FIM™ internal construct validity revisited
Testlet approaches
Within the traditional testlet approach 3 different ver-
sions of testlet combinations were applied, based on the
underlying subscale structure of the FIM™. Two ver-
sions included 4 testlets for the motor scale, structured
according to the FIM™ subtopics (self-care, sphincter
control, transfers, locomotion) together with 2 combi-
nations of the cognitive items. In one version all the
cognitive FIM™ items were unified in one testlet, since
they all showed local dependency among each other at
the baseline analysis, resulting in a total of 5 testlets.
In the other version, the cognitive items were split
thematically according to the FIM™ subtopics into 2
testlets, communication and social cognition, resulting
in a total of 6 testlets. The third version attempted to
form similar sized testlets and was oriented at the
residual correlations between the items and formerly
reported clusters of the FIM™ (29, 36). In this version,
3 testlets were created: a self-care testlet incorporating
items A–H, a mobility testlet incorporating items I–L,
and a cognitive testlet incorporating items M–R. None
of the 3 traditional testlet approaches, the 3-testlet, the
5-testlet and the 6-testlet version, resulted in fit to the
Rasch model (see Table II).
In contrast, the alternative 2-testlet approach (with
Testlet1 containing items A, C, E, G, I, K, M, O and
Q, and Testlet2 containing items B, D, F, H, J, L, N, P
and R) showed fit to the Rasch model across all 9 ana-
lyses steps. The p-values from the item-trait χ 2 were all
non-significant at the 0.01 level, the reliability indexes
all above 0.9, and the item- and person-fit estimates
within the set acceptable values. The expected common
variance values retained in the latent estimate were
all just above 1, indicating some marginal remaining
residual local dependency among the testlets. The fit
of all testlet solutions is summarized in Table II, and
the application of the 2-testlet approach to all aggre-
gation levels of the calibration sample is shown in
Appendix S2 1 .
Differential Item Functioning strategy
Despite overall fit, some DIF remained in the 2-testlet
solution for the whole calibration sample. For elimi-
nating all DIF, the successful 2-testlet solution of the
whole calibration sample had to be split twice. Testlet2
first had to be split by rehabilitation group. Secondly,
the group of musculoskeletal rehabilitation from Test-
let2 had to be split into the 2 time-points, i.e. admission
and discharge. This resulted in the following super-
items: Testlet1, Testlet2_NEUR, Testlet2_MSKt1,
and Testlet2_MSKt2. Testlet1 was the anchor for the
comparison of the person estimates of the split and the
unsplit version. The effect size calculation resulted in
0.11 (see Appendix S3 1 ), indicating that there was no
need to split the final interval-scale transformation into
different subgroups.
Transformation table
Based on the 2-testlet solution, an interval-based
transformation table was created for all available
FIM™ total scores, which can be used to transfer the
Table II. Testlet solutions on the level of the whole calibration sample (FIM_all)
Item-fit
residuals
Mean (SD)
Person-fit
residuals
Mean (SD)
χ 2
p-value PSI
n/CI Testlets (items)
946/10 6 Testlets:
–0.156 (5.077) –0.426 (1.200) 0.000
Self-Care (A-F), Sphincter
Control (G-H), Transfers
(I-K), Locomotion (L-M),
Communication (N-O),
Social Cognition (P-R)
946/10 5 Testlets:
–0.010 (7.046) –0.360 (1.138) 0.000
Self-Care (A-F), Sphincter
Control (G-H), Transfers
(I-K), Locomotion (L-M),
Cognition (N-R)
946/10 3 Testlets:
Self-Care (A-H), Mobility
(I-M), Cognition (N-R)
946/10
197
–1.419 (6.894) –0.502 (1.049) 0.000
2-testlets:
–0.208 (0.317) –0.614 (1.003) 0.408
Testlet1 (A, C, E, G, I, K,
M, O, Q), Testlet2 (B, D, F,
H, J, L, N, P, R)
Acceptable values
SD < 1.4
SD < 1.4
> 0.01
α
DIF (Testlet)
A
Paired
Cond. test of
t-test, % fit CI based
0.906 0.887 gender (T6),
age (T2),
language (T1, T2, T3, T4,
T5, T6),
insurance (T1, T5),
time-point (T2, T4, T5),
rehab-group (T1, T3, T4,
T5, T6)
0.895 0.878 age (T2, T5),
language (T1, T2, T3, T4,
T5),
nationality (T5),
insurance (T1),
time-point (T2, T4),
rehab-group (T3, T4, T5)
0.838 0.859 gender (T2, T3),
age (T1), language (T1, T2,
T3), nationality (T1, T3),
insurance (T1),
time-point (T2),
rehab-group (T2, T3)
0.980 0.981 rehab-group (T1, T2) 1.019 4.97 0.607
> 0.7 > 0.9 > 0.01
> 0.7
No DIF
0.942 1.27
Only available
for the 2-testlet
approach
0.930 1.16
0.871 1.27
< 5.00
FIM all: Functional Independence Measure; all: combination of time-points and rehabilitation-groups; n: sample size; CI: class intervals; SD: standard deviation;
PSI: person separation index; α: Cronbach’s alpha; A: explained common variance; DIF: differential item functioning.
J Rehabil Med 51, 2019