Abstract

A study of 10 Omron pedometers found that the devices were not reliable according to commonly used reliability indices, i.e., Pearson correlation (r), intra-class coefficient (ICC), Cronbach's alpha, and G-coefficient (based on the Generalizability theory), while the precision of the pedometers was in fact outstanding. It is suspected that low between-subject variability is the reason. PURPOSE: To examine why the reliability index was low despite high precision of the devices and identify appropriate statistical index for this kind of phenomenon. METHODS: Twelve healthy male (M ± SD of Height, Weight & BMI: 178.65±5.89 cm, 90.40±15.59 kg & 28.24±4.09) and eight female (163.20±5.35 cm, 77.92±21.72 kg, & 28.97±7.08) adults (42.35±13.47yr.) were recruited to wear five sets of 10 BI pedometers during testing at left (L) and right (R) waist positions in front of their body. The subjects walked 100 counted straight steps twice (back & forth) with each set of pedometers on a level sidewalk at a comfortable walking speed. To artificially increase between-subject variance, an extra five steps were added to both pre- and post-trial raw data from the 2nd subject (i.e., raw score 2 + 5, raw score 3 + 10,…, raw score 20 + 95). Reliability coefficients, including, ICC for between (B) trails (T), sides (S) and pedometers (P), r, alpha, and G-coefficient (3 facets: trial, side and pedometer), were computed for both raw (RA) and “simulated” (S) data. RESULTS: The M and SD of the raw scores were 101.10 ± 4.51 R and 100.30 ±77 L, with an average absolute error% ([recorded steps – actual steps]/actual steps * 100) 1.83±4.26 R and 0.48± 0.67 L, respectively. The reliability coefficients for both data sets are summarized below: ICC BT: RA = .192, S = .999; BS: RA= .155, S = .994; BP: RA = .228, S = .993 r: RA = .14 ± .39, S = .99 ± .01 Alpha: RA = .59, S = .99 G-coefficient: RA = .03, S = .41 CONCLUSION: Commonly used reliability indices are based on the classical true-score theory (true score variance/obtained score variance) and the “low” reliability observed in the raw data set is due to the design of the study and high precision of the device. Since each pedometer counted 100 steps and the pedometers precisely measured the actual steps taken there was little between-subject variance. As a result, false “unreliable” results were found. They became reliable when an artificial between-subject variation was introduced. Clearly, absolute precision indices (e.g., absolute error%, root mean square difference index, etc.) and related statistical tests (e.g., paired t-test or repeated measure ANOVA) are more appropriate for this kind of research design and data sets.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call