Exploration of Diagnostic Testing Instruments: Validity, Reliability, and Item Characteristics

Wahyu Hartono; Samsul Hadi; Raden Rosnawati; Heri Retnawati

doi:10.47750/pegegog.13.03.39

Authors

Wahyu Hartono Universitas Swadaya Gunung Djati
Samsul Hadi Universitas Negeri Yogyakarta https://orcid.org/0000-0003-3437-2542
Raden Rosnawati Universitas Negeri Yogyakarta
Heri Retnawati Universitas Negeri Yogyakarta https://orcid.org/0000-0002-1792-5873

DOI:

https://doi.org/10.47750/pegegog.13.03.39

Keywords:

Validity, Reliability, Classical Test Theory

Abstract

Researchers design diagnostic assessments to measure students' knowledge structures and processing skills to provide information about their cognitive attribute. The purpose of this study is to determine the instrument's validity and reliability, as well as to investigate the use of classical test theory to identify item characteristics. The data used in the form of responses to elementary school mathematics material answers as many as 166 respondents from 5 public elementary schools in Cirebon. The data analysis technique used is the analysis of item characteristics based on classical test theory using the R software package. The results showed that the developed mathematical ability diagnostic instrument had high content validity based on the Aiken formula and valid construct validity based on the CFA approach. According to the Spearman-Brown formulation, the correlation coefficient is about 0.889, indicating high internal consistency reliability. In the index of difficulty level, overall, it is categorized as moderate items. The discriminatory index shows that there are two items, namely items 9 and 17, with low discriminating power, so the two items are revised or not used. Of the 60 total distractors, 5 (8.3%) did not function well because less than 5% of the participants chose them. In contrast, as many as 55 distractors (91.7%) have functioned well.

Downloads

Download data is not yet available.

References

Allen, M.J., & Yen, W.M. (1979). Introduction to Measurement Theory. California: Wadsworth, Inc.

Arikunto, Suharsimi. 2008. Dasar-Dasar Evaluasi Pendidikan. Jakarta: Bumi Aksara.

Arikunto, S. (2010). Prosedur Penelitian Suatu Pendekatan Praktek. Jakarta: Rineka Cipta.

Andrich, D., & Marais, I. (2019). A Course in Rasch Measurement Theory: Measuring in the Educational, Social and Health Sciences (Issue 1989, pp. 41–53). https://doi.org/10.1007/978-981-13-7496-8

Bodner, G. M. (1986). Constructivism: A theory of knowledge. Journal of Chemical Education, 63(10), 873–878. https://doi.org/10.1021/ed063p873

Cronbach, L. J. (1990). Essentials of Psychological Testing. Harper & Row.

Hambleton, R. K. (2005). An NCME instructional module on comparison of classical test theory and item response theory and their applications to test development. Educational Measurement: Issues and Practice, 12(3), 38–47.

Istiyono, E., Mardapi, D., & Suparno, S. (2014). Pengembangan Tes Kemampuan Berpikir Tingkat Tinggi Fisika (PysTHOTS) Peserta Didik SMA. Jurnal Penelitian Dan Evaluasi Pendidikan, 18(1), 1–12. https://doi.org/10.21831/pep.v18i1.2120

Kurian, G. (2014). Reliability and Validity Assessment. In The Encyclopedia of Political Science. https://doi.org/10.4135/9781608712434.n1341

Leighton, J. P., & Gierl, M. J. (Eds.). (2007). Cognitive Diagnostic Assessment for Education: Theory and Applications. Cambridge University Press.

Levine, M. D., Lindsay, R. L., & Reed, M. S. (1992). The wrath of math: Deficiencies of mathematical mastery in the school child. Pediatric Clinics of North America, 39(3), 525–536. https://doi.org/10.1016/S0031-3955(16)38342-0

Lovitt, R. (1993). Psychological Assessment. In Journal of Personality Assessment (Vol. 60, Issue 3). https://doi.org/10.1207/s15327752jpa6003_20

Retnawati, H. (2016). Analisis Kuantitatif Instrumen Penelitian (Pertama). Parama Publishing.

Sharkness, J., & DeAngelo, L. (2011). Measuring Student Involvement: A Comparison of Classical Test Theory and Item Response Theory in the Construction of Scales from Student Surveys. Research in Higher Education, 52(5), 480–507. https://doi.org/10.1007/s11162-010-9202-3

Sheng, Y. (2019). CTT Package in R. Measurement, 17(4), 211–219. https://doi.org/10.1080/15366367.2019.1600839

Tall, D., & Razali, M. R. (1993). Diagnosing students' difficulties in learning mathematics. International Journal of Mathematical Education in Science and Technology, 24(2), 209–222. https://doi.org/10.1080/0020739930240206

Traub, R. E. (2005). Classical Test Theory in Historical Perspective. Educational Measurement: Issues and Practice, 16(4), 8–14.

Zimmerman, D. W. (1998). How should classical test theory have defined validity? Social Indicators Research, 45(1–3), 233–251. https://doi.org/10.1023/a:1006949915525

Exploration of Diagnostic Testing Instruments: Validity, Reliability, and Item Characteristics

Authors

DOI:

Keywords:

Abstract

Downloads

References

Downloads

Published

How to Cite

Issue

Section

License

Similar Articles

Most read articles by the same author(s)

JCR Cite Score

Information

Email: editor@pegegog.net