Beck depression inventory-II: A study for meta analytical reliability generalization



Meta-Analysis, Reliability Generalization, Beck Depression Inventory-II, VC Model, Cronbach Alpha


The main aim of achieving with the reliability generalization is to investigate the variability related to the reliability estimates and to try to characterize the sources of this variability. As part of the research, a reliability generalization study was carried out on the basis of Beck Depression Inventory-II to investigate potential factors contributing to the variability of the reliability of the measurement results and to examine the sources of the measurement error. Within the scope of the study, it was published in English between 2011-2019 and only 40 articles in the type of article were examined. The Kappa coefficient for the coding form was determined to be 0.93 and it was concluded that the measurement results performed for the coding form were valid and reliable. Jamovi and R programs were used in the research. When the test results regarding publication bias are evaluated in a holistic way, it is concluded that there is no publication bias related to the studies included in the research. It was thought that the heterogeneity observed by the researchers may indicate an amount of heterogeneity to be examined and moderator analyzes were performed. As a result of the moderator analysis, it was determined that any of the continuous and categorical moderator variables did not have an explanatory role regarding the variability between the reliability estimates of the inventory. In order to carry out qualified RG studies in the future, it is recommended that researchers report their reliability estimates regarding the measurement results of their studies.


Download data is not yet available.


Ahava, G. W., Iannone C., Grebstein, L., & Schirling J. (1998). Is the Beck Depression Inventory reliable over time? An evaluation of multiple test-retest reliability in a nonclinical college student sample. Journal of Personality Assessment, 70, 222-231.

Altman, D. G. (1999). Practical Statistics for Medical Research. Chapman; Hall/CRC Press.

American Psychiatric Association. (2013). Diagnostic and statistical manual of mental disorders (5th ed.). Arlington, VA: Author.

Barnes, L. L. B., Harp, D., & Jung, W. S. (2002). Reliability Generalization of Scores on the Spielberger State-Trait Anxiety Inventory. Educational and Psychological Measurement, 62, 603-618.

Barrera, M., & Garrison-Jones, C. V. (1988). Properties of the Beck Depression Inventory as a screening instrument for adolescent depression. Journal of Abnormal Child Psychology, 16 (3), 263–273.

Beck, A. T., Steer, R. A. & Brown, G. K. (1996). Manual for the Beck Depression Inventory-II. San Antonio, TX: Psychological Corporation.

Beck, A. T., Rial, W. Y., & Rickels, K. (1974). Short form of depression inventory: cross-validation. Psychol Rep., 34 (3), 1184-1186.

Bentz, B. G., & Hall, J. R. (2008). Assessment of depression in a geriatric inpatient cohort: A comparison of the BDI and GDS. International Journal of Clinical and Health Psychology, 8 (1), 93–104.

Bonett, D. G. (2010). Varying coefficient meta-analytic methods for alpha reliability. Psychological Methods,15, 368–385.

Bonett, D. G. (2008). Meta-analytic interval estimation for bivariate correlations. Psychological Methods, 13 (3), 173-189.

Borenstein, M. (2019). Common mistakes in meta-analysis and how to avoid them. Biostat, Inc, Englewood, NJ.

Brouwer, D, Meijer, R. R., & Zevalkink, J. (2013). On the Factor Structure of the Beck Depression Inventory-II: G Is the Key. Psychol Assess., 25 (1), 136-145.

Bunevicius, A., Staniute, M., Brozaitiene, J., & Bunevicius, R. (2012). Diagnostic accuracy of self-rating scales for screening of depression in coronary artery disease patients. Journal of Psychosomatic Research, 72 (1), 22-25.

Campos, R. C., & Gonçalves, B. (2011). The Portuguese version of the Beck Depression Inventory-II (BDI-II): preliminary psychometric data with two nonclinical samples. European J Psychol Assess. 27 (4), 258-264.

Caruso, J. C. (2000). Reliability generalization of the NEO personality scales. Educational and Psychological Measurement, 60 (2), 236-254.

Corbière, M., Bonneville-Roussy, A., Franche, R. L., Coutu, M. F., Choinière, M., Durand, M. J., & Boulanger, A. (2011). Further validation of the BDI-II among people with chronic pain originating from musculoskeletal disorders. The Clinical Journal of Pain, 27 (1), 62-69.

Dadfar, M., & Kalibatseva, Z. (2016). Psychometric Properties of the Persian Version of the Short Beck Depression Inventory with Iranian Psychiatric Outpatients. Scientifica, 1-6.

Dahem, F. (2016). Psychometric Properties of the Beck Scale for Depression (Beck Depression Inventory BDI-II) - A Study on a Sample of Students in the State of Kuwait Universities. Journal of Education and Practice, 7 (17), 87-99.

Dolle, K, Schulte-Körne, G, O'Leary, A. M., Von Hofacker, N, Izat, Y, & Allgaier, A. K. (2012). The Beck Depression Inventory-II in adolescent mental health patients: Cut-off scores for detecting depression and rating severity. Psychiatry Res. 200 (2), 843-848.

Dozois, D. J. A., & Dobson, K. S. & Ahnberg, J. L. (1998). A psychometric evaluation of the Beck Depression Inventory-II. Psychological Assessment, 10(2), 83–89.

Dozois, D. J. A., & Covin, R. (2004). The Beck Depression Inventory-II (BDI-II), Beck Hopelessness Scale (BHS), and Beck Scale for Suicide Ideation (BSS). In M. J. Hilsenroth & D. L. Segal (Eds.), Comprehensive handbook of psychological assessment, Vol. 2. Personality assessment (pp. 50–69). New York: John Wiley & Sons Inc.

Eser, M. T., Yurtçu, M., & Aksu, G. (2020). R programlama dili ve Jamovi ile meta analiz uygulamaları. Ankara: Pegem Akademi.

Fabozzi, F.J., Focardi, S., Rachev, S.T., & Arshanapalli, B. (2014). The basics of financial econometrics: Tools, concepts, and asset management applications. Wiley.

Field, A. P. (2003b). The problems in using fixed effects models of meta-analysis on real-world data. Understanding Statistics, 2, 77 – 96.

García-Batista, Z. E., Guerra-Peña, K., & Cano-Vindel, A., Herrera-Martínez, S. X., & Medrano, L. A. (2018). Validity and reliability of the Beck Depression Inventory (BDI-II) in general and hospital population of Dominican Republic. PLoS One. 13 (6), 1-12.

Ginting, H., Näring, G., Williams, V. V., Srisayekti, W., & Becker, E. (2013). Validating the Beck Depression Inventory-II in Indonesia's general population and coronary heart disease patients. International Journal of Clinical and Health Psychology, 13, 235–242.

Gomes-Oliveira, M. H., Gorenstein, C., Lotufo Neto, F., Andrade, L. H., & Wang, Y. P. (2012). Validation of the Brazilian Portuguese version of the Beck Depression Inventory-II in a community sample. Braz J Psychiatry. 34 (4), 389-394.

González, D. A., Reséndiz, A., & Reyes-Lagunes, I. (2015). Adaptation of the BDI-II in Mexico. Salud mental, 38 (4), 237–244.

Gorenstein, C, Wang Y. P., Argimon, I. L, & Werlang, B. S. G. (2011). Manual do Inventário de Depressão de Beck - BDI-II. São Paulo: Casa do Psicólogo.

Hatzenbuehler, L. C., Parpal, M., & Matthews, L. (1983). Classifying college students as depressed or nondepressed using the Beck Depression Inventory: An empirical analysis. Journal of Consulting and Clinical Psychology, 51 (3), 360–366.

Hayden, M. J., Brown, W. A., & Brennan, L., & Brien, P. E. (2012). Validity of the Beck Depression Inventory as a Screening Tool for a Clinical Mood Disorder in Bariatric Surgery Candidates. Obesity Surgery, 22 (11), 1666-1675.

Hedges, L. V., & Olkin, I. (1985). Statistical methods for meta-analysis. San Diego, CA: Academic Press.

Hedges, L. V., & Vevea, J. L. (1998). Fixed- and random-effects models in meta-analysis. Psychological Methods, 3 (4), 486–504.

Hedges, L. V. (1992). Modeling publication selection effects in meta-analysis. Statistical Science, 7 (2), 246-255.

Henson, R. K. (2006). Effect-size measures and meta-analytic thinking in counseling psychology research. The Counseling Psychologist, 34 (5), 601-629.

Henson, R. K., & Thompson, B. (2002). Characterizing measurement error in scores across studies: Some recommendations for conducting “reliability generalization” studies. Measurement and Evaluation in Counseling and Development, 35, 113-127.

Hisli, N. (1989) Beck Depresyon Envanteri’nin üniversite öğrencileri için geçerliği güvenirliği. Psikoloji Dergisi, 23, 3-13.

Holland, D. F. (2015). Reliability Generalization: A Systematic Review And Evaluation Of Meta-Analytic Methodology And Reporting Practice (Doctoral dissertation, North Texas University, Texas, USA). Retrieved November 13, 2020, from

Hunter, J. E., & Schmidt, F. L. (2004). Methods of meta-analysis. Thousand Oaks, CA: Sage.

Kieffer, K. M. (1999). Why Generalizability Theory is Essential and Classical Test Theory is Often Inadequate. Thompson, B. (Ed.), Advances in Social Science Methodology (pp. 1-11), Stamford, Connecticut: JAI.

Kieffer, K. M. ve Reese, R. J. (2002). A Reliability Generalization Study of the Geriatric Depression Scale. Educational and Psychological Measurement, 62, 969- 994.

Kirsch-Darrow, L., Marsiske, M., Okun, M. S., Bauer, R., & Bowers, D. A. (2011). Apathy and depression: separate factors in Parkinson's disease. The Journal of the International Neuropsychological Society, 17 (6), 1058-1066.

Laird, N. M., & Mosteller, F. (1990). Some Statistical Methods for Combining Experimental Results. International Journal of Technology Assessment in Health Care, 6 (1), 5-30.

Lam, R. W., & Kennedy, S. H. (2005). Using meta analysis to evaluate evidence: Practical tips and traps. Canadian Journal of Psychiatry, 50 (3), 167-174.

Langan, D., Higgins, J., Jackson, D., Bowden, J., Veroniki, A., Kontopantelis, E., Viechtbauer, W., & Simmonds, M. (2019). A comparison of heterogeneity variance estimators in simulated random-effects meta-analyses. Research Synthesis Methods, 10 (1), 83-98.

Lee, E. H., Lee, S. J., Hwang, S. T., Hong, S. H., & Kim, J. H. (2017). Reliability and Validity of the Beck Depression Inventory-II among Korean Adolescents. Psychiatry Investigation, 14 (1), 30-36.

Lopez, M. N., Pierce, R. S., Gardner, R. D., & Hanson, R. W. (2013). Standardized Beck Depression Inventory-II scores for male veterans coping with chronic pain. Psychological Services, 10 (2), 257–263.

Mahmoudi, O., Paydar, M., Amini, M. R., Mohammadi, F., & Darvishi, M. (2019). Beck Depression Inventory: Establishing the Reliability and Validity of the Kurdish Version Among Earthquake Survivors of Kermanshah, Iran. International Journal of Health and Life Sciences, 5 (1), 1-5.

Nunnally, J. C., & Bernstein, I. H. (1994). Psychometric theory. New York, NY: McGraw-Hill, Inc.

Rubio-Aparicio, M., Núñez-Núñez, M. R., Meca, J. S., López-Pina, A. J., Marín-Martínez, F., & López-López, A. J. (2020) The Padua Inventory–Washington State University Revision of Obsessions and Compulsions: A Reliability Generalization Meta-Analysis, Journal of Personality Assessment, 102 (1), 113-123.

Mason, C., Allam, R., & Brannick, M.T. (2007). How to meta-analyze coefficient-of-stability estimates: Some recommendations based on Monte Carlo studies. Educational and Psychological Measurement, 67 (5), 765-783.

McDowell, I. (2006). Measuring health: A guide to rating scales and questionnaires. New York: Oxford University.

Mullen, B., Muellerleile, P. ve Bryant, B. (2001). Cumulative meta-analysis: A consideration of ındicators of sufficiency and stability. Personality and Social Psychology Bulletin, 27 (11), 1450-1462.

Nimon, K., Zientek, L. R., & Henson, R. K. (2012). The assumption of a reliable instrument and other pitfalls to avoid when considering the reliability of data. Frontiers in Quantitative Psychology and Measurement, 3, 1-13.

Nunnally, J. C. (1978). Psychometric theory. New York: McGraw-Hill.

Odriozola-González, P., & Ruiz, F. J. (2016). The role of psychological inflexibility in Beck’s cognitive model of depression in a sample of undergraduates. Anales de Psicología, 32 (2), 441-447.

Roberts, G., Roberts, S., Tranter, R., Whitaker, R., Bedson, E., Tranter, S., Prys, D., Owen, Heledd & Sylvestre, Y. (2012). Enhancing rigour in the validation of patient reported outcome measures (PROMs): bridging linguistic and psychometric testing. Health and Quality of Life Outcomes, 10 (64), 1-6.

Rodriguez, M. C., & Maeda, Y. (2006). Meta-analysis of coefficient alpha. Psychological Methods, 11 (3), 306-322.

Meca, J. S., López-López, J. A., & López-Pina, J. A. (2013). Some recommended statistical analytic practices when reliability generalization studies are conducted. British Journal of Mathematical and Statistical Psychology, 66 (3), 402-425.

Meca, J. S., López-Pina, J. A., López-López, J., Marín-Martínez, F., Rosa-Alcázar, A. I., & Gomez-Conesa, A. I. (2011). The Maudsley Obsessive-Compulsive Inventory: A reliability generalization meta-analysis. International Journal of Clinical and Health Psychology, 11 (3), 473-493.

Sanz, J. (2013). 50 Years of The Beck Depression Inventory: Recommendatıons for Usıng the Spanısh Adaptation of the BDI-II in Clinical Practıce. Papeles del Psicólogo, 34 (3), 161-168.

Sashidharan, T., Pawlow, L. A., & Pettibone, J. C. (2012). An examination of racial bias in the Beck Depression Inventory-II. Cultur Divers Ethnic Minor Psychol., 18 (2), 203-209.

Savaşır, I., & Şahin N. H. (1997) Bilişsel Davranışçı Terapilerde Değerlendirme: Sık Kullanılan Ölçekler. Ankara: Türk Psikologlar Derneği Yayınları.

Schmidt, F. L., & Hunter, J. E. (1977). Development of a general solution to the problem of validity generalization. Journal of Applied Psychology, 62 (5), 529-540.

Schmidt, F. L., Oh, I.S., & Hayes, T. L. (2009). Fixed- versus random-effects models in meta-analysis: Model properties and an empirical comparison of differences in results. British Journal of Mathematical and Statistical Psychology, 62, 97-128.

Sim, J. & Wright, C. C. (2005). The Kappa statistic in reliability studies: Use, interpretation, and sample size requirements. Physical Therapy, 85 (3), 257-268.

Steer, R. A, & Clark, D.A. (1997). Psychometric characteristics of the Beck Depression Inventory-II with college students. Measurement and Evaluation in Counseling and Development, 30, 128–136.

Taber, K. S. (2017). The Use of Cronbach’s Alpha When Developing and Reporting Research Instruments in Science Education. Research in Science Education, 48 (6), 1273–1296.

Thompson, B. (2002). What future quantitative social science research could look like: Confidence intervals for effect sizes. Educational Researcher, 31, 25-32.

Thompson, B. & Vacha‐Haase, T. (2000). Psychometrics is datametrics: The test is not reliable. Educational and Psychological Measurement, 60 (2), 174– 95.

Toledano-Toledano, F., & Contreras-Valdez, J. A. (2018). Validity and reliability of the Beck Depression Inventory II (BDI-II) in family caregivers of children with chronic diseases. PLoS ONE, 13 (11), 1-13.

Tully, P. J., Winefield, H. R., Baker, R. A., Turnbull, D. A., & De Jonge, P. (2011). Confirmatory factor analysis of the Beck Depression Inventory-II and the association with cardiac morbidity and mortality after coronary revascularization. Journal of Health Psychology, 16 (4), 584–595.

Turner, A., Hambridge, J., White, J., Carter, G., Clover, K., Nelson, L., & Hackett, M. (2012). Depression screening in stroke: A comparison of alternative measures with the structured diagnostic interview for the diagnostic and statistical manual of mental disorders, fourth edition (major depressive episode) as criterion standard. Stroke, 43 (4), 1000-1005.

Vacha-Haase, T. (1998). Reliability generalization: Exploring variance in measurement error affecting score reliability across studies. Educational and Psychological Measurement, 58, 6- 20.

Vacha-Haase, T., Henson, R. K., & Caruso, J. C. (2002). Reliability generalization: Moving toward improved understanding and use of score reliability. Educational and Psychological Measurement, 62 (4), 562-569.

Vacha-Haase, T., & Thompson, B. (2011). Score reliability: A retrospective look back at 12 years of reliability generalization studies. Measurement and Evaluation in Counseling and Development, 44 (3), 159-168.

Vassar, M., & Bradley, G. (2012). A reliability generalization meta-analysis of coefficient alpha for the Reynolds Adolescent Depression Scale. Clinical Child Psychology and Psychiatry, 17 (4), 519-527.

Viechtbauer, W. (2010). Conducting meta-analyses in R with the metafor package. Journal of Statistical Software, 36 (3), 1-48.

Vicent, M., Rubio-Aparicio, M., Sánchez-Meca, J., & Gonzálvez, C. A. (2019). Reliability generalization meta-analysis of the child and adolescent perfectionism scale. J Affect Disord. 245, 533-544.

Whisman, M. A., Judd, C. M., Whiteford, N. T., & Gelhorn, H. L. (2013). Measurement Invariance of the Beck Depression Inventory-Second Edition (BDI-II) across gender, race, and ethnicity in college students. Assessment, 20 (4), 419-428.

Wilkinson, L. & Task Force on Statistical Inference, American Psychological Association, Science Directorate. (1999). Statistical methods in psychology journals: Guidelines and explanations. American Psychologist, 54 (8), 594–604.

Williams, J. R., Hirsch, E. S., Anderson, K., Bush, A. L., Goldstein, S. R., Grill, S., Lehmann, S., Little, J. T., Margolis, R. L., Palanci, J., Pontone, G., Weiss, H., Rabins, P., & Marsh, L. (2012). A comparison of nine scales to detect depression in Parkinson disease: which scale to use?. Neurology, 78 (13), 998–1006.

Win, K. L., Kawakami, N., & Htet Doe, G. (2019). Factor structure and diagnostic efficiency of the Myanmar version BDI-II among substance users. Annals of general psychiatry, 18 (12), 1-7.




How to Cite

Eser, M. T., & Aksu, G. (2021). Beck depression inventory-II: A study for meta analytical reliability generalization. Pegem Journal of Education and Instruction, 11(3), 88–101. Retrieved from