Abstract
Latent class (LC) analysis is used by social, behavioral, and medical science researchers among others as a tool for clustering (or unsupervised classification) with categorical response variables, for analyzing the agreement between multiple raters, for evaluating the sensitivity and specificity of diagnostic tests in the absence of a gold standard, and for modeling heterogeneity in developmental trajectories. Despite the increased popularity of LC analysis, little is known about statistical power and required sample size in LC modeling. This paper shows how to perform power and sample size computations in LC models using Wald tests for the parameters describing association between the categorical latent variable and the response variables. Moreover, the design factors affecting the statistical power of these Wald tests are studied. More specifically, we show how design factors which are specific for LC analysis, such as the number of classes, the class proportions, and the number of response variables, affect the information matrix. The proposed power computation approach is illustrated using realistic scenarios for the design factors. A simulation study conducted to assess the performance of the proposed power analysis procedure shows that it performs well in all situations one may encounter in practice.
Article PDF
Similar content being viewed by others
Avoid common mistakes on your manuscript.
References
AGRESTI, A. (2002), Categorical Data Analysis, Hoboken NJ: John Wiley & Sons, Inc.
COHEN, J. (1988), Statistical Power Analysis for the Behavioral Sciences (2nd ed.), Hillsdale NJ: Lawrence Erlbaum Associates.
COLLINS, L.M., and LANZA, S.T. (2010), Latent Class and Latent Transition Analysis: With Applications in the Social, Behavioral, and Health Sciences, Hoboken NF: John Wiley & Sons, Inc.
DAYTON, C M., and MACREADY, G.B. (1976), “A Probabilistic Model for Validation of Behavioral Hierarchies”, Psychometrika, 41(2), 189–204.
DAYTON, C.M., and MACREADY, G.B. (1988), “Concomitant-Variable Latent-Class Models ”, Journal of the American Statistical Association, 83(401), 173–178.
DEMIDENKO, E. (2007), “Sample Size Determination for Logistic Regression Revisited”, Statistics in Medicine, 26(18), 3385–3397.
DEMIDENKO, E. (2008), “Sample Size and Optimal Design for Logistic Regression with Binary Interaction ”, Statistics in Medicine, 27(1), 36–46.
FORCINA, A. (2008), “Identifability of Extended Latent Class Models with Individual Covariates ”, Computational Statistics & Data Analysis, 52(12), 5263–5268.
FORMANN, A.K. (1982), “Linear Logistic Latent Class Analysis ”, Biometrical Journal, 24(2), 171–190.
GOODMAN, L.A. (1974), “Exploratory Latent Structure Analysis Using Both Identifiable and Unidentifiable Models ”, Biometrika, 61(2), 215–231.
HAGENAARS, J.A. (1988), “Latent Structure Models with Direct Effects Between Indicators Local Dependence Models ”, Sociological Methods & Research, 16(3), 379–405.
HIRTENLEHNER, H., STARZER, B., and WEBER, C. (2012), “A Differential Phenomenology of Stalking Using Latent Class Analysis to Identify Different Types of Stalking Victimization ”, International Review of Cictimology, 18(3), 207–227.
HOLT, J.A., and MACREADY, G.B. (1989), “A Simulation Study of the Difference Chi-Square Statistic for Comparing Latent Class Models Under Violation of Regularity Conditions ”, Applied Psychological Measurement, 13(3), 221–231.
KEEL, P.K., FICHTER, M., QUADIEG, N., BULIK, C.M., BAXTER, M.G., THORNTON, L., HALMI, K.A., KAPLAN, A.S., STROBER, M., WOODSIDE, D.B., et al. (2004), “Application of a Latent Class Analysis to Empirically Define Eating Disorder Phenotypes ”, Archives of General Psychiatry, 61(2), 192.
LANZA, S.T., COLLINS, L M., LEMMON, D.R., and SCHAFER, J.L. (2007), “Proc LCA: A SAS Procedure for Latent Class Analysis ”, Structural Equation Modeling, 14(4), 671–694.
LAZARSFELD, P. (1950), “The Logical and Mathematical Foundation of Latent Stricture Analysis and the Interpretation and Mathematical Foundation of Latent Structure Analysis”, in Measurement and Predictions, eds. S.A. Stoufer et al., pp. 362–472.
LINZAR, D.A., and LEWIS, J.B. (2011), “poLCA: An r Package for Polytomous Variable Latent Class Analysis ”, Journal of Statistical Software, 42(10), 1–29.
MAGIDSON, J., and VERMUNT, J.K. (2004), “Latent Class Models ”, in The Sage Handbook of Quantitative Methodology for the Social Sciences, ed. D. Kaplan, Thousand Oaks CA: Sage, pp. 175–198.
MANN, H.B., and WALD, A. (1943), “On Stochastic Limit and Order Relationships ”, The Annals of Mathematical Statistics, 14(3), 217–226.
MCCUTCHTEON, A.L. (1987), Latent Class Analysis, Newbury Park CA: SAGE Publications.
MCHUGH, R.B. (1956), “Efficient Estimation and Local Identification in Latent Class Analysis ”, Psychometrika, 21(4), 331–347.
MCLACHLAN, G., and PEEL, D. (2000), Finite Mixture Models, New York: John Wiley.
MUTHÉN, L.K., and MUTHÉN, B.O. (2012), Mplus. The Comprehensive Modelling Program for Applied Researchers: Users Guide 5, Los Angeles CA: Muthén & Muthén.
NAKAGAWA, S., and FOSTER, T.M. (2004), “The Case Against Retrospective Statistical Power Analyses with an Introduction to Power Analysis ”, Actathologica, 7(2), 103–108.
NYLUND, K.L., ASPAROUHOV, T., and MUTHÉN, B. O. (2007), “Deciding on the Number of Classes in Latent Class Analysis and Growth Mixture Modeling: A Monte Carlo Simulation Study ”, Structural Equation Modeling, 14(4), 535–569.
Ó BRIEN, R.G. (1986), “Using the SAS System to Perform Power Analyses for Log-Linear Models”, in Proceedings of the 11th Annual SAS Users Group Conference, pp. 778–784.
RENDER, R. (1981), “Note on the Consistency of the Maximum Likelihood Estimate for Non-Identifiable Distributions ”, The Annals of Statistics, 9(1), 225–228.
RENCHER, A.C. (2000), Linear Models in Statistics, New York: John Wiley.
RINDSKOPF, D., and RINDSKOPF, W. (1986), “The Value of Latent Class Analysis in Medical Diagnosis ”, Statistics in Medicine, 5(1), 21–27.
TOFIGHI, D., and ENDERS, C.K. (2008), “Identifying the Correct Number of Classes in Growth Mixture Models ”, Advances in Latent Variable Mixture Models, 317–341.
UEBERSAX, J.S., and GROVE, W.M. (1990), “Latent Class Analysis of Diagnostic Agreement”, Statistics in Medicine, 9(5), 559–572.
VERMUNT, J.K. (1996), Log-Linear Event History Analysis: A General Approach with Missing Data, Latent Variables, and Unobserved Heterogeneity, Tilburg: Tilburg University Press.
VERMUNT, J.K. (1997), LEM: A General Program for the Analysis of Categorical Data, Tilburg: Tilburg University.
VERMUNT, J.K. (2010a), “Latent Class Modeling with Covariates: Two Improved Three-Step Approaches ”, Political Analysis, 18(4), 450–469.
VERMUNT, J.K. (2010b), “Latent Class Models”, in International Encyclopedia of Education, 7, eds. P. Peterson, E. Baker, and B. McGaw, pp. 238–244.
VERMUNT, J.K., and Magidson, J. (2013a), LG-Syntax User’s Guide: Manual for Latent GOLD 5.0 Syntax Module, Belmont MA: Statistical Innovations Inc.
VERMUNT, J.K., and Magidson, J. (2013b), Technical Guide for Latent GOLD 5.0: Basic, Advanced, and Syntax, Belmont MA: Statistical Innovations Inc.
WALD, A. (1943), “Tests of Statistical Hypotheses Concerning Several Parameters When the Number of Observations is Large ”, Transactions of the American Mathematical Society, 54(3), 426–482.
WHITE, H. (1982), “Maximum Likelihood Estimation of Misspecified Models ”, Econometrica: Journal of the Econometric Society, 50(1), 1–25.
WHITTEMORE, A.S. (1981), “Sample Size for Logistic Regression with Small Response Probability ”, Journal of the American Statistical Association, 76(373), 27–32.
WOLFE, J.H. (1970), “Pattern Clustering by Multivariate Mixture Analysis ”, Multivariate Behavioral Research, 5(3), 329–350.
YANG, C.C. (2006), “Evaluating Latent Class Analysis Models in Qualitative Phenotype Identification ”, Computational Statistics and Data Analysis, 50(4), 1090–1104.
YANG, I., and BECKER,M.P. (1997), “Latent Variable Modeling of Diagnostic Accuracy”, Biometrics, 53(3) 948–958.
Author information
Authors and Affiliations
Corresponding author
Additional information
This work is part of research project 406-11-039 “Power analysis for simple and complex mixture models” financed by the Netherlands Organization for Scientific Research (NWO).
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
About this article
Cite this article
Gudicha, D.W., Tekle, F.B. & Vermunt, J.K. Power and Sample Size Computation for Wald Tests in Latent Class Models. J Classif 33, 30–51 (2016). https://doi.org/10.1007/s00357-016-9199-1
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00357-016-9199-1