High-dimensional Exploratory Item Factor Analysis by A Metropolis–Hastings Robbins–Monro Algorithm

Cai, Li

doi:10.1007/s11336-009-9136-x

High-dimensional Exploratory Item Factor Analysis by A Metropolis–Hastings Robbins–Monro Algorithm

Theory and Methods
Open access
Published: 28 July 2009

Volume 75, pages 33–57, (2010)
Cite this article

Download PDF

You have full access to this open access article

Psychometrika Aims and scope Submit manuscript

High-dimensional Exploratory Item Factor Analysis by A Metropolis–Hastings Robbins–Monro Algorithm

Download PDF

Li Cai¹

4243 Accesses
212 Citations
5 Altmetric
Explore all metrics

Abstract

A Metropolis–Hastings Robbins–Monro (MH-RM) algorithm for high-dimensional maximum marginal likelihood exploratory item factor analysis is proposed. The sequence of estimates from the MH-RM algorithm converges with probability one to the maximum likelihood solution. Details on the computer implementation of this algorithm are provided. The accuracy of the proposed algorithm is demonstrated with simulations. As an illustration, the proposed algorithm is applied to explore the factor structure underlying a new quality of life scale for children. It is shown that when the dimensionality is high, MH-RM has advantages over existing methods such as numerical quadrature based EM algorithm. Extensions of the algorithm to other modeling frameworks are discussed.

Article PDF

A Riemannian Optimization Algorithm for Joint Maximum Likelihood Estimation of High-Dimensional Exploratory Item Factor Analysis

Article 01 June 2020

Joint Maximum Likelihood Estimation for High-Dimensional Exploratory Item Factor Analysis

Article 19 November 2018

Estimation Methods for Item Factor Analysis: An Overview

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

References

Albert, J.H. (1992). Bayesian estimation of normal ogive item response curves using Gibbs sampling. Journal of Educational Statistics, 17, 251–269.
Article Google Scholar
Aptech Systems, Inc. (2003). GAUSS (Version 6.08) [Computer software]. Maple Valley: Author.
Google Scholar
Baker, F.B., & Kim, S.-H. (2004). Item response theory: Parameter estimation techniques. New York: Dekker.
Google Scholar
Bartholomew, D.J., & Knott, M. (1999). Latent variable models and factor analysis (2nd ed.). London: Arnold.
Google Scholar
Bartholomew, D.J., & Leung, S.O. (2002). A goodness of fit test for sparse 2^p contingency tables. British Journal of Mathematical and Statistical Psychology, 55, 1–15.
Article PubMed Google Scholar
Béguin, A.A., & Glas, C.A.W. (2001). MCMC estimation and some model-fit analysis of multidimensional IRT models. Psychometrika, 66, 541–561.
Article Google Scholar
Benveniste, A., Métivier, M., & Priouret, P. (1990). Adaptive algorithms and stochastic approximations. Berlin: Springer.
Google Scholar
Bishop, Y.M.M., Fienberg, S.E., & Holland, P.W. (1975). Discrete multivariate analysis: Theory and practice. Cambridge: MIT Press.
Google Scholar
Bock, R.D., & Aitkin, M. (1981). Marginal maximum likelihood estimation of item parameters: Application of an EM algorithm. Psychometrika, 46, 443–459.
Article Google Scholar
Bock, R.D., Gibbons, R., & Muraki, E. (1988). Full-information item factor analysis. Applied Psychological Measurement, 12, 261–280.
Article Google Scholar
Bock, R.D., & Lieberman, M. (1970). Fitting a response model for n dichotomously scored items. Psychometrika, 35, 179–197.
Article Google Scholar
Bolt, D. (2005). Limited and full information estimation of item response theory models. In A. Maydeu-Olivares & J.J. McArdle (Eds.), Contemporary psychometrics (pp. 27–71). Mahwah: Earlbaum.
Google Scholar
Booth, J.G., & Hobert, J.P. (1999). Maximizing generalized linear mixed model likelihoods with an automated Monte Carlo EM algorithm. Journal of the Royal Statistical Society—Series B, 61, 265–285.
Article Google Scholar
Borkar, V.S. (2008). Stochastic approximation: A dynamical systems viewpoint. Cambridge: Cambridge University Press.
Google Scholar
Browne, M.W. (2001). An overview of analytic rotation in exploratory factor analysis. Multivariate Behavioral Research, 36, 111–150.
Article Google Scholar
Browne, M.W., Cudeck, R., Tateneni, K., & Mels, G. (2008). CEFA: Comprehensive Exploratory Factor Analysis (Version 3.02) [Computer software]. Retrieved from http://quantrm2.psy.ohio-state.edu/browne/.
Cai, L. (2006). Full-information item factor analysis by Markov chain Monte Carlo stochastic approximation. Unpublished master’s thesis, Department of Statistics, University of North Carolina at Chapel Hill.
Cai, L. (2008a). A Metropolis–Hastings Robbins–Monro algorithm for maximum likelihood nonlinear latent structure analysis with a comprehensive measurement model. Unpublished doctoral dissertation, Department of Psychology, University of North Carolina at Chapel Hill.
Cai, L. (2008b). SEM of another flavour: Two new applications of the supplemented EM algorithm. British Journal of Mathematical and Statistical Psychology, 61, 309–329.
Article PubMed Google Scholar
Cai, L., du Toit, S.H.C., & Thissen, D. (2009, forthcoming). IRTPRO: Flexible, multidimensional, multiple categorical IRT modeling [Computer software]. Chicago: SSI International.
Google Scholar
Cai, L., Maydeu-Olivares, A., Coffman, D.L., & Thissen, D. (2006). Limited-information goodness-of-fit testing of item response theory models for sparse 2^p tables. British Journal of Mathematical and Statistical Psychology, 59, 173–194.
Article PubMed Google Scholar
Camilli, G. (1994). Origin of the scaling constant d=1.7 in item response theory. Journal of Educational and Behavioral Statistics, 19, 379–388.
Google Scholar
Celeux, G., Chauveau, D., & Diebolt, J. (1995). On stochastic versions of the EM algorithm (Tech. Rep. No. 2514). The French National Institute for Research in Computer Science and Control.
Celeux, G., & Diebolt, J. (1991). A stochastic approximation type EM algorithm for the mixture problem (Tech. Rep. No. 1383). The French National Institute for Research in Computer Science and Control.
Chib, S., & Greenberg, E. (1995). Understanding the Metropolis-Hastings algorithm. The American Statistician, 49, 327–335.
Article Google Scholar
de Boeck, P., & Wilson, M. (2004). Explanatory item response models: A generalized linear and nonlinear approach. New York: Springer.
Google Scholar
Delyon, B., Lavielle, M., & Moulines, E. (1999). Convergence of a stochastic approximation version of the EM algorithm. The Annals of Statistics, 27, 94–128.
Article Google Scholar
Dempster, A.P., Laird, N.M., & Rubin, D.B. (1977). Maximum likelihood estimation from incomplete data via the EM algorithm. Journal of the Royal Statistical Society—Series B, 39, 1–38.
Google Scholar
Diebolt, J., & Ip, E.H.S. (1996). Stochastic EM: Method and application. In W.R. Gilks, S. Richardson, & D.J. Spiegelhalter (Eds.), Markov chain Monte Carlo in practice (pp. 259–273). London: Chapman and Hall.
Google Scholar
Dunson, D.B. (2000). Bayesian latent variable models for clustered mixed outcomes. Journal of the Royal Statistical Society—Series B, 62, 355–366.
Article Google Scholar
Edwards, M.C. (2005). A Markov chain Monte Carlo approach to confirmatory item factor analysis. Unpublished doctoral dissertation, University of North Carolina at Chapel Hill.
Fisher, R.A. (1925). Theory of statistical estimation. Proceedings of the Cambridge Philosophical Society, 22, 700–725.
Article Google Scholar
Fox, J.-P. (2003). Stochastic EM for estimating the parameters of a multilevel IRT model. British Journal of Mathematical and Statistical Psychology, 56, 65–81.
Article PubMed Google Scholar
Fox, J.-P. (2005). Multilevel IRT using dichotomous and polytomous response data. British Journal of Mathematical and Statistical Psychology, 58, 145–172.
Article PubMed Google Scholar
Fox, J.-P., & Glas, C.A.W. (2001). Bayesian estimation of a multilevel IRT model using Gibbs sampling. Psychometrika, 66, 269–286.
Article Google Scholar
Gelfand, A.E., & Smith, A.F.M. (1990). Sampling-based approaches to calculating marginal densities. Journal of the American Statistical Association, 85, 398–409.
Article Google Scholar
Gu, M.G., & Kong, F.H. (1998). A stochastic approximation algorithm with Markov chain Monte-Carlo method for incomplete data estimation problems. The Proceedings of the National Academy of Sciences, 95, 7270–7274.
Article Google Scholar
Gu, M.G., Sun, L., & Huang, C. (2004). A universal procedure for parametric frailty models. Journal of Statistical Computation and Simulation, 74, 1–13.
Article Google Scholar
Gu, M.G., & Zhu, H.-T. (2001). Maximum likelihood estimation for spatial models by Markov chain Monte Carlo stochastic approximation. Journal of the Royal Statistical Society—Series B, 63, 339–355.
Article Google Scholar
Gueorguieva, R.V., & Agresti, A. (2001). A correlated probit model for joint modeling of clustered binary and continuous responses. Journal of the American Statistical Association, 96, 1102–1112.
Article Google Scholar
Haberman, S.J. (1977). Log-linear models and frequency tables with small expected cell counts. The Annals of Statistics, 5, 1148–1169.
Article Google Scholar
Hastings, W.K. (1970). Monte Carlo simulation methods using Markov chains and their applications. Biometrika, 57, 97–109.
Article Google Scholar
Huber, P., Ronchetti, E., & Victoria-Feser, M.-P. (2004). Estimation of generalized linear latent variable models. Journal of the Royal Statistical Society—Series B, 66, 893–908.
Article Google Scholar
Jank, W.S. (2004). Quasi-Monte Carlo sampling to improve the efficiency of Monte Carlo EM. Computational Statistics and Data Analysis, 48, 685–701.
Article Google Scholar
Joe, H. (2008). Accuracy of Laplace approximation for discrete response mixed models. Computational Statistics and Data Analysis, 52, 5066–5074.
Article Google Scholar
Kass, R., & Steffey, D. (1989). Approximate Bayesian inference in conditionally independent hierarchical models. Journal of the American Statistical Association, 84, 717–726.
Article Google Scholar
Kuhn, E., & Lavielle, M. (2005). Maximum likelihood estimation in nonlinear mixed effects models. Computational Statistics and Data Analysis, 49, 1020–1038.
Article Google Scholar
Kullback, S., & Leibler, R.A. (1951). On information and sufficiency. Annals of Mathematical Statistics, 22, 79–86.
Article Google Scholar
Kushner, H.J., & Yin, G.G. (1997). Stochastic approximation algorithms and applications. New York: Springer.
Google Scholar
Lange, K. (1995). A gradient algorithm locally equivalent to the EM algorithm. Journal of the Royal Statistical Society—Series B, 57, 425–437.
Google Scholar
Liu, Q., & Pierce, D.A. (1994). A note on Gauss–Hermite quadrature. Biometrika, 81, 624–629.
Google Scholar
Lord, F.M., & Novick, M.R. (1968). Statistical theories of mental test scores. Reading: Addison-Wesley.
Google Scholar
Louis, T.A. (1982). Finding the observed information matrix when using the EM algorithm. Journal of the Royal Statistical Society—Series B, 44, 226–233.
Google Scholar
Makowski, D., & Lavielle, M. (2006). Using SAEM to estimate parameters of models of response to applied fertilizer. Journal of Agricultural, Biological, and Environmental Statistics, 11, 45–60.
Article Google Scholar
Mardia, K.V., Kent, J.T., & Bibby, J.M. (1979). Multivariate analysis. San Diego: Academic Press.
Google Scholar
Maydeu-Olivares, A., & Cai, L. (2006). A cautionary note on using g ²(dif) to assess relative model fit in categorical data analysis. Multivariate Behavioral Research, 41, 55–64.
Article Google Scholar
Maydeu-Olivares, A., & Joe, H. (2005). Limited and full information estimation and testing in 2ⁿ contingency tables: A unified framework. Journal of the American Statistical Association, 100, 1009–1020.
Article Google Scholar
McCullagh, P. (1980). Regression models for ordinal data. Journal of the Royal Statistical Society—Series B, 42, 109–142.
Google Scholar
McCullagh, P., & Nelder, J.A. (1989). Generalized linear models (2nd ed.). London: Chapman & Hall.
Google Scholar
McCulloch, C.E., & Searle, S.R. (2001). Generalized, linear, and mixed models. New York: Wiley.
Google Scholar
Meng, X.-L., & Schilling, S. (1996). Fitting full-information item factor models and an empirical investigation of bridge sampling. Journal of the American Statistical Association, 91, 1254–1267.
Article Google Scholar
Metropolis, N., Rosenbluth, A.W., Rosenbluth, M.N., Teller, A.H., & Teller, E. (1953). Equations of state space calculations by fast computing machines. Journal of Chemical Physics, 21, 1087–1092.
Article Google Scholar
Mislevy, R.J. (1986). Bayes modal estimation in item response models. Psychometrika, 51, 177–195.
Article Google Scholar
Muthén, B. (1984). A general structural equation model with dichotomous, ordered categorical, and continuous latent variable indicators. Psychometrika, 49, 115–132.
Article Google Scholar
Muthén, & Muthén (2008). Mplus (Version 5.0) [Computer software]. Los Angeles: Author.
Google Scholar
Natarajan, R., & Kass, R.E. (2000). Reference Bayesian methods for generalized linear mixed models. Journal of the American Statistical Association, 95, 227–237.
Article Google Scholar
Naylor, J.C., & Smith, A.F.M. (1982). Applications of a method for the efficient computation of posterior distributions. Journal of the Royal Statistical Society—Series C, 31, 214–225.
Google Scholar
Orchard, T., & Woodbury, M.A. (1972). A missing information principle: Theory and application. In L.M. Lecam, J. Neyman, & E.L. Scott (Eds.), Proceedings of the sixth Berkeley symposium on mathematical statistics and probability (pp. 697–715). Berkeley: University of California Press.
Google Scholar
Patz, R.J., & Junker, B.W. (1999a). A straightforward approach to Markov chain Monte Carlo methods for item response models. Journal of Educational and Behavioral Statistics, 24, 146–178.
Google Scholar
Patz, R.J., & Junker, B.W. (1999b). Applications and extensions of MCMC in IRT: Multiple item types, missing data, and rated responses. Journal of Educational and Behavioral Statistics, 24, 342–366.
Google Scholar
Rabe-Hesketh, S., Skrondal, A., & Pickles, A. (2004b). Generalized multilevel structural equation modeling. Psychometrika, 69, 167–190.
Article Google Scholar
Rabe-Hesketh, S., Skrondal, A., & Pickles, A. (2005). Maximum likelihood estimation of limited and discrete dependent variable models with nested random effects. Journal of Econometrics, 128, 301–323.
Article Google Scholar
Rabe-Hesketh, S., Skrondal, A., & Pickles, A. (2004a). GLLAMM manual (U.C. Berkeley Division of Biostatistics Working Paper Series, 160).
Raudenbush, S.W., Yang, M.-L., & Yosef, M. (2000). Maximum likelihood for generalized linear models with nested random effects via high-order, multivariate Laplace approximation. Journal of Computational and Graphical Statistics, 9, 141–157.
Article Google Scholar
Reeve, B.B., Hays, R.D., Bjorner, J.B., Cook, K.F., Crane, P.K., Teresi, J.A., et al. (2007). Psychometric evaluation and calibration of health-related quality of life items banks: Plans for the patient-reported outcome measurement information system (PROMIS). Medical Care, 45, S22–31.
Article PubMed Google Scholar
Robbins, H., & Monro, S. (1951). A stochastic approximation method. The Annals of Mathematical Statistics, 22, 400–407.
Article Google Scholar
Roberts, G.O., & Rosenthal, J.S. (2001). Optimal scaling for various Metropolis-Hastings algorithms. Statistical Science, 16, 351–367.
Article Google Scholar
Samejima, F. (1969). Estimation of latent ability using a response pattern of graded scores. Psychometric Monographs, 17.
Savalei, V. (2006). Logistic approximation to the normal: The KL rationale. Psychometrika, 71, 763–767.
Article Google Scholar
Schilling, S., & Bock, R.D. (2005). High-dimensional maximum marginal likelihood item factor analysis by adaptive quadrature. Psychometrika, 70, 533–555.
Google Scholar
Segall, D.O. (1998). IFACT computer program Version 1.0: Full information confirmatory item factor analysis using Markov chain Monte Carlo estimation [Computer software]. Seaside: Defense Manpower Data Center.
Google Scholar
Shi, J.-Q., & Lee, S.-Y. (1998). Bayesian sampling-based approach for factor analysis models with continuous and polytomous data. British Journal of Mathematical and Statistical Psychology, 51, 233–252.
Google Scholar
Song, X.-Y., & Lee, S.-Y. (2005). A multivariate probit latent variable model for analyzing dichotomous responses. Statistica Sinica, 15, 645–664.
Google Scholar
te Marvelde, J., Glas, v.G.C., & van Damme, J. (2006). Application of multidimensional item response theory models to longitudinal data. Educational and Psychological Measurement, 66, 5–34.
Article Google Scholar
Thissen, D. (2003). MULTILOG 7 user’s guide. Chicago: SSI International.
Google Scholar
Thomas, N. (1993). Asymptotic corrections for multivariate posterior moments with factored likelihood functions. Journal of Computational and Graphical Statistics, 2, 309–322.
Article Google Scholar
Tierney, L. (1994). Markov chains for exploring posterior distributions (with discussion). The Annals of Statistics, 22, 1701–1762.
Article Google Scholar
Tierney, L., & Kadane, J.B. (1986). Accurate approximations for posterior moments and marginal densities. Journal of the American Statistical Association, 81, 82–86.
Article Google Scholar
Titterington, D.M. (1984). Recursive parameter estimation using incomplete data. Journal of the Royal Statistical Society—Series B, 46, 257–267.
Google Scholar
Wainer, H., & Kiely, G. (1987). Item clusters and computerized adaptive testing: A case for testlets. Journal of Educational Measurement, 24, 185–202.
Article Google Scholar
Wei, G.C.G., & Tanner, M.A. (1990). A Monte Carlo implementation of the EM algorithm and the poor man’s data augmentation algorithm. Journal of the American Statistical Association, 85, 699–704.
Article Google Scholar
Wirth, R.J., & Edwards, M.C. (2007). Item factor analysis: Current approaches and future directions. Psychological Methods, 12, 58–79.
Article PubMed Google Scholar
Zhu, H.-T., & Lee, S.-Y. (2002). Analysis of generalized linear mixed models via a stochastic approximation algorithm with Markov chain Monte-Carlo method. Statistics and Computing, 12, 175–183.
Article Google Scholar
Zimowski, M.F., Muraki, E., Mislevy, R.J., & Bock, R.D. (2003). BILOG-MG3 user’s guide. Chicago: SSI International.
Google Scholar

Download references

Author information

Authors and Affiliations

GSE & IS, UCLA, Los Angeles, CA, USA, 90095-1521
Li Cai

Authors

Li Cai
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Li Cai.

Additional information

I thank the editor, the AE, and the reviewers for helpful suggestions. I am indebted to Drs. Chuanshu Ji, Robert MacCallum, and Zhengyuan Zhu for helpful discussions. I would also like to thank Drs. Mike Edwards and David Thissen for supplying the data sets used in the numerical demonstrations. The author gratefully acknowledges financial support from Educational Testing Service (the Gulliksen Psychometric Research Fellowship program), National Science Foundation (SES-0717941), National Center for Research on Evaluation, Standards and Student Testing (CRESST) through award R305A050004 from the US Department of Education’s Institute of Education Sciences (IES), and a predoctoral advanced quantitative methods training grant awarded to the UCLA Departments of Education and Psychology from IES. The views expressed in this paper are of the author’s alone and do not reflect the views or policies of the funding agencies.

Rights and permissions

Open Access This is an open access article distributed under the terms of the Creative Commons Attribution Noncommercial License ( https://creativecommons.org/licenses/by-nc/2.0 ), which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited.

Reprints and permissions

About this article

Cite this article

Cai, L. High-dimensional Exploratory Item Factor Analysis by A Metropolis–Hastings Robbins–Monro Algorithm. Psychometrika 75, 33–57 (2010). https://doi.org/10.1007/s11336-009-9136-x

Download citation

Received: 14 September 2008
Revised: 25 April 2009
Accepted: 31 May 2009
Published: 28 July 2009
Issue Date: March 2010
DOI: https://doi.org/10.1007/s11336-009-9136-x

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

High-dimensional Exploratory Item Factor Analysis by A Metropolis–Hastings Robbins–Monro Algorithm

Abstract

Article PDF

Similar content being viewed by others

A Riemannian Optimization Algorithm for Joint Maximum Likelihood Estimation of High-Dimensional Exploratory Item Factor Analysis

Joint Maximum Likelihood Estimation for High-Dimensional Exploratory Item Factor Analysis

Estimation Methods for Item Factor Analysis: An Overview

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

High-dimensional Exploratory Item Factor Analysis by A Metropolis–Hastings Robbins–Monro Algorithm

Abstract

Article PDF

Similar content being viewed by others

A Riemannian Optimization Algorithm for Joint Maximum Likelihood Estimation of High-Dimensional Exploratory Item Factor Analysis

Joint Maximum Likelihood Estimation for High-Dimensional Exploratory Item Factor Analysis

Estimation Methods for Item Factor Analysis: An Overview

Explore related subjects

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation