Main Content

Factor Analysis

Multivariate data often includes a large number of measured variables, and sometimes those variables overlap, in the sense that groups of them might be dependent. For example, in a decathlon, each athlete competes in 10 events, but several of them can be thought of as speed events, while others can be thought of as strength events, etc. Thus, you can think of a competitor's 10 event scores as largely dependent on a smaller set of three or four types of athletic ability.

Factor analysis is a way to fit a model to multivariate data to estimate just this sort of interdependence. In a factor analysis model, the measured variables depend on a smaller number of unobserved (latent) factors. Because each factor might affect several variables in common, they are known ascommon factors。每个变量被认为是依赖near combination of the common factors, and the coefficients are known asloadings。Each measured variable also includes a component due to independent random variability, known asspecific variancebecause it is specific to one variable.

Specifically, factor analysis assumes that the covariance matrix of your data is of the form

x = Λ Λ Τ + Ψ

where Λ is the matrix of loadings, and the elements of the diagonal matrixΨare the specific variances. The functionfactoranfits the Factor Analysis model using maximum likelihood.

See Also

Related Topics