Principal Component Analysis

Principal Component Analysis

Principal component analysis (PCA) is a mathematical procedure that uses an orthogonal transformation to convert a set of observations of possibly correlated variables into a set of values of linearly uncorrelated variables called principal components. The number of principal components is less than or equal to the number of original variables. This transformation is defined in such a way that the first principal component has the largest possible variance (that is, accounts for as much of the variability in the data as possible), and each succeeding component in turn has the highest variance possible under the constraint that it be orthogonal to (i.e., uncorrelated with) the preceding components. Principal components are guaranteed to be independent only if the data set is jointly normally distributed. PCA is sensitive to the relative scaling of the original variables. Depending on the field of application, it is also named the discrete Karhunen–Loève transform (KLT), the Hotelling transform or proper orthogonal decomposition (POD).

PCA was invented in 1901 by Karl Pearson. Now it is mostly used as a tool in exploratory data analysis and for making predictive models. PCA can be done by eigenvalue decomposition of a data covariance (or correlation) matrix or singular value decomposition of a data matrix, usually after mean centering (and normalizing or using Z-scores) the data matrix for each attribute. The results of a PCA are usually discussed in terms of component scores, sometimes called factor scores (the transformed variable values corresponding to a particular data point), and loadings (the weight by which each standardized original variable should be multiplied to get the component score).

PCA is the simplest of the true eigenvector-based multivariate analyses. Often, its operation can be thought of as revealing the internal structure of the data in a way that best explains the variance in the data. If a multivariate dataset is visualised as a set of coordinates in a high-dimensional data space (1 axis per variable), PCA can supply the user with a lower-dimensional picture, a "shadow" of this object when viewed from its (in some sense) most informative viewpoint. This is done by using only the first few principal components so that the dimensionality of the transformed data is reduced.

PCA is closely related to factor analysis. Factor analysis typically incorporates more domain specific assumptions about the underlying structure and solves eigenvectors of a slightly different matrix.

Read more about Principal Component Analysis:  Details, Discussion, Table of Symbols and Abbreviations, Properties and Limitations of PCA, Computing PCA Using The Covariance Method, Derivation of PCA Using The Covariance Method, Relation Between PCA and K-means Clustering, Correspondence Analysis, Software/source Code

Other articles related to "principal component analysis, principal components, analysis, principal component, component":

Kernel Principal Component Analysis
... Kernel principal component analysis (kernel PCA) is an extension of principal component analysis (PCA) using techniques of kernel methods ...
Eigenvalues And Eigenvectors - Applications - Principal Components Analysis
... Main article Principal components analysis See also Positive semidefinite matrix and Factor analysis The eigendecomposition of a symmetric positive semidefinite (PSD) matrix ... The orthogonal decomposition of a PSD matrix is used in multivariate analysis, where the sample covariance matrices are PSD ... This orthogonal decomposition is called principal components analysis (PCA) in statistics ...
Principal Component Analysis - Software/source Code
... In the NAG Library, principal components analysis is implemented via the g03aa routine (available in both the Fortran and the C versions of the Library) ... In the MATLAB Statistics Toolbox, the functions princomp and pca (R2012b) give the principal components, while the function pcares gives the residuals and reconstructed matrix for a low ... mostly compatible with MATLAB, the function princomp gives the principal component ...
Component - Mathematical Terms
... Component (group theory), a quasisimple subnormal subgroup Connected component (graph theory), a maximal connected subgraph Connected component (topology) in ...
CP Decomposition - Other Decompositions
... simply unfolding of the multi-way array to a matrix and then performing standard two-way methods as principal component analysis (PCA) ... The Tucker3 method should rightfully be called three-mode principal component analysis (or N-mode principal component analysis), but here the term Tucker3 ... (CORCONDIA), split-half analyses, examination of the loadings, and residual analysis ...

Famous quotes containing the words analysis, principal and/or component:

    ... the big courageous acts of life are those one never hears of and only suspects from having been through like experience. It takes real courage to do battle in the unspectacular task. We always listen for the applause of our co-workers. He is courageous who plods on, unlettered and unknown.... In the last analysis it is this courage, developing between man and his limitations, that brings success.
    Alice Foote MacDougall (1867–1945)

    All animals, except man, know that the principal business of life is to enjoy it.
    Samuel Butler (1835–1902)

    ... no one knows anything about a strike until he has seen it break down into its component parts of human beings.
    Mary Heaton Vorse (1874–1966)