Nonlinear PCANonlinear PCA toolbox for MATLABby Matthias Scholz |
Auto-associative neural network (Autoencoder) |
Nonlinear principal component analysis (NLPCA)
is commonly seen as a nonlinear generalization of standard
principal component analysis (PCA).
It generalizes the principal components from straight lines to curves (nonlinear).
Thus, the subspace in the original data space which is described by all nonlinear components is also curved.
Nonlinear PCA can be achieved by using a neural network with an autoassociative architecture
also known as autoencoder, replicator network, bottleneck or sandglass type network.
Such autoassociative neural network is a multi-layer perceptron that performs an identity
mapping, meaning that the output of the network is required to be identical to the input.
However, in the middle of the network is a layer that works as a bottleneck in which
a reduction of the dimension of the data is enforced. This bottleneck-layer provides
the desired component values (scores).
Here, NLPCA is applied to 19-dimensional spectral data representing equivalent widths of 19 absorption lines of 487 stars, available at www.cida.ve. The figure in the middle shows a visualisation of the data by using the first three components of standard PCA. Data of different colors belong to different spectral groups of stars. The first three components of linear PCA and of NLPCA are represented by grids in the left and right figure, respectively. Each grid represents the two-dimensional subspace given by two components while the third one is set to zero. Thus, the grids represent the new coordinate system of the transformation. In contrast to linear PCA (left) which does not describe the nonlinear characteristics of the data, NLPCA gives a nonlinear (curved) description of the data, shown on the right.
see all publications: [Matthias Scholz: publications]
| Nonlinear PCA: www.nlpca.de | Matthias Scholz: www.matthias-scholz.de |