In January 2016, I was honored to receive an “Honorable Mention” of the
John Chambers Award 2016.
This article was written for R-bloggers,
whose builder, Tal Galili, kindly invited me
to write an introduction to therARPACK
package.
Eigenvalue decomposition is a commonly used technique in
numerous statistical problems. For example, principal component analysis (PCA)
basically conducts eigenvalue decomposition on the sample covariance of a data
matrix: the eigenvalues are the component variances, and eigenvectors are the
variable loadings.
In R, the standard way to compute eigenvalues is the eigen()
function.
However, when the matrix becomes large, eigen()
can be very time-consuming:
the complexity to calculate all eigenvalues of a $n times n$ matrix is
$O(n^3)$.
While in real applications, we usually only need to compute a few
eigenvalues or eigenvectors, for example to visualize high dimensional
data using PCA, we may only use the first two or three components to draw
a scatterplot. Unfortunately in eigen()
, there is no option to limit the
number of eigenvalues to be computed. This means that we always need to do the
full eigen decomposition, which can cause a huge waste in computation.
And this is why the rARPACK
package was developed. As the name indicates,rARPACK
was originally an R wrapper of the
ARPACK library, a FORTRAN package
that is used to calculate a few eigenvalues of a square matrix. However
ARPACK has stopped development for a long time, and it has some compatibility
issues with the current version of LAPACK. Therefore to maintain rARPACK
in a
good state, I wrote a new backend for rARPACK
, and that is the C++ library
Spectra.
The name of rARPACK
was POORLY designed, I admit. Starting from version
0.8-0, rARPACK
no longer relies on ARPACK, but due to CRAN polices and
reverse dependence, I have to keep using the old name.
The usage of rARPACK
is simple. If you want to calculate some eigenvalues
of a square matrix A
, just call the function eigs()
and tells it …read more
Source:: r-bloggers.com