class skcuda.linalg.PCA(n_components=None, handle=None, epsilon=1e-07, max_iter=10000)[source]

Principal Component Analysis with similar API to sklearn.decomposition.PCA

The algorithm implemented here was first implemented with cuda in [Andrecut, 2008]. It performs nonlinear dimensionality reduction for a data matrix, mapping the data to a lower dimensional space of K. See references for more information.

  • n_components (int, default=None) – The number of principal component column vectors to compute in the output matrix.
  • epsilon (float, default=1e-7) – The maximum error tolerance for eigen value approximation.
  • max_iter (int, default=10000) – The maximum number of iterations in approximating each eigenvalue.


If n_components is None, then for a NxP data matrix K = min(N, P). Otherwise, K = min(n_components, N, P)


[Andrecut, 2008]


>>> import pycuda.autoinit
>>> import pycuda.gpuarray as gpuarray
>>> import numpy as np
>>> import skcuda.linalg as linalg
>>> from skcuda.linalg import PCA as cuPCA
>>> pca = cuPCA(n_components=4) # map the data to 4 dimensions
>>> X = np.random.rand(1000,100) # 1000 samples of 100-dimensional data vectors
>>> X_gpu = gpuarray.GPUArray((1000,100), np.float64, order="F") # note that order="F" or a transpose is necessary. fit_transform requires row-major matrices, and column-major is the default
>>> X_gpu.set(X) # copy data to gpu
>>> T_gpu = pca.fit_transform(X_gpu) # calculate the principal components
>>>[:,0], T_gpu[:,1]) # show that the resulting eigenvectors are orthogonal
__init__(n_components=None, handle=None, epsilon=1e-07, max_iter=10000)[source]

x.__init__(…) initializes x; see help(type(x)) for signature


__init__([n_components, handle, epsilon, …]) x.__init__(…) initializes x; see help(type(x)) for signature
fit_transform(X_gpu) Fit the Principal Component Analysis model, and return the dimension-reduced matrix.
get_n_components() n_components getter.
set_n_components(n_components) n_components setter.