Mathematical Derivation: PCA Whitening Implementation

1 minute read

Published:

This document details the mathematical derivation of the PCA (Principal Component Analysis) Whitening method as implemented in the PCA.R function of the eegwhiten package.

1. Core Concept

PCA whitening transforms a dataset $X$ such that the covariance matrix of the transformed data is the identity matrix $I$. It achieves this by projecting the data onto its principal components and scaling them by the inverse square root of their variances (eigenvalues).

2. Mathematical Formulation

Step 1: Eigendecomposition

Given a centered data matrix $X \in \mathbb{R}^{n \times p}$ (where $n$ is the number of trials and $p$ is the number of channels), the sample covariance matrix $\Sigma$ is computed as:

\[\Sigma = \frac{1}{n-1} X^\top X\]

In PCA.R, we perform the eigendecomposition of the symmetric matrix $\Sigma$:

\[\Sigma = U \Lambda U^\top\]
  • $U \in \mathbb{R}^{p \times p}$ is the matrix of orthogonal eigenvectors (variable U in code).
  • $\Lambda = \text{diag}(\lambda_1, \dots, \lambda_p)$ is the diagonal matrix of eigenvalues (variable lambda in code).

Step 2: Constructing the Whitening Matrix

The goal of whitening is to find a matrix $W$ such that the transformed data $Z = X W^\top$ satisfies $\text{Cov}(Z) = I$.

In our implementation, the whitening matrix $W_{\text{PCA}}$ is constructed as:

\[W_{\text{PCA}} = \Lambda^{-1/2} U^\top\]

R Code Correspondence: ```r

W = D^-1/2 * U^T

W <- tcrossprod(diag(1 / sqrt(lambda), nrow = k, ncol = k), U)