What CORAL, SA, TCA, and OT Actually Align

4 minute read

Published:

Domain adaptation sounds broad.

In practice, each method aligns something specific.

If that object is the right problem, the method can help. If that object is not the problem, the method can move the data without improving the classifier.

So I like to ask one question first:

What exactly is being aligned?

The Setup

There is a source session and a target session.

The source has features and labels:

\[(X_S, y_S).\]

The target has features:

\[X_T.\]

In unsupervised domain adaptation, target labels are not used for fitting the adaptation map.

The goal is to train on source labels and predict target trials.

The problem is that \(X_S\) and \(X_T\) may not live in the same distribution.

CORAL Aligns Covariance

CORAL means correlation alignment.

The main idea is direct:

  1. whiten the source features,
  2. recolor them with the target covariance.

The source covariance is pushed toward the target covariance.

A simple reading is:

source shape -> target shape

CORAL is useful when the main difference is second-order structure: scale, spread, and feature correlation.

For EEG features, that often means session-level covariance shift.

CORAL does not know class labels. It does not know which samples are left-hand or right-hand trials. It aligns the global feature cloud.

SA Aligns Subspaces

SA means subspace alignment.

First, compute a low-dimensional basis for the source. Then compute a low-dimensional basis for the target. These are often PCA subspaces.

Call them:

\[P_S,\quad P_T.\]

SA learns a map from one basis to the other:

\[M = P_S^\top P_T.\]

The source representation is moved toward the target subspace.

The simple reading is:

source axes -> target axes

SA is useful when source and target share a low-dimensional structure but that structure is rotated between sessions.

It does not align every sample. It aligns the main directions of variation.

TCA Aligns a Latent Space

TCA means transfer component analysis.

It builds a new feature space where source and target are closer by a distribution distance. The common choice is MMD.

The simple reading is:

find a shared space where source and target mismatch is smaller

TCA is more flexible than plain covariance alignment. It can use kernels. It can make nonlinear mismatch easier to handle.

But the choice of kernel, dimension, and regularization matters. If the latent space removes class information while reducing source-target mismatch, the classifier loses.

That is the tradeoff:

less domain mismatch
vs
enough class structure

OT Aligns Samples by Transport

OT means optimal transport.

It treats source and target samples as two piles of mass. It finds a plan for moving source mass toward target mass with low cost.

A transport plan \(\gamma\) solves a problem like:

\[\min_{\gamma} \sum_{i,j} \gamma_{ij} c(x_i, z_j).\]

Here \(x_i\) is a source sample, \(z_j\) is a target sample, and \(c\) is a cost.

The simple reading is:

move source samples toward target samples

OT is useful when sample-level geometry matters. It can handle shifts that are more local than a single covariance transform.

It can also be sensitive to noise, sample size, and cost choice.

The Four Methods Side by Side

MethodWhat it alignsGood when
CORALcovariancesource and target differ in global feature spread
SAsubspace basissessions share a low-dimensional structure with rotated axes
TCAlatent distributiona new shared space can reduce mismatch and keep labels useful
OTsample masslocal source-target geometry matters

This table is more useful than ranking the methods.

There is no single best alignment object for all EEG sessions.

What to Check Before Trusting the Result

After adaptation, look at more than accuracy.

Check whether the distance decreased.

Check whether class separation survived.

Check whether the adapted source became too compressed.

Check whether the target representation moved into a region covered by source labels.

A smaller source-target distance is good only if the classifier still sees the task.

The Practical Order

I usually read the methods from simple to flexible.

CORAL asks: are the covariances mismatched?

SA asks: are the main subspaces rotated?

TCA asks: can a shared latent space reduce mismatch?

OT asks: can source samples be transported toward target samples?

This order gives a clean debugging path.

If CORAL works, the problem may be mostly covariance shift.

If TCA or OT helps more, the mismatch may be more than a global covariance change.

If all methods fail, the feature representation itself may be the problem.

Domain adaptation is not one operation. It is a choice about what kind of shift you believe is blocking transfer.