What CORAL, SA, TCA, and OT Actually Align
Published:
Domain adaptation sounds broad.
In practice, each method aligns something specific.
If that object is the right problem, the method can help. If that object is not the problem, the method can move the data without improving the classifier.
So I like to ask one question first:
What exactly is being aligned?
The Setup
There is a source session and a target session.
The source has features and labels:
\[(X_S, y_S).\]The target has features:
\[X_T.\]In unsupervised domain adaptation, target labels are not used for fitting the adaptation map.
The goal is to train on source labels and predict target trials.
The problem is that \(X_S\) and \(X_T\) may not live in the same distribution.
CORAL Aligns Covariance
CORAL means correlation alignment.
The main idea is direct:
- whiten the source features,
- recolor them with the target covariance.
The source covariance is pushed toward the target covariance.
A simple reading is:
source shape -> target shape
CORAL is useful when the main difference is second-order structure: scale, spread, and feature correlation.
For EEG features, that often means session-level covariance shift.
CORAL does not know class labels. It does not know which samples are left-hand or right-hand trials. It aligns the global feature cloud.
SA Aligns Subspaces
SA means subspace alignment.
First, compute a low-dimensional basis for the source. Then compute a low-dimensional basis for the target. These are often PCA subspaces.
Call them:
\[P_S,\quad P_T.\]SA learns a map from one basis to the other:
\[M = P_S^\top P_T.\]The source representation is moved toward the target subspace.
The simple reading is:
source axes -> target axes
SA is useful when source and target share a low-dimensional structure but that structure is rotated between sessions.
It does not align every sample. It aligns the main directions of variation.
TCA Aligns a Latent Space
TCA means transfer component analysis.
It builds a new feature space where source and target are closer by a distribution distance. The common choice is MMD.
The simple reading is:
find a shared space where source and target mismatch is smaller
TCA is more flexible than plain covariance alignment. It can use kernels. It can make nonlinear mismatch easier to handle.
But the choice of kernel, dimension, and regularization matters. If the latent space removes class information while reducing source-target mismatch, the classifier loses.
That is the tradeoff:
less domain mismatch
vs
enough class structure
OT Aligns Samples by Transport
OT means optimal transport.
It treats source and target samples as two piles of mass. It finds a plan for moving source mass toward target mass with low cost.
A transport plan \(\gamma\) solves a problem like:
\[\min_{\gamma} \sum_{i,j} \gamma_{ij} c(x_i, z_j).\]Here \(x_i\) is a source sample, \(z_j\) is a target sample, and \(c\) is a cost.
The simple reading is:
move source samples toward target samples
OT is useful when sample-level geometry matters. It can handle shifts that are more local than a single covariance transform.
It can also be sensitive to noise, sample size, and cost choice.
The Four Methods Side by Side
| Method | What it aligns | Good when |
|---|---|---|
| CORAL | covariance | source and target differ in global feature spread |
| SA | subspace basis | sessions share a low-dimensional structure with rotated axes |
| TCA | latent distribution | a new shared space can reduce mismatch and keep labels useful |
| OT | sample mass | local source-target geometry matters |
This table is more useful than ranking the methods.
There is no single best alignment object for all EEG sessions.
What to Check Before Trusting the Result
After adaptation, look at more than accuracy.
Check whether the distance decreased.
Check whether class separation survived.
Check whether the adapted source became too compressed.
Check whether the target representation moved into a region covered by source labels.
A smaller source-target distance is good only if the classifier still sees the task.
The Practical Order
I usually read the methods from simple to flexible.
CORAL asks: are the covariances mismatched?
SA asks: are the main subspaces rotated?
TCA asks: can a shared latent space reduce mismatch?
OT asks: can source samples be transported toward target samples?
This order gives a clean debugging path.
If CORAL works, the problem may be mostly covariance shift.
If TCA or OT helps more, the mismatch may be more than a global covariance change.
If all methods fail, the feature representation itself may be the problem.
Domain adaptation is not one operation. It is a choice about what kind of shift you believe is blocking transfer.
