Practical Guide: Distribution Distance Metrics for EEG Domain Adaptation

1 minute read

Published: February 18, 2026

This note summarizes three continuous-distribution distances that I repeatedly use to quantify source-target shift in EEG domain adaptation workflows.

Why these three?

In practice, I track:

Wasserstein distance for geometric transport cost.
MMD (with RBF kernel) for distribution mismatch in RKHS.
Energy distance for a distance-based discrepancy that is often stable in high dimensions.

All are interpreted as smaller is better, with 0 meaning perfect overlap.

Core Definitions

For source distribution \(\mu_S\) and target distribution \(\mu_T\):

\[W_1(\mu_S,\mu_T)=\inf_{\gamma\in\Pi(\mu_S,\mu_T)} \int \|x-y\|\,d\gamma(x,y).\]

For samples \({x_i}{i=1}^m\), \({y_j}{j=1}^n\), MMD is:

\[\mathrm{MMD}^2 =\frac{1}{m^2}\sum_{i,j} k_\sigma(x_i,x_j) +\frac{1}{n^2}\sum_{i,j} k_\sigma(y_i,y_j) -\frac{2}{mn}\sum_{i,j} k_\sigma(x_i,y_j).\]

Energy distance:

\[D_E=\sqrt{2\,\mathbb{E}\|X-Y\|-\mathbb{E}\|X-X'\|-\mathbb{E}\|Y-Y'\|}.\]

Typical Workflow

Standardize both domains in the same feature space (z-score or whitened feature space).
Compute all three metrics before adaptation.
Apply adaptation (e.g., CORAL, OT, SA, TCA, ART/PT).
Recompute all three metrics after adaptation.
Report both distance reduction and target-domain accuracy together.

Interpretation Heuristic

In my EEG pipelines (standardized feature space), Wasserstein values are often interpreted as:

0-2: light shift, transfer may already be acceptable.
2-10: moderate shift, adaptation is usually beneficial.
>10: strong shift, naive transfer often fails.

These are heuristics, not universal thresholds. The same numeric value can mean different things under different feature constructions.

Practical Takeaway

Do not trust a single metric. If one decreases while the others increase, it usually indicates a geometry mismatch (for example, covariance alignment improved but support mismatch remains). I use the three together as a compact diagnostic panel before model selection.

Share on

Twitter Facebook LinkedIn

Yiming Shen

Practical Guide: Distribution Distance Metrics for EEG Domain Adaptation

Why these three?

Core Definitions

Typical Workflow

Interpretation Heuristic

Practical Takeaway

Share on

You May Also Enjoy

Research Note: Markov-Switching State-Space Models for Neural Decoding

ICA Objective Functions in One Place: Kurtosis, Negentropy, MI, and MLE

From Matrix ICA to Tensor ICA: Architectures and Decomposition Choices

Mathematical Derivation: PCA Whitening Implementation