Multi-Source Domain Adaptation for EEG Sessions
Published:
One old EEG session is useful.
Several old sessions look better.
That is the temptation in multi-source domain adaptation: use all the data.
The problem is that EEG sessions are not equal. Some are close to the target. Some are far. Some carry useful variation. Some pull the classifier in the wrong direction.
Multi-source adaptation is the problem of deciding what to do with the source sessions.
The Basic Setup
Suppose a subject has several previous sessions:
\[S_1, S_2, \ldots, S_m.\]There is a new target session:
\[T.\]The source sessions have labels. The target session may have no labels or only a small calibration set.
The goal is simple:
Use the old sessions to classify the new session.
The hard part is choosing which old sessions to trust.
Why Merging All Sources Is Not Enough
The first baseline is to merge all sources.
S1 + S2 + ... + Sm -> one training set
Then fit feature extraction, domain adaptation, and classifier.
This is attractive because it is simple. It uses all labels. It reduces variance.
But it also assumes every source is useful.
That assumption fails often in EEG.
A far source can change the decision boundary. It can dominate the covariance. It can make adaptation chase the wrong target geometry.
More data helps only when the added data points in the right direction.
Distance Gives a First Filter
A source session can be compared with the target session in feature space.
Common choices are MMD, Wasserstein distance, energy distance, or a covariance distance.
The distance does not give the full answer. It gives a first reading.
near source -> likely useful
far source -> inspect before trusting
This turns multi-source adaptation into a session-role problem.
Each old session can be treated as near, far, bridge, selected, downweighted, or ignored.
Four Useful Strategies
In my cross-session runners, I often compare four families.
MAP: Merge & Adapt
MAP chooses a feature, adaptation method, and classifier setup, then merges all source sessions. It is the clean all-source baseline.
It answers:
What happens if every source participates equally?
DWP: Distance-Weighted Pooling
DWP keeps all sources but gives larger weight to closer sessions.
It answers:
Can we keep the full pool but reduce the influence of far sessions?
MMP: Minimum-distance Multi-source
MMP uses distance and uncertainty to select a near set. Then it combines that near set either by merging before adaptation or by a mixture-of-experts style vote.
It answers:
Can we trust a smaller near set more than the full pool?
BDP: Bridge-Domain Proxy
BDP separates sessions into bridge and far groups. It uses the bridge/far structure to tune choices through a proxy task before final target prediction.
It answers:
Can a middle session help tune transfer across a larger shift?
The Role View
Accuracy is not the whole output.
It is also the role assigned to each session.
A good benchmark should tell you:
- which sessions were used,
- which sessions were downweighted,
- which sessions were treated as bridge,
- which sessions were ignored,
- which sessions caused degradation or fallback.
This makes the result readable.
Without roles, multi-source adaptation becomes a score table with no mechanism.
Why Two Good Sources Can Beat Five Sources
Imagine five old sessions.
Two are close to the target. Three are far.
Merging all five may blur the target structure. A weighted method may help by shrinking the far sessions. A selection method may help by using only the two close sessions.
This is not data waste. It is source control.
The target session needs relevant information, not every available trial.
What Multi-Source Methods Should Report
A cross-session benchmark should report at least four things.
First: target accuracy.
Second: distance from each source to target.
Third: source weights or source roles.
Fourth: runtime.
Runtime matters because some methods need bootstrapping, proxy tuning, or repeated adaptation. A method that gains one point of accuracy at ten times the cost may not be the right deployment choice.
The Practical Reading
Multi-source EEG adaptation is also a trust problem.
Which old session should influence the new session?
How much should it influence it?
Should it be used as a direct source, a bridge, or not at all?
The useful pipeline is:
measure source-target shift
-> assign source roles
-> adapt with those roles
-> evaluate accuracy and mechanism
Once source roles are visible, adaptation stops being a black box.
It becomes a set of decisions that can be inspected.
