When Domain Adaptation Helps and When It Fails
Published:
Domain adaptation changes the data.
That is all it promises.
It can make transfer better. It can also make transfer worse.
The useful question is not “which method is best?” The useful question is:
What problem is adaptation trying to fix?
Start With the Failure
A classifier trained on one EEG session often fails on another session.
The feature distribution moved.
The covariance changed.
The subject state changed.
The cap moved.
The task may look the same, but the data no longer sits in the same place.
Domain adaptation tries to reduce that mismatch.
When It Helps
Adaptation helps when the source and target still share the same task structure.
The left-hand and right-hand trials must remain separable in a related way.
The problem is then a shift around a shared structure.
Common cases:
- the mean moved,
- the covariance changed,
- the feature scale changed,
- the main subspace rotated,
- source and target support still overlap after alignment.
In this setting, adaptation can move the source representation closer to the target without destroying the class boundary.
That is the good case.
When It Fails
Adaptation fails when it aligns the wrong thing.
A method may reduce global distance while mixing the classes.
It may match covariances but erase the direction that separates labels.
It may transport source samples toward target samples that belong to another class.
It may overfit a tiny target calibration set.
It may trust a bad source because that source is close by one metric and bad by another.
This is negative transfer.
The adapted pipeline becomes worse than the no-adaptation baseline.
The Three Checks
I use three checks before reading an adaptation result.
First: did source-target distance decrease?
Second: did target accuracy improve?
Third: did the adapted source still keep class structure?
The first check alone is not enough.
A method can win the distance metric and lose the classification task.
The second check alone is also incomplete.
Accuracy tells the final score. It does not explain the mechanism.
The third check tells whether the method kept the reason the classifier works.
A Simple Decision Table
| Observation | Reading | Decision |
|---|---|---|
| distance high, accuracy low | shift is blocking transfer | try adaptation or source selection |
| distance decreases, accuracy improves | adaptation is fixing relevant shift | keep method |
| distance decreases, accuracy drops | alignment hurt class structure | reject method |
| distance stays high, accuracy improves | metric missed useful structure | inspect feature space |
| all methods fail | feature representation may be wrong | revisit preprocessing and features |
This table is simple. That is why it is useful.
Why Source Choice Matters
A bad source can ruin a good adaptation method.
If the source is far from the target, the method may spend most of its effort forcing two unrelated distributions together.
If several sources are available, the first decision is not the adaptation method. The first decision is the source set.
Use all sources when they are broadly compatible.
Weight sources when they differ but still carry useful signal.
Select sources when only a subset is close.
Use a bridge strategy when the target is far and a middle session gives a better proxy.
The Target Data Question
Adaptation can use target features.
That is allowed in many unsupervised settings.
But the target data must match the deployment story.
If the future target session is unavailable at training time, then adaptation cannot use it.
If a short unlabeled calibration block is available, then adaptation can use that block.
If target labels are used for tuning, then the setting is no longer unsupervised.
The experiment should say which target information is allowed.
What I Look For in Results
A good result has a clean pattern.
The no-adaptation baseline is visible.
The adapted result is visible.
The source-target distances before and after adaptation are visible.
The source roles are visible when multiple sources are used.
The failure cases are visible.
Hiding failures makes the method harder to trust. Showing failures makes the method easier to use.
The Practical Reading
Domain adaptation helps when it fixes the shift that actually hurts the classifier.
It fails when it fixes a different shift, or when it breaks the class structure while making distributions look closer.
So the workflow should be:
build safe features
-> measure shift
-> choose sources
-> apply adaptation
-> compare against no adaptation
-> inspect failures
Adaptation is not the first step. It is the middle step.
The first step is a feature space where the task is still visible.
The last step is a decision: adapt, select another source, collect calibration data, or retrain.
