Kaggle Silver Medal: HMS - Harmful Brain Activity Classification
Published:
Competition Goal: To detect and classify six patterns of harmful brain activity (Seizure, LPD, GPD, LRDA, GRDA, Other) in critically ill patients using EEG recordings. This work contributes to automating neurocritical care diagnostics.
🏆 Achievement
- Rank: 98th out of 2,767 teams (Silver Medal).
- Role: Solo Competitor / Lead Data Scientist.
💡 Technical Approach
My solution focused on treating the multi-channel EEG time-series data as a computer vision problem by converting signals into spectrograms, leveraging the power of modern CNNs.
1. Data Preprocessing & Feature Engineering
- Spectrogram Conversion: Converted raw EEG signals (10-20 system) into Log-Mel Spectrograms to capture time-frequency features.
- Montage Engineering: Utilized “double banana” and other clinical montages to highlight spatial differences between brain hemispheres.
- Signal Cleaning: Applied bandpass filters to remove noise and power-line interference (50/60Hz).
2. Model Architecture
I employed an ensemble of 2D Convolutional Neural Networks, specifically EfficientNet (B0-B2) variants, pre-trained on ImageNet.
- Backbone:
EfficientNetfor high feature extraction efficiency. - Input: Stacked spectrograms representing different spatial montages.
- Pooling: GeM (Generalized Mean) Pooling to capture salient features across time.
3. Inference & Post-Processing
To maximize the stability of predictions (KL-Divergence metric), I implemented a Weighted Ensemble strategy combining predictions from multiple model checkpoints trained on different folds.
💻 Code Snippet: Weighted Ensemble Inference
The following snippet demonstrates the inference pipeline, where predictions from different models are weighted based on their validation performance to produce the final robust voting scores.
```python import pandas as pd import numpy as np import torch from efficientnet_pytorch import EfficientNet
— Ensemble Configuration —
Weights derived from Nelder-Mead optimization on OOF (Out-of-Fold) data
model_weights = { ‘model_v1’: 0.28111, ‘model_v2’: 0.23014, ‘model_v3’: 0.31241, ‘model_v4’: 0.17634 }
def inference_ensemble(test_loader, model_paths): final_preds = []
# Iterate through each model in the ensemble
for model_name, weight in model_weights.items():
# Load Architecture
model = EfficientNet.from_name('efficientnet-b0')
checkpoint = torch.load(f"./models/{model_name}.pth")
model.load_state_dict(checkpoint)
model.eval()
# Batch Prediction
fold_preds = []
with torch.no_grad():
for batch in test_loader:
images = batch['image'].cuda()
outputs = model(images)
# Softmax for probability distribution
probs = torch.softmax(outputs, dim=1)
fold_preds.append(probs.cpu().numpy())
# Apply Ensemble Weight
weighted_preds = np.concatenate(fold_preds) * weight
final_preds.append(weighted_preds)
# Sum weighted predictions
ensemble_result = np.sum(final_preds, axis=0)
return ensemble_result