CV
Summary
Ph.D. candidate in Computational Science developing R/Python software for EEG and multivariate time-series analysis. Built four R packages spanning domain adaptation, whitening, feature extraction, and physics-constrained simulation, with complementary research in deployment under distribution shift, benchmarking, and statistically grounded model diagnostics.
Technical Skills
- Programming: R (package development, Rcpp), Python (NumPy, scikit-learn, PyTorch), SQL (MySQL).
- Statistics & Machine Learning: Multivariate analysis, tensor methods, dimensionality reduction, regularized regression, mixed-effects models, domain adaptation, transfer learning, time-series modeling, EEG signal processing, nested cross-validation.
- Engineering & Infrastructure: Docker, Git, GitHub Actions, Linux, AWS, HPC/Slurm.
Education
- Ph.D. in Computational Science (Data Analytics Track), University of Massachusetts Boston, Expected Summer 2026
- M.S. in Analytics and Modelling, Valparaiso University, May 2018
- B.Eng. in Electrical and Electronics Engineering, Chongqing University, China, Jun 2014
Selected Open-Source R Packages
DA4BCI — Domain adaptation framework for EEG-based BCI
- Developed a suite of domain-adaptation methods for non-stationary EEG/BCI settings, supporting systematic evaluation under cross-subject and cross-session distribution shift.
- Packaged the methods in a reusable research workflow with testing, continuous integration, and vignette-style documentation.
eegwhiten — Whitening transforms for EEG and multichannel signals
- Implemented covariance-whitening transforms with an explicit train/apply API that fits preprocessing on training data and reuses learned parameters on validation and test sets.
- Added inverse transforms, diagnostics, tuning, and reporting to support leakage-safe preprocessing and auditable signal pipelines.
TensorEEG — Physics-constrained EEG simulation on manifolds
- Built a simulator for 3rd-order synthetic EEG tensors using graph-Laplacian smoothing on spherical manifolds, geodesic random walks on rotation/Stiefel manifolds, and structured temporal dynamics.
- Designed the package for controlled benchmarking of tensor decomposition, Riemannian-geometry classifiers, and source-localization methods.
BCIFeatR — Feature extraction toolkit for EEG-based BCI
- Developed a unified train/test interface for feature extraction from multichannel EEG trials, standardizing experimentation across feature families in offline decoding pipelines.
- Reduced pipeline friction by making feature modules composable within a consistent API for comparative studies.
Professional Experience
Doctoral Researcher, Data Science and Algorithm Development | Sep 2019 – Present University of Massachusetts Boston
Method Development and Deployment Under Shift
- Developed proxy tuning for label-scarce deployment, enabling automatic hyperparameter selection when target-domain labels are unavailable.
- Built a confidence-gated source-selection procedure using bootstrap confidence intervals; in simulation, reduced negative-transfer failures from 20.2% under random source selection to 0%.
Tensor Methods, Diagnostics, and Validation
- Developed TMCCA, a tensor-based method for integrating heterogeneous multi-view datasets, and released the method as the tensorMCCA R package.
- Designed a drift-feature-performance decomposition framework to distinguish input drift from feature-extraction failure as causes of performance degradation.
- Used mixed-effects models to quantify environmental effects on predictive accuracy and inform retraining policy.
- Built evaluation pipelines spanning accuracy, latency, and compute cost across classical and deep-learning classifiers.
- Ran simulation studies across noise and sample-size regimes to evaluate robustness and scalability of feature-matching algorithms.
Data Engineer, IoT Analytics | Jul 2014 – Aug 2016 China Mobile IoT Company Limited, Chongqing, China
- Managed GB-scale time-series sensor data in MySQL and improved heavy aggregation queries through indexing and partitioning, keeping reporting workflows within SLA limits.
- Built monitoring dashboards and automated visual reports for network-health tracking and anomaly detection.
Selected Research Outputs
Honors
- Silver Medal (Rank 98/2767, top 4%), Kaggle HMS – Harmful Brain Activity Classification, 2024
- Doctoral Fellowship, University of Massachusetts Boston (full tuition and stipend), 2019 – Present
Manuscripts
- Shen, Y. et al. Drift-Feature-Performance Decomposition via Structured Geometric Modeling. Under review. 2025
- Shen, Y. et al. Decision-Oriented BCI: Confidence-Gated Adaptation. Under review. 2025
- Degras, D. & Shen, Y. Scalable Feature Matching Across Large Data Collections. 2021
Talks
- Invited Talk: Benchmarking Classification Pipelines, MIND Seminar, Inria, France. Jun 2025
- Poster: Cross-Session BCI Transfer: Global vs. Selective Pooling, ENAR 2025, New Orleans. Mar 2025
