TinyMyo: a Tiny Foundation Model for Flexible EMG Signal Processing at the Edge
TinyMyo is a 3.6M-parameter Transformer-based foundation model for surface EMG (sEMG). It is pretrained on >480 GB of EMG data and optimized for ultra-low-power, real-time deployment, including microcontrollers (GAP9) where it achieves an inference time of 0.785 s, energy of 44.91 mJ and power envelope of 57.18 mW.
TinyMyo is built for broad generalization across datasets, sensor configurations, movement tasks, subjects, and domains (gesture, kinematics, speech).
π License & Usage (Model Weights)
The released TinyMyo weights are licensed under CC BY-ND 4.0. This summary is not legal adviceβplease read the full license.
β You may
- Use and redistribute the unmodified TinyMyo weights (including commercially) with attribution.
- Fine-tune/modify internally for research or production without redistributing modified weights.
- Publish code, configs, evaluations, and papers using TinyMyo.
π« You may not
- Share or host modified weights in any form (including LoRA/adapter deltas, pruned/quantized models).
- Claim endorsement from the TinyMyo authors without permission.
- Use the TinyMyo name for derivative models.
π€ Contributing Improvements
To upstream improvements, submit a PR to the BioFoundation repository with:
- Full reproducibility artifacts (configs, logs, seeds, environment).
- Evaluation on standard protocols (e.g., DB5, EPN-612, UCI EMG, DB8, Silent Speech).
- Comparison to TinyMyoβs reported metrics.
Approved PRs will be retrained and released as official TinyMyo checkpoints under CC BY-ND.
π 1. Default Input & Preprocessing
Unless specified otherwise, TinyMyo expects:
Channels: 16
Sampling rate: 2000 Hz
Segment length: 1000 samples (0.5 s)
Windowing: 50% overlap (pretraining)
Preprocessing:
- 4th-order 20β450 Hz bandpass
- 50 Hz notch filter
- Minβmax normalization (pretraining)
- Z-score normalization (downstream)
Datasets with <16 channels are zero-padded (pretraining only).
π¬ 2. Pretraining Overview
TinyMyo is pretrained via masked reconstruction on three large-scale EMG datasets:
| Dataset | Subjects | fs | Channels | Size |
|---|---|---|---|---|
| Ninapro DB6 | 10 | 2000 Hz | 14 | 20.3 GB |
| Ninapro DB7 | 22 | 2000 Hz | 12 | 30.9 GB |
| EMG2Pose | 192 | 2000 Hz | 16 | 431 GB |
Tokenization: Channel-Independent Patches
Unlike EEG FMs that mix channels early, TinyMyo uses per-channel patching:
- Patch length: 20 samples
- Patch stride: 20 samples
- Tokens/channel: 50
- Total seq length: 800 tokens (16 x 50)
- Positional encoding: RoPE (rotary)
This preserves electrode-specific structure while allowing attention to learn cross-channel relationships.
Transformer Encoder
- 8 layers, 3 heads
- Embedding dim: 192
- Pre-LayerNorm
- Dropout & drop-path: 0.1
Lightweight Decoder
A single linear layer (~3.9k params) reconstructs masked patches. Following SimMIM, this forces the encoder to learn robust latent structure.
Masking Objective
- 50% random masking with a learnable
[MASK]token - Loss: Smooth L1 with small penalty on visible patches
Training Setup
- Optimizer: AdamW (Ξ²=(0.9,0.98), wd=0.01)
- LR: 1e-4 with cosine decay
- Batch size: 512 (with grad accumulation)
- Epochs: 50, warm-up: 10
- Hardware: 4Γ NVIDIA GH200 GPUs
π§ 3. Architecture Summary
Model Variant
| Variant | Params | (Layers, Dim) |
|---|---|---|
| TinyMyo | 3.6M | (8, 192) |
π― 4. Downstream Tasks
TinyMyo generalizes across gesture classification, kinematic regression, and speech EMGβwith state-of-the-art or competitive results.
4.1 Hand Gesture Classification
Evaluated on:
- Ninapro DB5 (52 classes, 10 subjects)
- EPN-612 (5 classes, 612 subjects)
- UCI EMG (6 classes, 36 subjects)
- Meta Neuromotor Interface (9 gestures)
Preprocessing
EMG filtering: 20β90 Hz bandpass + 50 Hz notch
Window sizes:
- 200 ms (best for DB5)
- 1000 ms (best for EPN, UCI)
Linear Classification Head
- Input: C Γ 192
- Params: <40k
Performance (Fine-tuned)
| Dataset | Metric | Result |
|---|---|---|
| Ninapro DB5 (200 ms) | Acc | 89.41 Β± 0.16% |
| EPN-612 (1000 ms) | Acc | 96.74 Β± 0.09% |
| UCI EMG (1000 ms) | Acc | 97.56 Β± 0.32% |
| Neuromotor | CLER | 0.153 Β± 0.006 |
TinyMyo achieves new state-of-the-art on DB5, EPN-612, and UCI.
4.2 Hand Kinematic Regression (Ninapro DB8)
- Predict 5 joint angles
- Windows: 200 ms or 1000 ms
- Normalization: z-score only
Regression Head (~788k params)
- Depthwise + pointwise convs
- Upsampling
- Global average pooling
- Linear projection to 5 outputs
Performance
- MAE = 8.77 Β± 0.12Β° (1000 ms)
Note: Prior works reporting ~6.9Β° MAE are subject-specific; TinyMyo trains a single cross-subject model, a significantly harder setting.
4.3 Speech Production & Recognition (Silent Speech)
Dataset: Gaddy Silent Speech (8 channels, 1000 Hz, face/neck EMG)
Speech Production (EMG β MFCC β HiFi-GAN β Audio)
Pipeline:
- Residual downsampling
- TinyMyo encoder
- Linear projection β 26-dim MFCC
- HiFi-GAN vocoder
WER: 33.54 Β± 1.12% β state-of-the-art with >90% fewer params in the transduction model.
Speech Recognition (EMG β Text)
- TinyMyo encoder
- Linear projection β 37 characters
- CTC loss
- 4-gram LM + beam search
WER: 33.95 Β± 0.97%
TinyMyo is EMG-only, unlike multimodal systems like MONA-LISA.
β‘ 5. Edge Deployment (GAP9 MCU)
TinyMyo runs efficiently on GAP9 (RISC-V) via:
- INT8 quantization, including attention
- Multi-level streaming (L3 to L2 to L1)
- Integer LayerNorm, GELU, softmax
- Static memory arena via liveness analysis
Runtime (DB5 pipeline)
- Inference time: 0.785 s
- Energy: 44.91 mJ
- Average power: 57.18 mW
This is the first EMG foundation model demonstrated on a microcontroller.
π 6. Results Summary
Pretraining
- Smooth L1 reconstruction with high fidelity
- Total compute β 4.0 GFLOPs
Downstream Highlights
- DB5: 89.41%
- EPN-612: 96.74%
- UCI EMG: 97.56%
- Neuromotor: 0.153 CLER
- DB8 Regression: MAE 8.77Β°
- Silent Speech Production: 33.54% WER
- Silent Speech Recognition: 33.95% WER
TinyMyo matches or exceeds state-of-the-art performance, while being smaller and more efficient than all prior EMG foundation models.
π οΈ Code & Usage
To fine-tune TinyMyo on downstream tasks, follow the examples in the BioFoundation repository.
python -u run_train.py +experiment=TinyMyo_finetune \
pretrained_safetensors_path=/path/to/model.safetensors
Environment variables:
DATA_PATHβ dataset pathCHECKPOINT_DIRβ checkpoint to load
π Resources
π Citation
Please cite TinyMyo using:
@misc{fasulo2025tinymyotinyfoundationmodel,
title={TinyMyo: a Tiny Foundation Model for Flexible EMG Signal Processing at the Edge},
author={Matteo Fasulo and Giusy Spacone and Thorir Mar Ingolfsson and Yawei Li and Luca Benini and Andrea Cossettini},
year={2025},
eprint={2512.15729},
archivePrefix={arXiv},
primaryClass={eess.SP},
url={https://arxiv.org/abs/2512.15729},
}
π§ Contact & Support
- Questions or issues? Open an issue on the BioFoundation GitHub repository.
Paper for MatteoFasulo/TinyMyo
Evaluation results
- acc@1 on Ninapro DB5self-reported0.894
- f1 on Ninapro DB5self-reported0.780
- acc@1 on EPN-612self-reported0.967
- f1 on EPN-612self-reported0.967
- acc@1 on UCI-EMGself-reported0.976
- f1 on UCI-EMGself-reported0.976
- CLER on Generic Neuromotor Interface (Discrete Gesture)self-reported0.153
- MAE on Ninapro DB8self-reported8.770
- RMSE on Ninapro DB8self-reported13.350
- R2 on Ninapro DB8self-reported0.620