ASL to Text Video Translation

Kaggle competition by Google — top 4%.

Model to transcribe American Sign Language (ASL) from video to text. Hand position detection followed by text translation.

Approach: Image processing and data augmentation, MediaPipe for hand movement detection, then a Conformer model (convolution + attention mechanism).

Stack: TensorFlow, TFLite, OpenCV, MediaPipe, Transformers, Conformers, CNN, CTCLoss, AdamW

View on GitHub