Soothly
An offline baby cry detection system I built to run locally on mobile devices, helping parents decode what their little ones need.

As a new dad, I wanted a small weekend project to help with parenting. What started as a simple idea quickly turned into a deep dive down the rabbit hole of analyzing different baby cry datasets and building a complete training pipeline to make everything work on-device. Classic scope creep, but worth every minute!
In the future, I plan to implement these same models on embedded devices with baby monitoring video/audio systems. Stay tuned!
Demo
Soothly in action, analyzing baby cries and providing insights
Preview



Soothly app interface showing cry classification and history tracking
Tech Stack
React NativeTypeScriptMachine LearningONNXAudio Processing
Key Features
- AI-Powered Analysis: The machine learning model analyzes baby cries in real-time to identify their needs - no more guesswork at 3 AM.
- Quick Recognition: Get instant insights into whether your baby is hungry, tired, uncomfortable, or needs a diaper change.
- History Tracking: Keep track of crying patterns and needs over time to better understand your baby's unique communication style.
- Privacy-First: All processing happens locally on your device - no data ever leaves your phone.
- Real-Time Performance: Optimized for speed to deliver near-instant cry detection, even on older devices.
How It Works
- Audio Capture & Preprocessing: The app records audio using the device's microphone, resamples to 16kHz, and converts to mono for consistent processing.
- Feature Extraction: I extract rich audio features including MFCC, Chroma, Mel Spectrogram, Spectral Contrast, and Tonnetz to capture the unique characteristics of the baby's cry.
- Feature Aggregation: Features are calculated over small frames of audio and averaged to create a compact 194-element feature vector.
- Model Inference: The feature vector feeds into an ONNX model that predicts the cry type along with a confidence score.
Tech Challenges
- Mobile Optimization: Implementing efficient audio processing on mobile devices with limited resources was a real pain
- Model Efficiency: Had to optimize the machine learning model for on-device inference while maintaining accuracy
- Feature Pipeline: Creating a feature extraction pipeline that matches the training environment took several iterations
- Battery Usage: Balancing performance with battery consumption for real-time analysis required careful optimization
Technical Implementation
- Audio Processing: Feature extraction leverages my own package @siteed/expo-audio-studio (formerly @siteed/expo-audio-stream) for all the audio processing and feature extraction
- Custom Features: Built custom implementations for features like spectral contrast that weren't available in mobile libraries
- Cross-Platform: Used ONNX model format for cross-platform compatibility and performance
- Implementation: Implemented in React Native with native modules for performance-heavy tasks