Soothly

An offline baby cry detection system I built to run locally on mobile devices, helping parents decode what their little ones need.

As a new dad, I wanted a small weekend project to help with parenting. What started as a simple idea quickly turned into a deep dive down the rabbit hole of analyzing different baby cry datasets and building a complete training pipeline to make everything work on-device. Classic scope creep, but worth every minute!

In the future, I plan to implement these same models on embedded devices with baby monitoring video/audio systems. Stay tuned!

Demo

Soothly in action, analyzing baby cries and providing insights

Preview

Soothly app interface showing cry classification and history tracking

Tech Stack

React NativeTypeScriptMachine LearningONNXAudio Processing

Key Features

AI-Powered Analysis: The machine learning model analyzes baby cries in real-time to identify their needs - no more guesswork at 3 AM.
Quick Recognition: Get instant insights into whether your baby is hungry, tired, uncomfortable, or needs a diaper change.
History Tracking: Keep track of crying patterns and needs over time to better understand your baby's unique communication style.
Privacy-First: All processing happens locally on your device - no data ever leaves your phone.
Real-Time Performance: Optimized for speed to deliver near-instant cry detection, even on older devices.

How It Works

Audio Capture & Preprocessing: The app records audio using the device's microphone, resamples to 16kHz, and converts to mono for consistent processing.
Feature Extraction: I extract rich audio features including MFCC, Chroma, Mel Spectrogram, Spectral Contrast, and Tonnetz to capture the unique characteristics of the baby's cry.
Feature Aggregation: Features are calculated over small frames of audio and averaged to create a compact 194-element feature vector.
Model Inference: The feature vector feeds into an ONNX model that predicts the cry type along with a confidence score.

Tech Challenges

Mobile Optimization: Implementing efficient audio processing on mobile devices with limited resources was a real pain
Model Efficiency: Had to optimize the machine learning model for on-device inference while maintaining accuracy
Feature Pipeline: Creating a feature extraction pipeline that matches the training environment took several iterations
Battery Usage: Balancing performance with battery consumption for real-time analysis required careful optimization

Technical Implementation

Audio Processing: Feature extraction leverages my own package @siteed/expo-audio-studio (formerly @siteed/expo-audio-stream) for all the audio processing and feature extraction
Custom Features: Built custom implementations for features like spectral contrast that weren't available in mobile libraries
Cross-Platform: Used ONNX model format for cross-platform compatibility and performance
Implementation: Implemented in React Native with native modules for performance-heavy tasks

Back to projects