Ascribe - Professional Transcription SDK
A professional audio and video transcription application for macOS built as a modular .framework SDK using Deepgram's speech-to-text API. Designed for content creators, podcasters, and YouTubers to transcribe media files with professional accuracy. The SDK architecture enabled seamless integration when Dropbox acquired the product entirely and integrated it directly into their macOS application. The combination of native macOS performance, Deepgram accuracy, and modular design created strategic value that helped Axle survive financially during company transition.
Problem Solved
Content creators needed professional-grade transcription for podcasts and videos but existing solutions were either too expensive, too slow, or required uploading sensitive content to third-party cloud services
- •Designed and built .framework SDK architecture for external integration
- •Integrated Deepgram speech-to-text API with robust networking layer
- •Implemented multi-format audio and video processing using AVFoundation
- •Created Core Data model for transcription jobs with word-level timestamps
- •Built reusable UI components for progress tracking and transcript viewing
- ✓Successfully acquired by Dropbox and integrated directly into their macOS application
- ✓Built as reusable .framework SDK enabling Dropbox integration without modifying core codebase
- ✓Integrated Deepgram speech-to-text API for professional-grade transcription accuracy
- ✓Achieved native macOS performance with efficient memory management for large audio files
- ✓Opened content creator market (podcasters, YouTubers) beyond enterprise media clients
- ✓Created additional revenue stream that helped Axle survive financially during transition
- ✓Enabled seamless UX with direct Dropbox storage integration - transcribe files without duplication
- ✓Leveraged AX1 Platform media processing expertise for rapid development in transcription domain
Scale
- • Supported WAV, MP3, M4A, AAC audio formats
- • Video audio extraction from MOV, MP4
Technology Stack
Challenge
Building Ascribe as a .framework SDK rather than standalone application required fundamentally different architectural thinking with clean API surfaces for external developers
Solution
Designed layered architecture with clear separation between public SDK interfaces and internal implementation. Created comprehensive public headers with detailed documentation. Built reusable UI components with minimal configuration requirements.
Impact
Dropbox successfully integrated Ascribe without modifying Axle's core codebase
Challenge
Integrating Deepgram's speech-to-text API required handling network failures, rate limits, long-running operations for large files, and maintaining responsive UX despite cloud latency
Solution
Built robust networking layer with retry logic, exponential backoff, and graceful degradation. Implemented chunked uploads with progress feedback and resumable transfers. Cached transcripts in Core Data for offline access.
Impact
Reliable transcription experience that felt native despite cloud processing dependency
Challenge
Content creators use diverse audio formats and quality levels from various recording setups, requiring handling of codec compatibility, sample rate variations, and corrupted files
Solution
Leveraged AVFoundation's robust format handling with audio extraction from video using AVAssetReader for memory-efficient streaming. Built validation pipeline normalizing audio to Deepgram's preferred formats.
Impact
Ascribe accessible to diverse content creator workflows without format restrictions
Situation
After completing the intensive AX1 Platform rebuild, Axle needed additional revenue streams and market diversification. The content creator market (podcasters, YouTubers) was growing rapidly, and professional transcription was a pain point. Meanwhile, Dropbox was looking to add transcription capabilities to their platform.
Task
Build a professional transcription application that could serve content creators while being architected for potential integration into larger platforms.
Action
Khaled designed Ascribe as a modular .framework SDK from the start, enabling clean integration rather than being a standalone application. He integrated Deepgram's speech-to-text API, building a robust networking layer with retry logic, chunked uploads, and intelligent caching. The Core Data model stored transcription jobs with word-level timestamps and confidence scores. He leveraged AVFoundation expertise from AX1 to handle diverse audio and video formats through memory-efficient streaming. The SDK exposed clean APIs for job management, progress tracking, and transcript access, plus drop-in UI components that external applications could embed with minimal configuration.
Result
Ascribe was acquired entirely by Dropbox and integrated directly into their macOS application, providing seamless transcription for files already in Dropbox. The SDK architecture enabled integration without code modifications. The acquisition provided crucial financial support for Axle during a transition period and validated the modular design approach. Completion in May 2020 coincided with COVID-19's arrival, marking a natural career transition point.
Technical
- • SDK architecture and clean API surface design
- • .framework bundle development with versioning
- • Third-party API integration with comprehensive error handling
- • RESTful API consumption patterns with retry logic
- • Audio/video format handling with AVFoundation
Soft Skills
- • Building products with acquisition potential
- • Integration-friendly architecture for external teams
- • Documentation for external developers
- • Strategic product pivoting to adjacent markets
Key Insights
- 💡 Building for integration forces architectural clarity that benefits all development
- 💡 Clean architecture and documentation create acquisition value beyond features
- 💡 Adjacent market strategy (enterprise to content creators) can diversify revenue quickly

