SublyAI is an AI-powered video subtitle generator that uses Google Gemini AI for speech recognition and translation. Unlike cloud-based tools (such as VEED or Kapwing), SublyAI processes videos locally in the user's browser - video files never leave your device.
Supported formats: MP4, MOV, AVI. No file size limits.
Audio is extracted locally in your browser using WebCodecs API. Video never leaves your device.
Phase 1: LLM creates transcript with word-level timestamp precision. Phase 2: Second LLM performs final translation or transcript refinement for perfect results.
Automatic translation using Google Gemini AI. Including English, Czech, German, French, Spanish, and more.
Export as SRT, VTT, or burn-in (embed subtitles directly into video client-side).
Current AI models have inherent limitations: they either provide accurate word-level timestamps but imperfect translation, or they can perfectly adapt text for readability but lose timing precision (so-called "timestamp drift"). SublyAI is the first in the world to combine two specialized LLM models: Phase 1 extracts precise transcript with word-level timestamps. Phase 2 uses a different LLM optimized for language quality and context. The result is subtitles that are accurate in both timing and linguistic expression.
| Feature | SublyAI | VEED | Kapwing |
|---|---|---|---|
| Video Processing | Client-side (in your browser) | Cloud-based (uploaded to servers) | Cloud-based (uploaded to servers) |
| Privacy | Video never leaves your device | Video uploaded to cloud | Video uploaded to cloud |
| AI Technology | Google Gemini + two-phase processing | Proprietary AI/not specified | Proprietary AI/not specified |
| Timestamp Accuracy | Word-level precision | Sentence-level | Sentence-level |
| Speed | ~30 seconds (no queues) | Depends on queue | Depends on queue |
| Price | 60 min/week FREE | Paid (from $12/month) | Freemium with watermark |
| Import Own Subtitles | Free, no credits deducted | Limited | Limited |
SublyAI uses WebCodecs API for audio extraction and FFmpeg.wasm for video processing directly in your browser. Your video files are processed locally; only extracted audio is transmitted to Google Cloud for AI analysis.
Phase 1: Speech-to-text with word-level alignment. Phase 2: Language refinement for optimal readability and translation. This approach overcomes limitations of current models that suffer from either timestamp drift or suboptimal language output.
Extracted audio is processed over encrypted connection (SSL/TLS). We use ephemeral storage (signed URLs) - audio files are automatically deleted after processing completion.
Video
MP4, MOV, AVI, WebM
Import
SRT, VTT
Export
SRT, VTT, Burn-in (video with subtitles)
Languages
99+ languages including English, Czech, German, French, Spanish, Italian, Polish, Russian, Chinese, Japanese, and more
SublyAI uses Google Gemini AI (Vertex AI) for speech recognition and translation. We implement a unique two-phase approach where one LLM ensures precise word-level timestamps and another optimizes language output.
Yes, during beta we offer 60 minutes of AI processing per week completely free. No credit card required. After official launch, we plan competitive pricing with early adopter discounts.
Main differences: 1) SublyAI processes videos client-side (video never leaves your device) while VEED uses cloud processing. 2) SublyAI offers 60 min/week free, VEED has more limited free plan. 3) SublyAI uses two-phase AI processing for perfect accuracy.
No. Your video files are processed locally in your browser. Only extracted audio is temporarily transmitted to Google Cloud for AI analysis and automatically deleted after completion.
SublyAI supports translation to 99+ languages including English, Czech, German, French, Spanish, Italian, Polish, Russian, Chinese, Japanese, Korean, Portuguese, Dutch, and many more.
Yes, you can upload your own SRT or VTT file and perform burn-in (embed into video) for free. This feature does not deduct any AI credits.
Thanks to our two-phase approach, we achieve up to 99% accuracy. Phase 1 ensures precise word-level timestamps, Phase 2 optimizes text quality.
Client-side means processing happens directly in your browser, not on remote servers. Your video never leaves your device, ensuring maximum privacy and speed.
Typically 30 seconds to 2 minutes depending on video length. Unlike cloud solutions, you never wait in a queue.
Yes, SublyAI uses client-side architecture, meaning your video files never leave your device. AI communication is encrypted (SSL/TLS) and audio files are automatically deleted after processing.
Start generating subtitles for free today.
Get Started Free