How do I convert audio to SRT?

Drop your audio file onto the tool and start. Typist transcribes it with word-level timing, splits the text into cues, and returns a downloadable SRT in seconds.

A plain-text subtitle file. Each cue has a number, a start and end timecode, and the caption text, with a blank line between cues. Any player, editor, or text editor can read it.

Can I edit the timings?

Yes. The SRT is plain text, so you can change any timecode or wording in your editor. Word-level timing usually leaves the cues needing little adjustment.

How do I sync the SRT to a video?

If the audio is the same recording as the video, the cues line up on load. If you trimmed the start, shift all timings by that offset in your editor.

Which audio formats can I convert?

MP3, M4A, WAV, FLAC, and OGG, plus the audio inside MP4 and WebM. You do not need to convert the file first.

Is audio to SRT free?

Yes. Free minutes with no signup on the Turbo model. Longer files and the Studio model are paid, but you can generate and preview the SRT before paying.

How accurate are the captions?

On clean speech in a well-supported language, accuracy reaches around 99%. Noise, heavy accents, and overlapping speakers lower it, and the Studio model is sharper on hard audio.

Free AI subtitle generator

Free audio to SRT converter

Turn any audio recording into a timed SRT. Drop a file, get subtitle cues with short readable lines

Drag and drop, or click to upload

MP3, MP4, and any audio or video

Try free, no card required

3 AI models

4 free export formats

99 languages

Transcribe audio and video in 99 languages

English
Español
中文
Français
Deutsch
日本語
Русский
Português
Italiano
한국어
العربية
हिन्दी
Türkçe
Polski
Nederlands
Български
বাংলা
Čeština
Dansk
Ελληνικά
فارسی
Suomi
עברית
Magyar
Bahasa Indonesia
മലയാളം
Română
Svenska
Kiswahili
தமிழ்
తెలుగు
ไทย
Українська
اردو
Tiếng Việt

How it works

audio to srt in 3 steps

1
Upload your file
Drop your file or click to choose. MP3, M4A, WAV, FLAC, OGG, and more.
2
Pick language and model
Auto-detect the language or choose from 99. Use free Turbo for speed, or Studio for the best accuracy.
3
Export your subtitles
Read the transcript in seconds, then export timestamped SRT subtitles ready for any editor.

Why Typist

Built for fast, accurate transcripts

An hour in about a minute

Groq-served Turbo runs at roughly 200x real time, so your transcript is ready almost immediately

Every export, free

Download as plain text, Word, PDF, or timestamped SRT subtitles on every plan

Your file stays yours

Uploaded only to transcribe, removed afterward, and never sold, shared, or used to train models

99 languages

Auto-detected or pick your own, with the most accurate model recommended per language

Beyond transcription

Your transcript is just the start

AI summary and key moments
One tap turns the transcript into a TL;DR, key quotes, and action items.
Auto chapters
Long recordings are split into navigable chapters you can jump between.
Share or export anywhere
Send a clean public link, or export to TXT, DOCX, PDF, or SRT.

Summary

Chapters

IntroKey pointsQ&AWrap-up

Ready to turn audio into subtitles?

Drop a file and read your transcript in seconds. Free to start, no signup.

Transcribe a file

The format

What audio actually is

An SRT is a plain-text subtitle file: numbered cues with a start and end timecode and the caption text. Audio files carry no timing for captions, so Typist transcribes the recording with word-level timing and writes the cues, turning any audio into subtitles you can use over video or as standalone captions.

Audio is a family of containers and codecs (MP3, AAC, Opus, PCM), and Typist reads the common ones, so you never convert first. Lossy or lossless makes no difference for clear speech. The SRT quality tracks the recording, and cue boundaries come from word-level timing, then the words are grouped into short lines.

Where these files come from

Phone recordings, voice notes, podcasts, lectures, and interviews. Whatever a microphone captured that you want as timed captions.

Podcasts
Voice notes
Lectures
Interviews

How audio becomes textAccuracy comes from the audio, not the file type

audio fileYour upload
Audio decodedThe speech is what we transcribe
TranscriptCopy or export to TXT, DOCX, PDF, SRT

Output: SRT subtitles
Timing: Word-level
Lines: ~42 chars
Works with: Any editor

Subtitles

Timed captions, ready for your editor

why some projects stall

captions.srt

100:00:00,000 --> 00:00:03,200Today we are looking at

200:00:03,200 --> 00:00:06,400why some projects stall

300:00:06,400 --> 00:00:09,800even when the team works hard

Loads into your tools

CapCut
Premiere Pro
DaVinci Resolve
YouTube Studio
Final Cut Pro
VLC

Readable on screen

Typist re-segments long speech into short timed lines of about 42 characters, at most two lines per cue, so captions stay readable. It will not stuff a whole paragraph into one cue.

FAQ

Free audio to SRT converter

audio to srt in 3 steps

Upload your file

Pick language and model

Export your subtitles

Built for fast, accurate transcripts

An hour in about a minute

Every export, free

Your file stays yours

99 languages

Your transcript is just the start

Ready to turn audio into subtitles?

What audio actually is

Timed captions, ready for your editor

Questions about converting to text

Other ways to transcribe

Free audio to SRT converter

audio to srt in 3 steps

Upload your file

Pick language and model

Export your subtitles

Built for fast, accurate transcripts

An hour in about a minute

Every export, free

Your file stays yours

99 languages

Your transcript is just the start

Ready to turn audio into subtitles?

What audio actually is

Timed captions, ready for your editor

Questions about converting to text

How do I convert audio to SRT?

What is an SRT file?

Can I edit the timings?

How do I sync the SRT to a video?

Which audio formats can I convert?

Is audio to SRT free?

How accurate are the captions?

Other ways to transcribe