From timed cues to plain prose
A subtitle file is mostly scaffolding — numbers, timecodes, arrows, tags — wrapped around the words. This tool
keeps the words and discards the rest, joining each cue’s internal line breaks into a single line and stripping any
<i>/<b> formatting.
input.srt output.txt
1
00:00:01,000 --> 00:00:03,000 Who's there?
Who's there? Nobody. Go back to sleep.
2
00:00:03,200 --> 00:00:05,000
<i>Nobody.</i> Go back to sleep. Speaker labels and paragraphs
Two switches shape the output. Remove speaker labels drops leading names (JOHN:) and
the “- ” dashes used for alternating speakers — useful when you want only the words. Join into
paragraphs flows consecutive cues together and starts a new paragraph after a clear gap in the dialogue,
turning a list of fragments into something you can actually read.
What it’s good for
Pulling a transcript out of a video you’ve subtitled, getting a block of text to translate or summarise, quoting dialogue, or feeding clean text into another tool. It’s the fastest way from a subtitle file to readable copy — with no timestamps in the way.
Limits
The output is only as good as the subtitles: this extracts existing text, it doesn’t transcribe audio. Paragraph grouping uses pauses between cues as a heuristic, so very dense dialogue may stay as one block. Everything runs locally — no upload, instant.