Convert VobSub file into .srt file
So first of all, you should have a MKV file, with a VobSub subtitle file; or a combinaison of idx/sub files.
Extracting idx/sub files
When running $ mkvmerge -i mkvfile.mkv
, you should see Track ID x: subtitles (VobSub)
(where x is a number).
Do not forget to install the mkvtoolnix package, which contains mkvmerge.
Extract this vobsub file from your mkv by using my script from here.
chmod +x sub_mkv.sh
./sub_mkv.sh mkvfile.mkv
. When it asks “Subtitle Track ID:”, write the number x see above.- Normally you’ll now have two files: a
.idx
and a.sub
(and optionnaly a.ifo
)
## OCR the two files
### By using ffmes
- Install
ogmrip
,tesseract-data-eng
andtesseract-data-fre
packages - Clone https://github.com/Jocker666z/ffmes
- Put your idx/sub files and move them alone in a folder
- Launch ffmes:
bash ffmes.sh [FOLDER OF IDX/SUB FILE]/
- Choose option 18, and the language
- Done, you have the srt file!
### Old version: by using vobsub2srt
- Then install vobsub2srt and a tesseract lang package (see all the lang package available here)
- install
tesseract-data-eng
andtesseract-data-fre
- vobsub2srt converts subtitles in VobSub (.idx / .sub) format into subtitles in .srt format.
- execute it by using
vobsub2srt mkvfile
where mkvfile is the filename of the subtitle files WITHOUT the extension (.idx / .sub). - vobsub2srt writes the subtitles to a file called
mkvfile.srt
- It’s finished.
- install
If a subtitle file contains more than one language you can use the
--lang
parameter to set the correct language. More info here
If you want to dump the subtitles as images (e.g. to check for correct ocr) you can use the –dump-images flag.
Updated on September 12, 2020.