Extracting subtites

November 4, 2007

Step one is easy.
You will need the following programs:

  • transcode
  • mplayer
  • subtitleripper

Using with Linux just use the following script (originally taken from the Gentoo Wiki):

#!/bin/bash
lsdvd
echo “Please type in the stream number”
read DVDSTREAM;
mplayer -dvd-device /dev/dvd dvd://$DVDSTREAM -vo null -ao null -frames 0 -v 2>&1 | grep sid
# ask the user for sid…# the correct number is 0x20 + sid
echo “Please type in the subtitle SID, type in hexadecimal and add 0x20. Example: for sid 0, type 0x20”
read SID;

tccat -i /dev/dvd -T $DVDSTREAM -L | tcextract -x ps1 -t vob -a $SID > subs
subtitles2pgm -o subtitles-$DVDSTREAM -c 0,255,255,255
We will end up with a lot of pictures, each containing one piece of the final base64 encoded file. The next step will be to convert these pictures into text using an ocr-programm

Advertisements

What’s this all about?

November 4, 2007

It recently ordered the “The IT Crowd” DVD set from Amazon.uk. Season 1 contains some normal easter eggs, but the Season 2 DVD goes far beyond. The leet subtitles are actually base64 encoded files.
So, what to do now?

  1. Extract subtitles from DVD
  2. Convert subtitles to text
  3. Decode text to files