Due to lack of interest, the Disc ID DB will go offline on March 1st. The source code will continue being available for those interested. Note: This documentation has been updated on Feb 9, 2009. Disc ID DB is my attempt at standardizing identification of optical discs to make it easier to classify and sort optical media backups. Currently the code is in beta, but should work to at least test the waters. It can identify DVD media using the libdvdread’s method for generating the disc id and can identify DVD’s of TV shows and movies, using the TVDB and the Movie DB as sources. There is no magic here: nobody sent me a list of hundreds of thousands of DVD id’s and their contents so if you want to use this database, it’s up to you to populate it. If you want to use this service you need to do the following:

Step 0: Generate a Disc ID

libdvdread has a straightforward way of doing this. If you build it from source you will find a disc_id binary in the build tree. Just in case you want to write your own disc_id program/script the algorithm is very simple:
  1. Mount the DVD
  2. Go to VIDEO_TS directory.
  3. Concatenate the VIDEO_TS.IFO and up to 9 VTS_0X_0.IFO files where X is the number from 1 to 9.
  4. Take an MD5 hash of the resulting string.

Step 1: Look Up Your DVD

Direct your favorite browser to http://disciddb-test.igorpartola.com/dvd/5cd9d0b31adb223dd199df80fd0e6e88.debug.You will get a response object formatted using PHP’s print_r() function. The object will have two mandatory attributes: “success” and “error”. “success” indicates whether what you were trying to do worked or not. 1 or true indicates a success and 0 or false or an empty string indicates a failure. In case of a failure the “error” attribute will contain the details of the failure. The third attribute called “data” is optional and will contain the payload of the response. In this case, if the request was successful, but the data is NULL, the DVD was not found. Let’s say this is the case so we need to identify the disc. A note on formats: currently only two formats are supported: JSON and “debug”. Debug stands for output of PHP’s print_r() function. I plan on adding more but for now these two should suffice.

Step 2: Lookup a TV Series or a Movie

Let’s say you know that the DVD is a part of “The Big Bang Theory” series. Go to http://disciddb-test.igorpartola.com/lookup-series/big bang theory.debug. You will get a listing of the TV series that match your search criteria (big bang theory), which will include series #80379 – The Big Bang Theory. This is the TVDB ID of this series, so if you have other code that talks to their API, then you can use it (see errata section). In fact the “lookup-series” call is purely a convenience at this point, although I might add some caching later on. Now let’s assume that the DVD contains a movie: “The Hangover”. Go to http://disciddb-test.igorpartola.com/lookup-movie/hangover.debug. You will see a listing, very similar to the one for a TV series, but containing a list of movies. Grab the ID of the movie to use later.

Step 3: Getting Series or Movie Details

Now that you have a series ID written down use it to look up the details about the series: http://disciddb-test.igorpartola.com/series/80379.debug. The structure should be self-explanatory: a series contains seasons, which contain episodes. The data returned is a subset of the data that the TVDB currently has, so if you want to know what a particular field means check out their documentation. Using a movie ID, you can get the details of the movie as well: http://disciddb-test.igorpartola.com/movie/18785.debug. This is very similar to the movie data you got from the search, except you actually get the movie runtime.

Step 4: Identify DVD Tracks as Episodes or Movies

Since you will want to enhance the Disc ID DB, you will identify which tracks on the DVD correspond to which episodes of the series that you got from the previous call. After you do so, you will need to put together a JSON structure of the following format:
{
    "341996": 2,
    "342790": 3,
    "356711": 4,
    "358455": 5,
    "360371": 6,
    "363199": 7
}
Here the keys are the episode ID’s from step 3 and values are the title numbers on the DVD. Next simply submit a POST request to http://disciddb-test.igorpartola.com/update-dvd/5cd9d0b31adb223dd199df80fd0e6e88.debug with a single POST variable called “episodes” where the value is, you guessed it, the JSON string we put together above. Similarly you can identify movies. You need to put together the following JSON structure:
{
    "18785": 1
}
Here the keys are movie ID’s (yes you can have multiple movies on one disc) and the values are track numbers. Now do a POST request to http://disciddb-test.igorpartola.com/update-dvd/6f7f354890f2e028dd5a1e921d9ad44f.debug with a parameter called “movies”. Technically you could submit both episodes and movies data at the same time, but I don’t think there are any DVD’s out there that actually do this.

Step 5: Look Up the DVD Again

After you have submitted the identifying information about the DVD you can verify that everything worked as advertised: go to http://disciddb-test.igorpartola.com/dvd/5cd9d0b31adb223dd199df80fd0e6e88.debug and look at the structure. You should see a “DVD” object that contains “movies” and/or “episodes”, each of which lists a season (each of which lists a series). Note that a particular DVD could technically contain episodes from multiple seasons or even different series. This is functionality intentional, but the possibility of something like this are slim with commercially authored discs.

Misc functions

http://disciddb.igorpartola.com/system/stats.debug contains the database stats such as the number of DVD’s it currently has. http://disciddb.igorpartola.com/backups/disciddb-latest.sql.bz2 contains the latest database dump in case you would like to see it. Use it in conjunction with the schema.sql file located in the SVN repo to use it.

Final Thoughts

Try all your code against http://disciddb-test.igorpartola.com. The production database is located at http://disciddb.igorpartola.com, but please don’t run your development/test code against it as bugs in your code might make it so that I have to restore the database to a previous state. If you want to check out the internals of the Disc ID DB: svn co https://disciddb.svn.sourceforge.net/svnroot/disciddb/trunk.

Errata

There are several issues with this service that you should know about. I do not consider any of these critical at this point, however I will try to work them out as time permits:
  • You need to do a series/#####.format or a movie/#####.format lookup before doing an update-series or update-movie call. This is because update-XXXXX calls do not automatically download the TV series or movie data into the Disc ID database. You might ask why I would want to keep a local copy of this data and the reality is, I don’t. However, it will both increase responsiveness of any future requests AND is required by both the TVDB and the Movie DB’s API TOS. I don’t think this is a problem, unless you have a bunch of code that runs against either of these databases directly. If it does, simply issue the necessary calls, verify that no error happened and ignore the rest of the returned data.
  • Movie requests are currently cached indefinitely. This is a problem since if a movie’s listing is incorrect in the Movie DB and the Disc ID DB has already cached it, there is no way to clear that cache. I don’t think this is a problem in the short term since by the time a movie ends up on a DVD, it’s most likely already in the Movie DB and all the data is correct.
  • Currently anybody can overwrite any DVD’s content. This is a problem in terms of spam. I don’t consider this a huge problem since I do make nightly backups. In the future, I will implement more sanity checks on DVD content so that we don’t end up with corrupt data.