Open CD Index  
 
Search form  
 Quick Links
 » Developers 
 » Downloads 
 » Contacts 

How Open CD Index identifies a CD

Audio CDs do not contain a code which can be used to identify them, that's why in CDDB and freedb.org CD data archives were (and still is) used the infamous CDDBID: a 8 hex digits checksum computed from the offset of each track of the CD.

Unfortunately this CDDBID checksum has 2 major problems:

  • It happens that 2 different CDs share the same CDDBID
  • It happens that some CD readers read the same CD with tracks offset displaced (usually of 75 or 150 frames)
In Open CD Index we are trying to ovecome this two problems, but at the same time we are also trying to keep compatibility with freedb.org so that we can easily import its huge and Open Source database.

To do that we decided to use a checksum to identify each CD in the index, based on data available in freeddb.org: CD tracks offsets, number of tracks in the CD and total length in seconds of the CD (all those data are available in freedb.org archive).

To overcome the problem of the CD readers which read the same CD reporting different offsets, we decided to use not the absolute offsets but the difference of sequential offsets, basically the length of the tracks. In this way even if one CD reader reports as first track offset 0 and another 150, since all the tracks in the second reader will be displaced of 150 frames, the difference between them will be the same.
Unfortunately freedb.org data reports only tracks beginning offsets, so we are not able to compute last track of the CD length in frames (we could compute it in seconds, but it's not enough accurate), and this means that this method will work only for CDs with at least 2 tracks on.

This is why Open CD Index does NOT accept submissions of CDs with less than 2 tracks.

Other know limits

We also find out that while in music CDs this method works well, there is a cathegory of Audio CDs where more CDs with different contents can have the very same structure: audiobooks.
Many audiobooks with different contents are edited with the same number of tracks of the same exact length. In this case Open CD Index (as well as freedb.org and other CD databases) is unable to distinguish them.

How to compute CD ID

To compute CDID you need the following info from the CD:

  • Total number of tracks on the CD
  • Length in CD frames of all the tracks but the last one
  • Total length of the CD in seconds
Express tracks length in 5 digits hex numbers and concatenate them, then compute a standard MD5 hash from the string. Add to the 32 hex digits of the MD5 hash the number of tracks as 2 digit hex and the length in seconds as 6 digit hex. The final 40 digit hex string is our CD ID.

A PHP sample

Here follows a PHP snippet where we compute CD ID of a 18 tracks CD knowing number of tracks, tracks offsets and total CD length in seconds.

// Input data
$nTracks=18; // 18 tracks on the CD
$aryOffsets=array(150,13267,23332,34977,47940,59585,73922,84655,96255,108972,118272,129302,140710,151512,163047,172000,181667,192712);
$iSeconds=2691; // Total CD length: 2691 seconds = 44 minutes and 51 seconds
// Compute
$qq='';
for($i=1;$i<$nTracks;$i++) {
$diff=$aryOffsets[$i]-$aryOffsets[$i-1];
$qq.=sprintf("%05x",$diff);
}
$cdid=md5($qq).sprintf("%02x",$nTracks).sprintf("%06x",$iSeconds);
// Output
echo "CDID: $cdid"

This snippet will output:
CDID: 35cba5e3d204a32b0c4328f4c369cbda12000a83