|
heyrick
|
 |
« on: March 06, 2011, 05:43:05 pm » |
|
** If you're looking for software to repair damaged MP4 files, you won't find it here! **
I am looking into the viability of having some way of being able to attempt to "repair" damaged MP4 files. What follows below are my notes, though it looks like it is ridiculously complicated.
For what it is worth, when my other PVR used to screw up finalising files, I tried a good few programs to recover MP4 data and I can say that exactly NONE of them worked. Hell, half of them didn't even seem to understand the file format. Things may have changed in a year, but honestly I don't expect much. The only thing I can be happy about is that in my time of owning the OSD, it only messed up finalising Zatoichi, and I was able to get the DVD from Amazon (used&new) for less than the cost of the postage. ;-)
Anyway, here's a rough guide to the murky insides of the mp4 file.
God almighty. It looks like there's a giant TOC at the end of the file, with pretty much nothing being pre-defined (only the "ftyp"), in the case of a failed write.
The structure of a valid file is:
iso base
'--- ftyp = "isom" This specifies the file type, plus a compatible type. Compatible "mp41" (MPEG 4 revision 1). It also seems as if "isom" is a media tag that is more intended for in-house work, and when "in the wild" it should specify the file type correctly (i.e. "mp41").
It looks like QuickTime references ought to be the best source here; for the later "mp42" format is more in the realms of ISO and official standards ain't free...
'----- mdat Gives size of movie data, plus the sample count, which would appear to be how many fragments there are. Also there is a non-parsed size (huh?).
This may or may not be present? It looks like some data is present here in the non-finalised file, but the "mdat" ID and the word before are empty.
Note - the *actual* data following resides elsewhere.
'--- track (id #1) Defines the first track. Doesn't actually hold its own data, but instead is a sequence of chunks. In our test file, the first three chunks are at: +2756 +141195 +173453 Note the chunk sizes are NOT regular. The first word of each chunk is "00 00 01 b6".
'--- track (id #2) Defines the second track. It appears these ones are written before the video data, and they are smaller, so I think we can assume this is audio data. Test file, first three: +8 +139383 +171770 Which if we calculate offsets from track one, would be: -2748 -1812 -1683 So we cannot rely upon any offset relationship. The only thing that seems pretty much for certain is the first byte is &21, but this may not be true for long recordings.
'----- moov Is apparently a "movie" atom as opposed to a "movie data" atom. I figure the difference is the data is the raw data (duh!) and this is the offset table. The type is "moov", and it is largish containing a long list of offsets. Example data: 36 2784 139411 which don't seem to correlate to the offsets above?
'--- mvhd Movie header, gives length, speed, some sort of matrix... Of note, the tracks are listed backwards: 2,1
'--- iods Unknown, "Initial Object Descriptor". Data is: 00 00 00 21 69 6F 64 73 00 00 00 00 10 13 00 4F FF FF 0F 01 FF 0E 04 00 00 00 02 0E 04 00 00 00 01 What is this pointing to? Or meaning?
For what it is worth, it is debatable if "mp41" should even have an "iods", as this is an "mp42" enhancement by the looks of things.
'--- "trak" (id = 2)
'--- "tkhd" Track header, seems to dupe a lot of the info in "mvhd"?
'--- "edts"
'--- "elst" Edits, edit list? Seems to list media rate and segment duration.
'--- "mdia"
'--- "mdhd" Media header, gives duration (centiseconds?). It is of note that the "mvhd" gives a "timescale" of 90000 while here the timescale is 1000. Additionally, the language seems to be undefined.
'--- "hdlr" Description, which is handler "soun", track type "Sound Track", name "Sound Media Handler".
'--- "minf" Media info.
'--- "smhd" Sound media handler.
'--- "dinf" '--- "dref" '--- "url" Empty pointer to a URL, this is used my rtp and some streaming media.
'--- "stbl" Sample table
'--- "stsd" Sample table description.
'--- "mp4a" Now we know it is MP4A, I guess we can assume AAC instead of MP3. Sample rate is 16000.0, sample size is 16, and 2 channels.
'--- "esds" Extended descriptors - there are three extended descriptor tags, which are all 128. Oddly enough this is the bitrate. Coincidence? ;-)
'--- "stts" Decoding time to sample. Gives sample count and delta, usu. 64 but may vary. This may be impossible to regenerate, so we might have to just push all to be "64" and accept that some random audio missyncs are better than a junk file.
'--- "stsc" Appears to list all samples in order, specifying the description index as "1" and the samples per chunk as "2" for all of them.
'--- "stsz" Lists the size of each sample. Good luck. ;-) For what it's worth, the first three are: 2033, 715, 868. Yeah, again, another sequence of digits which are gibberish compared to previous data.
'--- "stco" Sample chunk offset. 36, 139411, 171798, which is every other (odd) entry in the "moov" list.
'--- "trak" (id = 1) [NOTE 1 follows 2]
'--- "tkhd" Track header, seems to dupe a lot of the info in "mvhd"? There is a "height" and a "width" which is specified as "480" and "640" respectively. This would correlate to the record dimensions (640x480).
'--- "edts"
'--- "elst" Edits, edit list? Seems to list media rate and segment duration.
'--- "mdia"
'--- "mdhd" Media header, gives duration (centiseconds?). As for the audio track, the timescale is 1000.
'--- "hdlr" Description, which is handler "vide" (that means empty in French! ;-) ), track type "Video Track", name "Video Media Handler".
'--- "minf" Media info.
'--- "vmhd" Video media handler. Graphicsmode=0, opcolor=0,0,0
'--- "dinf" '--- "dref" '--- "url" As before, empty URL pointer.
'--- "stbl" Sample table
'--- "stsd" Sample table description.
'--- "mp4v" Now we will know is is MPEG-4 Simple Profile, H.263. Of note: compressorname = undefined depth = 24 framecount = 1 height = 480 horizontal resolution = 72ppi vertical resolution = 72ppi width = 640
'--- "esds" Extended descriptors - which are 59,0,0. Significance?
'--- "stts" Decoding time to sample. Gives sample count and delta, usu. 40 but may vary (38-42).
'--- "stsc" Appears to list all samples in order, specifying the description index as "1" and the samples per chunk as... 5, 3, 3, 3, 3, 4, 3, 3, 3, 3, 4, 3, 3, 3, 3 [etc] I didn't see another 5 in the example file, so can we assume it is "4 3 3 3 3" repeated?
'--- "stsz" Lists the size of each sample. Good luck. ;-) For what it's worth, the first three are: 37226 36283 42108 9187 I give the 4th to show that we really can't make any assumptions.
'--- "stco" Sample chunk offset. 2784, 141223, 173481, which is every other (even) entry in the "moov" list.
'--- "stss" Sync sample, the list is: 1, 2, 3, 16, 31, 46, 61, 76, 91, 106, 121... ??? Keyframes?
I do not, as yet, know where in the file a lot of this stuff is actually located.
There are a few things in our favour. The OSD uses, primarily, one video codec (so-called "industry standard mpeg4" which is technically meaningless, what we mean is H.263SP/AVC, sort of XviD like, only in a different wrapper) and two audio codecs (MP3 and AAC). As the writing library is the same in each version, it may be possible to calculate a lot of the restoration data by brute force by taking apart enough "osd.mp4" files to get an idea of *where* the first samples are written. At its most redundant level, once we step beyond the null header, the file consists of a series of "atoms" each of which should carry a size. If this is true, then we may be able to recover data. If this is not true, we might be able to restore something by low-level searching for the marker that video chunks appear to begin with, and as the data is interleaved, we could infer where the audio chunks are. Again, this is all pretty hairy, but as we'd only be dealing with the OSD's output...
We would need user assistance for: Dimensions Audio codec and bitrate
Yeah, and, um, exactly where is the fps specified?
Right, so here is a rough overview of the file at low level:
It begins:
00 00 00 01 f t y p i s o m 00 00 00 00 m p 4 1 00 00 00 00 00 00 00 00 xx xx xx xx m d a t 21 5F FE 80 00 64 0F C8 64 0F C8 DF
The xx xx xx xx is a pointer to the "moov" element. It seems to be a couple of words early, so it might be "offset from <X>" rather than from the start of the file. This and the following "mdat" are not present in the non-finalised file. Byte order is 00 59 1E 88 is the address &00591E88 (not &881E5900 or anything weirder).
The 215F....C8DF seems to be the same? Purpose? It is the first chunk of track #2, so perhaps it is the MP4A header or somesuch?
This claims to be at offset +8, but it is really at address &24 in the file. Prior is the "mdat" and its pointer.
Following the pointer, we reach the end of the data and end up (after advancing the equal seven words - why can't this stuff start from address zero?) to the word before the "moov". It is 00 00 3F 58, which is 16216, which is the atom length.
And so for the TOC. Are the "mdat" track/chunk IDs built from the media description, or is it possible to step through the file to calculate the chunks?
So many questions... But one thing is for certain, any attempt to cut'n'paste header info will most likely result in plenty of failure. I might (*no* promises) play around with a damaged file to see if I can see what the bare minimum necessary to get a playable file actually is.
Okay, that's all for now.
Best wishes,
Rick.
|