Checksum on uploaded files

Why is there no metadata checksum on the uploaded files ?
Managing files without any check sum is a huge headache !
This make it really difficult to correctly make video file versioning !
this make it nearly impossible to sync correctly a catalog of video files!

the checksums are not visible by using the uplaodToken API or the media API.

Hello @jo2012,

I agree with you that calculating the checksum of the source file on the server side can help in the event something in the upload process has gone wrong and the resulting file is corrupted, there’s no doubt about it.

However, when you realistically look at the chances of that happening, you’ll realise that in actuality they are pretty slim.
The upload is done over TCP so faulty packages will be retransmitted and even if the file does end up being corrupt, the transcoding process will most likely fail.
True, in such cases, comparing the hashes between the original file on the client side with that of the server side is the easiest way to understand the cause of the conversion failure but from our experience, this is quite rare.

In the event the transcoding process completed successfully, it is highly unlikely that the source file is incomplete/corrupted.

Considering the small amount of cases in which the uploaded source file ends up being corrupted, along with the fact the hash calculating operation on big files can get a bit consuming [we do sometimes handle very large source files and of course, process quite a lot of source files in parallel at any given moment], calculating the hash was deemed uncritical.

BTW, for ingestion via the drop folder mechanism, specifically, a checksum is calculated.

I’m also interested in the reason why you’re asking about it, have you come across many such cases when ingesting files into the Kaltura server? I’m happy to look into any such issues.

I’m using the API so changing it use the drop folder will not be useful.
the checksum would have been a nice addition as a metadata to compare quickly files.
this was used as a standard in my previous project where we needed to sync a collection of streaminbg file to a secondary NAS, a outside file/web server, with amazon or even limelight.

since it’s not the first time I read that kaltura is not interested to integrates checksum in the storage.
I wont insists.

I was trying to provide more argument for it usefulness.
:slight_smile:

thank you.

joel

Hi Joel,

Like I said, I understand where you’re coming from but I hope my reply also managed to convince you that, in most cases, calculating checksums for the uploaded sources isn’t critical, which is why we haven’t made an effort in that direction.