VideoHelp Forum




+ Reply to Thread
Results 1 to 15 of 15
  1. Member
    Join Date
    Jan 2015
    Location
    singapore
    Search PM
    hi, I like to keep track of the checksum of important files, sometimes I find that two same videos with the same content, exact length of playback time, resolution, size and everything else like audio rate, they are exactly the same, but two files could have a different checksum, i.e. MD5 or SHD1. Why could that happen? Does it mean they must be having a very slight difference in quality (that may not be distinguished by naked eyes)
    Quote Quote  
  2. I'm a MEGA Super Moderator Baldrick's Avatar
    Join Date
    Aug 2000
    Location
    Sweden
    Search Comp PM
    Exact same file size also?

    But you can't get any quality information from any checksums.
    Quote Quote  
  3. Two files will have different checksum even if you change only one bit. So, checksum is not useful to compare quality. However, you can use checksum on files you have to periodically check integrity of your files. Files might become corrupted due to bad sectors on HDD or due to problems with RAM memory during coping, ect.
    Quote Quote  
  4. Member
    Join Date
    Jan 2015
    Location
    singapore
    Search PM
    Originally Posted by Detmek View Post
    Two files will have different checksum even if you change only one bit. So, checksum is not useful to compare quality. However, you can use checksum on files you have to periodically check integrity of your files. Files might become corrupted due to bad sectors on HDD or due to problems with RAM memory during coping, ect.
    how can I ensure the file is perfect (compared to the copies on the server side) in the first place it is downloaded? Data corruption over time can be avoided by having several copies and keeping an eye on the checksum regularly.

    Could there be any data loss/corruption that go unnoticed during transmission from the remote side to the user side? 99% of websites don't provide checksum to the user
    Quote Quote  
  5. For normal data transfers tcp should make provide a decent error protection.
    users currently on my ignore list: deadrats, Stears555, marcorocchini
    Quote Quote  
  6. Originally Posted by yanyancr View Post
    hi, I like to keep track of the checksum of important files, sometimes I find that two same videos with the same content, exact length of playback time, resolution, size and everything else like audio rate, they are exactly the same, but two files could have a different checksum, i.e. MD5 or SHD1. Why could that happen? Does it mean they must be having a very slight difference in quality (that may not be distinguished by naked eyes)
    You're referring to what would be an individual file, and a copy of that file in another folder on the same PC, or on another PC etc, but a copy of the first file, I assume?

    For multimedia files that store data in various types of tags and from which thumbnails might be generated, maybe they could be modified by a program accessing them. Theoretically iit'd be tags being updated if a program might be inclined to do that sort of thing automatically. I don't know..... just a thought.

    I'll confess I've had so few issues with file corruption over the years I don't really give it any thought. When a hard drive went bad years ago it currpted files, but aside from that....... maybe very occasionally Firefox will save a file that's corrupted and won't unzip or open etc, but if it works as expected I've always assumed it's a perfect copy of the original. Is it possible it's not?
    Quote Quote  
  7. Originally Posted by yanyancr View Post
    Originally Posted by Detmek View Post
    Two files will have different checksum even if you change only one bit. So, checksum is not useful to compare quality. However, you can use checksum on files you have to periodically check integrity of your files. Files might become corrupted due to bad sectors on HDD or due to problems with RAM memory during coping, ect.
    how can I ensure the file is perfect (compared to the copies on the server side) in the first place it is downloaded? Data corruption over time can be avoided by having several copies and keeping an eye on the checksum regularly.

    Could there be any data loss/corruption that go unnoticed during transmission from the remote side to the user side? 99% of websites don't provide checksum to the user

    You can not do anything with files on servers you don't own, like Youtube, online storage space if you did not upload file ect. But if you want to backup your own files you can add some data redundancy.

    You could store your files in RAR archive and add recovery recoded. That will ensure that files are always the same. Even if you experience some file corruption recovery record should be able to recover original files. Downside is you will always have to extract file to open/play it.

    The second option is to use PAR2 recovery files. It is like recovery record in RAR archive except this information is stored as separate file. If file gets corrupted you can run QuickPAR to restore your original file. The bigger recovery record the better protection but takes more space. Usually 5% should do the job. There isn't any downside as files are not packed or modified.
    Quote Quote  
  8. Member
    Join Date
    Jan 2015
    Location
    singapore
    Search PM
    Originally Posted by Detmek View Post
    Originally Posted by yanyancr View Post
    Originally Posted by Detmek View Post
    Two files will have different checksum even if you change only one bit. So, checksum is not useful to compare quality. However, you can use checksum on files you have to periodically check integrity of your files. Files might become corrupted due to bad sectors on HDD or due to problems with RAM memory during coping, ect.
    how can I ensure the file is perfect (compared to the copies on the server side) in the first place it is downloaded? Data corruption over time can be avoided by having several copies and keeping an eye on the checksum regularly.

    Could there be any data loss/corruption that go unnoticed during transmission from the remote side to the user side? 99% of websites don't provide checksum to the user

    You can not do anything with files on servers you don't own, like Youtube, online storage space if you did not upload file ect. But if you want to backup your own files you can add some data redundancy.

    You could store your files in RAR archive and add recovery recoded. That will ensure that files are always the same. Even if you experience some file corruption recovery record should be able to recover original files. Downside is you will always have to extract file to open/play it.

    The second option is to use PAR2 recovery files. It is like recovery record in RAR archive except this information is stored as separate file. If file gets corrupted you can run QuickPAR to restore your original file. The bigger recovery record the better protection but takes more space. Usually 5% should do the job. There isn't any downside as files are not packed or modified.


    Hi, I don't mean to download youtube video or do anything with files on serves that I don't own, What I mean is when they are 'giving ' me a file ( = I am downloading the file), how can I ensure that the file I receive ( download ) is exactly the perfect copy of the one they give me? In most cases they don't provide checksum.

    store my files with RAR archive and add recovery recoded? I know about RAR, do I have to split the file into several rar? And what is recover recoded? I am sorry I hope you can explain more on this issue.
    Quote Quote  
  9. Banned
    Join Date
    Oct 2014
    Location
    Northern California
    Search PM
    Originally Posted by yanyancr View Post
    What I mean is when they are 'giving ' me a file ( = I am downloading the file), how can I ensure that the file I receive ( download ) is exactly the perfect copy of the one they give me?
    Ahh, you can't!

    How do you know the light turns off when you close the door of the fridge?

    But seriously why wonder, transmission errors are rare as there are already integrity checks during transmission.

    Quote Quote  
  10. Originally Posted by yanyancr View Post
    Hi, I don't mean to download youtube video or do anything with files on serves that I don't own, What I mean is when they are 'giving ' me a file ( = I am downloading the file), how can I ensure that the file I receive ( download ) is exactly the perfect copy of the one they give me? In most cases they don't provide checksum.

    store my files with RAR archive and add recovery recoded? I know about RAR, do I have to split the file into several rar? And what is recover recoded? I am sorry I hope you can explain more on this issue.
    If they do not provide checksum of the source you can not check if the file is downloaded without errors. But, depending on file type you can tell if file is downloaded with error. Installers won't run, archives won't extract files and music or video files will have artifacts or won't play at all.

    When you right click on file and choose Add to archive you will get a window with WinRAR compression options. One of those options is to enable recovery record. On second tab you can choose a size for recovery record.

    Recovery record uses advanced algorithm to scans whole file and generates additional data that can be used to recover any part of the RAR archive that is missing or it is damaged. Additional data is stored in the same archive.

    You don't need to split archive but you can if single archive becomes very large and can not be stored on chosen media (HDD with FAT32 partitions, USB drive, DVD disc ect). Recovery record works with single archive or multipart archive.
    Click image for larger version

Name:	Image 011.png
Views:	1516
Size:	14.1 KB
ID:	29853
    Click image for larger version

Name:	Image 012.png
Views:	1418
Size:	12.4 KB
ID:	29852
    This is useful for storing files as backup. For files in regular use this can be a problem as not much applications can read from archives. For those files using PAR2 recovery files is easier because recovery record is stored as separate file and original files stay as-is.
    Last edited by Detmek; 22nd Jan 2015 at 12:17.
    Quote Quote  
  11. Member
    Join Date
    Jan 2015
    Location
    singapore
    Search PM
    Originally Posted by Detmek View Post
    Originally Posted by yanyancr View Post
    Hi, I don't mean to download youtube video or do anything with files on serves that I don't own, What I mean is when they are 'giving ' me a file ( = I am downloading the file), how can I ensure that the file I receive ( download ) is exactly the perfect copy of the one they give me? In most cases they don't provide checksum.

    store my files with RAR archive and add recovery recoded? I know about RAR, do I have to split the file into several rar? And what is recover recoded? I am sorry I hope you can explain more on this issue.
    If they do not provide checksum of the source you can not check if the file is downloaded without errors. But, depending on file type you can tell if file is downloaded with error. Installers won't run, archives won't extract files and music or video files will have artifacts or won't play at all.

    When you right click on file and choose Add to archive you will get a window with WinRAR compression options. One of those options is to enable recovery record. On second tab you can choose a size for recovery record.

    Recovery record uses advanced algorithm to scans whole file and generates additional data that can be used to recover any part of the RAR archive that is missing or it is damaged. Additional data is stored in the same archive.

    You don't need to split archive but you can if single archive becomes very large and can not be stored on chosen media (HDD with FAT32 partitions, USB drive, DVD disc ect). Recovery record works with single archive or multipart archive.
    Image
    [Attachment 29853 - Click to enlarge]

    Image
    [Attachment 29852 - Click to enlarge]

    This is useful for storing files as backup. For files in regular use this can be a problem as not much applications can read from archives. For those files using PAR2 recovery files is easier because recovery record is stored as separate file and original files stay as-is.

    Thank you very much for your explanation work. Just one more question - what does the size for recovery record mean? what is the number I should choose and what does they mean?

    Thank you again.
    Quote Quote  
  12. Originally Posted by yanyancr View Post


    Thank you very much for your explanation work. Just one more question - what does the size for recovery record mean? what is the number I should choose and what does they mean?
    The size is the amount of additional data. The higher the value, the larger the archive size on HDD, but the higher the probability that a damaged archive can be repaired or recovered

    5% means 5% additional size. Only you can determine how paranoid you are about data loss. People typically use around 5-10%

    A recovery record is NOT a backup solution. If things are important to you, make multiple copies and store them in multiple different locations
    Quote Quote  
  13. Member
    Join Date
    Jan 2015
    Location
    singapore
    Search PM
    Originally Posted by poisondeathray View Post
    Originally Posted by yanyancr View Post


    Thank you very much for your explanation work. Just one more question - what does the size for recovery record mean? what is the number I should choose and what does they mean?
    The size is the amount of additional data. The higher the value, the larger the archive size on HDD, but the higher the probability that a damaged archive can be repaired or recovered

    5% means 5% additional size. Only you can determine how paranoid you are about data loss. People typically use around 5-10%

    A recovery record is NOT a backup solution. If things are important to you, make multiple copies and store them in multiple different locations
    Yes, I agree that making multiple copies are the best bet.

    but 5% addition size should only backup 5% of information , is it understood correctly? when a corruption occurs to a file, the corruption will corrupt randomly and does not choose where to corrupt. Just like a piece of perfect chocolate bar, you only protect it with 5-10% of the surface area with tin wrapper, and how about when the rest of 90-95% of unprotected area? 90-95% unprotected area is still highly susceptible to attack by ants or rats ....

    I think I must have understood something incorrectly, but I hope you can put some corrections.




    and you see it ? you see it? this is the screenshot I just captured from Jdownloader



    two files are dwonloaded 100% and their sizes matches, but MD5 CRC Check fail, I've tried to execute the MP4 , it plays fine, but could stop for a few seconds, winrar returns with error message for the rar archive and unable to extract at all.
    Last edited by yanyancr; 23rd Jan 2015 at 04:10.
    Quote Quote  
  14. You can in theory use "checksums" to verify quality and compare files - go for same metrics and do comparison on metrics base, beware of fact that quality of metrics and how they are correlated with perceived quality is completely different topic.

    http://google.com/?q=human+visual+system+metrics

    http://google.com/?q=human+auditory+system+metrics

    Error protection is integral part any reasonable data protocol - MD5 or similar checksums are good to verify integrity of system however they not provide error protection - only error detection and only for all bits at once.
    It is inefficient to add error protection to file as error protection shall use/involve/exploit protocol characteristic.
    In most of situation all data are coded in a way to provide additional level of error protection and error detection (also on HDD's, SSD's etc).

    This may be interesting for You http://en.wikipedia.org/wiki/Forward_error_correction - 5% additional data bits may provide significantly higher chance to recover all data bits. There is multiple factors to improve file integrity and error robustness.
    Last edited by pandy; 23rd Jan 2015 at 04:12.
    Quote Quote  
  15. Member
    Join Date
    Jan 2015
    Location
    singapore
    Search PM
    Originally Posted by pandy View Post
    You can in theory use "checksums" to verify quality and compare files - go for same metrics and do comparison on metrics base, beware of fact that quality of metrics and how they are correlated with perceived quality is completely different topic.

    http://google.com/?q=human+visual+system+metrics

    http://google.com/?q=human+auditory+system+metrics

    Error protection is integral part any reasonable data protocol - MD5 or similar checksums are good to verify integrity of system however they not provide error protection - only error detection and only for all bits at once.
    It is inefficient to add error protection to file as error protection shall use/involve/exploit protocol characteristic.
    In most of situation all data are coded in a way to provide additional level of error protection and error detection (also on HDD's, SSD's etc).

    This may be interesting for You http://en.wikipedia.org/wiki/Forward_error_correction - 5% additional data bits may provide significantly higher chance to recover all data bits. There is multiple factors to improve file integrity and error robustness.

    Thank you
    Since I know Chrome could stop downloads without any signs or failure message, and IDM acknowledges that they could download corrupted files, and bad RAM, bit-rot. All these could lead to possible error.
    Quote Quote  
Visit our sponsor! Try DVDFab and backup Blu-rays!