Backstory: in additional to offsite/cloud, I maintain two local copies of captures: one on a NAS and the other on a local PC. PC copy is NTFS on high quality drives (WD Blacks or Red Pros or Seagate Exos X-series) and the NAS was ZFS with duplication for some years then later migrated to Stablebit Drivepool running on high reliability drives (Red Pro, Ironwolf Pro or Exos X.) In short, all fairly robust, quality hardware and software. I typically capture on one PC then, post-capture, copy the file(s) to these two remote destinations. I use BeyondCompare to regularly compare directory trees between the two systems to keep things in sync, but I generally have not done bit-level comparisons.... there is on the order of 30+TB worth of captures so bit-level would take forever!
I had to do some data migration recently so I decided to do some bit-level compares for the first time in a long while before moving data. To my surprise, across about 12TB that I compared between the two systems, two files showed up with a single bit difference between copies. One is a 47GB SD DV capture, the other is a 209GB huffyuv file. In both cases a single bit is flipped and I don't have a good way to check offhand which is the 'good' file. I can pull down the cloud copy but given the sizes... ugh. I know where in each file the flip has occurred, so I'm going to try arithmetic + virtualdub to see if I can hone in on the frame in each file where the error occurred--hopefully it'll manifest as some corruption.
Interestingly, both files were captured in early 2013, within a couple of months of each other.
While I need to do the post-mortem on how this occurred in the first place, and put in measures to ensure it doesn't happen again, what I'd like to know is whether there is an automated test mechanism I can run against captures, that is codec aware, to detect corruption issues. Similar to using 7-zip to test an archive. Since I often capture but do not post-process (unless there's an immediate need to) I'm counting on the integrity of these files both at capture-time and during deep storage. Not having an action in my workflow to verify integrity (both post-capture and routinely as part of long-term storage) is a big hole in my workflow. Tools that can help with this would be useful!
+ Reply to Thread
Results 1 to 5 of 5
I use Multipar, which is a frontend to handle PAR2 files. PAR2 can both check to see if any bits have changed since you first created the PAR2. And at the cost of using up more storage, it can fix damaged files with repair blocks. One repair block can be used to fix one damaged block of the original file. You can tell the program how big or small you want the blocks, with big blocks being more efficient but are not good for random damage throughout a file.
I normally put as much as I can on a BD-R, then for the remaining 200MB I just fill it will PAR2 data. Which will cover the data should some minor problems arise in the future.
You can try Video Comparer (there's a fully functional free version with a 500 file limit) in Full mode on your bad video(s), but it's really slow. A test I did for this this thread took ~12 hour (not the days that I thought it would) to scan 1113 files, 1.6TB in Through Mode (one step below Full).
What themaster1 and KarMa (hey, I just realized it's not Kar Ma, but karma! ) said is all you can and should do. Verify the files during copying and use Par files for repairs. You can also RAR the files and add REV (Recovery) files which are essentially the same as PAR files. None of these will perfect, but it more likely that bitflip or corruption will occur during copying/moving than directly on the disc, optical or hard drive.
Bottom line. Test often and keep a log and you'll never know when and how the corruption happened.
No hard drive, pc, is perfect.
There's always failure rates and error rates on hdds - see hd specs.
For the pc, same with ram.
Unless you're using ecc ram, no error correction if the team bit happens to be bad (can be anything from static to gamma rays).
Windows Server has the features built in to verify, maintain integrity of data in multiple drives.
Copy corruption - copy /v, terracopy, etc can all do verify after write copies.
Drive corruption - raid 5 typically can handle bit errors on a drive due to the ecc it automatically creates.
Lots more ideas
Last edited by babygdav; 14th Feb 2020 at 01:40.