Itís a while now that iím wondering how could I OCR a full video.
I know that there are programs like SubRip and AviSubDetector, but they arenít the best, they struggle with complicated fonts, they need a lot of manual inputs and they are a bit clucky in general for what I need them to do (OCR hardsubbed cartoons/anime episodes with complicated fonts), they arenít practical enough.

So Iím thinking: is there a way to use google keepís ocr to do this? Itís the best ocr i know at the moment and it recognised everything i threw at it, but it only accepts images that arenít larger than 4Mb.

I just discovered AutoIt, so iím wandering, is there a way to automate the process of: converting a video in a jpg sequence, crop every image, upload the first on Google keep, do the ocr, copy the result, paste it somewhere, delete the image and upload the next one?

Obviously I didnít take in consideration the whole ďtimingĒ question, so iím creating a text file with only the actual dialogs, without time stamps.
But maybe it can be done, utilizing AviSubDetector: when it stops because it recognised a subtitle, i grab the timing from it with the autoit script and i paste it on my text file.

These are just ideas, but iíd like to know what you guys think.
Is there a better way to ocr an entire video with a google ocr engine? (not tesseract, as far as i know itís not even near google keep performance-wise).
Or maybe a better one, if it exist.
Let me know