VideoHelp Forum
+ Reply to Thread
Page 1 of 2
1 2 LastLast
Results 1 to 30 of 33
Thread
  1. Hi everyone, I have this one goal to be able to control the caption signal of digital TV and using Google Translate to live translate these and output on TV screen so my parents and many other elders can understand if they're in a foreign country, can anyone point me to the right direction to learn or get help?
    Quote Quote  
  2. Member Cornucopia's Avatar
    Join Date
    Oct 2001
    Location
    Deep in the Heart of Texas
    Search PM
    You could start with CEA-708 captioning standard.

    Scott
    Quote Quote  
  3. Member
    Join Date
    Aug 2006
    Location
    United States
    Search Comp PM
    Originally Posted by devin View Post
    Hi everyone, I have this one goal to be able to control the caption signal of digital TV and using Google Translate to live translate these and output on TV screen so my parents and many other elders can understand if they're in a foreign country, can anyone point me to the right direction to learn or get help?
    What you want to do is probably more difficult and complex than you believe. You should think about limiting the scope of this project to reduce the complexity and difficulty. You would need a PC based solution and the PC would need to be equipped with an appropriate tuner for the broadcast system in the country your elders are visiting.

    Different countries use different broadcast systems. DVB-T, ATSC, ISDB-T, DTMB, PAL, and SECAM are still in use. ...and these broadcast systems use different captioning/subtitling technologies, some of which are graphics based or code-based rather than text based. The code-base subtitles would need to be decoded prior to being translated. The graphics based subtitles would have to be accurately converted into text using OCR, before being translated. Captioning/subtitling systems often provide dialog in multiple languages, as well as subtitles for the deaf and hard of hearing, so you would also need to decide which captions to translate.

    Real-time translations are probably not achievable at present. It is more likely that a delay would be needed to provide time to process the captions/subtitles. Also, most of the time OCR is not 100% accurate and Google translate isn't always accurate either, so the end result will not always make a lot of sense.

    I cannot provide a lot of help for anything other than ATSC. CCExtractorGUI can decode ATSC CEA-608 subtitles, but the author is still working on ATSC CEA-708 subtitles. My knowledge of captioning systems other than those used for ATSC is minimal and I'm not sure about what software can be used to process them.
    Last edited by usually_quiet; 19th Nov 2016 at 11:11. Reason: correct typo
    Ignore list: hello_hello, tried, TechLord, Snoopy329
    Quote Quote  
  4. Member
    Join Date
    Aug 2010
    Location
    San Francisco, California
    Search PM
    Originally Posted by devin View Post
    Hi everyone, I have this one goal to be able to control the caption signal of digital TV and using Google Translate to live translate these and output on TV screen so my parents and many other elders can understand if they're in a foreign country, can anyone point me to the right direction to learn or get help?
    How much are you paying?
    Quote Quote  
  5. Dinosaur Supervisor KarMa's Avatar
    Join Date
    Jul 2015
    Location
    US
    Search Comp PM
    Seems like any country that uses DVB will use bitmap subtitles, which differs from the text based subtitles in ATSC (which is mostly in North America). With ATSC, you can copy the text to a small file easily with CCExtractorGUI, and your TV can even change the font of the characters since it's just text data. But with DVB, which is the most prevalent OTA broadcast standard around the world, they seem to only use bitmap based subtitles. Much like the subtitles on many DVDs, bitmap subtitles are really just pictures of the words. Making it difficult to copy the subtitles to a text file, as it requires a OCR program to look at the picture of letters and figure out what letters it's truly looking at. Which even becomes more complex when not dealing with the latin alphabet.

    While I doubt you are going to find a solution to your problem cheaply, you would need a work flow like this.

    DVB tuner ----> Something to extract the bitmap subs from the stream ----> OCR to text ----> Feed to text to google ---->display google results on screen
    Last edited by KarMa; 19th Nov 2016 at 17:38.
    Quote Quote  
  6. Member
    Join Date
    Aug 2006
    Location
    United States
    Search Comp PM
    ATSC EIA-608 and EIA-708 captions don't contain text as one of the standard character codes that ordinary computer programs would recognize. They have to be decoded and converted to something ordinary computer programs can use.
    Last edited by usually_quiet; 19th Nov 2016 at 22:42.
    Ignore list: hello_hello, tried, TechLord, Snoopy329
    Quote Quote  
  7. @Cornucopia Thanks Scott, I took the time to read a little bit about it and I have a better picture now but as a pure amateur, I think there's still a lot for me to go through to get this job done. So far, I've read about and all of it from Wikipedia which are digital television, ATSC and after your suggestion, CEA-708 and closed captioning (somehow I didn't think I should have read about this one!).

    @usually_quiet So many great info, I really appreciate it! In response, believe me though I can't imagine how hard this task can be done exactly, I know for sure that I can never underestimate how complicated and even impossible it might be. I once gave up this thought and tried to research on speech to speech interpreter tools as an alternative but currently there's not so much on that side either so I went back and sought for help.

    As you suggested and I'm not so sure if this can make it any easier, my new and first specific phase of the project would be able to capture the closed captions displayed on digital TVs in US and convert them to plain text if it's not already a text based caption. The second and last part would be to translate these texts into Vietnamese (which must be displayed in Arial or Unicode fonts) and feed them back to TV signal receiver for display on screen. Does it sound right or even make sense?

    As you were saying about ATSC EIA-608 and EIA-708, it looks like both are not text based and need to be decoded/converted but since I'm aware that this whole thing is big complex job, any errors from OCR or delay displaying are more than acceptable for me right now. I just want to know that it can give some results and hopefully someone can perfect it later by make it more responsive and accurate so I'm really happy to learn from all of you that this dream can come true.

    @JVRaines I wish I majored in business and could make a profit out of the projected final product to offer you more generously but right now, I'm just a college student so $100, $200, $500? If we can get far and perfect the system to use in multiple households, I will find work during summer to pay you in thousands.

    @KarMa I have seen subtitles on DVDs and videos files as integrated and separate files but I had little knowledge until now. And thanks for providing me the guide on how to achieve this dream. Fortunately, I also live in US as some of you might so does that make the flow change in anyway as in broadcast system terms and could you please so kind to suggest some possible tools between each stage for me to try?
    Quote Quote  
  8. Dinosaur Supervisor KarMa's Avatar
    Join Date
    Jul 2015
    Location
    US
    Search Comp PM
    Could start by saying what country you want this to work in.
    Quote Quote  
  9. Your problem is very interesting. Given all the issues that have been brought out, especially the idea that some TV captions might be hard-coded, you might want to look into a way in which you could adapt the technology shown in this translator app:



    In theory, you could sit close to the TV, point your phone at the caption text, press a button, and get the translation. If this same technology could be ported (or perhaps already has been ported) to work with a camera attached to a laptop, you could set up a small camera, point it at the bottom of the screen where the captions appear, run its output to the app on the laptop, and then have the app "press the button" every few seconds so that the translated text shown on the laptop is constantly updated. You would then position the laptop in front of the viewers so they barely have to glance downwards while watching the movie.

    I have done quite a bit of work extracting both hard-coded subtitles (using OCR software built for the purpose) as well as merging subtitles from other sources, and it is always quite a bit of work. No matter what you do, this won't be simple.

    Finally, I assume that you did a lot of Googling before you posted here, and probably found some professional solutions, such as this one:

    Subtitles and captions in real-time and post-event


    The basic ides is to use speech-to-text technology to generate the caption in the original language and then translate that, using a translator built-into the laptop. This has a lot of advantages over trying to use an online translation tool like Google translate, because it eliminates lots of delays, etc.

    Bottom line: there is no simple near-free solution that I can find.
    Quote Quote  
  10. Member
    Join Date
    Aug 2006
    Location
    United States
    Search Comp PM
    Originally Posted by devin View Post
    As you suggested and I'm not so sure if this can make it any easier, my new and first specific phase of the project would be able to capture the closed captions displayed on digital TVs in US and convert them to plain text if it's not already a text based caption. The second and last part would be to translate these texts into Vietnamese (which must be displayed in Arial or Unicode fonts) and feed them back to TV signal receiver for display on screen. Does it sound right or even make sense?
    No, it doesn't make sense to involve the "TV signal receiver" (by which which I assume you mean the TV tuner). You would just need to connect the PC to a TV using HDMI. Software on the PC would take care of displaying the picture with subtitles.

    A computer running software would turn the ATSC closed captions from the output of the PC TV tuner into text-based subtitles, translate the subtitles into Vietnamese, and create text based subtitles in Unicode. Then, a software media player would be used to overlay the Vietnamese subtitles on video. If you didn't want to do this in real time, you could easily do it with free tools.

    Actually, according to its developer, CCExtractor (the command line program, not the GUI) can create text-based subtitles from CEA-608 captions and translate them in real time using a plug-in, cctranslate. http://www.ccextractor.org/doku.php?id=public:gsoc:translating_captions However, while there are many media players able to read subtitles from a file and display them, I don't know if any that currently exist can use the real-time, translated output from CCExtractor as the subtitle source.
    Last edited by usually_quiet; 20th Nov 2016 at 09:55.
    Ignore list: hello_hello, tried, TechLord, Snoopy329
    Quote Quote  
  11. @KarMa The country is US and my current main goal is to translate English captions into Vietnamese, either with or without Internet. My ultimate goal is to have a universal solution within the US so anyone can easily obtaining the tools and implement it, either free or with an affordable cost like $100 or so besides TV and PC. In this conversation however, I can deal with thousands if that is what required for my personal goal which only include my parents.

    @johnmeyer That picture is as interesting as this project don't you think? I actually saw it when researching on this and thought of putting an LCD bar underneath like what some theaters might do so but I didn't want to include any external custom screen into the plan because then we would have to deal with all kinds of different sized TVs and plus, it would bring more headache for elders in the terms of price and managing so I try to stay with what's already available physically.

    The laptop screen as you said might not work well in my case either because again the screen need to be large enough and even so, we would have to look down or up to this screen while watching the other which is the TV. Your idea of making the app "press the button" though is what lightens me the most. I have a lot of question to know basically how things work and what all the choices are and this is one of them which is to tell the engine where to stop and start a new translation. Unfortunately, I am no programmer or engineer of any kinds so I would need help from professionals if I ended up with this task.

    It looks like you've done quite a bit of research on this also so thank you for doing that. I did come across that link you shared and also saw conferences where different countries leaders using the technology or even like the live speech to speech interpreting you might have seen on Facebook and YouTube. I agree with you on the offline translation but until I find a good solution for at least Vietnamese, I would still have to rely on Google Translate for common understanding and flexibility in different languages. Below is a short passage from Wikipedia so you know how diverse the US are.

    In the United States, the National Captioning Institute noted that English as a foreign or second language (ESL) learners were the largest group buying decoders in the late 1980s and early 1990s, before built-in decoders became a standard feature of US television sets. This suggested that the largest audience of closed captioning was people whose native language was not English.

    @usually_quiet Sorry for my amateur language but it does sound more possible to me now because before all these posts, I have no idea what a TV tuner is and what it can do! So just to confirm my understanding, in a normal household setup an antenna is used for picking up the signals from broadcasters, then a tuner with a port for attaching the antenna is placed within the TV and responsible for decoding/delivering the audio/video/captions signals to TV screen/speakers to output?

    If that is correct and my focus can be completely on the PC now like the solutions you suggested, it would be so much easier than what I pictured in my head before. My questions are, are all TV signals currently in US CEA-708? If so then I would need to wait for CCExtractor developers to finish the hard work and release a new version? If not then how can I know which channel is 608 or 708 and are these the only two here in US? Is CCExtractor currently the only one for extracting from TV signals?

    I should also mention that I'm really a newbie who is not so familiar with command lines so do you think it's simple enough for me to follow, if there is instructions? On the video overlaying part, this is something that I didn't think of so thank you very much for bringing it up but will I be able to have transparent background to not covering the whole picture on TV? Should I start on trying out the tools suggested here and if so should I start with buying a TV tuner? If so which brand and from reading on Wikipedia, is it the same as a game recorder which I bought on Amazon a few years ago to do live video streaming?

    @Everyone I really thought that I had to get in touch with a programmer from a TV manufacture or closed caption vendor (which I had no idea they exist!) to get in between the antenna and TV screen but these posts really gave me some great alternatives. In fact, I have reached out to multiple TV channels and websites to get some thoughts on this and I even asked for help from major US producers of captions which are WGBH-TV, VITAC, CaptionMax and the National Captioning Institute (Wikipedia). What I'm trying to say is no matter how far this project can go, I really appreciate all the help and info and let's not forget the hardworking tool developers. Have a wonderful and blessing Thanksgiving day and season everyone.
    Quote Quote  
  12. Member
    Join Date
    Aug 2006
    Location
    United States
    Search Comp PM
    Originally Posted by devin View Post
    So just to confirm my understanding, in a normal household setup an antenna is used for picking up the signals from broadcasters, then a tuner with a port for attaching the antenna is placed within the TV and responsible for decoding/delivering the audio/video/captions signals to TV screen/speakers to output?
    Yes

    Originally Posted by devin View Post
    My questions are, are all TV signals currently in US CEA-708? If so then I would need to wait for CCExtractor developers to finish the hard work and release a new version? If not then how can I know which channel is 608 or 708 and are these the only two here in US? Is CCExtractor currently the only one for extracting from TV signals?
    CEA-608 and CEA-708 refer to the closed caption standards used for ATSC broadcasts in N. America. They are the only kinds of closed captions allowed for for ATSC broadcasts, and both are supposed to be available in all ATSC broadcasts in the USA at present. However, the PVR and media player software that I have tried which displays closed captions from ATSC broadcasts can only display CEA-608 closed captions. I'm not aware of any software of that kind which has implemented CEA-708 captions.

    Originally Posted by devin View Post
    I should also mention that I'm really a newbie who is not so familiar with command lines so do you think it's simple enough for me to follow, if there is instructions?
    It may be too soon to worry about writing command lines and scripts for using the real-time translation feature. According to the documentation on the page I linked to, it appears that CCExtractor's developer, Carlos, was still working on real-time translation at the time that page was created. I have no idea of how much progress he has made on that project, or if it could be adapted for your use. You would have to write to Carlos and ask him about the feasibility of using this feature for your project.

    Originally Posted by devin View Post
    On the video overlaying part, this is something that I didn't think of so thank you very much for bringing it up but will I be able to have transparent background to not covering the whole picture on TV? Should I start on trying out the tools suggested here and if so should I start with buying a TV tuner? If so which brand and from reading on Wikipedia, is it the same as a game recorder which I bought on Amazon a few years ago to do live video streaming?
    Most of the the PVR and media player software I've tried which can display closed captions uses a transparent background.

    Carlos really likes Silicondust's HDHomeRun tuners. This is the Amazon page for the N. American version of the HDHomerun Connect: https://www.amazon.com/SiliconDust-HDHomeRun-CONNECT-broadcast-2-Tuner/dp/B00GY0UB54

    There are a lot of unknowns, and it's possible that my idea won't turn out to be workable in the end, so before buying a PC TV tuner you should think about whether or not it would be of any use to you if this project goes nowhere.
    Ignore list: hello_hello, tried, TechLord, Snoopy329
    Quote Quote  
  13. Thanks for the heads up usually_quiet and again for the great info, I have picked PVR my word of the day because I know little about it yet it seems so interesting!

    I have emailed Carlos as you referred so hopefully he can have free time to answer and help. In the mean time, could I ask if it's possible for us to find out which captioning standard is being displayed? I can picture that we could know which standard a software can extract by reading its description but on the broadcaster side, how can we know to confirm that the signal is in this standard or that or if it even carries captions like ads/commercials between programs?
    Quote Quote  
  14. Member
    Join Date
    Aug 2006
    Location
    United States
    Search Comp PM
    Originally Posted by devin View Post
    I can picture that we could know which standard a software can extract by reading its description but on the broadcaster side, how can we know to confirm that the signal is in this standard or that or if it even carries captions like ads/commercials between programs?
    As I recall, ATSC broadcasts in the US will have both CEA-608 and CEA-708 captions, but there may be no caption text present, either because of a technical problem, or because the broadcast is exempt for one reason or another. https://www.fcc.gov/general/self-implementing-exemptions-closed-captioning-rules Most advertising is exempt.

    How do you know what kind of captions you are seeing when watching over-the-air broadcasts with a TV? By default most TVs display CEA-608 captions, white text on a black background. There is usually a menu option somewhere on the TV which must be used to set up the TV to display CEA-708 captions instead. However, I have a small LCD TV in my bedroom that only allows CEA-608 captions.

    If someone is using CEA-708 captions, the set-up for those allows customizing the text color, background color, background transparency, and font.
    Ignore list: hello_hello, tried, TechLord, Snoopy329
    Quote Quote  
  15. Ah I see, so that is why sometimes I don't see captions on TV even though the setting is ON.

    So for the option to customize captions with 708, I will have to try and see which type of captions display would be more readable by CCExtractor right? One thing that I still don't know for sure is that while the captions in either standard is being displayed on the TV window on my PC, where can I put the translated text so these two different language captions won't cover the whole screen?

    On the PVR note, does every manufacture often provide their own software for viewing TV on PC? Now that I know that I can do this right on the computer, do you think the 3rd party media player software which can stream from a TV tuner someone I should reach out next since they know how to decode the signals? Which one do you recommend for this purpose? I am trying to get the hang on the captions signals and separate it from being displayed together with AV so do you know any engineers or programmers who can give some thoughts on this?
    Quote Quote  
  16. Member
    Join Date
    Aug 2006
    Location
    United States
    Search Comp PM
    Originally Posted by devin View Post
    So for the option to customize captions with 708, I will have to try and see which type of captions display would be more readable by CCExtractor right?
    I already told you that at present, CCExtractor is better at processing CEA-608 captions. CCExtractor's developer has not yet perfected his software for decoding CEA-708 captions.

    Also, once in a while a TV show may use hard subtitles (subtitles that are an integral part of the picture itself) when someone is speaking a foreign language, or speaks with a heavy foreign accent. Be aware that CCExtractor's output won't include those unless they are included in the closed captions as well.

    Originally Posted by devin View Post
    One thing that I still don't know for sure is that while the captions in either standard is being displayed on the TV window on my PC, where can I put the translated text so these two different language captions won't cover the whole screen?
    You should only display the subtitles containing the translation. If you attempt to display both the original closed caption text and the translation, it will indeed take up too much room, and there will be too much text to read in the amount of time allotted for displaying them.

    Originally Posted by devin View Post
    On the PVR note, does every manufacture often provide their own software for viewing TV on PC?
    Not always, and the manufacturer's PVR software doesn't always include the ability to decode and display closed captions, let alone display external subtitles.

    Originally Posted by devin View Post
    Now that I know that I can do this right on the computer, do you think the 3rd party media player software which can stream from a TV tuner someone I should reach out next since they know how to decode the signals? Which one do you recommend for this purpose? I am trying to get the hang on the captions signals and separate it from being displayed together with AV so do you know any engineers or programmers who can give some thoughts on this?
    There is third-party PVR software able to display closed captions when they are present TV broadcasts. NextPVR is able to do it it quite well for CEA-608 captions. However, as far as I know, NextPVR cannot display external subtitles instead of closed captions when watching live TV. Sub (NextPVR's developer) is a busy man. He may be able to tell you if he thinks your project is feasible, but he probably won't have the time to do the work required for it, especially for free or a small salary. Writing software to do all that you want in real time will not be a simple task. I'm afraid not many programmers will be interested if you want them to work for only a small salary until such time as you can develop and sell a product.

    If you record the TV show with PVR software, turn the closed captions into subtitles, translate the subtitles, then watch the recorded TV show with the translated subtitles sometime later, that is doable now using various free tools and many media players.
    Last edited by usually_quiet; 28th Nov 2016 at 11:52.
    Ignore list: hello_hello, tried, TechLord, Snoopy329
    Quote Quote  
  17. That watch later scenario you suggest might not work well in my case since the program I'm aiming at is live and news broadcasts but I'm interested in knowing which tools are capable of doing these so could you name some popular ones and what is that feature that let you import external subtitles and display using a transparent background often called?

    As for the live display, I will seek out to Sub to at least see what he thinks. I understand that whoever can put all these things together deserves a good pay but unfortunately I can't afford that providing my current situation so I know that I can't expect much.

    On which captions language to display, I first thought of displaying only the translated language but then I thought if the English captions is not displayed on screen for CCExtractor to extract, how can I get the text to be able to translate? That brings me to look into the software that decode the signals from the tuner to see if they can separate and route it to somewhere else for translation but at this point, I'm still not sure how does the software and tuner communicate and in what form so this will be a mystery.

    Carlos has also responded to me and he said this whole thing will require some advanced programming knowledge which I have none so I guess I just have to see what I can do with all the available tools. But anyway, thank you for the heads up on the case with hard sub because I haven't thought of this so it will save me a noob question.
    Quote Quote  
  18. Member
    Join Date
    Aug 2006
    Location
    United States
    Search Comp PM
    Originally Posted by devin View Post
    That watch later scenario you suggest might not work well in my case since the program I'm aiming at is live and news broadcasts but I'm interested in knowing which tools are capable of doing these so could you name some popular ones and what is that feature that let you import external subtitles and display using a transparent background often called?
    If you buy a Hauppauge PCTV tuner, it will come with WinTV unless you buy a "white box" or MCE version of the device. WinTV has no program guide itself, but the scheduler allows making recordings based on channel, start time and stop time, to be performed one time, daily or weekly. There is also a way to use the shortcut Hauppauge provides to TitanTV for scheduling recordings. Hauppauge's PC TV tuners also work with NextPVR and Windows Media Center.

    Silcondust's HDHomerun tuners don't come with PVR software, but they work with NextPVR and Windows Media Center. Silicondust's own DVR software is still in beta and requires paying a yearly subscription.

    NextPVR, Windows 7's Media Center and WinTV use an opaque black background for displaying closed captions. PotPlayer can be set up use a transparent background for closed captions, but I've never figured out how to set it up to display closed captions correctly, using multiple lines as needed. For that reason, closed captions displayed by PotPlayer are often illegible. Not one of the DVR programs that I have mentioned above makes use of CEA-708 captions. They are decoding CEA-608 captions, regardless of what they look like when displayed by the software.

    Once you have the SRT subtitles from CCEXtractor GUI, you can open the SRT file in Subtitle Edit and use the built-in Google Translate function to translate the subtitles from English to Vietnamese.

    If you convert the closed captions to subtitles, most media players use a transparent background to display them. Just try one that you like which allows using subtitles.

    Originally Posted by devin View Post
    On which captions language to display, I first thought of displaying only the translated language but then I thought if the English captions is not displayed on screen for CCExtractor to extract, how can I get the text to be able to translate? That brings me to look into the software that decode the signals from the tuner to see if they can separate and route it to somewhere else for translation but at this point, I'm still not sure how does the software and tuner communicate and in what form so this will be a mystery.
    There is no need to play the recording or display anything in order to use CCExtractorGUI for converting closed captions to SRT subtitles. It doesn't read closed captions from the screen. CCExtractor GUI can process a file containing a recording from an ATSC source with closed captions within a minute or two.
    Last edited by usually_quiet; 29th Nov 2016 at 16:48.
    Ignore list: hello_hello, tried, TechLord, Snoopy329
    Quote Quote  
  19. usually_quiet, you are not quiet (in terms of information and helpfulness) at all! I really appreciate your breakdown with all the tools and steps you mentioned above. This sure will benefit who want to have their hands on subtitles for watch later shows which I think many. That also includes extracting subtitles from local offline source using CCExtractor but if I were to do it in real time, I would need to have the captions on and have some OCR software recognize it and translate right?

    I just made my first step by ordering the WinTV-dualHD tuner and an amplified antenna and I think this is currently how "far" I can get in the process but I would like to see if it can pop me some possible approaches. I didn't pick the SiliconDust because it seems to be more about wireless streaming which doesn't help much with my goal and the fact that it doesn't have provided software like you have said so will see how things go.

    With any of these two though, it seems to me that I can watch two channels at a time so if I want to have one for captions recognizing and the other one for viewing with translated captions, I would need to use two monitors but the out of hand question is, is there currently any Hauppauge compatible PVR software that can stream one channel to both windows and a captions on/off setting for each window despite the fact that these two windows are showing the same thing?

    On a different approach, I just tried to do a speech to text translation with Google Translate below and the result is interesting to watch if anyone is interested. Though it hangs on me eventually and Google might set a limit for day long translation but it is really something for a quick translation need. I also tried out Dragon NaturallySpeaking and SYSTRAN Translator software on for this purpose as it looks really promising on YouTube but couldn't configure it to use Stereo Mix mode and it still requires Google instant translation so I stopped there.

    https://www.youtube.com/watch?v=ma6aGOimr30

    On a "basic" user level programming question though, does anyone know if Chrome browser let us tweak its appearance with its included Developer tools so I can chop off the top URL bar and resize result box, background and font color, size, etc?
    Quote Quote  
  20. Member
    Join Date
    Aug 2006
    Location
    United States
    Search Comp PM
    Originally Posted by devin View Post
    usually_quiet, you are not quiet (in terms of information and helpfulness) at all! I really appreciate your breakdown with all the tools and steps you mentioned above. This sure will benefit who want to have their hands on subtitles for watch later shows which I think many. That also includes extracting subtitles from local offline source using CCExtractor but if I were to do it in real time, I would need to have the captions on and have some OCR software recognize it and translate right?
    No. You do not need to display the captions and use OCR. In theory if you knew how to do the programming for it, you could get the captions from the video in the transport stream in real time. They can be decoded from data in the video's GOP user data.

    Originally Posted by devin View Post
    I just made my first step by ordering the WinTV-dualHD tuner and an amplified antenna and I think this is currently how "far" I can get in the process but I would like to see if it can pop me some possible approaches. I didn't pick the SiliconDust because it seems to be more about wireless streaming which doesn't help much with my goal and the fact that it doesn't have provided software like you have said so will see how things go.
    Too bad you didn't ask me why I recommended the HDHomerun Connect before ordering. You can plug HDHomerun's tuners into the ethernet port in your PC, if you don't want to set them up on a wired home network. I have a different HDHomerun device so I know that connecting one directly to the Ethernet port works. You merely need to set up a private network on the PC to use the tuner.

    It is possible get the closed captions in real time using the HDHomerun Connect with CCExtractorGUI: http://www.ccextractor.org/doku.php?id=public:general:working_with_hdhomerun Carlos swears this works, but I have not tried it.

    I don't have time for the rest of your questions now.
    Ignore list: hello_hello, tried, TechLord, Snoopy329
    Quote Quote  
  21. Member
    Join Date
    Aug 2006
    Location
    United States
    Search Comp PM
    Originally Posted by devin View Post
    With any of these two though, it seems to me that I can watch two channels at a time so if I want to have one for captions recognizing and the other one for viewing with translated captions, I would need to use two monitors but the out of hand question is, is there currently any Hauppauge compatible PVR software that can stream one channel to both windows and a captions on/off setting for each window despite the fact that these two windows are showing the same thing?
    There is a way to run two instances of WinTV, and even with both instances tuned to the same channel, but if you are using WinTV to watch a live broadcast there is no way to display subtitles containing a translation. WinTV will only display the closed captions contained in the broadcast that is being viewed in that window.

    Originally Posted by devin View Post
    On a "basic" user level programming question though, does anyone know if Chrome browser let us tweak its appearance with its included Developer tools so I can chop off the top URL bar and resize result box, background and font color, size, etc?
    I don't know the answer to this one.

    I played around with configuring CCExtractor to process captions from UTP sent by my HDHomerun device (a CableCARD tuner). I haven't gotten anything to work yet.
    Ignore list: hello_hello, tried, TechLord, Snoopy329
    Quote Quote  
  22. Thank you very much for your time, usually_quiet. I did take a quick look on Amazon before buying and the HDHomerun was indeed the best seller at the moment but I'm not so confident right now to play with the command lines and CCExtractor so I think I will try to do a simpler approach first and see how things go. Plus my Internet port is already configured with a printer so I don't know if it will create more work to switch between the two but I will definitely order it if someone can come accross this thread and share the solution as my knowledge is so litmited right now to try out.

    AverMedia, B&H, and SiliconDust and I think Hauppauge tech support also (I contacted too many to remember!) have responded to me via email that they can't help me with choosing a product that can do this work nor how to assemble a working process so I will leave it for the future if I can gain more knowledge about programming as from your posts, it will sure help very much if I know how to play around with this guy. As for now, I will just have to make the most out of what I have and have learned from this thread.
    Quote Quote  
  23. Dinosaur Supervisor KarMa's Avatar
    Join Date
    Jul 2015
    Location
    US
    Search Comp PM
    Here is a ATSC broadcast stream of one of my local stations with both 608 and 708 subtitles, if you don't already have a sample to work with. The subtitles for this program are delayed because these are live subtitles, being written quickly after the words are spoken. These live subtitles are often full of typing errors since this is done in real time. Programs that are not done live, will usually have fewer errors and better sync with the audio.
    Image Attached Files
    Quote Quote  
  24. Thanks KarMa, I was able to see the text after running CCExtractor but I have no idea what to do with it now. Though it seems interesting that by looking its media info, both subtitles show zero byte and I couldn't see an option for turning it on in Media Player Classic either but it's definitely there somewhere in the stream isn't it?
    Quote Quote  
  25. Originally Posted by devin View Post
    I was able to see the text after running CCExtractor but I have no idea what to do with it now.
    Cut and paste into http://translate.google.com and see that the translated text makes no sense at all.

    Original English:
    Code:
    1
    00:00:01,501 --> 00:00:03,469
       THE GORILLA TO PROTECT       
    
    2
    00:00:03,603 --> 00:00:04,069
       THE CHILD.                   
    
    3
    00:00:04,204 --> 00:00:05,270
       >>> AND TOURISTS IN          
    
    4
    00:00:05,405 --> 00:00:06,071
       NEW YORK HARBOR GOT          
    
    5
    00:00:06,206 --> 00:00:06,872
       DOUBLE THE VIEW TODAY.       
    
    6
    00:00:07,007 --> 00:00:07,806
       THEY WERE THERE TO           
    
    7
    00:00:07,941 --> 00:00:09,241
       CHECK OUT THE STATUE         
    
    8
    00:00:09,376 --> 00:00:10,009
       OF LIBERTY WHEN              
    
    9
    00:00:10,143 --> 00:00:10,776
       REPORTS STARTED TO           
    
    10
    00:00:10,910 --> 00:00:11,643
       FLOOD IN OF RARE             
    
    11
    00:00:11,778 --> 00:00:12,778
       SIGHTINGS OF A WHALE         
    
    12
    00:00:12,912 --> 00:00:13,612
       NEAR LIBERTY ISLAND.         
    
    13
    00:00:13,747 --> 00:00:14,880
       AS YOU CAN IMAGINE,          
    
    14
    00:00:15,015 --> 00:00:15,547
       IT'S A HEAVILY               
    
    15
    00:00:15,682 --> 00:00:16,682
       TRAFFICKED WATERWAY,         
    
    16
    00:00:16,816 --> 00:00:17,449
       AND THE COAST GUARD          
    
    17
    00:00:17,584 --> 00:00:18,584
       PUT OUT AN ALERT FOR         
    
    18
    00:00:18,718 --> 00:00:20,486
       BOATS TO BE ON THE           
    
    19
    00:00:20,620 --> 00:00:21,053
       LOOKOUT
    Translated to Japanese, then back to English:
    Code:
    1
    00: 00: 01, 501 -> 00: 00: 03, 469
       Protecting gorilla
    
    2
    00: 00: 03, 603 -> 00: 00: 04, 069
       children.
    
    3
    00: 00: 04, 204 -> 00: 00: 05, 270
       >>> And tourists
    
    Four
    00: 00: 05405 -> 00: 00: 06,071
       New York · Harbor · Got
    
    Five
    00: 00: 06,206 -> 00: 00: 06,872
       I will double it today.
    
    6
    00: 00: 07,007 -> 00: 00: 07, 806
       They were there.
    
    7
    00: 00: 07,941 -> 00: 00: 09,241
       Check the status
    
    8
    00: 00: 09,376 -> 00: 00: 10,009
       Statue of Liberty any time
    
    9
    00: 00: 10, 143 -> 00: 00: 10, 776
       Reporting has started
    
    Ten
    00: 00: 10,910 -> 00: 00: 11, 643
       Rare species flooding
    
    11
    00: 00: 11,778 -> 00: 00: 12,778
       Signs of whale
    
    12
    00: 00: 12,912 -> 00: 00: 13,612
       Near the Statue of Liberty of Liberty
    
    13
    00: 00: 13,747 -> 00: 00: 14,880
       As you can imagine,
    
    14
    00: 00: 15,015 -> 00: 00: 15, 547
       It's heavy
    
    15
    00: 00: 15,682 -> 00: 00: 16, 682
       Traffic water way,
    
    16
    00: 00: 16,816 -> 00: 00: 17,449
       Coastguard
    
    17
    00: 00: 17,584 -> 00: 00: 18,584
       An alert is issued
    
    18
    00: 00: 18,718 -> 00: 00: 20, 486
       The boat
    
    19
    00: 00: 20,620 -> 00: 00: 21,053
       look out
    Last edited by jagabo; 5th Dec 2016 at 06:52.
    Quote Quote  
  26. Dinosaur Supervisor KarMa's Avatar
    Join Date
    Jul 2015
    Location
    US
    Search Comp PM
    Can test the translation on a full News Broadcast. Both were decoded via CCExtractor with a 1 line limit and the other had no limit, and both had their roll up subtitles removed.
    Last edited by KarMa; 5th Dec 2016 at 11:36.
    Quote Quote  
  27. Member
    Join Date
    Aug 2006
    Location
    United States
    Search Comp PM
    Originally Posted by devin View Post
    Thanks KarMa, I was able to see the text after running CCExtractor but I have no idea what to do with it now. Though it seems interesting that by looking its media info, both subtitles show zero byte and I couldn't see an option for turning it on in Media Player Classic either but it's definitely there somewhere in the stream isn't it?
    ATSC closed captions aren't stored as separate streams in TS files. They are found in the video stream's GOP user data, which is used to store a few different kinds of non-picture information, That is probably why they show up as 0 bytes. VLC can be set up to display CEA-608 closed captions when playing TS files (Subtitle->Sub Track->Closed Captions). As I recall, Media Player Classic lacks the ability to display closed captions.

    [Edit]If you have the SRT subtitles from CCEXtractor GUI, you can open the SRT file in Subtitle Edit and use the built-in Google Translate function to translate the subtitles from English to Vietnamese. Those can be saved as a different srt. Media Player Classic can play subtitles, if I recall correctly.
    Last edited by usually_quiet; 5th Dec 2016 at 12:32.
    Ignore list: hello_hello, tried, TechLord, Snoopy329
    Quote Quote  
  28. I can see the subtitles fine using MPC but only after the SRT file is extracted and present with the TS file. There might be an option to turn it on in the settings just like VLC but I'm not sure. What I notice is the texts change super fast so I think I will need to combine multiples lines together but then this will be a new challenge for this unsolved problem. The translation won't be perfect either as jagabo pointed out but what can I expect for a free project right?

    I'm reaching out to Subtitle Edit author to see if s/he can implement live and offline translation but I think, to begin solving this puzzle I would need to start from where Carlos is already at which is decoding the stream into separate signals and then decide what else can we do with it. Do you think it's correct and will I be able to do it just by reading online? I don't want to ask Carlos because this seems a little too much since the idea is like creating another CCExtractor while we already have one.
    Quote Quote  
  29. Originally Posted by devin View Post
    what can I expect for a free project right?
    It won't be a free project. At some point you will have to hire a programmer to put it all together. Unless you plan to learn to program yourself.
    Quote Quote  
  30. DIY is my plan but programming has been sitting in my to do list for quite a while now so I'm just a little doubtful about that. I'm also willing to pay for someone to do it as I said in my second post but I can't afford much to get anyone interested so looks like I'm on my own now. On a bright side though, I think I got more than what I expected from this thread so that is something to keep me moving.
    Quote Quote  



Similar Threads

Visit our sponsor! Try DVDFab and backup Blu-rays!