VideoHelp Forum
+ Reply to Thread
Results 1 to 18 of 18
Thread
  1. Member
    Join Date
    Feb 2012
    Location
    Europe
    Search PM
    Hi,

    Do you know whether there are solutions/softwares that read the real lipsync of a person and then apply it on the model.
    I'm perfectly ok with the fact that these might be two different softwares.

    My need is to use the lipsync for language teaching, so I need to simulate a realistic lips movement.
    Would I need to buy any new hardware?

    I've played a bit with Daz, iClone and maybe I'll check out a bit Blender (I find Blender an awsome product, but I do not have that much time to invest in it). I can use any other animation software that can animate realistic human models.

    Of course the cheaper the better.

    Thank you
    Quote Quote  
  2. I'm not aware of any that can read from live action footage (real people) from visuals , and applying to a 3d mesh and do it with any accuracy - but there are products that do it according to audio . The products that do exist aren't that great

    Have you tried Mimic Live! for Daz ?
    http://www.youtube.com/watch?v=VVECWiYpSds
    http://www.daz3d.com/mimic-live

    Another way you can "fake" it in 2d is to use after effects script
    http://aescripts.com/auto-lip-sync/
    Quote Quote  
  3. Member budwzr's Avatar
    Join Date
    Apr 2007
    Location
    City Of Angels
    Search Comp PM
    Don't deaf people read lips? Maybe a head mounted camera? I guess you would need to mic the speaker and record that, then use "Plural Eyes" to sync it up. I've never used "Plural Eyes", but I've heard it's very good.

    Oops, hehe, you'll need an in-between person to sing the song back as the lip reader enunciates the words. This project can get complicated quick.

    Recap: OK, so everyone needs headphones, I guess, except the lip reader needs a mic.

    I give up. This is turning into a Rube Goldberg machine, hahaha. Whew, there haven't been many interesting posts lately so my brain must have turned to mash, haaha.
    Last edited by budwzr; 10th Jan 2013 at 21:43.
    Quote Quote  
  4. Member
    Join Date
    Feb 2012
    Location
    Europe
    Search PM
    Originally Posted by poisondeathray View Post
    I'm not aware of any that can read from live action footage (real people) from visuals , and applying to a 3d mesh and do it with any accuracy - but there are products that do it according to audio . The products that do exist aren't that great

    Have you tried Mimic Live! for Daz ?
    http://www.youtube.com/watch?v=VVECWiYpSds
    http://www.daz3d.com/mimic-live

    Another way you can "fake" it in 2d is to use after effects script
    http://aescripts.com/auto-lip-sync/
    Thank you,

    I tried Mimic Pro with one character and it worked, but with an newer character M4 it did not work at all.
    I didn't try Mimic Live, and was wondering whether it's worth buying it. I found a few threads on the DAZ forum that are less encouraging
    http://www.daz3d.com/forums/viewthread/13569/#196933
    http://forumarchive.daz3d.com/viewtopic.php?t=169909

    Thank you for the Adobe after effects script. I do not have AE, and I understood that it's expensive... I would check once the price.

    Do you anything about CrazyTalk? or the iClone type of products? are they any better?
    Quote Quote  
  5. Originally Posted by jimmyy View Post
    Do you anything about CrazyTalk? or the iClone type of products? are they any better?

    Not sure about iclone, but I tried CrazyTalk a few years ago and wasn't that impressed. I don't know what your expectations are

    If you're looking for truly "realistic"; you won't get that unless you do facial motion capture with many markers/trackpoints . The systems are very expensive with dozens, even hundreds of cameras that capture different angles. The software is expensive too . There are "markerless" facial cap systems, but they tend to be not as realistic. A low end system might be $5-10K (including cameras, and software). High end...is obscenely expensive
    Quote Quote  
  6. Member
    Join Date
    Feb 2012
    Location
    Europe
    Search PM
    Originally Posted by poisondeathray View Post
    Originally Posted by jimmyy View Post
    Do you anything about CrazyTalk? or the iClone type of products? are they any better?

    Not sure about iclone, but I tried CrazyTalk a few years ago and wasn't that impressed. I don't know what your expectations are

    If you're looking for truly "realistic"; you won't get that unless you do facial motion capture with many markers/trackpoints . The systems are very expensive with dozens, even hundreds of cameras that capture different angles. The software is expensive too . There are "markerless" facial cap systems, but they tend to be not as realistic. A low end system might be $5-10K (including cameras, and software). High end...is obscenely expensive
    Hi,

    My experience is with software that was not working properly.I had iClone 5 installed as trial, and I still have Daz.

    My requirements are far from requiring an investment of thousands of $, it is a hobby, out of pleasure, not a business, and moreover my wife would kill me if I would spend that amount

    I would like to have realistic lip movements in terms of the way sounds are pronounced with regards to the non-English language, and the fact that the software that I've used was very far from reality.

    I've came across Zing Track 2, that looks that it's not working with DAZ, and thought that this could be a solution, a software of around 150$ , no hardware... why not, but most probably I was dreaming with my eyes open

    If you happen to know some software that works reasonably well in reading the lip movements I'd be happy to try it out.

    Thank you very much for your support, it is appreciated
    Quote Quote  
  7. Member budwzr's Avatar
    Join Date
    Apr 2007
    Location
    City Of Angels
    Search Comp PM
    You might be able to composite the realtime motion of the lips, via a displacement map, onto another set of lips. In fact, I'm sure of it.

    I could describe the process, and perhaps someone else could help you understand how to apply it. I'm a bad teacher, I have no patience.
    Last edited by budwzr; 12th Jan 2013 at 20:13.
    Quote Quote  
  8. Originally Posted by budwzr View Post
    You might be able to composite the realtime motion of the lips, via a displacement map, onto another set of lips. In fact, I'm sure of it.
    like Conan O'Brian "lips"???

    Quote Quote  
  9. Member budwzr's Avatar
    Join Date
    Apr 2007
    Location
    City Of Angels
    Search Comp PM
    Hahaha, not exactly. That's a crude facsimile of what I have in mind. I propose to capture the motion-only.

    Stay tuned, I have to return something at WalMart. And collect my thoughts. My brain has latched onto the concept, but I have to flesh it out procedurally.

    Don't konk out yet.
    Last edited by budwzr; 12th Jan 2013 at 20:23.
    Quote Quote  
  10. Originally Posted by jimmyy View Post

    I would like to have realistic lip movements in terms of the way sounds are pronounced with regards to the non-English language, and the fact that the software that I've used was very far from reality.
    That might be part of your problem; many software are attuned to English "phonemes" . Tones and pronunciations are different even within one language with a variety of accents. The other part of the problems might be using relatively low poly and/or models not rigged for complex speech - some mouth movements might be limited. Human speech is very intricate to model in a realistic fashion.


    I've came across Zing Track 2, that looks that it's not working with DAZ, and thought that this could be a solution, a software of around 150$ , no hardware... why not, but most probably I was dreaming with my eyes open
    I haven't heard of this one, if it works out for you please write a few comments about your experiences with it


    If you happen to know some software that works reasonably well in reading the lip movements I'd be happy to try it out.
    As I said earlier, I don' t know of any that work well from visuals (eg. analyzing pre shot footage). The only ones that work fairly well, are marker tracked facial mocap systems
    Quote Quote  
  11. Member budwzr's Avatar
    Join Date
    Apr 2007
    Location
    City Of Angels
    Search Comp PM
    Yeah, there are a lot of things that could make this undoable.

    But, if the subject and camera are fixed, I think it's technically possible to build a luminance mask on the lips, which have their own color, using a Secondary Color Corrector.

    And the speaker could be coached to minimize tongue and teeth showing, maybe? Like maybe pretend to be paralyzed except for the lips.

    Also, the camera can be very close, to help build the mask, then scaled down later.

    Anyway, the completed mask could be parented with a gradient, then the gradient would distort, yes? Then capture that distortion as a bump or displacement map?

    Sounds plausible.
    Quote Quote  
  12. Originally Posted by budwzr View Post
    Yeah, there are a lot of things that could make this undoable.

    But, if the subject and camera are fixed, I think it's technically possible to build a luminance mask on the lips, which have their own color, using a Secondary Color Corrector.

    And the speaker could be coached to minimize tongue and teeth showing, maybe? Like maybe pretend to be paralyzed except for the lips.

    Also, the camera can be very close, to help build the mask, then scaled down later.

    Anyway, the completed mask could be parented with a gradient, then the gradient would distort, yes? Then capture that distortion as a bump or displacement map?

    Sounds plausible.

    I can see several problems:


    I think he wants apply it to a 3d model (e.g. .obj) . A 2d gradient won't "map" to a 3d mesh very well, even if it's in 1 shot face on

    How would you build the gradient in the 1st place? What characteristics of the source footage would provide the gradient?

    A gradient won't be able to drive a mouth animation very accurately
    Quote Quote  
  13. Member budwzr's Avatar
    Join Date
    Apr 2007
    Location
    City Of Angels
    Search Comp PM
    Yeah, this thing gets ugly, fast. Hey, gotta run out for a pizza, I'm hongry.
    Quote Quote  
  14. Member
    Join Date
    Feb 2012
    Location
    Europe
    Search PM
    Hi,

    Thank you for all the brainstorming.

    You are right poisondeathray that my inention is to apply it on a 3d object. and the non-English language plays a role as well in making it more difficult o find what I'm searching for.
    I was thinking even at 2D models but I haven't found any much better solutions.

    I haven't tried Zing Track 2, I've just read the forum of Daz:
    http://www.daz3d.com/forums/viewthread/3748
    and concluded that Zing Track 2 does not really work, and if I understood correctly it's mainly that you can't really capture the facial movements.

    I started searching the internet for alternatves to Zing Track 2, and found Makarad from din-o-matic, but it's 1500$, way over my buget. the reviews that I read were good of Maskarad.
    http://www.di-o-matic.com/products/Software/Maskarad/#page=overview

    din-o-matic has as well other cheaper products, suchs as Live MX,
    http://www.di-o-matic.com/products/Software/LipsyncMX/
    which is 2D, but looking at the samples on their web-site:
    http://www.lipsync-mx.com/
    it looks like their polycount is very low.

    I was thinking a doing a close-up of the face with the mouth movements.
    Imagine another scenario, a deaf person trying to learn the sign language. He/She would need to see a clear lips movement, even if it's not the best looking picture... and even if it's 2D, but to be able to distinguish between an "a" and an "e".

    For example I was thinking of creating a free lesson for only pronouncing the sounds, (e.g.to start with the vowels)

    Do you think lispync MX from Di-o-matic might do it?

    I'll try to download the trial version and give it a try.

    Thank you one more time for all your help
    Quote Quote  
  15. Originally Posted by jimmyy View Post

    I was thinking a doing a close-up of the face with the mouth movements.
    Imagine another scenario, a deaf person trying to learn the sign language. He/She would need to see a clear lips movement, even if it's not the best looking picture... and even if it's 2D, but to be able to distinguish between an "a" and an "e".
    Can I ask why this has to be done with a 3D model (instead of real live action, with real people) ?

    For example I was thinking of creating a free lesson for only pronouncing the sounds, (e.g.to start with the vowels)

    Do you think lispync MX from Di-o-matic might do it?

    I'll try to download the trial version and give it a try.

    I doubt it will be accurate enough for what you want , I think those are markerless . Markerless trackers tend to have drift or lots of error . It's worth checking out since they have a trial

    Not only do movements have to be precise, the models have to be higher poly, and rigged properly

    You can't apply motion data to a non rigged mesh and expect it to move properly
    Quote Quote  
  16. Member
    Join Date
    Feb 2012
    Location
    Europe
    Search PM
    Originally Posted by poisondeathray View Post
    Can I ask why this has to be done with a 3D model (instead of real live action, with real people) ?
    This is a very good question.

    The other option is to use a real camera, with a real person, that pronounces the sounds very slowly...

    I tried it, and spent 200$ on reting the equipment (camera, mics, actors)... but I'm disappointed by the results, because I've never recorded any semi-professional material, and sometimes the camera is too low, sometimes it is too dark, sometimes there is background noise..., I could not get two wireless mics, I got only one and a rod microfone.

    And the most difficult part is that I would have to shoot in another location, and I have to fly to get there... because I shot in a different location I had less options, and got the last camera they had on rent, a MiniDV one, and had to import the movie at home and it was a small nightmare with the firewire that was not working (after having struggled to make the firewire PCI card to work on WinXP it does not work anymore on Win8), and realized that the rod mic mounted on the camera appears on the shooting, in the upper right corner...

    I know that I could do a croma key, but that would mean as well quite some work, maybe less money? with a less expensive camera?

    Whereas like this with the animation, I could even get the sound files recorded by myself or somebody else and have them animated by the software, and change the camera position if needed, do a close-up...

    I was thinking maybe to buy a semi-professional camera, maybe second hand, but the most important part is the possibility to plug-in an external mic..., and at least new, that is more than 1000$.

    I'm happy that you were thinking about the same option
    Quote Quote  
  17. If I were doing this, I would do it with real people. It would be less expensive to get consistent good results, easier. Yes, there is a lot to learn about shooting/studio setup , audio, but in the long run I think it would be a better investment - you can use this equipment for other purposes as well, so much more flexibility. To get realistic results for 3D models, you really need a mocap marker tracked system. And even if you spend a load of cash, it still won't be as good as the real thing
    Quote Quote  
  18. Member budwzr's Avatar
    Join Date
    Apr 2007
    Location
    City Of Angels
    Search Comp PM
    The benefit of using a model is the model can be adapted to different ethnic, cultural, or cartoon for kids, etc. Like "Barney".

    There is another possibility. Motion Graphics.

    There is a newer particle tool variation out now that uses a "pixel array" that can be fully animated in 3D, but can also be tuned to react to sound. It's quite sophisticated, in fact you can tune it to react to different frequencies.

    Boris Continuum 8 has it. You can check it out at the Boris site. Look here: http://www.borisfx.com/sony/bccsvp/movie_gallery.php
    Last edited by budwzr; 13th Jan 2013 at 17:14.
    Quote Quote  



Similar Threads

Visit our sponsor! Try DVDFab and backup Blu-rays!