I have recorded in the studio a dialogue, and now I would like to make a short movie to publish it on youtube to support my educational material.
It's a dialogue between two persons.
I was thinking of dividing the frame in two parts for example left part person 1, and right part person 2.
I'm planning to buy some pictures to represent person 1 and 2.
I would need your help with two questions:
1) which size shall I buy the pictures, the smaller (450 x 300px) are cheaper, but I can afford as well the bigger size (848 x 565px).
I do not really need HD resolution for my educational free movies, but I'm thinking that nowadays 450x300 is quite small, but if I double this I end up with a frame of 900 x 300 which is pretty strange as format. I could place the subtitles under it and this might lead to 900x400 but this is still strange as format.
2)how to highlight who is talking?
a)I could add a roundish text box with the actual text, or
b)I was thinking to place the subtitles at a time under each speaker? or
c)somehow to darken the picture of the person who is not speaking... or
d)use only one picture at a time of the person speaking and fill up the whole frame only with that picture (the bigger size of picture then) and then it will be obvious who is speaking, but I was thinking that it might become to flickery since in a dialogue the speaker changes sometimes very fast if his/her line is short which is my case
Many thanks for helping me out
+ Reply to Thread
Results 1 to 19 of 19
why not use 2 cameras and 2 friends to read the script and video it. then edit it to cut between the speakers.--
"a lot of people are better dead" - prisoner KSC2-303
It's about the scenery, the dialogue takes place in a specific place (e.g. restaurant) so I hope one day I'll be able to shoot it in a restaurant, but that's quite complicated to be arranged in my particular case (e.g. I might have to travel to a specific location...)
Unfortunately my friends are more serious types than the ones willing to go in front of the camera.
So with the pictures I can somehow "simulate" that environment, the question that I had is how to organise the pictures in the frame.
I would appreciate an answer on my initial questions.
I'm still hoping that someone could actually help me understand whether having one person only in a frame is better than having to persons in a frame... at the end if I can understand which one is better, the rest I can try to figure it out by myself.
My fear is that if I have only one person (one picture) on a frame it will be too complicated for watching it having to change every 10-30 seconds the person... and then it will be difficult to follow.
I think this is more a shooting technique rather than a pure issue that I'm facing in my little project, the question being, if the exchange of lines is very fast, does it make sense to focus all the time on the speaker? I would tend to answer:not necessarily, sometimes you want to see the emotions/reactions on one speaker's face when the other one is saying something. But in my case since I'll have still images, I can't show any emotions and anyway the dialogue is quite simple.
Another complementary question would be: after how much time I can change the focus without making the experience of the viewer to awkward: I remember something like the rule of 5 seconds? but that is too little, to change every 5 second the speaker...
Many thanks for your support
Last edited by jimmyy; 23rd Feb 2016 at 16:34.
What is the background context ? What is this about ? Why are you doing it /what is the purpose, or what is the topic of the conversation at least. That would probably help to guide how you might do it. You mentioned educational, specific place, 2 people - but that's pretty broad
For example, if it's a dialog about cancer you would probably handle it differently than a talk about beer or the superbowl
Why do you need to "buy" the pictures ? What is it about ? Do you need a specific , famous, face ?
You have over-looked the obvious.
To the watcher, it is easy to know who is talking since no two people sound alike.
Watch some dialogue scenes from movies or cartoons you like and see what kind of pace makes sense. See what you can follow comfortably. You're asking for a technical answer to a creative problem, and it seems you may be underestimating your audience
That said, try stuff out. Then tweak it and make it better,
Couple of thoughts:
You get out what you put in (or GIGO): These kind of decisions should all be made with regard to the content, the emotion(s) you are trying to convey, and how you are trying to direct the audiences' focus. There are probably 120 different ways to shoot & edit/composite this, depending on what you are trying to convey. Don't be stingy or lazy with your choices in one area, just to accommodate another. It'll show.
There are literally MILLIONS of people wanting to get break into acting who could do this job, so the bit about not using your "serious" friends is a poor excuse. Are you going for INFORMATIONAL/DOCUMENTARIAN or for DRAMATIC rendition on this. The answer should help steer you.
It seems you are going about this in a strange fashion/order: decisions like this should usually be done prior to actually recording ANYTHING ("pro-production"), yet you've already got the dialog recorded & edited, it seems. Well, think about similar historical projects that may have gone the way yours has. Re-purposed radio shows? Music videos?
You could possibly do a greenscreen shoot, and "place" you actors in the scenery/environment that you aren't able to travel to (here is where bought stock footage might make sense)?
Those arguments about the pacing of editing or the subtitling are weak: both have successfully used a wide variety of timings & styles over the years. Again, it's about what you want to convey. Figure out the WHY before you figure out the HOW (it should guide you).
@DB83, no 2 people sound EXACTLY alike, but viewers are often fooled. That's why there's such a thing as impersonators & impressionists. Not sure how this pertains anyway.
Of course with an impressionist one does see the performer so one knows it is not the real person.
Basically here we have a 'radio' performance. Having already created the script the simplest thing for the OP is to create a short introduction. He could still use the photos (I also do not see merit in this approach) and even use the non-speaker as the point of focus -full screen - which could indicate that the other is also listening.
Frist of all I would like to thank you all for your answers and your constructive approach. I'll try to answer and clarify all questions.
I have a website with free educational materials, it's about learning a foreign language.
I managed to find two native speakers, a man and a woman, that recorded the lines in a studio.
The dialogue is teaching you how to book a table, order a meal at a restaurant... these are short dialogues of max 10 lines each (10 lines man 10 lines woman).
I would love to shoot the dialogues with real actors, but it's quite complicated since I do not live in that country, so I would need to arange everything...
And actually I tried it some years ago, and the actors were poor, actually they were not actors, they were just some friends, and in the restaurant there was background music... I couldn't really use that shooting.
I'm planning one day to still find another restaurant and shoot the dialogues but in the meantime I would like to give some visual aids to the students (public at large since the lessons are free), and this is why I would like to make a few short movies with the dialogues.
Each person has it's own way of assimilating new information, and learning new things, some of us are more visual other remember things easier if they hear it, this is why I would like to have some video support.
I also find it much easier to remember if I would watch a movie, even with pictures related to the subject of the dialogue. This is why I found nice pictures with a waiter on the phone and another man on the phone (the one that orders), and then other pictures related to my dialogues.
I would love to do all this by myself, but this is a hobby as I have a job, family so cannot really afford to do it myself, and the lessons being free the budget is quite limited...
This is why I was thinking of buying some pictures and then making a small video.
I have Sony Vegas (not the professional) at home and I can learn very fast.
I can send you the link to the website by pm just let me know if you are interested, I'm not posting it here not to be accused of advertising...
I hope this answers your question, if you need more info, I'm happy to provide it.
I was doing what you've just mentioned searching for ways to record dialogues, and most of the times you would have a shot with the back of his head and the other speaker a bit further, but I will never find such pictures especially with the background a restaurant which is what I'm searching for.
What I was thinking is something like this:
And I was thinking of improving it by showing a picture with a real person in a real restaurant.
Another idea that I had was to animate the characters but I remember when I checked it some time ago it was quite complicated... as I would have to model the character and at least what they are holding in their hand (e.g. phone...)
What do you think?
Many thanks for your support
Last edited by jimmyy; 24th Feb 2016 at 07:57.
Well if it is a man and woman then it's a no-brainer.
Variation is the key here. It really doe not matter what the viewer sees since even the blind knows a woman's voice dos not normally come out of a man's mouth.
For me, the best learning conditions would be to see the actual human mouth pronunciate the words (real human) , along with both written languages as subtitles. I realize the real human component is not feasible for you now
For me, a static picture of people doesn't really help me to actually learn the language. In that situation it doesn't matter if it's "cartoon" or real because there are no mouth movements. It's almost no different than a blank screen - but I suppose it does help in terms of situational context, and perhaps as a memory aid for some people as you suggested. The dialog in a restaurant would be different than in a doctor's office for example.
But the written text does help. I don't know who your primary audience is, but it helps me to see both languages, but it might be too cluttered for what you are trying to do. I think both speakers should be present for the visual. A man and a woman's voice is easy to distinguish so I wouldn't worry too much about "highlighting" who is "speaking" . But I would pay attention to the pace of the speech (slow), and legibilty of the text
I remember now - you were asking about lipsync and 3d character animation a few years back.
Nothing has changed in terms of audio driven "automatic" animation. But a lot has changed in terms of mocap for both marker and markerless animation - basically it's getting more affordable, much better, more accurate and realtime (eg. you can have an avatar for IP chat sessions between people) - but it doesn't really help your situation, just some followup information
The slow part is already incorporated in the audio recording.
Could you help me though with my question on whether to divide the frame in two (left , right) and put let's say the woman to the left, the man to the right. I believe this type of composition is usually used in dialogues over the phone isn't it (regardless of the actor being a human or a cartoon), but when it's an human actor then it's obvious who is speaking, with the picture less obvious, but still obvious enough since it's a man and a woman. The other approach could be to use the picture of the man when he is talking on the whole frame and the one of the woman when she is answering. I find it more "natural" to divide the screen in two (left, right), what do you think?
Many thanks for the links, I'll check them out for my curiosity.
All the best and I'm really happy to read your helpful replies and advice.
Last edited by jimmyy; 24th Feb 2016 at 14:23.
I would do it face to face such as in the examples you posted. If it is a "telephone" conversation with images, I would still do it face to face , but with a divider . I would position the text more on the side of whoever is speaking at the time. I think it's too distracting to do other things like a ring or highlight the person.
Definitely human is the best for learning. Watching the mouth/facial movements and body language, not necessarily just zooming in on the mouth. The actors don't have to be supermodel good looking or anything like that - just pronounce clearly and speak slowly. Cornucopia is right - the actor part is easy to find , you can find acting students that will do it for free, or people that just want to pad their resume. All they usually need is credit at the end.
You explored the free daz studio before, but the genesis 3 model has more phonemes (still geared towards English), better facial animation, and a face rig with bones that can plug into various programs
Here is a quick "automatic" example, using one of the male voice videos you posted above as the source (funny because it's a female model, doing both voices). Of course you can tweak/refine the movements, add facial expressions to make the character more "alive" - but this was 100% automatically audio driven + text of the words; even the eye blinking and face movements are automatic. (It's a low quality render, poor lighting, poor antialiasing, etc...but just to demonstrate the mouth movements). But it looks like it has slightly improved on the audio driven end for daz.
I still have the issue with the environment which should be modeled which is quite complicated,