VideoHelp Forum




+ Reply to Thread
Page 4 of 8
FirstFirst ... 2 3 4 5 6 ... LastLast
Results 91 to 120 of 229
  1. Member DJRumpy's Avatar
    Join Date
    Sep 2002
    Location
    Dallas, Texas
    Search Comp PM
    I think I understand. If we used something besides SSRC , and resampled an audio clip with a 44.1Khz frequency, to a 48Khz requency, without shifting it to a new 'bin', we would get a sizable pitch change, because the pitch would be shifted up along with the frequency. By shifting the data to another 'bin' (what's it called by the way? domain?) you avoid the change in pitch associated with the frequency change.

    Can you give me more detail on these domains (something like a real world example that resembles this technique?). I'm having a hard time getting my head around this.

    Do they simply record the audio, at the new sample rate? This would retain the pitch, but give you the new sample rate, which could then be slowed to the original sample rate.
    Impossible to see the future is. The Dark Side clouds everything...
    Quote Quote  
  2. OK, think a 2.205KHz tone. You have a 44.1KHz sampled version of a tone that lasts for 1 second (and for the sake of this discussion, has a phase of 0 degrees) in a WAV that is 3 seconds long (1 empty second before and 1 empty second after the tone). This means there are exactly 132,300 samples. In the middle are exactly 2205 cycles (2.205KHz * 1 second), which will take up 20 samples each (2.205KHz * 20 = 44.1KHz).

    Obviously if we play back that sound at 44.1KHz, we'll hear 1 second of silence, followed by one second of a 2.205Khz tone, followed by 1 second of silence.

    But let's say we instead play this back at 48KHz. We don't modify the data in any way whatsoever. We just play the samples faster. Immediately you should see that we will finish with our 132,300 samples in less than 3 seconds. We actually only have 2.75625 seconds worth of data, if we play it back at 48KHz. Also, we'll reach the end of the first 1/3 of our WAV in less than 1 second. The first section of silence will now last 0.91875 of a second instead.

    When we get to the tone, we'll be playing that back faster as well. Like the silence, the tone will only last 0.91875 of a second. But we'll also be playing the samples faster. So our tone will actually become a 2.4KHz tone. Obviously this is a higher tone.

    Now that we are playing our audio at 48KHz, we have shortened it (while inadvertently raising the pitch), but our finished audio needs to be 44.1KHz (say for a VCD). We didn't modify the data at all, so if we just slow down to 44.1KHz, we're right back where we started. So instead, we have to resample, and actually modify the data. There are a variety of ways to resample, each with their own benefits and drawbacks.

    The worst method would be to drop a sample now and then so that we end up with 121551 samples (2.75625 seconds * 44.1KHz ~= 121551). We'd drop about one in every 11 samples. The quality of this type of sample rate conversion is very bad (for reasons I won't go into here).

    Some better methods of sample rate conversion attempt to model the sound wave from the input samples, and then try to guess what the sound wave would have been doing if the output sample rate had been used. For example, if we were going from 44.1KHz to 88.2KHz, we would have to create twice as many samples as there were in the input. We could use the provided values for half, but we have to make up the other half. Since the other half, unknown samples come from exactly halfway between two known samples (the ratio of 88.2KHz to 44.1KHz), we could just average the two on either side. This is called linear interpolation. There are some more complex interpolation methods, and of course our ratio of 48KHz to 44.1KHz isn't a nice convenient multiple, so something more sophisticated is needed. But in any case, we end up with a 2.75625 second clip at 44.1KHz.

    Anyway, this is what BeSweet does. AVISynth can do the same thing with:
    Code:
    AssumeSampleRate(48000)
          ResampleAudio(44100)
    Now the method I described above goes through a different process. In that case, you take your original 3 second WAV, with the 2.205Khz tone and convert to the frequency domain. Now instead of 132,300 samples, you have 132,300 bins (there can be a different number, and the bins can hold more than one number, but this is a simple example). Each bin represents a frequency.

    In practice, the entire WAV wouldn't be converted to the frequency domain as a whole. Instead, pieces called frames would be converted, operated on, and then converted back. The size of the frame varies, and the size of the frames determines what the actual frequency values of the bins are. The more bins, the closer together the frequencies will be. For this example, let's assume we have the right number of bins so that one of them represents exactly 2.205Khz (if this isn't the case, then in most implementations the amplitude must be spread out among more than one bin around the target frequency, which is much to complicated for this example, and if the phase isn't 0, the amplitude is spread out as well). So only that bin will have anything in it. A more complex waveform would have values in lots of bins.

    For the first and last 1 second worth of the frames, all the bins would be 0, since there is no audio present. But in the frames of the 2nd second, we have the tone. If our original tone had an amplitude of 20,000, the 2.205Khz bin will contain a value that represents that amplitude (again the implementation will determine the granularity of the bin).

    Now, to make the same magnitude of change as in the time domain example above, we would want to change the frequency of our 2.205KHz tone to 2.4KHz. So we set the 2.205Khz bin value to 0, and then move the 20,000 value to the 2.4KHz bin. For a more complex wavefore, with data in more bins, we'd just shift each bin's contents up to the bin that is .195KHz above it (we zero-fill the lowest bins, and lose any data that gets shifted above the available bins, which introduces aliasing). Then we convert back to the frequency domain.

    At this point we still have 132,300 samples, but we have a 2.4Khz tone instead of a 2.205KHz tone. The duration is the same 3 seconds as before, and the tone is still 1 second long.

    Of course, as in the example from the previous post, our goal isn't to shift the pitch, but the speed things up. If we go through the time domain exercise above, we'll end up shifting the frequency even higher. So what we should have done is shifted the frequency down. Instead of moving the 20,000 from the 2.205KHz bin to the 2.4Khz bin, we should have moved it down to the 2.026Khz bin ((2.4 - 2.205) / 2.205 ~= 2.026). Then after we convert back to the time domain, we have a 3 second clip with a 1 second tone that is lower than we started.

    Now we can apply our original time domain conversion. We treat the 44.1KHz samples as if they were actually 48KHz. Our clip becomes 0.91875 times the original. And the pitch is shifted up by approximately 1.09x. 1.09 x 2.026Khz = 2.205Khz, so we are back to our original frequency.

    Whew! That's one way of doing it. There is a lot of complexity left out (believe it or not), and as I mentioned in the previous post, this isn't how WSOLA does things. In that case, they combine some of the stuff, and I think they do the resampling in the frequency domain, or at least when converting between the frequency and time domains. But hopefully you get the idea.

    Xesdeeni
    Quote Quote  
  3. Member
    Join Date
    Jun 2002
    Location
    United States
    Search Comp PM
    DJRumpy, yes I am talking about BeSweet. Concering SVCD's that are PAL that I have done, the audio is at 44khz going to DVD NTSC we all know it must be 48khz.

    All I was saying that im my case of taking the PAL SVCD and converting the framerate from 25 fps to 23.976 fps, I have not noticed any difference in pitch but Xesdeeni's posts have been informative as usual.

    I'm always willing to try new things and BeSweet has not failed me as of yet in the type of conversion I do. I was wondering about using the AVISynth script that you all have used. Better results?
    Quote Quote  
  4. Member DJRumpy's Avatar
    Join Date
    Sep 2002
    Location
    Dallas, Texas
    Search Comp PM
    If your not hearing the pitch shift from BeSweet, then I would think it should be fine to use AVISynth.

    Xesdeeni, would a new WAV recording, recorded at a higher frequency, and then slowed down to the old frequency rate also accomplish the same thing? I know you can record an existing sound, into a new file, at a different frequency, retaining the sound, and pitch, and length, but using the new frequency. Also, are all freqencies equal in the amount, or quality of sound they can hold?

    I understand the concept of what your saying. I'm still having a bit of trouble with the process of converting to the frequency domain. I may do some more reasearch on the net to see what I can turn up. I like to know what makes things tick if you haven't figured that out already...
    Impossible to see the future is. The Dark Side clouds everything...
    Quote Quote  
  5. Hey Guys,

    I followed your instructions perfectly, but I ran into a problem.
    After I created the avisynth file and put it in tmpgenc it runs fine. But it doesn't do the whole movie, it finishes up at 100% at 1hr 54minutes but the film is actually 2hrs and 6 minutes, is that right?


    Thanks
    Quote Quote  
  6. Member DJRumpy's Avatar
    Join Date
    Sep 2002
    Location
    Dallas, Texas
    Search Comp PM
    I'm guessing you sped up the framerate on your output? (you didn't specify). You can always just view the output to ensure the entire video encoded properly.
    Impossible to see the future is. The Dark Side clouds everything...
    Quote Quote  
  7. @DJRumpy

    Sorry I didn't see this earlier:
    Originally Posted by DJRumpy
    Xesdeeni, would a new WAV recording, recorded at a higher frequency, and then slowed down to the old frequency rate also accomplish the same thing?
    No, because when you slow down the playback, you lower the pitch. Step in the "Way Back Machine" and think about playing a 45 RPM record at 33 RPM.
    I know you can record an existing sound, into a new file, at a different frequency, retaining the sound, and pitch, and length, but using the new frequency.
    You can also resample, which accomplishes roughly the same thing (roughly, because the result is only an algorithmic approximation).
    Also, are all freqencies equal in the amount, or quality of sound they can hold?
    I'm not sure what you mean by this, but I'll scatter-shoot. The human hearing system is logarithmic in nature. You've seen the dB (deciBel) system of sound amplitude (loudness) measurement, which is logarithmic as well. But our hearing system in logarithmic in frequency as well. If you take the musical note A below middle C, it is 220Hz. The next A is 440Hz. The next A is 880Hz. But the perceived distance between these notes is the same. So in reality, as the frequency increases, we need less resolution. Some techniques, especially for compression of audio, use non-linear frequency bins. But I believe that the Fourier and Cosine Transforms natively are linearly spaced.
    I understand the concept of what you're saying. I'm still having a bit of trouble with the process of converting to the frequency domain.
    I have looked at the formulae, but they make my head hurt. Many people have spent many brain-hours on them, and I'm happy to ride on their coattails. The Fast Fourier Transform (FFT) and Discrete Cosine Transform (DCT) exist in a number of heavily optimized forms just for these types of uses. A friend has an entire textbook dedicated to just DCTs.
    I may do some more reasearch on the net to see what I can turn up.
    Check out http://www.fftw.org/ and http://www.cs.sfu.ca/CourseCentral/365/li/material/notes/Chap4/Chap4.2/Chap4.2.html (especially the picture about 1/4 of the way down).
    I like to know what makes things tick if you haven't figured that out already...
    Don't we all

    Xesdeeni
    Quote Quote  
  8. Hey DJrumpy,

    Here is my script:

    LoadPlugin("C:\Program Files\GordianKnot\MPEG2Dec3.dll")
    mpeg2source("Matrix.d2V")
    AssumeFPS(23.976,True)
    LanczosResize(720,480)

    I am taking a pal progressive source and trying to frameserve it into tmpgenc. I would much rather use cce but I get an error stating it can't find
    codec for yv12?

    Thanks in advance for the help
    Quote Quote  
  9. Forgot to mention its a dvd as well, thanks.
    Quote Quote  
  10. Member DJRumpy's Avatar
    Join Date
    Sep 2002
    Location
    Dallas, Texas
    Search Comp PM
    Thanks Xesdeeni. I'll check these out tonight.
    Impossible to see the future is. The Dark Side clouds everything...
    Quote Quote  
  11. Hey Guys,

    Is my script ok for what I am trying to do? And does anyone know why CCE won't work with it? Thanks
    Quote Quote  
  12. Member DJRumpy's Avatar
    Join Date
    Sep 2002
    Location
    Dallas, Texas
    Search Comp PM
    Sorry repdetect2, I didn't see your post. You need to make the following change:

    LoadPlugin("C:\Program Files\GordianKnot\MPEG2Dec3.dll")
    mpeg2source("Matrix.d2V")
    AssumeFPS(23.976,True)
    LanczosResize(720,480)
    ConvertToYUY2()

    That will eliminate the YV12 error your getting. If your using TMPGenc, then use ConvertToRGB() instead.

    You should just place your MPEG2DEC3.DLL in your AVISynth Plugins directory. That way you don't have to explicitly call it in every script. It will be autoloaded for you. You'll find the plugin directory in your Program Files\AVISynth folder. Just drop the DLL in there, and remove the LoadPlugin line from your script.
    Impossible to see the future is. The Dark Side clouds everything...
    Quote Quote  
  13. Thanks alot for the help!! I am definitely going to learn as much as I can and try to help people like you have helped me. Thanks to you and everyone
    Quote Quote  
  14. Hey Djrumpy,

    I must be doing something wrong. I loaded the script and I didn't get the error which is awesome!

    The problem now is CCE shows the duration of my film as being 10 minutes when its 2hours.

    I verifed in dvd2avi that the project file completed with a time of 2hours.

    Am I doing something wrong here?
    Quote Quote  
  15. Member DJRumpy's Avatar
    Join Date
    Sep 2002
    Location
    Dallas, Texas
    Search Comp PM
    Make sure you've got the decimal point in the AssumeFPS(23.976,True) command and the right framerate (23.976)

    You can also drag/drop the .AVS script into VirtualDub to view the results/file info, or just double click it, and associate it with Media Player. It will play as if it was an AVI.

    What does the output look like? Does it just stop at 10 minutes, or is it extremely fast?
    Impossible to see the future is. The Dark Side clouds everything...
    Quote Quote  
  16. Ok, quick question guys. I have some cartoons on PAL DVD that I want to convert over to NTSC. The guide listed looks great in theory... Only one problem. I am running AVISynth 2.5x and SmoothDeinterlacer doesn't work with it. Any suggestions other than downgrade to an earlier avisynth version? Is there a comparable deinterlacer that will also change framerate within the function (i.e. doublerate).

    Also, I too would like to keep the menus from the DVD. Actually I would like to keep the entire PAL DVD, just as NTSC. I will work on some things to convert the menus over but if any one has any suggestions I will be more than glad to listen. Thanks.
    Quote Quote  
  17. Member DJRumpy's Avatar
    Join Date
    Sep 2002
    Location
    Dallas, Texas
    Search Comp PM
    Start with the AVISynth site. Look at the mainpage filter list. They list of tons of 2.5x filters. There are probably multiple 3rd party plugins that will do an equivelent job.

    www.avisynth.org
    Impossible to see the future is. The Dark Side clouds everything...
    Quote Quote  
  18. Someone over at Doom9 did a port of SmoothDeinterlacer to 2.5.x. Do a search over there.

    The only way I know of to keep the whole DVD is to re-author by hand. You have to convert the motion menus as you would video. The still menus must go back to still images for most authoring software, so I use an image program (a la Photoshop) to scale them. Then you have to recreate the buttons and all the links.

    Sorry. I've posted a few times over at Doom9 to try to get a handle on an automated way of doing this, but so far, no joy.

    Xesdeeni
    Quote Quote  
  19. DJrumpy,

    Thanks alot for the help it was a simple syntax issue

    I need to pick your brain again though, I ran everything through CCE and it created the file with no problem.

    When I try to author it with DVD Maestro or TMPGENC DVD author it says the framerate is invalid at 23.976, how can I author it, do I need to patch the file or something?

    Thanks in advance DJRumpy, your my new hero
    Quote Quote  
  20. Member FulciLives's Avatar
    Join Date
    May 2003
    Location
    Pittsburgh, PA in the USA
    Search Comp PM
    Originally Posted by repdetect2
    DJrumpy,

    Thanks alot for the help it was a simple syntax issue

    I need to pick your brain again though, I ran everything through CCE and it created the file with no problem.

    When I try to author it with DVD Maestro or TMPGENC DVD author it says the framerate is invalid at 23.976, how can I author it, do I need to patch the file or something?

    Thanks in advance DJRumpy, your my new hero
    You probably created a progressive NTSC video stream with a 23.976fps so you need to run PULLDOWN.EXE on it which will include the "flags" needed so the DVD players knows it needs to perform 3:2 pulldown. After doing this your authoring program should see it as 29.97fps

    - John "FulciLives" Coleman
    "The eyes are the first thing that you have to destroy ... because they have seen too many bad things" - Lucio Fulci
    EXPLORE THE FILMS OF LUCIO FULCI - THE MAESTRO OF GORE
    Quote Quote  
  21. Member DJRumpy's Avatar
    Join Date
    Sep 2002
    Location
    Dallas, Texas
    Search Comp PM
    Exactly. Get the GUI for pulldown. You should be able to find both in the TOOLS section. If not, look on the www.doom9.org site.

    The GUI is called PullDownBatchFE. You can download it direct here: http://guiguy.wminds.com/downloads/pulldownbatchfe/down.html

    When you run this, you will first be prompted for the location of your PULLDOWN.EXE file. Just selected it, and click OK. After that, you simply use the BROWSE button to find your .M2V video file. For your OPTIONS, turn off (uncheck) everything, except the 'Set drop_frame_flag' option. They don't need to be checked under normal circumstances.

    The output will be another copy of your .M2V file ( it will have an _PD added to the filename: example_pd.m2v ), but with dropdown flags addd to it. It will report itself as 29.97 fps, and should import into your authoring software without issue.
    Impossible to see the future is. The Dark Side clouds everything...
    Quote Quote  
  22. DJRumpy and Fulcilives

    Thanks alot, I was hunting trying to find what settings to use this weekend

    I am going to try this again with the pal version of the matrix and let you know how it goes.

    One quick though, when I use besweet gui to changy my ac3 audio to do I just use the template for pal to ntsc 23.976 or pal to ntsc 29.97?

    Thanks so much guys, learning as I go.
    Quote Quote  
  23. Member DJRumpy's Avatar
    Join Date
    Sep 2002
    Location
    Dallas, Texas
    Search Comp PM
    PAL -> NTSC (25.000 to 23.976)

    This option will match the audio to your video's new 23.976 framerate.
    Impossible to see the future is. The Dark Side clouds everything...
    Quote Quote  
  24. DJRumpy,

    Thanks alot, You should really write a guide, your knowledge on this stuff is amazing. I got fortunate I found this particular post. Just plugged my file into CCE got about 2 hours to go....
    Quote Quote  
  25. Member DJRumpy's Avatar
    Join Date
    Sep 2002
    Location
    Dallas, Texas
    Search Comp PM
    I've been thinking about it.

    By the way, BeSweet GUI will also let you convert 5.1 sound to/from PAL and NTSC while retaining the distinct 5.1 sound channels. You must be sure to setup both the AC3 options under the AC3 & Ogg sections, as well as the BeSweet options. Your command line listing should have a tag added to it that looks like this: -ac3enc( -b 384 -6ch )

    If the -6ch bit isn't there, recheck your AC3 config. You must have the latest version of the BeSweet GUI to see the AC3 5.1 sound option (mine is v0.6 B76). This way you can keep your 5.1 sound and convert from/to PAL. I'm not sure what version of BESWEET.EXE you need, but I would get the latest to be sure ( http://dspguru.doom9.net/ ).

    ::EDIT::
    I went in with the latest betas on both the GUI and the EXE. You have to manually open a command prompt, and add the -6ch switch (use copy/paste to get all of it into a batch file via Notepad). This method does work, as I verified it over the weekend. The easiest method would be to copy/paste the entire command line into Notepad, and save it as a .BAT file. Make sure you add the -6ch switch right after the -b parameter (see above).
    Impossible to see the future is. The Dark Side clouds everything...
    Quote Quote  
  26. DJrumpy,

    Thanks for the additional information, I would have missed that one.

    I just left CCE template the way it was when I loaded my avs file, any suggestions on that?

    I am going to pick your brain while I have you around

    And I would definitely encourage you to write the guide, I know alot of guys like me would love it as source material...
    Quote Quote  
  27. Member DJRumpy's Avatar
    Join Date
    Sep 2002
    Location
    Dallas, Texas
    Search Comp PM
    Assuming your source is progressive, and Top Field First, I would normally leave everything disabled, except for the Top Field First option, and some tweaking on the QUALITY page.

    I turn off all fitlers on that page. For DVD video, I set the Quantzer Characteristics slider to 32, or to 16 for VCD/CVD/SVCD.

    Intra Block DC Precision: 9 for DVD, 8 for VCD/CVD/SVCD

    Block Scanning Order: ZigZag for progressive input, and Alternate for interlaced.

    Progressive Frame Flag: Check this only for progressive input (25fps progressive, or 23.976 progressive).

    I don't use the Simple settings.

    On the VIDEO settings page, I pick the Matrix according to the format I'm converting to:

    Standard: DVD
    Very Low Bitrate: SVCD
    Ultra Low Bitrate: VCD

    On the Main settings page, depending on your version (I'm on 2.66), I would use 3:2 pulldown if my source was telecined. This option will automatically enable the 'Progressive Frame Flag' for MPEG-2 Video. I normally would not use this option, as I perform IVTC before encoding.

    Letter Box Hint is used in conjunction with the 3:2 Pulldown option, to tell CCE to ignore the letterboxed area when trying to detect telecined (duplicate frames). Use this if you are using 3:2 Pulldown detection, and your input is letterboxed. It has no affect if "3:2 Pulldown detection" is not enabled.

    The Panscan option will affect Playback on a DVD player for 16:9 video. It will tell the player to Pan & Scan playback on a 4:3 television (Gives you fullscreen, instead of letterboxed display on a 4:3 television).

    Other than those settings, I typically leave everything else disabled
    Impossible to see the future is. The Dark Side clouds everything...
    Quote Quote  
  28. DJRumpy,

    Thanks alot for the information. I did everything just like you said.
    Before I author, I noticed the video is now longer than the audio after I used the pulldown program. Its longer by about 5 minutes, is that ok?

    Thanks again
    Quote Quote  
  29. Member DJRumpy's Avatar
    Join Date
    Sep 2002
    Location
    Dallas, Texas
    Search Comp PM
    Most definately not. Since your using AVISynth, and the AssumeFPS command, we know your video is 23.976 frames per second. You can verify the command in your script if you like. It should look like this:

    AssumeFPS(23.976)

    You can verify the script by dropping into VirtualDub, and looking at FILE | FILE INFORMATION. it will report the output framerate of your script.

    I would suspect the audio is the culprit. When converting audio in BeSweet GUI, its tempting to just select your input, check the PAL to FILM option, and set the output to MP2/AC3 or whatever your converting to. I find it tends to fail when doing this 'one-step' process.

    Convert your PAL file to WAV first.
    Take the new PAL wav and convert it to NTSC wav.
    Take the NTSC wav, and convert it to your output format (MP2/AC3).

    Do these each as a seperate step in BeSweet GUI, and see if that doesn't fix your difference in time. You can always look at length of your original PAL video to see if your audio matches the old PAL length. If it does, then the audio hasn't been converted to FILM properly.

    Both your audio and video should be about 5 minutes longer than your original PAL video.
    Impossible to see the future is. The Dark Side clouds everything...
    Quote Quote  
  30. DJRumpy,

    Thanks for the information. That did, everything is as it should be now. I wish I could think of some more questions to ask, but your just too darn good at answering them for me
    Quote Quote  



Similar Threads

Visit our sponsor! Try DVDFab and backup Blu-rays!