VideoHelp Forum


Try StreamFab All-in-One and rip streaming video! Or Try DVDFab and copy Blu-rays! or rip iTunes movies!
+ Reply to Thread
Results 1 to 13 of 13
Thread
  1. So I have a fundamental ignorance to when audio needs reencoding. When trimming a video on keyframes one can export losslessly in copy mode on many applications, however trimming within keyframes will cause corruption. Is the same true for audio? There's something called sample rate measured in 48000hz, does that mean you can cut audio anywhere and concatenate without reencoding? I really just am not sure how audio is encoded in the first place, to make later assessments about parsing it losslessly.. If it matters, I'm generally referring to AAC LC and libopus. Any help?
    Quote Quote  
  2. Member Budman1's Avatar
    Join Date
    Jul 2012
    Location
    NORTHWEST ILLINOIS, USA
    Search Comp PM
    My experience has been that cutting, encoding and trimming can cause the audio to be longer or shorter than the video, from a couple milliseconds to several Video frame length. This is normally not a problem during playback but can cause a loss of sync when joining several segments with this variation in length.

    Normally I fix with stretching (slightly) the audio track or fill with silent audio. For instance if you use Clone to add still frames to the start or end of a segment, it will have no audio for that portion unless you fill it with silence. So if you cloned the start of the video with 5 seconds repeat of the first frame (FFMpeg Clone filter), the resulting audio would be around 5 seconds shorter.
    Quote Quote  
  3. Originally Posted by Budman1 View Post
    My experience has been that cutting, encoding and trimming can cause the audio to be longer or shorter than the video, from a couple milliseconds to several Video frame length.
    Thanks for the reply. So this is when you export in copy mode? Or does it also happen when reencoding? Because for my purpose I want to learn how to cut audio without reencoding, if possible. I understand with video that because of interframe compression you have to cut on i-frames which contain the full data for that frame but with audio frankly I don't know how the compression works. Is there so called "inter" compression that is temporal in nature? Where is safe to cut and how do you know when you need to reencode - only when you apply filters to the audio?
    Quote Quote  
  4. DECEASED
    Join Date
    Jun 2009
    Location
    Heaven
    Search Comp PM
    Compressed audio is divided into segments called "frames" and each frame normally represents a fixed quantity of samples.
    So, when an application trims a compressed audio track losslessly, it always removes an integer quantity of frames, and at least AFAIK there is no (exact) audio equivalent of the so-called "smart rendering" for video.
    Last edited by El Heggunte; 12th Jun 2021 at 21:25. Reason: .
    "Like this facility, I don't exist."
    Quote Quote  
  5. Originally Posted by El Heggunte View Post
    Compressed audio is divided into segments called "frames" and each frame normally represents a fixed quantity of samples.
    So, when an application trims a compressed audio track losslessly, it always removes an integer quantity of frames, and at least AFAIK there is no (exact) audio equivalent of the so-called "smart rendering" for video.
    So there is temporal compression, but the "reference frame" for audio does not match the video? So when you cut on keyframes for video one should always reencode the audio, or am I misinterpreting? If so is it best to use the same algorithm at the same bitrate or more?
    When I use ffmpeg to concatenate losslessly sometimes it complains "DTS is not monotonically increasing". So is there a timestamp to the audio samples which matches the video and those warnings indicate a sync issue?
    I would just like to know when it is safe to trim and save audio losslessly vs when one must encode (quality loss).
    Quote Quote  
  6. Member
    Join Date
    Mar 2008
    Location
    United States
    Search Comp PM
    I've cut many videos in Avidemux with the
    Audio on " copy" and never noticed a problem. According to this article a frame of mp3 data contains 1152 samples and lasts for 26 ms
    https://stackoverflow.com/questions/6220660/calculating-the-length-of-mp3-frames-in-milliseconds
    Quote Quote  
  7. Originally Posted by davexnet View Post
    I've cut many videos in Avidemux with the
    Audio on " copy" and never noticed a problem. According to this article a frame of mp3 data contains 1152 samples and lasts for 26 ms
    https://stackoverflow.com/questions/6220660/calculating-the-length-of-mp3-frames-in-milliseconds
    Are these videos in which you copy audio but reencoded the video because you didn't cut on keyframes? And there's no audio drift, corruption or muting in the final product?
    Quote Quote  
  8. Member Budman1's Avatar
    Join Date
    Jul 2012
    Location
    NORTHWEST ILLINOIS, USA
    Search Comp PM
    Ok Quick test for demonstration, with 1 min 0 s video, cut between I frame 300 for 20 Seconds (Up to next I frame.)

    Image
    [Attachment 59433 - Click to enlarge]


    Original:
    General
    Complete name : D:\inside.mp4
    Format : MPEG-4
    Format profile : Base Media
    Codec ID : isom (isom/iso2/avc1/mp41)
    File size : 8.94 MiB
    Duration : 1 min 0 s
    Overall bit rate : 1 250 kb/s

    Video
    ID : 1
    Format : AVC
    Format/Info : Advanced Video Codec
    Format profile : Main@L3.1
    Format settings : CABAC / 5 Ref Frames
    Duration : 1 min 0 s
    Bit rate : 1 147 kb/s
    Frame rate : 30.000 FPS

    Audio
    ID : 2
    Format : AAC LC
    Format/Info : Advanced Audio Codec Low Complexity
    Codec ID : mp4a-40-2
    Duration : 1 min 0 s
    Bit rate mode : Constant
    Bit rate : 96.0 kb/s
    Channel(s) : 2 channels
    Channel layout : L R
    Sampling rate : 44.1 kHz
    Frame rate : 43.066 FPS (1024 SPF)

    =============================
    CUT for 20 SECONDS
    ffmpeg -ss 10.000000 -i "D:\inside.mp4" -vframes 600 -c:v copy -c:a copy -y "D:\0_0_inside.mp4"

    D:\inside.mp4
    Successful Cut at Frame 900
    0 10.000000 1 I
    19.833333,P
    19.866667,P
    19.900000,P
    19.933333,P
    19.966667,P Last Frame of cut
    60 12.000000 1 I

    General
    Complete name : inside.mp4
    Format : MPEG-4
    Duration : 20 s 0 ms
    Overall bit rate : 1 345 kb/s

    Video
    ID : 1
    Format : AVC
    Format/Info : Advanced Video Codec
    Format profile : Main@L3.1
    Format settings : CABAC / 5 Ref Frames
    Duration : 20 s 0 ms
    Bit rate : 1 242 kb/s

    Audio
    ID : 2
    Format : AAC LC
    Format/Info : Advanced Audio Codec Low Complexity
    Codec ID : mp4a-40-2
    Duration : 19 s 993 ms
    Bit rate mode : Constant
    Bit rate : 96.0 kb/s


    All streams copied and slight variation (0.007 ms) which may vary in other videos depending on length of cut, video content and compression algorithm.

    I personally fix if it will cause a glitch between joined segments with atempo"
    Image
    [Attachment 59434 - Click to enlarge]

    ffmpeg -i "D:\56f434bb28eb8.mp4" -c:v copy -c:a aac -af atempo=1.000339 "D:\AV_56f434bb28eb8.mp4"
    Last edited by Budman1; 13th Jun 2021 at 22:52. Reason: with personal fix added
    Quote Quote  
  9. Originally Posted by Budman1 View Post
    Ok Quick test for demonstration, with 1 min 0 s video, cut between I frame 300 for 20 Seconds (Up to next I frame.)
    D:\inside.mp4
    Successful Cut at Frame 900
    0 10.000000 1 I
    19.833333,P
    19.866667,P
    19.900000,P
    19.933333,P
    19.966667,P Last Frame of cut
    60 12.000000 1 I

    Video
    ID : 1
    Format : AVC
    Duration : 20 s 0 ms

    Audio
    ID : 2
    Format : AAC LC
    Duration : 19 s 993 ms
    Thanks for the analysis. Couple questions:
    If instead of extracting a 20 second clip, you removed a 20 second portion in the middle and kept the other 40 seconds; the rest being equal, would the last 20 seconds of the audio be shifted back 7ms, or would there be 7s of no audio, or a clipping sound between?
    Also, say in Avidemux if one chose to cut lossy (in between iframes) on the video, necessitating an encode, what would happen to the audio if it was kept in copy mode?
    Quote Quote  
  10. Member Budman1's Avatar
    Join Date
    Jul 2012
    Location
    NORTHWEST ILLINOIS, USA
    Search Comp PM
    Joining the same segment 2 and 3 times delivers 2 different videos that MediaInfo still reports as having a difference of 0.007 or 7ms. However if I look at the visible audio, there is a lomger gap at the end with no visible audio, at least with AudioGraph(4) filter. Don't know why the report says still 7 ms for 3 joined segments each having 7ms less audio

    Other than that, the start PTS is not 0.0 but stays contiguous with no glitches between joints and plays well. I can not tell about the audio since it is such a short difference.

    No. pts_time type
    0 0.077995 1 I
    1 0.111328 0 P
    2 0.144661 0 P
    3 0.177995 0 P
    4 0.211328 0 P

    597 19.977995 0 P
    598 20.011328 0 P
    599 20.044661 0 P
    600 20.077995 1 I -0.077995 Start PTS 0.0

    1198 40.011328 0 P
    1199 40.044661 0 P
    1200 40.077995 1 I -0.077995 Start PTS 0.0
    1201 40.111328 0 P
    1202 40.144661 0 P

    All in all it plays well but since there is a gap of no audio at the end, I can only assume that when joining the audios they are contiguous and the final result is far less that 1 frame of video so I don't believe it would be noticeable.

    As I said earlier, the program I use to cut and paste checks for audio excessive loss and corrects with Atempo and makes audio the same length as the video.

    If I reencode the Video and copy the audio of the video with lees audio., the 7ms remains. Reencoding both makes them all equal of course
    Quote Quote  
  11. Originally Posted by Budman1 View Post
    Joining the same segment 2 and 3 times delivers 2 different videos that MediaInfo still reports as having a difference of 0.007 or 7ms. However if I look at the visible audio, there is a lomger gap at the end with no visible audio, at least with AudioGraph(4) filter. Don't know why the report says still 7 ms for 3 joined segments each having 7ms less audio

    Other than that, the start PTS is not 0.0 but stays contiguous with no glitches between joints and plays well. I can not tell about the audio since it is such a short difference.

    All in all it plays well but since there is a gap of no audio at the end, I can only assume that when joining the audios they are contiguous and the final result is far less that 1 frame of video so I don't believe it would be noticeable.
    I see, however my source videos are typically on the order of 1-3 hours as opposed to 20 seconds, so it's possible the minor discrepancy could add up to an audio drift by the end of the video trims. Is atempo considered a filter that requires encoding? Regardless, if I need to reencode, should I use the same encoder with the same or higher bitrate, or since I'm encoding again I can change encoders at will (while maintaining a high quality)?
    Quote Quote  
  12. Member Budman1's Avatar
    Join Date
    Jul 2012
    Location
    NORTHWEST ILLINOIS, USA
    Search Comp PM
    https://superuser.com/questions/593869/does-it-make-sense-converting-a-file-to-a-highe...-audio-bitrate is a good page to base that on.

    Atempo uses encoding so the Bit-rate should be watched, mainly if you are changing formats since some are not as efficient at maintaining quality at lower bit-rates. If you are keeping the same format then what you have is as good as its going to get.
    Quote Quote  
  13. Originally Posted by Budman1 View Post
    https://superuser.com/questions/593869/does-it-make-sense-converting-a-file-to-a-highe...-audio-bitrate is a good page to base that on.

    Atempo uses encoding so the Bit-rate should be watched, mainly if you are changing formats since some are not as efficient at maintaining quality at lower bit-rates. If you are keeping the same format then what you have is as good as its going to get.
    Thanks for the help and the link, I think when I searched I used a "past year" restriction on results to get the most recent data but audio codecs don't change that often and it's probably still applicable!
    Quote Quote  



Similar Threads