I'm working on an application that reads subtitle files with media format timecodes (HH:MM:SS.mmm, for example) and I'm trying to calculate the correct frame at which a timecode is placed. This is basically converting the timecodes to total seconds and then to frames, based on the frame rate, for several purposes. I had done this by taking the total seconds of the timecode and multiplying it by the frame rate. So, for example, at 23.98 fps, 00:06:17.127, which is 377.127 seconds, would be 377.127 * 23.98 = 9043.50546. Rounding or flooring this value is something that would depend on what the program will do with it. However, this value of 9043.50546 frames caught my attention, because the 377.127 was obtained in Subtitle Edit by moving frame by frame, so it shouldn't be in the middle of a frame; it should be closer to an integer.
To further analyze this, I used ffmpeg's showinfo to show me info on all frames, and found that frame #9042 was rendered at 377.127:
n:9042 pts:33941407 pts_time:377.127
So, the timecode from Subtitle Edit is correct, but my caltulation of seconds * frame rate is not.
Trying to clear the confusion, I wrote a program to read the .txt file from ffmpeg's showinfo and get the time for all the frames in the video, and did it for several videos, from different sources and different frame rates, to plot and see the patterns. What I plotted is the duration of frames according to ffmpeg's analysis, taking the time of each frame minus the time of the previous frame. These are the plots for what I found.
In red, you'll see the duration of each frame according to ffmpeg's analysis. In blue (constant), you'll see the duration of each frame according to 1 / frame rate, which is, for example, 0.041701418 seconds @23.98 fps. The title is just an identifier for the source and series of the video.
- Videos @24 fps recorded with a camera from the same series, different durations:
[Attachment 60965 - Click to enlarge]
[Attachment 60966 - Click to enlarge]
[Attachment 60967 - Click to enlarge]
[Attachment 60968 - Click to enlarge]
As you can see here, there is a pattern in all videos. The duration of frames always oscillates, but with a different amplitude in different ranges. In the first 300 frames (more or less), the oscillations are very close to the value of 1/24 = 0.041666666666 seconds, but then start to oscillate further away from it. Then, in all videos except for the shortest one, the oscillations get even bigger at around 24000 frames. In the shortest one, this happens at around 2400 frames, but the values are a bit smaller. When I get the average of all these durations in red, the value is always the same as in the blue line, which I calculated with 1/24 = 0.041666666666. Maybe this also happens in the longest videos too, but it's not easy to see in the plot.
- Video @24 fps, screen recording:
[Attachment 60969 - Click to enlarge]
Here, even though the video is from a different source, I'm seeing the same pattern. Oscillations change at around 300 frames, and 2400 frames, which matches that of the shortest video above.
- Videos @23.98 fps from a camera, same series:
[Attachment 60970 - Click to enlarge]
*There is a line that's very, very clear here in the second oscillation segment. This is because the values there were very few.
[Attachment 60971 - Click to enlarge]
[Attachment 60972 - Click to enlarge]
A similar pattern, but not quite the same. Changes around 300 frames and 2400 frames, and also around 24000 frames for the longest video.
- Video @25fps, screen recording:
[Attachment 60973 - Click to enlarge]
Here the durations are identical to my calculations.
- Videos @25fps, animation:
[Attachment 60974 - Click to enlarge]
[Attachment 60975 - Click to enlarge]
Durations are identical, even though the sources are different. There is no problem with 25fps.
Does anyone know the reason for this and can help me understand these patterns and predict them depending of frame rate and maybe duration? The errors I'm getting in my application are not very frequent, but they exist. And, even though the average of this durations is always the same as what I calculate, there are cases in which it won't work correctly. Maybe I can determine a calculation that reflects these patterns, but I would like to have a better understanding. Could it be that it's not 300 frames, but ~240 and ~2400? Probably because the durations start to oscillate further away every power of 10?
I'm not including videos @29.97 fps, yet, but I think this is enough to explain the situation.
Thanks in advance. I hope some of you can shed some light on this.
+ Reply to Thread
Results 1 to 13 of 13
I don't know what you're doing (and don't care much, either) but if you depend on a correct framerate, 23.98fps isn't it. It's usually written as 23.976, or even more accurately, 23.97602397602398...
You get that with 24 x (1000/1001).
The fractional frame rate is the culprit, It's the dumbest thing we inhirited in addition to standard units, even when it is not needed now we still use it, that's how stubborn our ego is. lucky Europeans, they don't have to deal with that crap. Let's not get into standard vs metric (oh boy).
Thank you for your reply, manono.
The situation is not only happening with 23.98 fps; if you see the plots, it also happens with 24fps. Additionally, even with the most precise framerate, doing 1/frame rate will always yield a constant value, but the actual duration of frames is not constant. This is what my question is about.
For the video that corresponds to the plot I identified as E3 @23.98fps, I get this in VLC
[Attachment 60976 - Click to enlarge]
For the one that corresponds to the plot I identified as A121 @24fps, I get this:
[Attachment 60977 - Click to enlarge]
I also get the same frame rates from Windows's file properties and from ffmpeg. Could they be showing 24 but it's wrong and actually it's 23.98? I could do another check using OpenCV, but that would take more time.
Keep in mind that the video doesn't have to start at t=0.
In case it is relevant, this is what I get from ffmpeg's show info for the videos I showed with 23.98 fps and 24fps:
n: 0 pts: 0 pts_time:0
n: 1 pts: 3750 pts_time:0.0416667
n: 0 pts: 0 pts_time:0
n: 1 pts: 3753 pts_time:0.0417
With this information, is it still possible that the videos don't start at t=0?
To add to the explanation in the original post and the plots, the first frame always lasts 1/frame rate, but then their duration start to fluctuate with that pattern I showed.
They are correct:
Using 24 FPS * 1000 / 1001 * 377.127 sec = 9,042.005994005994 Frames. More exact. Closer by over a frame. And this could be rounded, though I don't think it needs to, as your subtitle edit might be giving you (slightly) inaccurate or rounded numbers.
My guess is also that your "24" isn't really true 24 either. Another inaccuracy in the reporting app.
Last edited by Rain-Maker; 26th Sep 2021 at 22:19. Reason: Unnecessary ranting, but worked as those Lincoln's hot letters.
FWIW... I have used ffmpeg show info and ffmpeg show time and found them not to be concise. After developing a program to cut at frame accurate times, I use a CMD script using ffprobe.
No. pts_time dts_time type Frame#
0 0.036000 0.036000 1 I 0
1 0.069367 0.069367 0 B 3
2 0.102733 0.102733 0 B 2
3 0.136100 0.136100 0 B 4
4 0.169467 0.169467 0 P 1
5 0.202833 0.202833 0 B 7
6 0.236200 0.236200 0 B 6
7 0.269567 0.269567 0 B 8
8 0.302933 0.302933 0 P 5
9 0.336300 0.336300 0 B 11
10 0.369667 0.369667 0 B 10
11 0.403033 0.403033 0 B 12
12 0.436400 0.436400 0 P 9
This shows PTS, DTS and best time. Occasionally I run into badly edited or badly joined videos or videos whose audio does not equal the video length.
Video 1 (Good Video)
Frame 0 PTS= 0.000000
1/29.97002997002997 = .0333666666666667
Video2 (Bad Video)
Frame0 PTS = .036000
1/29.97002997002997 = .0333666666666667
618.3510666666667 + 0.036000 = 618.3870666666667
FFprobe Time = 618.481167
I have found FFPROBE to be more accurate in catching errors and correcting them. As jagabo pointed out frames0 does not have to be 0.000000 PTS/DTS. This can throw off calculations a lot.
Also, mp4 usually uses a time base of 1/90000 seconds as recommended in the spec. NTSC frame times can't be represented exactly by N/90000 where N is an integer. So the muxer alternates between two values. For example, 90000 / (24000 / 1001) = 3753.75. So you get a pattern of three frames at 3754/90000 and one frame at 3753/90000, which averages out to 3753.75/90000 over time.
My guess is that some of your videos had portions which originally had a lower time base which was multiplied by a constant when muxed with 90000 time base, hence the larger oscillations.
Last edited by jagabo; 27th Sep 2021 at 06:44.