Thread: NTSC frame rate and the relation to time (and timecode)

1. Hi,

I'm a software developer oriented at video programming. I mostly have knowledge about PAL environments where everything is integer accurate and easy to comprehend. However, I'm working on a project where I have to move into the vague world of NTSC video and thus need some advice how things work there
My question is about how the frame rate relates to time.. Here is an example:
• 35 s of video in 25 fps timebase corresponds to exactly 875 number of frames (35 * 25)
• 35 s of video in 29.97 fps timebase corresponds to the fractional 1048.95 number of frames (35 * 29.97)
As seen, for the 29.97 fps timebase the number of frames is not even.. How does one relate to this if someone specifies exactly 35 s of video? Obviously the exact number of frames does not exist and my guess is that the number of frames has to be rounded to an integer. In my example above, this would result in 35.0016 s if rounding the number of frames upwards to 1049, or 34.9683 s if rounding downwards to 1048 number of frames.
Is there any common practice for this? Should one always round in a specific direction (upwards)?
One frame more or less could seem like a small problem, but in some environments (broadcast in this case), it's important to use the exact number of frames.

Also, does anyone know how video duration is commonly specified, preferably in a production enviroment? Should I use the fractional number 29.97 for calculations, or does people calculate using 30 fps timebase? Do people involve timecode to specify the length of video? That seems a bit unnecessary..

Does anyone have any experience from similar problems?

Lot's of question, hope someone can answer them.
Thank you!
2. The solution to this issue is drop-frame timecode.

http://en.wikipedia.org/wiki/SMPTE_timecode

The basic details are here. It keeps timecode in sync with realtime.

In practice if you type 01;01;00;00 into a proper broadcast editing system (such as Avid) it will take you to 01;00;59;28. The semi-colons are important. If were written as colons we would assume a 30 fps counting base and it will not be accurate to realtime.

Program duration in a professional production environment is specified in hours, minutes, and often seconds. In that environment shows do not end at fractions of seconds, even if that means there is a fraction of a second of black at the beginning or (more likely) end.

I have read about SMPTE timecode, drop frame and non-drop frame. But it mostly makes me more confused. From what I read, timecode is not a way of measure time, but merely a way of addressing a frame by an unique identifier. This FCP7 manual explained it quite good.

Could you explain why entering 01;01;00;00 into Avid would take me to 01;00;59;28? If both the entered timecode and the returned timecode are drop frame, shouldn't the returned timecode be the one typed in?

Ok, so if program duration is specified in hours, minutes and seconds, how come a show cannot end at fractions of seconds? A show does have to be an integer number of frames (it is the smallest unit in video), that would make a 35 s long show to actually end at 35.0016 or 34.9683 depending on rounding method.

The 59:58 issue has to do with making machine readable edits that don't choke over "non-existant" frames. It probably has to do with the legacy of timecode giving over to control track at the edit point in the early days of frame-accurate editing. You'll have to explore the SMPTE archives to get an exact answer to that one.

I was careful NOT to say a show CANNOT end at a fraction of a second, but that it DOES not. Broadcast shows are built to end at exact seconds by convention, to circumvent precisely the issues you bring up.

Also, one would never round down, because that would cut off part of a show.

edit: The unit of measure is frames for both the individual program as well as the overall broadcast stream, so counting in ms isn't particularly meaningful. In practice, there are usually at least a few frames of black for a fade-in or fade out at either end of a show so there is some wiggle room.

Hope that helps a little.
5. Originally Posted by xcile
How does one relate to this if someone specifies exactly 35 s of video?
What would you do if someone wanted exactly 35.1 seconds of PAL video?
6. Originally Posted by jagabo
Originally Posted by xcile
How does one relate to this if someone specifies exactly 35 s of video?
What would you do if someone wanted exactly 35.1 seconds of PAL video?
Exactly right -- it's a non-issue because it can't happen.

But here's a more realistic scenario. Let's say I'm supposed to deliver a :30 commercial with no fades.

In NTSC land that could mean 900 frames, 899 frames or 898 frames.

I would build a 900 frame spot and make damn sure there was no vital information that existed in only those last two frames.
7. I was pointing out that the problem you refer to isn't limited to NTSC video. Obviously you have to round up or down. I don't know if there is a common practice.

By the way, the NTSC frame rate is 30000/1001, not exactly 29.97.
8. Originally Posted by smrpix
Well, I'm looking for common practices, so thats fine

Originally Posted by smrpix
Broadcast shows are built to end at exact seconds by convention, to circumvent precisely the issues you bring up.
But I guess this points out my problem, a show cannot end at exact seconds, measured in real time. Even when talking about drop frame timecode it's still not exact time. Think of a sequence starting from frame 00;00;00;00. When we reach frame 30 (00;00;00;29), the timecode wraps to 00;00;01;00. Already at this point, timecode is out of synch compared to real time. At this point 29.97 real seconds has elapsed, but timecode shows that 30 frames has elapsed (thus the dropping of timecode frames later on..) The site I linked earlier explains this very good.
So, if a show is 1050 frames, converted to real time it becomes 35.0350 seconds (1050 / 29.97). But the timecode will show exactly 00;00;35;00 (if my calculations are correct .)

Originally Posted by smrpix
edit: The unit of measure is frames for both the individual program as well as the overall broadcast stream, so counting in ms isn't particularly meaningful.
Ok, so you mean that one always specify programs in timecode, not real time?

9. Originally Posted by jagabo
Originally Posted by xcile
How does one relate to this if someone specifies exactly 35 s of video?
What would you do if someone wanted exactly 35.1 seconds of PAL video?
Well, the problem with NTSC is that the timecode is virtually never correct compared to real time. When it comes to PAL, the timecode wraps at 24 frames which is exactly the same as 1 second. That makes the timecode always in sync with real time.

edit: But, maybe this isn't a problem in NTSC? Thats what my question is all about i think..
10. Originally Posted by xcile
Originally Posted by jagabo
Originally Posted by xcile
How does one relate to this if someone specifies exactly 35 s of video?
What would you do if someone wanted exactly 35.1 seconds of PAL video?
Well, the problem with NTSC is that the timecode is virtually never correct compared to real time.
No, it's correct every 1/29.97'th second. Whereas PAL is only correct every 25'th of a second. Your division of time into whole seconds is arbitrary. When an engineer asks you for 30 seconds of video he means not 29 seconds and not 31 seconds. He understands that you can't have exactly 30 seconds of NTSC video. As you've stated, it's impossible. I doubt any of them care whether you give them 899 frames or 900 frames.
11. An NTSC hour is a "real" hour, just like PAL. Timecode is based on a 24 hour clock.
12. IF you are looking at the # of frames, in either system PAL or NTSC, you cannot have fractions of a frame. In PAL, those occur at 1/25th of a second or 0.04sec. In NTSC, those occur at ~1/29.97th of a second (more accurately 1/29.97002997002997002997002997003 aka 1 / (30000 / 1001)) or 0.033366666666666666666666666666667 sec. If it's easier for your brain, use 1/30th of a second (0.033333333333 sec) and then understand that since the frames are slightly slower, they will each last slightly longer.

But as smrpix said: PAL has 90,000. exact frames in one hour. NTSC has 107,892. exact frames. Using "drop frame" timecode in NTSC, those 107,892. frames shows 01;00;00;00. Using non-dop, those 107,892. frames shows 3.6 seconds less: it would have to have 108,000. frames to show 01:00:00:00, which would be one hour+3.6 seconds in realtime.

Confusing, yes, but you get used to it. Now add 24FPS, and Fields into the mix!...

Scott
13. xcile, if you're willing to share, what's the nature of the project you're doing that raised these questions?
14. Originally Posted by jagabo
I doubt any of them care whether you give them 899 frames or 900 frames.
I don't know about NTSC markets (that's why I'm asking, duh! ) but in PAL markets broadcasters needs frame accurate files. Not 1 extra frame and not 1 less frame.

Originally Posted by smrpix
xcile, if you're willing to share, what's the nature of the project you're doing that raised these questions?
The project is about commercials. By the end of the day, it's about specifying the right amount of time/frames when cutting files with FFmpeg.
15. Originally Posted by xcile
The project is about commercials. By the end of the day, it's about specifying the right amount of time/frames when cutting files with FFmpeg.

unless your files are intra only, using ffmpeg to cut is hardly accurate with long gop files...
16. They are I-frame only and cutting with FFmpeg is not an issue.
17. So can I make the conclusion that in an NTSC production environment (read FCP), when producing a 30 seconds ad, one set up the timeline to match the specification (NDF or DF), edits and expect the resulting number of frames to match up the desired ones?
18. In a broadcast environment df is the norm.

Also, files don't just start and end with the program. They have bars and tone, identification slates and countdowns at the beginning and extra black at the end. Once the spots are on the broadcaster's server, only the program portion of the file is displayed. Usually there is embedded metadata as well (closed-captioning and timecode, for example.)

A lot of material is still delivered on tape, HDCAM being common, but not universal. File based protocols, formats and requirements still vary widely -- XDCam, P2, and DNxHD again being common but not exclusive.

edit: originally said ndf, I meant, drop-frame is normal!
19. Yes, I know about slates and black frames etc. But that's not part of the problem. The question is: How does one measure time in an NTSC production enviroment. This question is also independent of if the delivery is file or tape based.

Originally Posted by smrpix
In a broadcast environment ndf is the norm.
So I guess we're back to square one then..
If a delivery specification says 2 min and 30 seconds ( = 150 seconds) and I edit relying on NDF timecode (since it is the norm), then that would mean that I'm actually delivering 4500 frames which equals 150.15 real seconds. Then, how does the broadcaster deal with that 0.15 extra seconds delivered? In 1 hour it becomes approx 3.6 seconds.. In 1 day it sums up to 1 min and 26.4 sec. To me it seems they will have a constant delay of the programs throughout the day (unless they have a heavily modified DeLorean DMC-12 and are called McFly ..)
20. Originally Posted by smrpix
edit: originally said ndf, I meant, drop-frame is normal!
Ok, that makes a bit more sense. But as I already have stated, not even DF is frame accurate (also read this Adobe Premier document).
However, maybe it is frame accurate enough to be accepted by a broadcaster..
21. It is absolutely frame accurate.

That Adobe document is amateur stuff. If you want the real deal,

Start here: http://standards.smpte.org/content/978-1-61482-268-4/st-12-1-2008/SEC1.abstract?sid=b7...b-3e263deea48c
22. When I say it isn't frame accurate I mean that the frame displayed in the timecode does not exactly correspond to when in real time the frame is displayed on the display device.
23. That's why broadcasters use genlocks, frame syncs etc.
24. Remember, ffmpeg is time based, rather than frame based, but it still does not render partial frames. Also as far as I know (and I would love to be wrong on this) ffmpeg does not support true smpte timecode.
25. Originally Posted by jagabo
That's why broadcasters use genlocks, frame syncs etc.
Thanks, but that's not a very detailed answer. Could you explain that?
26. Originally Posted by smrpix
Remember, ffmpeg is time based, rather than frame based, but it still does not render partial frames. Also as far as I know (and I would love to be wrong on this) ffmpeg does not support true smpte timecode.
FFmpeg has the -vframes option to specify how many frames to process, so that shouldn't be a problem. The timecode -> number of frames calculation is implemented by me, thus this discussion..
27. Originally Posted by smrpix
That Adobe document is amateur stuff. If you want the real deal,
Start here: http://standards.smpte.org/content/978-1-61482-268-4/st-12-1-2008/SEC1.abstract?sid=b7...b-3e263deea48c
I wouldn't say Adobe are amateurs, however, citing from the SMTPE standard you mention:

To minimize the NTSC time error, the first two frame numbers (00 and 01) shall be omitted from the count at the start of each minute except minutes 00, 10, 20, 30, 40, and 50.
That's what I mean about not accurate, it's always about minimizing, it's not exactly accurate (except at every 10th minute.)
28. So just write your code to SMPTE specs and you're good to go.
29. ...
30. xcile, are you familiar with, or even associated with the ffmbc project? There is knowledge there on HOW to do this that goes beyond mine.