– relies on multicast transmis-
sions in which the sender con-
nects with multiple listeners.
Typically, Ethernet uses mul-
ticast addresses ranging from
239.0.0.0 to 239.255.255.255,
along with port numbers, to
carry video data to multiple
receivers. The rules of the sub-
net still apply — receivers and
senders will need to be in the
same IP address of the subnet
in order to broadcast to those
within that subnet.
By functioning as small switches
that can be used independent-
ly, Virtual Local Area Networks
(VLANs) are useful for keeping
traffic isolated and confined.
In a broadcast environment, it
might make sense to keep all
of the video in one VLAN as
a way to group video content
separate from audio and data.
It might also be important to
keep certain video confined to
studios; in this manner, an entire
studio might be kept in a sep-
arate VLAN even though the
switch might also serve another
studio or production area.
DEMYSTIFYING
ETHERNET PACKETS FOR
UNCOMPRESSED VIDEO
For decades, the most common
packet used to transport video
in Ethernet environments has
been the Universal Datagram
Packet (UDP). UDPs are very
basic, lacking capabilities such
as error correction, sequenc-
ing, duplicate elimination, flow
control, or congestion control.
But this simplicity is also why
UDPs are so commonly used;
they don’t require direct con-
nection management, making
them versatile for data includ-
ing video. In Ethernet, the
maximum amount of data in a
frame is approximately 1,500
bytes. With around 28 bytes
reserved for header informa-
tion, this leaves 1,472 bytes for
video data, just enough to avoid
fragmentation.
For video, the catch with UDP
packets is that there’s no way
to number them. Since video
is structured frame by frame,
according to a certain num-
ber of frames per second, it’s
important to make sure these
frames are in the correct order.
That’s where Real Time Protocol
(RTP) packets come in – they
not only solve the vital task
of sequencing the packets, but
they’re small enough that seven
RTP packets can fit within a
single UDP packet. RTP pack-
ets can even be time-stamped,
meaning that we can use time
stamps for sequencing as an
entirely separate data stream,
rather than putting an actual
marker on the video. Ethernet’s
use of packets is in direct con-
trast to the old-school method
of using a sync pulse.
The method used to stamp the
RTP packets is Precision Time
Protocol (PTP), a cornerstone
of the new SMPTE ST-2110 stan-
dard. As a pure Ethernet timing
standard, PTP was described
in the Video Services Forum’s
Technical Recommendations
3 and 4 (TR-03 and TR-04)
as a method of synchroniz-
ing audio, video, and data sig-
nals together. Among all of the
components of SMPTE ST-2110,
Section 2110-10, “System Timing
and Reference,” is the most rel-
evant here because it describes
how PTP packets are to be
used and how the video, audio,
and data streams will be time-
86 • Broadcast Beat Magazine • www.broadcastbeat.com
stamped and carried in the net-
work.
The PTP signal is a separate
stream of packets that con-
tains nothing more than precise
timing information. The trans-
mitting device is responsible
for reading the PTP packets
on the network and stamping
the corresponding RTP packets
as they’re transmitted to the
Ethernet network. When each
video, audio, and data stream
has a timestamp, it can be
used to co mpare other signals
against the time marks that are
in each separate stream. Not
only can the timestamps be
used for timing the composite
streams for switching or other
timed events, but they can also
be used within the streams to
maintain the relationship of
video to audio (for lip-sync)
and video to data (for closed
captioning). This capability
alone sets the PTP signal apart
from the industry’s pulse-based
audio and video signal heri-
tage – at last, there’s a means
to address lip-synchronization
issues once and for all.
PUTTING IT ALL TOGETHER
– THE OSI STACK
The
Open
Systems
Interconnection (OSI) model is
the foundation for all Ethernet
technologies; therefore, it’s
important to explain how OSI is
used in the application of Studio
Video over IP. The first layer is
the Physical (PHY) layer and is
comprised of the raw emission
of electrical pulses converted to
light, the essence of fiber. The
second layer is Ethernet, with
its formal addressing scheme,
followed by the IP layer that
defines all the rules that apply