In many television systems the video will pass through delaying devices
such as frame synchronizers, color correctors and noise reducers. Currently,
both 4 and 8 field frame synchronizers are in use. These video delays
are continuously variable, and are combined in the system with delays
which are switched in and out as the operator selects different modes
of operation.
As an example, in many systems the operator switches a color corrector
or noise reducer in and out of a remote feed as needed. In addition, the
feed is also passed through a frame synchronizer. This gives video delays
which can instantly change from 0 to more than 8 fields.
The most obvious problem these video delays cause is visible as "lip
sync" errors. Additionally, even when there are smaller invisible
errors present, these timing problems can affect viewers.
It is possible to delay the corresponding audio signal to match the video
delay, but with the video delay constantly changing there needs to be
a delay steering signal provided. Newer video devices output a DDO or
digital delay output signal for this purpose.
Measurement of video delays is a fairly complex problem. By measuring
the relative delay of input vertical sync and output vertical sync, it
is only possible to determine delays of up to one frame period. Many video
devices can have much larger delays which range from zero to several frames.
By just measuring the delay of vertical sync it is impossible to determine
if the delay is greater than a frame. For example it is impossible to
distinguish between ¼, 1¼ or 2¼ frames, or any other
multiple frame delay ambiguities.
Measuring and tracking lip sync errors with LipTracker
LipTracker™ is a non-invasive measurement
tool for in-service lip sync analysis. After detecting a face in the video,
LipTracker™ compares selected sounds
in the audio with the mouth shapes that create them in the video. The
relative timing of these sounds and corresponding mouth shapes (called
Mutual Events or MuEvs) is analyzed to produce a measurement of the lip
sync error. Audio offsets of up to ± 5 video frames can be measured.
This unique approach of analyzing real time video and audio content does
not require the insertion of cues, codes or watermarks into the program
material. Therefore, LipTracker™ can
be used at any point in the transmission path.
The numeric and graphic displays of the current audio offset are updated
periodically until the current face is lost or a new face is detected.
The history graph charts the most recent error profile and event logging
saves the results for scene by scene analysis. The Audio Offset Status
indicator is a visual warning of the current offset. User programmable
thresholds determine whether the indicator is Green, Yellow or Red at
any given offset reading.
The sounds and mouth shapes that are used for MuEv analysis are commonly
found in the natural speech patterns of many languages. When a face is
detected, the input video is processed by locating the upper and lower
lips within the face and extracting the mouth shape characteristics to
generate a field by field stream of video MuEvs.
At the same time, the input audio is transformed to frequency space and
normalized to minimize differences due to vocal pitch. A stream of audio
MuEvs are then identified for comparison with the video MuEvs.
The audio and video MuEv streams are then correlated at each of the possible
offset values to determine the offset that produces maximum correlation.
This maximum correlation result is used to generate the measurement of
the lip sync error that is displayed on the screen. The silence segments
that occur in the audio input are also identified to provide additional
cues for measuring the lip sync error.
LipTracker™ searches frame by frame
for a face in the input video. After finding a face, LipTracker™
maintains lock during typical camera pans, tilts, zooms, and through the
normal range of head motion. If LipTracker™
makes the wrong selection in scene with multiple faces, all it takes is
a double click of the mouse to select the face you want to analyze.
When event logging is enabled, the audio offset measurements are written
to an HTML file. For each segment analyzed, a thumbnail of the first frame
is stored along with the segment start time, the time of each measurement
point and the audio offset at each point. The times recorded in the log
can be the real time clock or VITC from the video input.
Pixel Instruments video delay detectors
The Pixel Instruments DD2100 (analog) and DD3100
(digital) delay detectors automatically measure video delays from zero
to several frames. These delay detectors take a unique approach of comparing
frames of active video which are input to the delaying device with frames
of video which come out of the device. The input and output video frames
are correlated by a high speed DSP circuit to measure the relative delay.
A frame of input video is grabbed from the input and is compared against
the output frames until a match is found. Whenever the frames are changing
so that each one is somewhat different from the others it is possible
to track how long it takes a given input frame to travel through the system
to the output. This comparison gives a coarse delay measurement by counting
the number of frames which pass from the output until the match is found.
In addition, a fine delay is calculated based on the relative phase of
input and output vertical sync. The coarse and fine delays are added to
give the total delay.
The delay detector is very easy to add to an existing system. The video
input to the delaying device and the video output from the delaying device
are both looped through the delay detector inputs. No modification to
the video signal or the video delaying device is required. Delays up to
8.99 fields can be detected with 525 or 625 line signals and the detector
provides a DDO signal. The DDO is connected with a coax or telephone line
to a companion audio delay such as AD3100 audio
synchronizer to make the appropriate compensating audio delay.
A front panel LCD display provides delay and update information as well
as alarm messages in the event of problems such as loss of video. Operation
is automatic and does not require any operator intervention once the unit
is installed.
The delay detector continuously calculates and updates the delay measurement.
Measurements are made based on relative vertical sync phase for fine measurement
and frame correlation for coarse measurements. A readout of elapsed time
since the last coarse measurement is provided, along with the current
total delay which is being sent to the audio delay via the DDO.
| |
Quick links
- Application notes |