Notes from Lecture on Audio/Video
Digital Audio Concepts
- Sampling rate
- To convert an analog sound wave into digital form samples of the
wave are taken at regular intervals. The more the samples the
greater is the fidelity of the sound and the bigger is the file.
Sampling rates are measured in kilohertz (KHz). Standard rates
are - 8 KHz, 11 KHz, 22 KHz, 44 KHz, and 48 KHz.
- Bit depth
- Bit depth corresponds to the resolution of the amplitude of the
sound. The more the bits the better the quality of the audio.
Common bit depths are 8-bit, 16-bit, 20-bit, 24 bit, 32-bit and
48-bit.
- Channels
- Audio files can support from 1 to 6 channels. Familiar channels are
1 for mono and 2 for stereo. Most existing file formats support only
mono or stereo.
- Bit rate
- Bit rate is the number of bits per second devoted to storing audio
data. Bit rate is measured in kilobits per second (Kbps) = File size
in kilobits / length of audio clip in seconds. You want to keep bit
rate lower than the rate of the user's connection to the Internet.
Copyright Restrictions
Respect copyright restrictions. It is illegal to broadcast a sound
recording without permission of the copyright owner. There are prerecorded
sound and music that can be obtained for a one time fee. Search for
royalty free music on the web. The other alternative is to prepare your
own music.
Streaming Audio
Previously one downloaded the complete sound file on the hard disk and
then browser would launch an external player or use a plug-in to play
audio. Streaming audio begins playing almost immediately after the
request is made and continues playing as the audio data is being transferred.
Unfortunately, there is a high cost of server software and the sound
quality is affected by low speed and inconsistent Internet connection.
Streaming Audio Server Software
Special server software breaks the audio file into packets that are
reassembled on the client side. The audio player collects a couple of
packets before actually playing them (buffering). The protocols used
are:
- User Datagram Protocol
- Real Time Streaming Protocol (RTSP) - two way streaming protocol that
allows users to send messages back to the server.
- RealTime Transfer Protocol (RTP) - one way stream used by Apple
QuickTime.
Some media format begin playing before they have completely downloaded.
This is pseudo-streaming and does not need a special server software.
But it cannot handle heavy server loads or multiple connections. RealMedia,
QuickTime, MP3, Flash, and Shockwave will pseudo stream from an HTTP
server.
Web Audio Formats
- WAV / AIFF (.wav, .aiff)
- WAV (Waveform Audio File Format) was developed for Microsoft Windows
but is supported on Macintosh. AIFF (Audio Interchange File Format)
was developed for Macintosh but is now supported on other platforms.
They can support sampling rates at 8 and 11 KHz and bit depths of
8 and 16 bits. They sound good when uncompressed but suffer drastic
loss of quality when compressed. They are good for short audio clips
like greetings.
- MP3 (.mp3)
- MP3 formats retain good sound fidelity under compression. Four
minutes of high quality sound in WAV format would occupy 40 MB
whereas in MP3 format it occupies 3.5 MB. MP3 is technically MPEG-1,
Layer III files. MPEG is a family of multimedia standards created
by Moving Picture Experts Group.
MPEG uses lossy compression scheme. Sounds that are not discernible to
the human ear are thrown out in the compression process. MP3s can
be served from HTTP server.
- Apple QuickTime Audio (.mov)
- QuickTime is known as a video technology but it is possible to create
audio only. Both Netscape and IE come with QuickTime plugin. Good
for continuous play audio and is capable of true streaming via RTP
or RTSP.
- RealMedia / RealAudio (.rm, .ra)
- RealAudio is a server based streaming audio software. RealServer offers
more advanced features such as bandwidth negotiation. It uses RTSP
transmission for smooth playback. A RealServer system can allow thousands
of simultaneous listeners. There are plugins available for most
browsers to listen to RealAudio files.
- Flash (.swf) / Shockwave (.dcr)
- Flash, developed by Macromedia is ideal for adding high-impact
interactivity and animation to web sites. Audio can be embedded in a
Flash movie. Shockwave takes Director files and compresses them down
for web delivery as .dcr files.
Adding Audio to a Web Page
You can use a simple anchor tag to link to an audio file from a web page.
<a href = "./audio/song.wav"> Play our song (3.5 MB) </a>
<a href = "./audio/song.mp3"> <img src = "./images/playme.gif"> </a>
When the reader clicks on the link or the graphics the browser retrieves the
audio file from the server and launches a helper application to play the
file.
If you want the audio file to play in the background for both Netscape and
IE use a combination of background tag (IE) and embed tag (Netscape):
<embed src = "./audio/song.wav" autostart = true hidden = true >
<noembed><bgsound = "./audio/song.wav"></noembed>
Digital Video Concepts
- Movie Length
- Longer video clips take more space. If you have to show longer video
use streaming video solutions.
- Frame size
- Full screen video is 640 x 480 pixels. Most common frame size for web
video is 160 x 120 pixels, some go as small as 120 x 90 pixels. Actual
size limits depend on CPU power and bandwidth of the user's Internet
link.
- Frame rate
- Frame rate is the number of frames per second (fps). On TV it is about
30 fps. For the web 15 fps or even 10 fps is appropriate.
- Color bit depth
- Size of the video is affected by the number of pixel colors in each
frame. Reducing the number of colors from 24 to 8 bit will reduce the
file size.
- Data rate
- The size of the movie in kilobits divided by the length of the movie
in seconds is the data rate. For streaming media the file's data rate
is more important than its total size.
Compression
- Lossless versus Lossy Compression
- Lossless compression does not lose information and the final file
is identical to the original. Lossy compression uses complicated
algorithms to remove data for sound and image detail that is not
discernible to the human ear or eye. The decompressed file is
similar but not identical to the original.
- Spatial versus Temporal Compression
- Spatial Compression takes each individual frame and compresses the
pixel information as though it were a still image. Temporal compression
happens over a series of frames and takes advantage of areas of the
image that remain unchanged from frame to frame, throwing out data
for repeated pixels.
- Video Codecs
- Codecs are compression / decompression algorithms that can be applied
video files for the web. Video editing software packages often offer
a long list of codecs in their compressor list options. Among them
are:
- Radius Cinepak
- Sorenson
- Intel Indeco
Video File Formats
- QuickTime Movie (.mov)
- QuickTime was developed to view video information on the Macintosh but
is now supported on Windows. QuickTime movies can be streamed using
a number of streaming server packages. You can edit video using
Apple's QuickTime Player.
- RealMedia (.rm)
- RealMedia is the industry standard for streaming media format. It is
good for long playing video clips and live broadcasts to a large
number of people.
- Windows Media (.wmv or .asf)
- Windows Media is the new standard for audio and video created by
Microsoft. The Windows Media Player is capable of playing proprietary
Microsoft formats as well as a number of other formats. The media
server runs only on Windows platform and there are tools for creating
.wmv and .asf files.
- MPEG (.mpg or .mpeg)
- MPEG files offer extremely high compression rates with little loss of
quality. The compression technique strips out data that is not
discernible to the human ear or eye.
Adding Video to a Web Page
A downloadable video file can be linked to HTML documents using the
standard anchor tag.
<a href = "./videos/movie.mov"> Check out the video (1.3MB) </a>
When the user clicks on the link the browser looks at the file type and
launches an external player application or uses a plugin to play the movie
right in the browser window.