Notes from Lecture on Audio/Video

Digital Audio Concepts

Sampling rate
To convert an analog sound wave into digital form samples of the wave are taken at regular intervals. The more the samples the greater is the fidelity of the sound and the bigger is the file. Sampling rates are measured in kilohertz (KHz). Standard rates are - 8 KHz, 11 KHz, 22 KHz, 44 KHz, and 48 KHz.
Bit depth
Bit depth corresponds to the resolution of the amplitude of the sound. The more the bits the better the quality of the audio. Common bit depths are 8-bit, 16-bit, 20-bit, 24 bit, 32-bit and 48-bit.
Channels
Audio files can support from 1 to 6 channels. Familiar channels are 1 for mono and 2 for stereo. Most existing file formats support only mono or stereo.
Bit rate
Bit rate is the number of bits per second devoted to storing audio data. Bit rate is measured in kilobits per second (Kbps) = File size in kilobits / length of audio clip in seconds. You want to keep bit rate lower than the rate of the user's connection to the Internet.

Copyright Restrictions

Respect copyright restrictions. It is illegal to broadcast a sound recording without permission of the copyright owner. There are prerecorded sound and music that can be obtained for a one time fee. Search for royalty free music on the web. The other alternative is to prepare your own music.

Streaming Audio

Previously one downloaded the complete sound file on the hard disk and then browser would launch an external player or use a plug-in to play audio. Streaming audio begins playing almost immediately after the request is made and continues playing as the audio data is being transferred. Unfortunately, there is a high cost of server software and the sound quality is affected by low speed and inconsistent Internet connection.

Streaming Audio Server Software

Special server software breaks the audio file into packets that are reassembled on the client side. The audio player collects a couple of packets before actually playing them (buffering). The protocols used are: Some media format begin playing before they have completely downloaded. This is pseudo-streaming and does not need a special server software. But it cannot handle heavy server loads or multiple connections. RealMedia, QuickTime, MP3, Flash, and Shockwave will pseudo stream from an HTTP server.

Web Audio Formats

WAV / AIFF (.wav, .aiff)
WAV (Waveform Audio File Format) was developed for Microsoft Windows but is supported on Macintosh. AIFF (Audio Interchange File Format) was developed for Macintosh but is now supported on other platforms. They can support sampling rates at 8 and 11 KHz and bit depths of 8 and 16 bits. They sound good when uncompressed but suffer drastic loss of quality when compressed. They are good for short audio clips like greetings.
MP3 (.mp3)
MP3 formats retain good sound fidelity under compression. Four minutes of high quality sound in WAV format would occupy 40 MB whereas in MP3 format it occupies 3.5 MB. MP3 is technically MPEG-1, Layer III files. MPEG is a family of multimedia standards created by Moving Picture Experts Group. MPEG uses lossy compression scheme. Sounds that are not discernible to the human ear are thrown out in the compression process. MP3s can be served from HTTP server.
Apple QuickTime Audio (.mov)
QuickTime is known as a video technology but it is possible to create audio only. Both Netscape and IE come with QuickTime plugin. Good for continuous play audio and is capable of true streaming via RTP or RTSP.
RealMedia / RealAudio (.rm, .ra)
RealAudio is a server based streaming audio software. RealServer offers more advanced features such as bandwidth negotiation. It uses RTSP transmission for smooth playback. A RealServer system can allow thousands of simultaneous listeners. There are plugins available for most browsers to listen to RealAudio files.
Flash (.swf) / Shockwave (.dcr)
Flash, developed by Macromedia is ideal for adding high-impact interactivity and animation to web sites. Audio can be embedded in a Flash movie. Shockwave takes Director files and compresses them down for web delivery as .dcr files.

Adding Audio to a Web Page

You can use a simple anchor tag to link to an audio file from a web page.
<a href = "./audio/song.wav"> Play our song (3.5 MB) </a>

<a href = "./audio/song.mp3"> <img src = "./images/playme.gif"> </a>

When the reader clicks on the link or the graphics the browser retrieves the audio file from the server and launches a helper application to play the file.

If you want the audio file to play in the background for both Netscape and IE use a combination of background tag (IE) and embed tag (Netscape):

<embed src = "./audio/song.wav" autostart = true hidden = true >

<noembed><bgsound = "./audio/song.wav"></noembed>

Digital Video Concepts

Movie Length
Longer video clips take more space. If you have to show longer video use streaming video solutions.
Frame size
Full screen video is 640 x 480 pixels. Most common frame size for web video is 160 x 120 pixels, some go as small as 120 x 90 pixels. Actual size limits depend on CPU power and bandwidth of the user's Internet link.
Frame rate
Frame rate is the number of frames per second (fps). On TV it is about 30 fps. For the web 15 fps or even 10 fps is appropriate.
Color bit depth
Size of the video is affected by the number of pixel colors in each frame. Reducing the number of colors from 24 to 8 bit will reduce the file size.
Data rate
The size of the movie in kilobits divided by the length of the movie in seconds is the data rate. For streaming media the file's data rate is more important than its total size.

Compression

Lossless versus Lossy Compression
Lossless compression does not lose information and the final file is identical to the original. Lossy compression uses complicated algorithms to remove data for sound and image detail that is not discernible to the human ear or eye. The decompressed file is similar but not identical to the original.
Spatial versus Temporal Compression
Spatial Compression takes each individual frame and compresses the pixel information as though it were a still image. Temporal compression happens over a series of frames and takes advantage of areas of the image that remain unchanged from frame to frame, throwing out data for repeated pixels.
Video Codecs
Codecs are compression / decompression algorithms that can be applied video files for the web. Video editing software packages often offer a long list of codecs in their compressor list options. Among them are:

Video File Formats

QuickTime Movie (.mov)
QuickTime was developed to view video information on the Macintosh but is now supported on Windows. QuickTime movies can be streamed using a number of streaming server packages. You can edit video using Apple's QuickTime Player.
RealMedia (.rm)
RealMedia is the industry standard for streaming media format. It is good for long playing video clips and live broadcasts to a large number of people.
Windows Media (.wmv or .asf)
Windows Media is the new standard for audio and video created by Microsoft. The Windows Media Player is capable of playing proprietary Microsoft formats as well as a number of other formats. The media server runs only on Windows platform and there are tools for creating .wmv and .asf files.
MPEG (.mpg or .mpeg)
MPEG files offer extremely high compression rates with little loss of quality. The compression technique strips out data that is not discernible to the human ear or eye.

Adding Video to a Web Page

A downloadable video file can be linked to HTML documents using the standard anchor tag.
<a href = "./videos/movie.mov"> Check out the video (1.3MB) </a>

When the user clicks on the link the browser looks at the file type and launches an external player application or uses a plugin to play the movie right in the browser window.