Abstract
Streaming Technology has already been in existence for a few years. It has
not, however matured to the stage where full screen, surround sound is
enjoyable by all. Conventional streaming techniques are wasteful and do not
yield high quality presentation. In this paper I will discuss streaming
compression and its delivery. In particular I will explain some of the codecs,
the conventional streaming techniques and the recent burst development.
Finally, I will show that the burst technology is significantly more
intelligent than any traditional method.
Keywords:
Streaming, Codecs, Burst
Contents
- Introduction
- Compression and Decompression
- Conventional Streaming Techniques
- Burst
- Conclusion
- References
1. Introduction
Streaming involves the transmission of multimedia information in the form of
audio and video from the server to a client that can be viewed immediately.
Broadcasting, on the other hand involves the transmission of similar
information from the server to multiple clients. A television station
broadcasts its signal to many viewers. A stream of information is sent
specifically for the client that requests it. Streaming and broadcasting
transmit the same information using different technologies.
In this paper I will concentrate on streaming, its compression and its
delivery.
I will firstly discuss the motives behind streaming, its major hurdles and
the two ways of tackling the problem. In the next section I will discuss the
compression algorithms required. Following this, I will outline the
conventional streaming techniques that use real time delivery. Lastly, I will
compare this with burst technology that promises to revolutionize the
streaming idea. I will conclude by giving suggestions to improve current
streaming techniques.
1.1. Motives and Problems
Streaming has many uses. Streaming carries multimedia that requires jitter
free, eye catching images mixed with CD quality sound. Streaming provides
interactive entertainment through video-on-demand, radio, web cams and live
concerts or sports events. Although it has many such uses, streaming requires
a lot of bandwidth.
Pure uncompressed video is too big. The dual stream that professionals use
require 44Mbs. Overcoming this requires two concurrent approaches. The first
minimizes bandwidth using compression algorithms. The second maximizes
bandwidth availability using intelligent network management. Together these
techniques improve the quality of streaming media. In the next section I will
discuss the techniques involved in compressing streaming media.
2. Compression and Decompression
Streaming requires heavy use of compression and decompression algorithms
commonly called codecs. These algorithms revolve around some similar
techniques that are best illustrated by discussing both the MPEG-1 and MPEG-4
encoding.
2.1. MPEG-1 Predictive Encoding
The MPEG-1 codec was designed to provide quality video primarily for CD-ROMs.
It uses a bandwidth of 1.2 megabits per second (Mbs) and this is too high for
the majority of the Internet population who use 56K modems. Although the
actual MPEG-1 codec is not perfect for streaming, the ideas behind it are.
MPEG-1 uses predictive encoding; that is it stores the differences between
each frame. A frame of video stores the still picture representing an exact
point of time. Every second of a typical MPEG-1 stream contains either 25 or
30 frames. The majority of these frames are only slightly different from the
previous frame. This observation that the difference between frames is small
builds the foundation of the MPEG-1 codec.
2.1.1. I-Frame
The starting frame of the MPEG-1 codec is called the intra frame or the
I-frame. This frame stores an uncompressed still image. The whole image is
requires since the next frames record the changes from this I-frame.
2.1.2. P-Frame
Following the I-frame is the predictive frame or the P-frame. The P-frame
predicts the motion of the closest previous I or P frame. The frame only has
to store these predictions or changes and nothing more. Large compressions
thus result from video that does not have a lot of motion. In the worst case
scenario where every frame of the original video is vastly different from the
previous one the MPEG-1 compression is ineffective. This is because every
P-frame would store the same amount of information as an I-frame and therefore
no compression is achieved.
2.1.3. B-Frame
In between the I and P frames comes the bidirectional frame or the B-frame.
The storage requirements of this frame is minimal since it is computed in the
decompression from the closest I and P frame. The B-frames are built to ensure
a continuous smooth motion between successive I and P frames. Compression
rates increase greatly using this technique since whole frames are dropped.
2.1.4. Structure
The typical MPEG-1 compression follows the following structure:
IBBPBBPBBPBBIBBPBBPBBPBBI
Every 12 frames is an I-frame and between every I-frame there are three
P-frames and between every P-frame there are two B-frames. The I-frame needs
to be stored frequently to correct any propagation of errors with the
P-frames.
2.2. Key Frames
The I-frame is often called the key frame because its presence is essential to
lift the quality of the presentation. The quality of the streaming
presentation is proportional to the on-time arrival of all the frames. Key
frames allow some intermediate frames to be dropped for a marginal loss in
quality. This is because intermediate frames project changes from a key frame.
If they are lost, the presentation will be momentarily jerky or unclear but
when the next key frame arrives the quality will be restored. In the case
where a key frame is lost, the presentation can still continue but it will be
unclear until the next key frame arrives.
Another advantage of key frames is that live broadcasts can be easily
commenced from the next I-frame. It is not necessary to view the presentation
from the very beginning. Much like a CD can be played from any track a
streaming presentation can be played from any key frame. So if someone in
Brazil started watching the Olympic closing ceremony after it actually started
in Sydney they do not need to start watching the streaming presentation from
the beginning of the ceremony. When they start the streaming broadcast, they
will be watching the ceremony live from the next key frame.
Finally, key frames allow for easy fast forwarding and rewinding of the
streaming media. If someone watching a video on the Internet wanted to watch a
particular scene again they can easily do so by rewinding to a specific key
frame. Likewise, if a user wanted to skip a certain scene they can easily do
so by fast forwarding to a certain key frame. Key frames enable users to
specifically control streaming presentations.
2.3. MPEG-4 Object Encoding
The MPEG-4 encoding was specifically tailored to very low bit rates. This is
perfect for streaming that require low bandwidth. MPEG-4 achieves high
compression using object encoding; that is the recognition of foreground and
background objects and accordingly applying varying levels of compression. A
background object does not require much detail and thus can be heavily
compressed. A foreground object requires a lot of detail and thus is not
heavily compressed. This technique simultaneously allocates space to the
foreground and blurs the background. Our human vision works in much the same
way. The object that is the focus of our attention is clear whilst the rest of
the space is blurred.
2.4. Resolution
A technique that is often used in streaming is to lower the resolution or the
screen-size of the video. Undesirable though this may be, it does
significantly decrease the size of the presentation. The smaller the viewing
area the lower the amount of information that has to be carried and the
greater the compression. I consider this as a last resort to get the most out
of your bandwidth. When a presentation has been compressed using all of the
techniques and it is still takes up too much bandwidth lowering the resolution
will definitely decreases the size of the presentation.
3. Conventional Streaming Techniques
In the previous section I discussed how to compress the media into the
smallest possible size. This minimizes the amount of bandwidth required to
stream. In this section I will show the traditional methods that maximize the
use of all available bandwidth. I will compare these methods with a more
recent approach in the next section.
3.1. Constant Bit Rate
Conventional streaming servers sends data at the same rate as the client can
play. So when a client requests for streaming media, the server looks at the
users bandwidth and play rates and adjusts the streaming speed accordingly.
The streaming data will be scheduled to arrive just in time. This means that
the streaming packet will be played as soon as it arrives. A reliable network
is thus necessary for seamless streaming. If the network is unreliable and the
stream is thus delayed the whole presentation will hang.
3.2. Network Problems
The typical network is in fact unreliable. Expecting that a constant bit rate
will be passed on seamlessly through all the different routers is unrealistic.
The constant bit rate approach to streaming is often delayed; which in turn
frequently causes the presentation to halt. There is no room for delay. The
user can not be shielded from network problems. Network congestion messages
are commonly displayed and this dramatically affects the presentation. The
worst part is that nothing much that can be done.
3.2.1. Protocol - UDP
Since these delays in delivery are very undesirable, the streaming protocol
must be as fast as possible. TCP/IP is guaranteed to arrive but it takes too
long to do so. The User Datagram Protocol (UDP) on the other hand is much
faster and is thus suitable for streaming. The basic idea behind the protocol
is as follows. The streaming data is broken down into small packets and sent
to the client. The client then reassembles these packets and plays the
presentation. The difference in this protocol is the fact that a lost packet
is not resent. This is suitable for streaming since a packet that is late as a
result of being resent is out of place in the presentation. Late packets are
simply discarded. The streaming presentation does not require all of the
packets to arrive. It only requires a reasonable percentage to arrive on time.
3.3. Streaming Architecture
The conventional streaming server requires plenty of bandwidth. It has to
allocate enough bandwidth to cover for all the requests during peak periods.
Most of the time, however, bandwidth is simply unused and wasted. The diagram
below illustrates the typical architecture.
The dark blue areas show the actual bandwidth used. The light blue areas
show the amount of bandwidth that is wasted. As you can see, for the majority
of the time bandwidth is wasted. This is costly for the web site that has set
up the streaming server. In the next section I will discuss a better way of
constructing a web server that does not waste resources.
4. Burst
In the previous section I looked at the conventional streaming techniques
and showed how they are prone to network problems and waste a lot of
bandwidth. The new burst streaming architecture aims to overcome these
problems through intelligent buffer management.
4.1. Burst Architecture
The burst architecture efficiently averages out the peaks and troughs that are
present in conventional streaming. This averaging results in less overall
bandwidth use and thus higher profit margins. The diagram below compares the
bandwidth allocated to conventional servers and that allocated to burst
servers.
As you can see the bandwidth used by burst architecture is significantly
less than that required by conventional architectures. This is achieved
through intelligent buffer management. This concept will be explained in
section 4.3.
4.2. Hiding Network Problems
Burst technology aims to shield the user from network problems. This is
achieved by monitoring the network topology, the application configuration and
the video load. The network topology is monitored by looking at the bandwidth
of the server and the client. So if either the bandwidth of the client or
server is not fully used more packets will be sent to the client. The length
of the presentation and the bandwidth required by it is also monitored. So a
higher priority will be given to the presentation if either the presentation
is short or the bandwidth required is high. The streaming load on the server
is monitored by keeping track of the maximum number of clients and the
variation of the number of clients. So if there are a high number of clients
on a server, it would be less likely to a accept another streaming request.
The server has to keep track of all these things to make informed intelligent
decisions.
4.2.1. Testing Scenarios
The client watching or requesting a presentation has to be shielded from any
network problems. So if an unfortunate even happens, the user has to be
shielded from this fact. In the scenarios where the server dies another backup
server will instantly detect its failure and take its place. This requires
that a server be on standby at all times just in case something happens. The
clients presentation continues uninterrupted and will thus be of higher
quality.
Another scenario is when the server is too busy. In this case the busy
server refers the next client to another freer server. This requires that each
streaming server keep track of each others loads in order that all clients
receives high quality presentations.
4.3. Intelligent Buffer Management
In the previous subsection I showed how a burst server shields the user from
network problems. This is primarily intended to increase the quality of the
presentation. Another way that this can be increased is by intelligent buffer
management. The streaming server keeps track of the buffers in the client
machine. If the buffer is diminishing, this client is given high priority.
However if the client has a full buffer it is given a low priority. This can
be seen in the diagram below.
Client one has a full buffer and thus does not require a high priority.
Client two and three's buffer are almost empty and thus require an immediate
refill from the server. The server then bursts or floods the stream down onto
the clients machine so that its buffer can be quickly replenished. This
intelligent buffer management ensures that every client has data in the
buffers. This thus increases the quality of the presentation.
4.4. Comparison with HTTP streaming
So how does this burst technology compare with conventional HTTP streaming?
This is best summarized by looking at the following table.
Burst.
|
HTTP Streaming.
|
|
Network centered.
|
Client centered.
|
|
Need based.
|
Greed based.
|
|
Connection acceptance criteria.
|
No criteria applied.
|
|
Fall-over scheme.
|
No fall-over scheme.
|
|
Designed to handle files of all sizes.
|
Designed to handle fairly small files.
|
HTTP streaming tends to satisfy the client with the biggest pipe. The
client with the largest bandwidth occupies the majority of the servers
attention. The other clients are unable to experience any reasonable service
because the bandwidth is all being used by client with the fastest connection.
Burst streaming is much fairer in the sense that all clients are served
equally. Burst wins on all counts. It is significantly better than HTTP
streaming and is thus the option to choose.
5. Conclusion
Streaming is a fantastic idea and will be used more and more in the future.
The current architectures, however do not offer high quality services to all
clients. It also wastes a lot of bandwidth and is thus expensive. Burst
streaming is an intelligent step forward but it is not the be all and end all.
Burst still needs to refined to further increase the quality of streaming
media. Higher bandwidth is not the solution, it is the intelligent management
of the overall network. The network is not reliable and the users are not
currently being shielded from this fact. When intelligent management is
employed, the user is shielded from the problems and thus receives higher
quality presentations. This will be better for us all.
6. References