Audio Streaming

What Is Audio Streaming & How Does It Works?

As technology has advanced, audio recordings have gotten way better. The days of using physical audio recordings are over. Now, lots of people listen to audio streaming services. You can easily listen to your favorite live audio streaming platform because internet speeds have gotten faster.

Streaming means you can enjoy things like podcasts, movies, TV shows, and webcasts online. You can listen to or watch these things on different devices, like your phone or computer. This article will explain how audio streaming works and what you need to create your own music streaming app.

What is Audio Streaming?

Audio Streaming platform is one-way audio exportation over a data network. We use it to hear audio clips and radio from the Internet on computers, tablets, and smartphones. Also, set up machines to stream a user’s music collection to a digital media hub connected to a stereo or home theatre.

Computers may also stream directly to power speakers through Bluetooth or Wi-Fi. It is not like music files that are downloaded and played in the future. But it is a streamed song played immediately after receiving a small amount of streaming audio data, and they do not store permanently the song file in the target device. The audio stream can listen with well-known players such as Spotify, iTunes, Apple Music, Media player, etc.

How Does Audio Streaming Work?

Audio streaming in the current media landscape offers a vast array of content, including popular podcasts and music. With music streaming services like Apple Music and Spotify, users can access millions of tracks instantly without the need for file downloads.

Launching a music streaming service involves providing a diverse selection of songs from various artists. These platforms automatically create playlists based on user preferences, offering a seamless and personalized listening experience.

Podcasts, on the other hand, can be either downloaded or streamed, allowing for easy access and smooth streaming across different devices. The media files are transmitted in packets of data, facilitating instant streaming without occupying storage space on the user’s devices, a departure from traditional audio recordings.

To enjoy audio streaming, users simply need a reliable high-speed internet connection and a subscription to a streaming platform or a mobile application. This setup enables uninterrupted access to a plethora of audio content without the need for extensive local storage.

Launch a music streaming services

Audio streaming works for internet radio stations: 

Users should use streaming software. This software changes the audio into a streaming format and connects it to a streaming server, and they relate all the listeners to this streaming server. 

Streaming software:

 The streaming software can change the audio into a streaming format. It could be the audio that comes into your PC through an audio card, or you can run a music play-out system on your PC like Aircast software. In the streaming software application, the user can select a streaming audio type that can connect to a server. Streaming providers offer streaming servers. 

Streaming servers:

The streaming server is a sort of remote server to which many listeners can connect parallelly. These servers are present in data centers where bandwidth is no problem. There are many types of streaming servers, mostly well-known. 

Audio Streaming On The Internet

An increasing number of applications use local area networks and the Internet for transmission. Besides the text and graphics transmission, provided mainly by the World Wide Web (WWW), more audio and video services emerge.

In cooperation with many German broadcasters, GMD has implemented an integrated Internet Radio system that uses a WWW interface to provide worldwide a live audio stream (the current radio program?) got to archives of audio streams, built from previously broadcast content.

Delivering Audio Content Over a Low Bandwidth Network:

 Streaming and Downloading:

We can provide audio content through a network in two different ways:

1. The audio file can be installed and then played from the local hard disk of the subscriber.

2. The audio content streams from the server to the subscriber who decodes the received packets in real-time displays the content immediately and then discards the collected data.

The advantages of the downloading method are:

• Downloading works with any data rate and lets any audio quality one need to offer.

• The file is transferred error-free thus no quality reduction happens during transmission.

But there are significant drawbacks of this method of audio delivery:

• If high audio quality provides over a low-bandwidth network, download times are incredibly substantial. For instance, a five-minute-long music clip encoded with 128kbps takes half an hour or more to download over typical private Internet access with a productive long-term bandwidth of 20kbps. Reduce the audio quality to decrease the download times.

• The utilizer has to load the complete file before he can listen to any part of it. He cannot preview the data to decide if they engrossed him in the content.

• There is no possibility to give a “live download over typical

Private Internet access with a productive long-term bandwidth of 20kbps. 

The advantages of streaming are:

• The listener can listen to the content immediately after he has demanded it. He can fast-forward and listen to other parts of the content without waiting for the whole file to download.

• It is possible to provide “live” services by encoding the audio signal in real-time and transferring the resulting audio data stream instantly to the client.

The drawbacks of streaming are:

• To stream real-time audio data, the transmission line has to provide the full bandwidth of the stream during the whole transmission period. It imposes a limit on the bandwidth and hence the quality of the audio stream.

Concerning the Internet, the possible bandwidth is very low if you want to achieve the vast amount of private users connected with a modem and via service providers.

• The transmitted stream is susceptible to network load, which may cause lost or delayed data packets. It leads to drop-outs in the customer audio output.

• To exercise the required flow control, additional server software (a “streaming server”) is needed. It would be best if you consider it as the main advantage of downloading, the possible high audio quality even low bandwidth networks should compensate in practical use by the given high download times.

The disadvantage of streaming

However, it can be decreased by adapting the amount of transmitted data to the actual capacity of the network connection dynamically. To do this, store the audio data in a format that lets dropping sections of the data result in a “graceful degradation” or store it in different forms so that the server can choose the size best suited for the bandwidth of a given connection. In conclusion, streaming is a suitable method to offer the user fast access to content. It is the only solution possible to provide a live stream.

How To Launch Your Audio Streaming Services

Launching an audio streaming service has become a sought-after venture for many aspiring entrepreneurs and entertainment enthusiasts. To guide you through building a successful music streaming app, here are essential features to consider:

User Registration:

Grant users complete access to audio content through authenticated logins. Offer flexibility by allowing registration via Facebook, email, or other platforms.

Smart Search:

Facilitate easy navigation with a smart search feature, enabling listeners to find audio tracks or preferred podcasts quickly.

Customizable HQ Audio Player:

Provide a customizable player with essential controls (pause, play, rewind, shuffle, stop, forward) and display lyrics for an interactive user interface.

Formulated Playlists:

Curate specific genres of audio content using top-notch streaming software. Recommend playlists of highly appreciated audio files based on favorites, artists, or popularity.

Downloadable or Offline Function:

Enhance user flexibility by including download and offline modes. Allow listeners to enjoy their favorite tracks without an internet connection.

Expert Suggestions:

Elevate your platform’s value with a smart recommendation engine. Provide expert suggestions based on user activities for audiobooks, music, podcasts, etc.

Multi-Format Support:

Ensure compatibility with multiple formats, supporting VP8, HEVC, H.264, and .mp3. Guarantee smooth playability by rendering in different bitrates.

Interactive Social-Sharing:

Enable listeners to share celebrated audio content across their social circles. Boost user traffic and enhance social interaction among your audience.

Monetization Models:

Monetize your audio content with various models like SVOD, AVOD, TVOD, Coupons, etc. Integrate these models to increase Return on Investment (ROIs).

Audio White-Label Platform:

Create a branded audio streaming platform with your logo or brand name. Self-host the platform to enhance credibility and grow your listener base.

Data Reports:

Utilize real-time audio analytics and reports to analyze and track the performance of audio content. Gain insights into listening patterns, user devices, drop-off rates, and more.

By incorporating these features, you can create a compelling and user-friendly audio streaming service that meets the expectations of today’s audience.

 HTTP Streaming:

The Hypertext Transport Protocol (HTTP) describes the transmission of data on the World Wide Web. HTTP streaming tries to reduce some drawbacks of downloading. The file is downloaded via HTTP (i.e. no particular Streaming Server is involved) but displayed immediately when the data arrives.

Thus, the user can listen to the content directly, with streaming. But there is no fast-forward and no possibility for the system to react to insufficient network bandwidth or temporary network congestion. We transport data through a reliable TCP/IP protocol stack, which is slower and less appropriate for audio streaming than UDP etc.

Read More: Top 10 ‘ WebRTC vs RTMP ‘ Pro’s & Con’s in 2022

 Audio Formats:

All streaming systems use more or less different perceptual audio codecs to convey as much sound quality as possible to the listener. The most frequently used codecs are Dolby AC-3-based ones (i.e. RealAudio 3.0) and MPEG Layer 2/Layer three implementations. GMD additionally uses AT&T’s PAC codec. All codec implementations are real-time capable of at least one hardware platform (i.e. Intel Pentium Processors optimize it). Hence, they can use Real Time-Audio-Streaming applications like MBone Live Radio.

Audio Quality

The perceived audio quality differs from codec to codec and depends on the given bitrate. Real-Audio seems to produce the best results at meager bit rates between 8 and 16 Kbit/s. At bit rates up to 112 Kbit/s, AT&T’s PAC and Fraunhofer’s (FhG) MPEG-Layer3 implementation reproduce the original with the lowest noticeable distortion. The difference between the original and the codec output is smaller; the higher the requested bitrate is.

Read More: What is Bitrate, Audio Quality, Audio Codecs, Sample Rate & Bandwidth?

We will soon examine the differences in the error tolerance of the various codecs focusing on packet losses and test the whole streaming system under (simulated) Internet conditions. Then focus our work on the system’s behavior on packet delays and packet loss.

Transmission Protocols For Audio Streaming:

End system applications do not implement all communication features but use existing communication protocols instead. Typically, we use a network protocol to forward datagrams across a network, and a transport protocol for end-to-end services. The combination of protocols is called the protocol stack. 

The well-known TCP protocol provides reliable data transmission. It acknowledges received data to the sender and then is able to retransmit lost data. Because of retransmission delays, however, TCP transmission is never continuous on an unreliable network. Compared to TCP, unreliability is a significant feature of UDP, and will never retransmit the lost data. For most streaming applications, packet loss is more acceptable than discontinuous presentation.

You can use the Real-time Transport Protocol [8] on top of UDP.

It provides additional presentation information (timestamps, payload type) for the price of some further transmission and processing overhead. A combination of IP, UDP, and RTP adds a minimum protocol overhead of (20+8+12=) 40 bytes to each data gram.

While this is no problem for high quality (= high bandwidth) audio, it becomes essential for five Kbit/s streams where a 20 MS datagram results in 12 bytes payload. Payload size will increase typically, e.g. to 100 MS for very low bandwidth. IP/UDP/RTP header compression to a total of two bytes per datagram is under discussion).

Requirements For A Streaming System For Internet Radio:

 The Components and How They Work Together:

An Internet Radio System as discussed here contains both a live stream and automatically updated audio archives. A best audio streaming system for this service must include:

1. permanent real-time digitizing and encoding of the (analogue) live audio signal into a streamable format;

2. continuous updating of the audio archives by recording parts or the whole encoded live audio data on the hard disk;

3. a streaming server, capable of serving the live stream and the recorded streams;

4. streaming clients that receive and play the audio data stream on the client host’s audio system.

Audio streaming system

The audio source (e.g. a radio tuner) delivers the content to be re-broadcast on the net. The digitizer digitizes and encodes its output, an analog audio signal, in real-time (or analog-digital converter, ADC) and the Encoder (which can be hardware or software). They reduced the output of the Encoder digital data in a streamable format. The reflector doubles this stream and passes the first to the streaming server. The streaming server transmits the data to the listening streaming client if there are any currently demanding the live stream. Otherwise, they will discard the data.

The second data stream passes to the recorder. The recorder stores the data on the hard disk according to a schedule that determines file names, start and end times. The recorder fills the archives with audio files of different lengths and names, depending on which content is to be stored. We call these audio files “recordings”. Whenever a streaming client demands a stored audio stream, the streaming server reads the audio data from the hard disk and transmits them to the client.

Further Requirements:

While working with several streaming systems, we learned that for practical use the fulfilment of the above requirements is not enough. There are additional requirements which are essential for Internet Radio to work,


The audio codec must produce sufficient quality at a very low bandwidth. Most Internet users receive a permanent data rate of 20kbps or less even if they connect it to their Internet provider via ISDN. An audio stream must not exceed that data rate while providing acceptable audio quality to reach them.

• The streaming server must be capable of serving parts of audio files instead of the whole recording only. We can divide an extended record into sections (which we call “clips”) just by creating pointers instead of cutting it physically. It makes “cutting”, automation, and correction much faster and easier.

Commercial Streaming System:

Today, many systems exist for streaming audio and video through low-bandwidth networks such as the Internet. 

The Components

 Many offer the digitizer and the Encoder as a single hardware component. The software includes a recorder which is called “Capture Tool”, a streaming audio server, and streaming clients. There is no reflector available from some companies, and the Audio Transmitter turns out to be a 486-PC with a special audio capturer/encoder ISA board, an ISA network interface board, and DOS software.

The Transmitter digitizes and encodes an analogue audio signal. Transmit the encoded audio data as a stream of UDP packets into the LAN the Transmitter is connected. Since the Transmitter comes with no hard disk, it boots from a floppy disk containing DOS and ATRANS. A DOS-based, menu-driven configuration program lets the user specify parameters like the desired audio codec and as well as the Transmitter’s IP address and the destination IP address and port of the output stream.


The Transmitter’s output stream is to be received by either the Capture Tool (to be stored and re-broadcast later) or by the server (to be re-broadcast immediately as a live stream). The Capture Tool receives and saves the Transmitter Output stream as a recording. The tool takes two parameters-duration of the record and the recording’s file name.

The server, which is available for several UNIX systems and Windows NT, receives the Transmitter’s output stream and serves it as a live stream via the Internet/Intranet to listening clients. The server can serve existing recordings as audio streams.

These recordings can be:

• standard MPEG Layer I or II audio files

• LBR files converted from Windows WAV format by

DOS software available from Xing

• MPEG or LBR files encoded by the Transmitter and captured by the Capture Tool.

The server can serve clips (sections of recordings). It can also offer different content depending on the client’s requested bandwidth. To describe an audio clip, some service providers use small ASCII text files called” play files”. A play file defines a clip giving the filename of a recording and the start- and end times of the clip relative to the beginning of the record.

This data can depend on the client’s requested bandwidth, making it possible to serve clips of different quality for each bandwidth. To listen to a Xing based Internet radio, one needs the Player(the client software). They can download freely it at Xing’s WWW Site. It supports Windows 3.x and 95, Power macintosh, Solaris, IRIX, and Linux.

Streaming System working

When working with, we found out that Capture Tool doesn’t implement as good. It consumes nearly the full CPU power. Restart it for every new audio file to create, which leads to unpredictable delays. Besides, there is no reflector within the package. A stream can be broadcast live or stored for later use, not both. GMD implemented a mirror and a new recorder for Xing’s LBR format audio streams. The software runs on.


The recorder is programmable and stores the live stream coming from Xing’s Transmitter via the reflector into audio files according to the desired schedule. It consumes less than 1% of a Sun SS20 CPU (SuperSparc 2 at 60MHz). It does not need to be stopped and restarted to produce a new recording. Therefore, it can start faster and with more accuracy to a time-based schedule.

With these extensions, meets the requirements described in chapter III. Its LBR audio codec delivers acceptable speech quality at data rates below 15kbps. It plays sections of recordings with reasonable accuracy. The client software supports many platforms, thus reaching a maximum number of listeners. Feed-back emails show that they used this service worldwide with reportedly good audio quality.

Real Audio by Progressive Networks

In spring 1996, when we chose one reputed service provider for our projects, RealAudio by Progressive Network was already the market leader for Internet audio streaming, but provided better audio quality at the lower bandwidth. With the announcement of version 3.0 in autumn 1996, RealAudio provides several new audio codecs that excel in those of Streamworks. For radio programs containing mainly speech, there is a codec at 6,5kbps delivering better audio quality than LBR at 11kbps. For music, jingles, etc., the 6,5kbps codec is not usable any more, but there are several other low bandwidth codecs suitable for music, as well.

The components

The RealAudio system comprises software only. In terms of Fig. 2, digitizer and Encoder are parts of a Windows PC. The digitizer may be any Windows MME compatible sound card (e.g. So while the Encoder is software by Progressive Networks (“RealAudio Encoder”). A fast 486-PC can digitize and encode audio in real-time. Again, transmit the resulting audio stream via a network interface to the streaming audio server (“RealAudio server”). The RealAudio Encoder can store the audio data as a file, acting as a reflector and recorder at the same time.

The RealAudio Encoder

Iis Windows software capable of encoding audio data directly from the digitizer to a live stream and a recording. It can also encode an existing WAV audio file into a streamable format. A DOS version makes it easy to encode the same WAV file into several streamable formats with different bandwidths and qualities using DOS batch processing. A Solaris CLI version encodes WAV and AU format audio files.

The RealAudio server is available for several platforms (Windows NT, Solaris). It can play live streams and sections of recordings with good accuracy and choose between files according to the bandwidth of the client’s connection. They can download freely the RealAudio player for Windows (3.x/95), Macintosh, OS/2, and several UNIX systems. With its new version 3.0, RealAudio made significant progress.





One response to “What Is Audio Streaming & How Does It Works?”

  1. kstuhmer Avatar

    Good blog insights to launch an audio streaming platform, get set to capitalize on the develop audio industry as more brands are ready to invest in it.

Leave a Reply

Your email address will not be published. Required fields are marked *