Sound is recorded by the decibel (volume) samples an object makes over time.
-
Analog - Infinite resolution
-
Digital - Fixed resolution
-
Analog to Digital Conversion - Digitize
Types of sounds:
-
Midi - device dependant (instrument sounds), small file size.
-
It contains a music score, like for a player piano.
-
It has a look table for an instrument's notes.
-
Digital Audio - large file size, device independant
-
Digital Audio sounds is sampled.
-
1kHz = 1000 Hz or 1000 samples per second.
-
Popular sample rates are 44.1 kHz (CD quality), 22 kHz, 11 KHz
-
Samples have a resolution typically of 8 or 16 bits per sample
-
As you can see at 1 or 2 bytes per sample, sound takes up lots of
room.
-
If you record in stereo, then you need 2 channels for sound (left
& right), so you do twice the samples ever sample taken.
-
Quantization is the rounding off of values to fit within the sample
resolution.
Audio calculations for an uncompressed audio file:
sample rate x seconds length x (bit resolution/8) x channels of sound
10 seconds of 22 kHz audio in mono sound is:
22,000 x 10 x 8/8 x 1 = 220,000 bytes
Audio compression - generally lossy (you loose data)
-
Reasons for compression
-
DPCM (Differential Pulse Code Modulcation) encoding stores only the differences
between samples, rather than the absolute values.
-
µ-Law (Mu - Law) reduces the number of bits needed to store samples.
It does so by rounding the data off to take less bits.
In this example to record at 13 bits, but save as 7 bits you could do:
The bit 5 is the start of the offset, anything more than 5 will have a offset
position.
Find the offset of the highest bit turned on.
Mark the offset as 3 bits
Copy the next 4 data bits.

-
12 |
11 |
10 |
9 |
8 |
7 |
6 |
5 |
4 |
3 |
2 |
1 |
0 |
|
|
|
|
|
|
|
Highest
Bit On |
D3 |
D2 |
D1 |
D0 |
X |
6 |
5 |
4 |
3 |
2 |
1 |
0 |
Offset
Of Highest
Bit On |
D3 |
D2 |
D1 |
D0 |
So for example:
12 |
11 |
10 |
9 |
8 |
7 |
6 |
5 |
4 |
3 |
2 |
1 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
1 |
0 |
1 |
0 |
1 |
0 |
0 |
Becomes:
6 |
5 |
4 |
3 |
2 |
1 |
0 |
0 |
0 |
1 |
0 |
1 |
0 |
1 |
-
6 has the highest bit turned on It's 1 away from 5, so that offset value
is 001. Then I copy over the remaing bits to fill up the 4 data bits copied.
Codec are used to encode/decode the files. Make sure you use a standard codec
available on the user's PC.
Streaming Media
MP3 (Info gotten from here)
As a form of compression, MP3 is based on a psycho-acoustic model which recognizes
that the human ear cannot hear all the audio frequencies on a recording. The
human hearing range is between 20Hz to 20Khz and it is most sensitive between
2 to 4 KHz. When sound is compressed into an MP3 file, an attempt is made to
get rid of the frequencies that can't be heard. As such, this is known as 'destructive'
compression. After a file is compressed, the data that is eliminated in the
creation of the MP3 cannot be replaced.
When encoding a file into MP3, a variety of compression levels can be set.
For instance, an MP3 created with 128 Kbit compression will be of a greater
quality and larger file size than that of a 56 Kbit compression. The more the
compression level decreases, the lesser the sound quality and file size.
Details on how MP3s work
Text to Speech - Download