Saturday, April 28, 2007

Music and Computers

This article introduces the main concepts supporting the use of computers to record, generate, edit, process, mix and play sound, and music in particular: American Institute of Physics, Conference Proceedings April 28, 2007, Volume 905, pp. 220-223.

Keywords: computer, sound, music, digital audio, sound synthesis, sequencers, MIDI.

Reprinted with permission. Copyright 2007, American Institute of Physics. This article may be downloaded for personal use only. Any other use requires prior permission of the author and the American Institute of Physics.


Electronic, digital computers are devices designed to represent objects. They are made up of a Central Processing Unit (CPU), central and external memory and some elements to interact with the final user: I/O peripherals like the keyboard, the screen or the printer. At a basic functional level, the memory is made up of a finite number of elements, each of them being in one of two possible states, and they are said to hold a bit of information each. Duly organized in packets of a certain size and with an associated address, different combinations of the status of these bits can be mapped very flexibly to different sets of objects (numbers in digit form for example).
When these representations are the elements to be transformed, or the result proper of the transformations, they are called data. When the objects represented are the operations to be performed on the data, they are called instructions and are usually grouped in programs. Some of these programs manage the whole of the system resources, and they are said to constitute the Operating System of the computer [1].
Instructions and data are fetched from memory into the CPU, where operations are performed in synchronization with an electronic clock. The external memory is usually larger than the central one, and does not need in general to be powered to keep the data.

Representable numbers in digit form

Owing to the limitation in the number of bits of any computer, when the objects to be represented are numbers in digit form (i.e.: pi as 3.14159... and not as “pi”, for example), only a finite quantity of numbers, each with a finite quantity of decimals are representable, i.e., only a finite quantity of numbers taken from of a finite subset of Q is representable. We will refer to this set as Q*, and due to the different memory capacities of different computers on one side, and to the different existing ways to represent numbers using bits on the other, the elements of Q* and the quantity of numbers that are representable will depend in general on the particular computer system under consideration.
For the purpose of this paper, we are interested in the possibilities of computers to record, create, edit, process, mix and play music, and as music is sound with a particular structure, we will first turn our attention to it.


Sound is the name for pressure waves in elastic media like air, water or certain solids. For these sound waves to be audible, their frequency must be in the range 20Hz to 20,000Hz. In air, for example, the variations happen around the atmospheric pressure, whose standard value is 100KPa. An additional condition for sound to be audible regards the amplitude of the pressure wave, which must be above 20μPa rms. At the other end, amplitudes of around 65Pa can damage the listener’s ears. By means of microphones, sound waves can be turned into electrical waves, and these ones back into sound with the help of amplifiers and loudspeakers.
Theoretically, sound pressure (and its associated voltage signal) can be represented as a continuous function of time, typically assigning 0 to the value around which the pressure oscillates, with positive portions corresponding to overpressures and negative ones to depressures. With amplitude and time taking values on R, this continuous representation rely upon the Real Numbers Theory, that states that some real numbers (the so called irrational ones) have to be represented in digit form as a non-repeating, arbitrarily large sequence of digits belonging to a particular base.
Furthermore, continuous functions like the one described have input and output sets that contain also arbitrarily large quantities of both irrational and rational numbers. This is the limit case of the more practical ones in which, due in part to limitations in the equipment used to convert pressure into voltage, both the time and the voltage take indeed values in just another finite subset of Q, which we will call Q**. Though Q** is obviously smaller than R, it is in general much larger than Q*. In practice, this means that these so-called analog signals can’t be represented on computers.

Digital vs. Analog Sound Signals

A possible solution is to give up some precision by transforming the analog signal into another one, a digital signal, one in which both the time and the amplitude take values only in Q*. Signal Theory shows that this does work if certain conditions –easily met by current technology– are satisfied. In practice, to record sound on a computer, the transformation implies the use of an analog-to-digital converter (ADC), a system usually contained in a single integrated electronic circuit connected to the computer. The ADC takes samples of the analog signal (produced by a microphone, an electric guitar or an electronic keyboard, for example) at a rate that is at least twice the highest frequency component in the signal, and rounds off the values obtained so that all of them become members of Q*.
Note that the number of samples taken, which depends both on the sample rate and the duration of the signal, must also fit in the computer’s memory. A related process is the construction of an analog signal from the original samples by means of a digital-to-analog converter (DAC), which is carried out by holding each sample for the duration of the interval of time that separates samples, producing in this way a staircase-like signal. This signal is later smoothed out by means of a filter. If the conditions mentioned above are met, the human ear will be unable to tell the original signal from the reconstructed one.


Once in memory, a suite of programs allows us to edit the signal (erase, split and splice portions of it) and to process it with Digital Signal Processing (DSP) techniques, adding effects like filtering, reverberation, delay, etc., all in order to achieve a particular musical result. Other interesting operations are described below.

Sampling and Sequencing

We can also sample the different notes of a particular instrument, tweak them if necessary and organize the whole into a sound library. Once in the computer’s memory, samples can be triggered to produce music, thus playing the sounds of that instrument from the computer. This can be achieved either by sending the triggering messages live from a keyboard (see MIDI below), or by having a computer program known as sequencer to automate the recording and dispatching of the triggering messages.
When we have samples from several instruments held in memory in this way, a sequencer can trigger them in a timely manner, thus opening the door to compositions for several instruments. A sequencer organizes visually the parts of the intervening instruments as a stack of tracks, and provides many tools for the composer to record, arrange and mix a piece of music. Sequencers, samplers and sound libraries are common nowadays among musicians that use computers [2].

Sound Synthesis

Another interesting possibility is the programmatic generation of sound signals, i.e., the synthesis of sound via the direct calculation of the value of each sample at a time, followed by the corresponding DAC conversion. The calculations may be based on a particular physical model, for example the differential equation that governs the vibration of a string, or may stray from physical constrains to produce a rich variety of sounds. One could apply, for instance, substractive synthesis on a signal having a rich spectral content; by removing certain frequencies with the help of filters, one can end up obtaining very interesting sounds. Additive synthesis, on the other side, proceeds by adding several simple signals to get a final sound.


Good standards are always a boon to users and manufacturers. Back on the seventies, electronic musical equipment was becoming increasingly affordable [3]. The need to make keyboards, synthesizers, sequencers and computers –all from different manufacturers– talk to each other in a standard way gave birth to the MIDI specification in 1983. MIDI is an acronym for Musical Instruments Digital Interface, and refers to a suite of specifications covering the way devices connect to each other and the kind of messages they exchange.
For example, a keyboard connected to a computer sends the MIDI message note-on each time a key is depressed. The message carries several numbers that identify the note assigned to the key, the intensity with which the note was played, and a channel of communication from among 16 possible ones [4]. If there is a synthesizer or a sampler player program running on the computer, tuned to the channel specified by the message, it will trigger the corresponding note with the corresponding intensity. After conversion in the DAC, its sound can be heard through loudspeakers or headphones connected to the computer.
If there is instead a sequencer program running, the message can be time-stamped according to the sequencer’s clock and recorded on the computer’s memory. As each note-on message is followed by the corresponding note-off message upon the releasement of the key, the sequencer can easily figure out the duration of the sound from the time-stamping information of both messages, and optionally display that duration in a score as a particular note.
The sequencer is thus the heart of the computer-based recording studio. Besides providing a useful visual interface to the user and the possibility to record and play MIDI messages from and to external devices, high-end modern sequencers come with integrated samplers and programmable synthesizers like the ones described in this paper. Many are also capable of recording and reproducing digital audio, thus integrating in a single environment all elements conducive to modern music production.On the downside, if the computer is not fast enough, the overhead to handle many tracks of MIDI and audio data (which includes their retrieval from memory, processing, mixing, etc.) may render the overall process so slow that the computer delivers the music not in real time, but with some latency. For a fixed workload, the delays involved get shorter with each improvement in technology.


The author wishes to thank professor Antonio Alfonso Faus for his kind support on the occasion of the 8th International Symposium: “Frontiers of Fundamental Physics” held in Madrid in October 2006.


  1. Gregorio Fernández Fernández and Fernando Saéz Vacas, Fundamentos de los ordenadores Vol. I., Madrid: E.T.S.I.T. Ciudad Universitaria, 1978.
  2. Palomo, Miguel, El Estudio de Grabación Personal, Madrid: Amusic, 1995.
  4. Giulio Clementi, Non solo MIDI, Ancona (Italy): Berben Edizioni Musicali, 1989.

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.