Many relations have been discovered in recent years between fractals and music. This paper will cover some of the research that has been done on these relations, including some of the controversies over conflicting discoveries. It will show how some artists are currently using fractals to generate the basic melodies in their compositions. A few computer programs which can be used to generate these melodies will be discussed. This paper will also examine possible practical uses for software which can be used to hear patterns in fractals. The algorithms behind the music-generating algorithms will be discussed, as well as some of the basic relations between mathematics and music. Some of the work of music theorist Joseph Schillinger, whose ideas in the 1920's and 1930's about generating music by recursive and chaotic means were far ahead of their time, will also be covered.
The mathematical study of music is certainly nothing new, dating back to ancient Greece. Around the 5th century BC, the Pythagoreans formulated a scientific approach to music, expressing musical intervals as numeric proportions. This was probably done by observing the tones produced by plucked strings of different lengths; for example, the tone produced by a string held at the middle is an octave higher than that of the whole string. They went on to calculate the intervals for several different scales, including the chromatic and diatonic scales. Archytas of Tarentum, a Pythagorean mathematician who lived around 400-350 BC, was even able to work out the relationships between notes in the enharmonic scale, which includes quarter tones. (Britannica)
The most rigorous mathematical study of music in more recent years would be the system formulated by Joseph Schillinger in the 1920's and 1930's. Schillinger, a Russian-American music theorist, was obsessed with developing a "scientification" of music through mathematics. In 1941 he published The Schillinger System of Musical Composition, a massive twelve-book work that was years ahead of its time. This system has been described as "a sort of computer music before the computer," since his work presaged many developments of algorithmic composition which would not be expanded upon until decades later (Degazio).
Book 1 of Schillinger's system dealt primarily with the theory of rhythm. Schillinger had the idea that the most basic rhythms in music could be generated by "the interference of two synchronized monomial periodicities." Simple rhythms could be found by sort of superimposing two waves of different periodicities and forming a new wave that contained the attacks of both waves. For example, a wave of period three would be combined with a wave of period four. If these waves were divided into twelfths (12 being the least common denominator), the first wave would consist of three notes of length four and the second wave would consist of four notes of length three. When the attacks of these waves are combined into one wave, the resultant is a note of length 3, a note of length 1, a note of length 2, a note of length 2, a note of length 1, and a note of length 3. Schillinger illustrates this technique graphically, as in Figure 1 below.
Another way Schillinger produced rhythmical patterns was by means of distributive powers. He would take a series of fractions that added up to 1 in the form (a+b), then square it to come up with a new pattern. Since a+b=1, then (a+b)2 =a2 + ab + ab + b2 = 1. For example, starting with a 2-1 pattern, we form the binomial (2/3 + 1/3). When we square this we get (2/3 + 1/3)^2 = 4/9 + 2/9 + 2/9 + 1/9, which gives rise to a new 4-2-2-1 rhythmic pattern. Schillinger also squared and cubed trinomials and higher order polynomials that added up to 1. This led to even richer rhythmic patterns. Schillinger also noted that the squares of (1/4 + 2/4 + 1/4) and (1/4 + 1/4 + 2/4) formed patterns that had been used intuitively by many classical composers. Using Schillinger's method, composers could simply multiply out polynomials until they found a rhythmic pattern that seemed interesting. (Schillinger System...) A later book by Schillinger, The Mathematical Basis of the Arts, contains more than a hundred pages of these patterns in the appendix.
One interesting sidenote is the way Schillinger made a point about the multi-levelled character of music:
"There are two sides to the problem of melody: one deals with the sound wave itself and its physical components and with physiological reactions to it. The other deals with the structure of melody as a whole, and esthetic reactions to it.
Further analysis will show that this dualism is an illusion and is due to considerable quantitative differences. The shore-line of North America, for example, may be measured in astronomical, or in topographical, or in microscopic values." (Schillinger 229).
This same argument was made more than thirty years later by Benoit Mandelbrot, the founder of fractal geometry, when describing the fractal nature of a coastline and how the length seems to change depending on how finely it is measured. (Degazio)
1/f noise and music
In the mid-1970's, an even more general mathematical study of music was performed by Richard F. Voss and John Clarke at the University of California. This time, rather than studying the structure of the music as it is written, the researchers decided to study the actual audio physical sound of the music as it is played. This was accomplished by analyzing the audio signal which, in a stereo system, would correspond to the voltage used to drive the speakers. The signal was fed through a PDP-11 computer, which then measured a quality called the spectral density.
Spectral density is often used in the analysis of random signals or noise, and is a useful characterization of the average behavior of any quantity varying in time. In technical terms, the spectral density Sv of a quantity V(t) fluctuating with time t is a measure of the squared variation V^2 in a unit bandwidth centered on the frequency f. The average is usually taken over at least 30 periods. Another quality, called the autocorrelation function, measures how the fluctuations in the signal are related to previous fluctuations.
The concepts of spectral density and autocorrelation are a bit difficult to grasp mathematically, but can be understood intuitively; Benoit Mandelbrot explains them in the following manner. If one takes a tape recorder and records a sound, then plays it faster or slower than normal, the character of the sound often changes considerably. Some sounds, however, will sound exactly the same as before if they are played at a different speed; one only has to adjust the volume to make it sound the same. These sounds are called "scaling sounds."
The simplest example of a scaling sound is white noise, which is commonly encountered as static on a radio. This is caused by the thermal noise produced by random motions of electrons through an electrical resistance. The autocorrelation function of white noise is zero, since the fluctuations at one moment are unrelated to previous fluctuations. If white noise is recorded and played back at a different speed, it sounds pretty much the same: like a "colorless" hiss. In terms of spectral density, white noise has a spectral density of 1/f^0.(Gardner)
Another type of scaling sound is sometimes called Brownian noise because it is characteristic of Brownian motion, the random motion of small particles suspended in a liquid and set into motion by the thermal agitation of molecules. Brownian motion resembles a random walk in three dimensions. Since where a particle goes next does depend on its current position, Brownian motion is random but still highly correlated. Similarly, Brownian noise is much more correlated than white noise, since the fluctuations at a point in time do depend on previous fluctuations and cannot stray too far from them in too short a time. Brownian noise has a spectral density of 1/f^2.
Voss and Clarke analyzed several recordings of music and speech. They first analyzed the spectral density of the audio signal itself. This consisted of a series of sharp peaks between 100 Hz and 2 kHz, which was far from the kind of results they were seeking. Since they wanted to measure quantities that varied more slowly, they then examined a quantity they called the audio power of the music. This was proportional to the power delivered to the speakers rather than the voltage. The audio power seemed to show 1/f behavior, which is midway between white noise (1/f^0) and Brownian noise (1/f^2). This was interesting, since many other phenomena have been found to have a spectral density of 1/f, including electronic flicker noise, sunspot activity (Voss and Clarke), uncertainties in time as measured by an atomic clock, the wobbling of the Earth's axis, traffic flow on freeways (Gardner), and even the flood levels of the river Nile. (Mandelbrot)
Voss and Clarke found the 1/f behavior to hold even for completely different kinds of music. For example, they analyzed both Bach's First Brandenburg Concerto and a recording of Scott Joplin piano rags. They found that both exhibited 1/f behavior. They also tried recording the power fluctuations for three different radio stations: a rock station, a classical station, and a news and talk station. These seemed to demonstrate 1/f behavior as well.
Benoit Mandelbrot found had an interesting explanation for the scaling behavior found by Voss and Clarke:
"The argument that I favor is that musical compositions are, as indicated by their name, composed: First, they subdivide into movements characterized by different overall tempos and/or levels of loudness. The movements subdivide further into the same fashion. And teachers insist that every piece of music be "composed" down into the shortest meaningful subdivisions. The result is bound to be scaling!" (Mandelbrot 375)
After Voss and Clarke found 1/f behavior in music, they decided to try applying these results in composing music using white, Brownian, and 1/f noises and compare the results. This composition technique was done in the following manner. A physical noise source was first used to provide a fluctuating voltage with the desired spectrum. This was done by using various electronic methods which could produce the desired noise. The voltages were then sampled, digitized, and stored in a computer as a series of numbers with a spectral density the same as the noise source. These numbers were then rounded and scaled and matched to notes over two octaves of a musical scale, matching the higher numbers to the notes with higher frequencies and the lower numbers to notes of lower frequencies. The process was then repeated, this time interpreting the numbers produced as durations of notes. Using a PDP-11 computer, they then turned these data into musical scores.
The three types of music were then played for several listeners, who made comments about the pieces. Most of them said that the white music was too random and the Brownian music was too correlated. The 1/f music, however, seemed to sound the most like regular music. Voss and Clarke used this as further evidence of the 1/f nature of music.
However, these results were not without controversy. Nearly twenty years after these results were published, a researcher named Nigel Nettheim criticized their methods. Nettheim was concerned with their use of long stretches of recorded material, such as the data they obtained through recordings of radio stations. These recordings, which often lasted up to twelve hours, included pieces by different composers, different styles of music, and spoken announcements and comments. Nettheim felt that a single musical piece is the largest unit of artistic significance. Doing his own research in a similar manner, he produced several instances where single movements of 18th and 19th century classical music exhibited results which were at variance with 1/f behavior.
Jean-Pierre Boon and two other researchers formulated a different technique for the quantitative analysis of music. Instead of analyzing recordings of music, the following technique was used. A synthesizer is interfaced with a computer and the printed score of the composition is played on the synthesizer by a musician. The pieces are digitally stored in the computer; this involves discretization of the pitch and duration. The smallest duration measured is the length of a 64th note triplet, and the pitches are scaled to the half-tone. The score is then converted into a time series X(t) (or a set of time series X(t), Y(t), Z(t) for multi-part scores). Data processing is then used to construct a phase portrait. For single part pieces, a time-delay method is used. These phase portraits are then used to compute the dimensionality, and the correlation function C(t) and the corresponding power spectrum S(f) can be calculated from the original time series. According to Boon, "The phase portrait constitutes a spatial representation (in the abstract phase space) of the temporal dynamics of the music piece reconstructed from the time series obtained from the pitch variations as a function of time." A third quality of the music called the entropy was also measured; this was a measure of how diverse the notes in the piece were and how often the composer used notes outside of the scale the piece was written in. For example, a piece where most of the notes were within the scale would have a low entropy. A few variations on this entropy quality were also examined.
Twenty-three musical pieces were selected for this type of analysis, including 19 classical pieces and 4 jazz pieces. Two other sequences were tested as well for comparison with the other pieces; repeated ascending and descending scales and a sequence of random music based on a white noise algorithm.
The analysis of these pieces in this manner produced interesting results. First of all, in all cases the Hausdorff dimension Df for each piece, obtained by the box-counting method, ranged between 1.3 and 1.9 for most pieces, with the exception of Epistrophy by Thelonius Monk, which measured 0.9400. The highest Hausdorff dimension for the music pieces (1.8600) was for Bach's Second suite for cello. By comparison, the Hausdorff dimension of the chromatic scale was 1.07, and the dimension of the 5000 note white music was 2.97. The values v for the spectral densities of the pieces 1/f^v were also computed; however, these results differed significantly from the results obtained by Voss and Clarke. This time, v seemed to vary between 1.79 and 1.97, which is closer to Brownian noise than 1/f noise. The difference was explained by the use of single pieces of music rather than long stretches. Boon emphasized that "if musical dynamics analysis is meant as a procedure to identify and characterize elements of musical significance, the single piece is the commonly recognized object to be studied. In this respect the meaning of long stretches of blended musical pieces is unclear." (Boon 507)
Chaos and composition
Even though controversial, Voss and Clarke's research prompted a lot of interest in algorithmic composition. Voss himself tried several variations on composing 1/f music. His first experiments were with electronic noise, but he then tried other algorithms to produce the different types of music as well. White music and Brownian music are easily generated with dice or random number generators. One way to generate white music is to simply assign numbers to different notes and toss dice to determine what note is played. Brownian music can be generated in a similar manner. One way of doing this would be to start at a given note then begin flipping a coin to determine the next note; if the coin comes up heads then go up a step on the scale, if the coin comes up tails go down a step. However, algorithms for randomly generating 1/f music get to be a little more complicated. One relatively simple way to generate 1/f music is described in Martin Gardner's article in the April 1978 issue of Scientific American. First assign the numbers 3 through 18 to sixteen adjacent notes on a piano. Then obtain three six-sided dice, each of a different color. Then write the numbers zero through seven are in binary notation, and assign a color to each column as in Figure 2 below. The first note in the melody is obtained by tossing all three dice and taking the sum. This is toss zero. For each successive toss, look at the chart and see which numbers change as you go to the next number in base two, then toss only the dice corresponding to the numbers which change. For example, on the chart below, we see that when we go from toss zero to toss one, the digit corresponding to the red die changes, so we would toss only the red die and take the new sum. Similarly, going from the fifth toss to the sixth toss, we see that both the digits corresponding to the red die and the green die change, so we would toss those two dice. This algorithm produces a sequence that is between white and brown; while it is not exactly 1/f, it is so close that it is difficult to tell the difference between melodies generated in this manner and melodies generated with actual 1/f noise.
Of course, Voss found that using natural 1/f phenomena provided a rich source for compositions; one of his compositions was even derived from the records of the annual flood levels of the Nile. He also tried experimenting with different scales; when he applied 1/f noise to a pentatonic scale, the resulting music resembled Oriental music.
Another composer who developed an interest in algorithmic composition using the mathematics of chaos theory is David Clark Little. After receiving a degree in chemistry, Little went on to study harpsichord and composition in the Netherlands. Little has based many of his compositions on mathematical models that demonstrate chaotic behavior. One of his pieces, Fractal Piano 6, uses the logistic equation as inspiration. Raw data from iterating the function was converted into pitches, durations, and loudnesses, then edited until it formed a pleasing piece of music. Another radical piece called Brain-Wave involves improvisation with at least three musicians. The musicians sit spread out around a music hall, sit in an arbitrary direction, and begin playing. The musicians follow a set of guidelines for what to play according to what the other players are doing and where the other players are sitting relative to them, thus taking their cues from the other players. They essentially try to do things similar to what the players in front of them are doing and try to do the opposite of what the players behind them are doing. For example, if the player in front of them speeds up, they should also speed up; if the player behind them starts playing softer, then they should play louder. Little compares this type of composition with behavior found in the human brain. In the brain, there are both inhibitory and excitatory synapses, which decrease or increase the rate at which neurons send out impulses. Researchers have discovered chaotic behavior in the brain, and Little felt that the music of Brain-Wave could imitate this behavior. (Little)
Another composer, Gary Lee Nelson, used a unique twist on the logistic equation to compose a piece called The Voyage of the Golah Iota. The logistic equation, X = P*X*(1-X), was used as a source for the notes. Nelson's twist was to vary P during the course of the nine minute piece between 1.0 and 4.0. P starts out at 1.0, then rises to 4.0, then falls back again to 1.0. The values of x were iterated during this time; the results were then scaled and mapped onto a seven octave pitch range. The piece takes advantage of the bifurcations that occur when P is increased and the eventual chaotic behavior. This piece was composed with a nautical metaphor in mind; a graph of the logistic equation resembles a rising and falling wave if P is increased and decreased. Nelson also offers further interpretation of his piece:
"The shape of my piece would mirror a voyage from the Mediterranean to South America on a mythical ship named the Golah Iota. This tiny galley would experience, through my sounds, a journey from the calm waters off North Africa through heavy seas and winds to the tranquil shores of what is now Brazil." (Nelson)
The "heavy seas" of course refer to the wildly chaotic behavior in the middle of the piece, as P rises from 3.6 to 4.0 and back down again.
Thanks to the efforts of a few software designers, algorithmic composition techniques are now available to anyone with a personal computer through a variety of programs. One such program is The Well-Tempered Fractal, written by Robert Greenhouse. Greenhouse started out writing a series of programs called Music From the Fringe. These were experimental programs designed to produce sounds from fractals. These programs initially produced fascinating yet decidedly non-musical wild howls and screeches. After playing around with these programs for a while, Greenhouse decided to try to produce something that sounded more like what most people would recognize as music with fractals. He accomplished this in The Well-Tempered Fractal.
The Well-Tempered Fractal contains ten built-in families of fractals for the user to choose from. Once a family is selected, the program generates some parameters at random for that family and begins to plot the points in the fractal with an iterative algorithm. As the points are plotted, notes are played which correspond to the points. This is done by dividing the region in which the fractals are plotted into a 16 x 16 grid and assigning notes to correspond to the x and y coordinates for points which fall within each square of the grid. The user can control several aspects of which notes are played; for example, the user can sample every nth note, play only the notes which correspond to the x or y values. Another useful feature is the user can choose more than twenty different musical scales for the notes to be selected from. This can give a different flavor to the music that is produced. While the music is playing, the user can record the notes and save them in a MIDI event file, which can later be converted into a standard MIDI file. The piece can then be manipulated in a sequencer and used as the inspiration for a musical composition. The author includes a few such MIDI file compositions in the program together with pictures of the fractals which created them.
Greenhouse stresses that The Well-Tempered Fractal is not meant to be an automated composing system which allows one mindlessly to press a key and create music. His intention is for people to be able to use the program to produce raw data which can then be manipulated into an interesting musical piece. The main purpose is to use the program to get ideas for musical compositions. According to Greenhouse, "...the ideas which lead to the musical compositions are generated from fractals and as such may have a quality to them that would be hard for a traditional composer to generate." (Greenhouse)
Another program which uses a stochastic composing process is Musinum, by Lars Kindermann. This program counts in a base selected by the user; as it counts, it takes the sum of the digits and plays a note corresponding to that sum: c for one, d for two, e for three, etc.. The program also has the capability of counting by different step sizes. Different voices can be used for playing the notes, and up to twelve different voices can be played simultaneously, each with a different algorithm for coming up with the notes. Like WTF, this program also has the capability of saving these tunes to a MIDI file.
Programs such as WTF and Musinum are clearly useful for coming up with ideas for musical compositions, but this is certainly not the only possibility for their use. Hearing the way fractals are created adds another dimension to how we can perceive fractals, namely time. When we see a picture of a fractal that is created by one of the algorithms used in WTF, we are unable to see in what order these points were generated. WTF is an excellent tool for watching as the fractals are created and also hearing how they are created. If we were to get a sense for what notes corresponded to what sections on the grid, we could easily remember the approximate order in which the points are generated by simply memorizing the melody which corresponds to the fractal. This could give us more insight into how these algorithms operate.
It is also possible that we may be able to perceive some properties held by fractals more easily by hearing them rather than seeing them. Consider the difference between the sheet music to a song and an audio recording of someone playing the song as it is written. Both the sheet music and the recording contain the same information, but in this case it is easier to hear such things as regularities in the notes and rhythms by listening to the recording rather than looking at the sheet music. In the documentation for Musinum, Kindermann points out a few ways MusiNum can be used in this manner. For example, the tune created when the program is set to binary with step size 1 is exactly the same as the tune created when the step size is set to two, four, eight, or any other power of two. This tells us that the sum of the digits of a number n in binary notation is the same as the sum of the digits of a number 2*n in binary notation. This mathematical fact is easy to come by analytically, of course, but it is even easier just to stumble across in MusiNum. Kindermann also points out that the tunes generated by step sizes of the form 2^n + 1 and 2^n - 1 have similarities to each other as well. Playing around with MusiNum can lead to such discoveries, thus prompting the user to try to discover the mathematical origins of such similarities.
Finally, another possible use for such programs would be as an aid for the vision impaired. Fractals are often studied in ways which are very visual, such as graphs and computer-generated pictures. Someone who has been blind since birth is unable to make use of many of these methods. Programs similar to WTF which turn fractals into sounds may someday be of great use helping people with such a handicap get a better idea of what fractals are.
Boon, Jean Pierre and Olivier Decroly. "Dynamical Systems Theory for Music Dynamics." Chaos 5(3) (1995): 501-508.
Degazio, Bruno. "Nikola Tesla and Joseph Schillinger: The Music of NT: The Man Who Invented the Twentieth Century." http://www-ks.rus.uni-stuttgart.de/people/schulz/fmusic/tesla.html
Gardner, Martin. "Mathematical Games." Scientific American April 1978: 16-32.
Greenhouse, Robert. "The Well-Tempered Fractal v3.0" (Documentation to WTF). Available on-line at http://www-ks.rus.uni-stuttgart.de/people/schulz/fmusic/wtf/docu.html
Greenhouse, Robert. Personal correspondence.
Kindermann, Lars. Musinum - The Music in the Numbers (Documentation to Musinum software). Available on-line at http://www.forwiss.uni-erlangen.de/~kinderma/musinum.html
Little, David Clark. "Composing with Chaos: Applications of a New Science for Music." Available on-line at http://www.xs4all.nl/~19521952/dcl/
Mandelbrot, Benoit. The Fractal Geometry of Nature. W.H. Freeman and Company, New York: 1977.
Nelson, Gary Lee. "Wind, Sand, and Sea Voyages: An Application of Granular Synthesis and Chaos to Musical Composition." Available on-line at http://www-ks.rus.uni-stuttgart.de/people/schulz/fmusic/gnelson.html
Schillinger, Joseph. The Schillinger System of Musical Composition Vol. 1. Carl Fischer, Inc., New York: 1941.
Schillinger, Joseph. The Mathematical Basis of the Arts. Philosophical Library, New York: 1948.
Voss, Richard F. and John Clarke. " '1/f noise' in Music: Music From 1/f Noise." J. Acoust. Soc. Am. 63(1) (1978): 258-261.