Diegetic Videogame-Music


an analysis of the levelscores from Banjo-Kazooie


2016 March 26 - Tommy Dräger


Cohen’s studies on the correlation between visual and aural stimuli, in which various people rated the authenticity of a simple animation of a bouncing ball either with or without diegesis on a scale of one to five, exhibited one distinct result: those who saw the animation accompanied by diegetic sound perceived it to be much more authentic. Cohen’s study is by far not the only which shows how sounds in association with visual media, such as animations, enable a much better perception of each other. For precisely this reason, this article will address an element of videogames which is sometimes underestimated, videogame music.

This text contains the guidelines to Zach Whalen’s article "Play along - An Approach to Videogame Music", Chris Greening’s interview with composer Grant Kirkhope, Mathew Belinkie’s "Useful History of Game Music", diverse studies and scientific articles by Chion and Cohen as well as many borrowed content examples from the references listed in the sources.

This text will analyze various aspects of the Banjo-Kazooie level music, such as their instrumentalisation, their intention, tempo and structure, with a focus on the use of instruments. Diegetic and nondiegetic music are the common theme represented in film and videogames that creates a bridge between film music and videogame music. Whalen wrote that especially early cartoon music and horror movies introduced specific possibilities of diegesis which to this day still serve as a reference for videogames (Zach Whalen, 2004).

I. Diegetic and nondiegetic music



It is possible to categorize a game’s music and sound in two groups: diegetic and nondiegetic. Similar to diegetic sounds, diegetic music simulates a part of the visual environment. Sue Morris wrote once that sound is used in a FPS in order to "put an aural complementary contrast to the action on the screen… and to give the impression of real physics." (Morris, 2002) To uphold the most authentic overview possible of the space around you, argues Morris, the player needs 360 degree aural feedback. A radio inside the game world, for example, can be perceived by the player character within the game. Diegetic game music also possesses vast possibilities to implement the intended impression of three dimensional space.

II. Basic form of diegesis (and related forms)



Comparing the similarities between videogames and films is a good way to analyze diverse elements such as score music. The most fundamental cross section between both media is the fact that films and videogames both rely on acoustic and visual signals to evoke the impression of three-dimensionality and diegesis.

A more obvious and more helpful comparison is expressed in the relationship between videogames and animation. (Paul Ward, 2002) mentioned an interesting argument about videogames as a form of animation, saying that both media aspire to be a representative form that is better described as an "emulation" than a "simulation", since videogames and animation films utilize the same production techniques.

3D spaciality as well as diegesis in animation films describe the characters in such a way that is perceived as authentic when exaggeratedly portrayed. Cartoons, for example, use diegetic music to reinforce visual motions. Percussive music is used for all sorts of violent or quick movements to give the motion itself more emphasis than would be possible with only silence.

Studies on the perception of animation show that objects are perceived as "alive" and anthropomorphic in behaviour when their movements are synchronised with music (Cohen, 2000). This phenomenon traces back to Chions (1994) poetic description of the incredible connection between sound in visible space of animation films and both observations (Chion’s theory and Cohen’s research). The combination of diegetic musical signals and non musical sound effects creates an illusion that allows us to perceive the object as more lively than we would with a mere visual depiction of the figures in movement.

III. Mickey Mousing



The professional term for this combination of animation and diegetic music is called "Mickey Mousing". A familiar example of the usage is Fantasia (The Sorcerer’s Apprentice, 1940), in which the magicians’ apprentice Mickey dreams he summons an army of broomsticks and causes huge amounts of water to amass that flow into a raging ocean.


(Figure 1)

The scene is accompanied by loud, clashing orchestral cymbals, symbolizing the foamy crashing of the breaking waves.

But this practice goes back to even the earliest days of animation, when theater pianists and musicians accompanied silent film. "Mickey Mousing" comes into effect in animated films when the music is supposed to offer synchronisation beyond audial limits to show what is happening on screen (Neumeyer and Buhler, 2001).

An especially good example showing the complexity of diegetic and nondiegetic music is Skeleton Dance (1929), an animated film set to a score by Carl Stalling in which cross-fading diegesis was worked into a powerful story.


(Figure 2)

This picture shows a scene from Skeleton Dance (1929) in which two frightened cats are shown as a skeleton rises from the grave. As the skeleton emerges, we hear a rising D minor scale (Figure 2) played by strings, a very common pattern in cartoon music. Around the time the skeletons begin to dance, their footsteps are accompanied by a (harmonizing) D minor scale in marimba timbre to produce a hollow sounding effect.

Another example within a game demonstrates the same type of synchronisation, but applied to a different medium. "The Mighty Jinjonator" (Banjo Kazooie,1998)


(Figure 3)

This image shows the the evil witch receiving the finishing blow. The attack consists of several plummeting maneuvers, each accompanied by a tremendous A major chord, followed by an F minor chord (Figure 3), which is played by every instrument to give the attack an emphasized focus. "Even when I watch it now the big chords for when the Jinjonator slams into Grunty give me chills" (Grant Kirkhope, 2010). This quite sarcastic comment reinforces the intention of the boss fight’s focus on diegesis.

IV. Banjo-Kazooie



Video game music promotes and improves the narrative experience of video games, meaning that music in video games is one of the elements directly responsible for the visual and aural impressions that allow the player to immerse into a fictional world (Zack Whalen, 2004). The following will be an analysis of exemplary characterizations of the score music from the worlds of the Nintendo 64 classic "Banjo-Kazooie", as the title contains a wide spectrum of diegetic score music. "It all comes down to the idea of Banjo and Kazooie being totally opposite characters and I wanted to reflect that in the music, hence the use of C major followed by F# major, the furthest point of any two notes being the tri-tone. I ended up going mad with that idea and using it everywhere in the Banjo games." (Grant Kirkhope, 2010). Kirkhope’s statement can be confirmed by observing the individual themes of the level scores. But what is much more remarkable is that this game contains several of the aforementioned cartoon music techniques. Especially the musical feedback of the collectable items, as well as several of Banjo’s attacks make use of "Mickey Mousing".

V. Instrument Analysis



The choice of instruments is limited to 16 in every single level (Grant Kirkhope, 2010). The Analysis of all level scores demonstrates that there are certain instruments which manifest themselves by their frequent reappearance. These instruments include the following: marimba, bassoon, strings, alt saxophone, piccolo flute, trombone, synthesizer, harp, glockenspiel, banjo, trumpet, and clarinet. The number of all the instruments is evenly distributed across the areas in the game world and tends to be reduced to five instruments per area, see (figure 4c). An area is referring to a section of a level that can be discerned by its individual score. (underwater, cave, etc.)

Download: Data Analysis (in German)

Y: Areas / X: Level


(Figure 4A)

Y: Instruments / X: Level



(Figure 4B)

Y: Instruments per Area / X: Level


On Average


(Figure 4C)

VI. Theme-specific Instruments



The areas of the game world each possess their own variation of the level score with their respective significant instruments. The musical accompaniment for an underwater scene, for example, uses exclusively the harp as the only playing instrument (Grant Kirkhope, 2010).


(A) Mumbo's Mountain (B) Treasure Trove Cove
(Figure 5)

A further example demonstrates the same intention of the composer, but applied to a different environment. Caves, coves or narrow spaces are always accompanied by a marimba, along with a few other theme-specific instruments, which stands for the setting.


(A) Bubble Gloop Swamp (B) Treasure Trove Cove
(Figure 6)

Furthermore, the composer makes use of the alt saxophone in all areas containing mud, muck or fecal matter.


(A) Mad Monster Mansion (B) Bubble Gloop Swamp

(Figure 7)

For all of the wintery and cold regions, bells and glockenspiel can be heard as a solo instrument interplaying with other instruments.


(A) Freezezy Peak (B) Click Clock Wood
(Figure 8)

VII. Beats per Minute



Certain events evoke various associations which, when substantiated by very specific types of music, allow the game designer to improve the player’s impression of three dimensional spatiality. When the tempo of the score music increases, the music is signaling to the player that they must expend more effort to master the situation. Thus, in all boss sequences, the pace of the level-specific score is raised to signal the player to move quickly:

Nipper (Treasure Trove Cove): from 125bpm to 160bpm;

Mutie Snippets (Clanker's Cavern): Craps: from 100bpm to 190bpm;

Fighting Flibbits (Bubble Gloop Swamp): from 135bpm to 180bpm;

Boss Boom (Rusty Bucket Bay): from 120bpm to 185bpm;

Final Fight (Gruntildas Lair): from 95bpm to 180bpm.

VIII. Conclusion



In this text I tried to examine several important aspects of video game music and the use of music in different settings of video games. It’s important to note that the analyzed material cannot be applied to every genre of video game music. Cohen and Chion’s studies have made clear the depth that can be achieved with this topic in relation to video games. Furthermore note that all credit is due to the authors and their works listed in the references. Unfortunately, video game music remains a controversial part of video games. However, this article has shown that it is possible for video game music to achieve a cognitive association between types of music and interpretation of visual aspects by referring to video games and animation films.

IX. References



Belinkie, Matthew. (1999) Video Game Music: Not Just Kids Stuff. 15 Dezember, aufgerufen am 18. März 2013, http://www.vgmusic.com/vgpaper.shtml.

Chion, Michel. (1994) Audio-Vision: Sound on Screen. New York, Columbia University Press.

Cohen, Annabel. (1998) The Functions of Music in Multi-Media: A Cognitive Approach. Fifth Annual Conference on Music Perception and Cognition. Seoul National University, Seoul, Western Music Research Institute.

Cohen, Annabel. (2000) Film Music: Perspectives from Cognitive Psychology. In: Buhler, James, Flinn, Caryl & Neumeyer, David (Eds.) Music and Cinema. Hanover, NH, University Press of New England.

Douglas, J. Yellowlees & Hargadon, Andrew. (2004) The Pleasure of Immersion andInteraction: Schemas, Scripts, and the Fifth Business. In: Wardrip-Fruin, Noah & Harrigan, Pat (Eds.) First Person: New Media as story, Performance, and Game.Cambridge, MIT Press.

Grieg, Edvard. (1994) March of the Trolls, Lyric Suite, Op. 54: No. 4. Cond. Bernstein, Leonard, Sony.

Hee, T., Ferguson, Norman & Leopold Stokowski, cond. (1960) Fantasia. 60thAnniversary Special Edition, Disney.

Iwerks, Ub. (1928) Galloping Gauchos. In Walt Disney Treasures - Mickey Mouse in Black and White 2002. [DVD], Walt Disney Home Video.

Iwerks, Ub. (1929) Skeleton Dance. In Disney Treasures: Silly Symphonies 2001. Comp. Carl Stalling. [DVD], Disney Home Video.

Strauss, Neil. (2002) Tunes for Toons: A Cartoon Music Primer. In: Goldmark, Daniel & Taylor, Yuval (Eds.) The Cartoon Music Book.

Morris, Sue. (2002) First-Person Shooters - A Game Apparatus. In: Krzywinkska, Geoff King and Tanya (Ed.) Screenplay: Cinema/Videogame/Interface. London, Wallflower Press.

Nattiez, Jean -Jacques. (1990) Can One Speak of Narrativity in Musicß Journal of the Royal Musical Association, 115, 240-257.

Neumeyer, David & Buhler, James. (2001) Analytical and Interpretive Approaches to Film Music (I): Analysing the Music. In: Donnelly, K.J. (Ed.) Film Music: Critical Approaches. New York, The Continuum International Publishing Group.

Wadhams, Nick. (2004) Of Ludology and Narratology. [Online article], aufgerufen am 22. März 2013

Ward, Paul. (2002) Videogames as Remediated Animation. In: King, Geoff & Krzywinkska, Tanya (Eds.) Screenplay: cinema/videogame/interface. London,Wallflower Press.

Interview with Grant Kirkhope (May 2010) by Chris Greening http://www.squareenixmusic.com/features/interviews/grantkirkhope.shtml, aufgerufen am 27. März 2013

Zach Whalen. (2004) Play Along - An Approach to Videogame Music http://www.gamestudies.org/0401/whalen/, aufgerufen am 12. März 2013

Banjo-Kazooie™ (1998) ©Rare published by Nintendo