| Online Publication | Signal Processing | Research Proposal | http://werner.yellowcouch.org/Papers/frisam06/ |
Werner Van Belle1* - werner@yellowcouch.org, werner.van.belle@gmail.com
Bruno Laeng2 - bruno.laeng@psykologi.uio.no
1- Personal Research; CH-9294 Basel, Switzerland
2- Biological Psychology; Department of Psychology; University of Tromsų; Tromsų; Norway
* Corresponding author
Abstract: What are the underlying dimensions that give structure and meaning to music ? To answer this question, we aim to integrate methodological techniques from other disciplines (signal processing & mathematics) into the field of psychology and psycho-physiology. Our main goal is to measure in a mathematical and computational way musical parameters and relate those to the human judgment of a song its emotional content and in addition compare these to psycho-physiological measures (pupillometry and EEG).
Keywords: psychoacoustics, emotional cues, audio content extraction, psycho acoustics, BPM, tempo, rhythm, composition, echo, spectrum, sound color
Reference: Werner Van Belle, Bruno Laeng; Where are the emotional cues in music ?; Ideas in Psychology; YellowCouch Scientific; May 2006
Disclaimer: This is a proposal written by the beforementioned partners. Copying part of it without giving due credit must be considered plagiarism. If you are interested in this research consider contacting one of the partners to discuss the presented research.
Most of us feel that music is closely related to emotion or that music expresses emotions and/or can elicit emotional responses in the listeners. There is no doubt that music is a human universal. Every known human society is characterized by making music from their very beginnings. One fascinating possibility is that, although different cultures may differ in how they make music and may have developed different instruments and vocal techniques, we may all perceive music in a very similar way. That is, there may be considerable agreement in what emotion we associate with a specific musical performance [22,25,44,21,58,72,49,13,23,57,8,54,20].
Musical emotions can be characterized in very much the same way as the basic human emotions. The happy and sad emotional tones are among the most commonly reported in music and these basic emotions may be expressed, across musical styles and traditions, by similar structural features. Among these, pitch and rhythm seem basic features. Research on infants has shown that these structural features can be perceived already early in life [33]. Recently, it has been proposed that at least some portion of the emotional content of a musical piece is due to the close relationship between vocal expression of emotions (either as used in speech, e.g. a sad tone, or in non-verbal expressions, e.g., crying). In other words, the accuracy with which specific emotions can be communicated is dependent on specific patterns of acoustic cues [45]. This account can explain why music is considered expressive of certain emotions. Specifically, some of the relevant acoustic cues that pertain to both domains (music and verbal communication, respectively) are: speech rate/tempo, voice intensity/sound level, and high-frequency energy. Speech rate/tempo may be most related to basic emotions as anger and happiness (when they increase) or sadness and tenderness (when they decrease). Similarly, the high-frequency energy also plays a role in anger and happiness (when it increases) and sadness and tenderness (when it decreases). Different combinations and/or levels of these basic acoustic cues could result in several specific emotions. For example, fear may be associated in speech or song with low voice intensity and little high-frequency energy, but panic expressed by increasing both intensity and energy.
Juslin & Laukka
[45]
observed
that the following dimensions
related to the emotional expression of music and/or speech: pitch
or
(i.e., the strongest cycle
component of a waveform), contour
or intonation, vibrato, intensity or loudness, attack or the rapidity
of tone onset, tempo or the velocity of music, articulation or the
proportion of sound-to-silence, timing or rhythm variation, timbre
or high-frequency energy of an instrument/singers formant.
Certainly, the above is not an exhaustive list of all the relevant dimensions of music or of the relevant dimensions of emotional content in music. Additional dimensions might include echo, harmonics, interval structure, melody and low frequency oscillations.
The proposed project will gather psycho-acoustic information by measuring the emotional responses of a group of participants to various short musical pieces. Signal processing expertise will be relied upon as to objectively analyze musical excerpts for various parameters believed to be emotional cues. Below we discuss general procedures and tools that will be used. Afterwards specific projects will be described.
The stimuli we will present to the listeners will be short (a couple of measures) and will capture the essence of the musical piece. Participants will be asked to rate on a step-scale how well a particular emotion describes the sound fragment heard. Responses and latency will be recorded with mouse clicks from participants on a screen display implemented in an extension of BpmDj.
Ratings will be made by use of 24 antonyms. The antonyms will name emotions and moods (e.g., happy-sad), whereas some items will mention objects (e.g., sun-moon) or non-acoustic attributes (e.g., bright-dark). Lines of equal length (10 cm) will be drawn between the two opposites. Participants will be asked to indicate the degree of appropriateness of the probed expression by marking with a pencil a position on the line. If the participants find any of the sounds to be neutral in meaning with respect to a specific pair of opposites, they should mark the center of the line.
All experiments will use within-subjects designs. A Latin square design will be used to order the presentations of the intervals to prevent either fatigue or practice effects. The sounds will be played on a CD-player and heard through stereo headphones. The volume will be the same in all experiments and set to the level of normal speech. The listeners will sit comfortably in a quiet room while listening to the sound fragments.
The selection of musical pieces presented to the participants will be determined using a 'design of experiment' methodology [38,28,35]. This allows one, before conducting the experiment, to vary the relevant factors systematically as to optimize the results that can be obtained through that specific data-set. A design of experiment relies on design variables and response variables. In our experiments, the design variables will encompass a set of measured song properties while the response variables will encompass the emotional response. Cluster analysis and principal component analysis will be applied to the song collection to determine classes of sounds that can be used as input into the design of experiment.
After conducting the experiment, analysis of variance will identify the factors that most influence the results, as well as the existence of interactions and synergies between factors [12,29].
Pupillometry will be performed by means of the Remote Eye Tracking Device, R.E.D., built by SMI-SensoMotoric Instruments in Teltow (Germany). Analyzes of recordings will be computed by use of the iView software, also developed by SMI. The R.E.D. II can operate at a distance of 0.5-1-5 m and the recording eye-tracking sample rate is 50 Hz., with a resolution better than 0.1 degree. The eye-tracking device operates by determining the positions of two elements of the eye: the pupil and the corneal reflection. The sensor is an infrared light sensitive video camera typically centered on the left eye of the subject. The coordinates of all the boundary points are fed to a computer that, in turn, determines the centroids of the two elements. The vector difference between the two centroids is the "raw" computed eye position. Pupil diameters are expressed in number of video-pixels of the horizontal and vertical diameter of the ellipsoid projected onto the video image by the eye pupil at every 20 ms sample.
The psychology department at the University of Tromsø recently acquired a 64 channel BrainAmp MRPlus system (model BP-1310) from Algol Pharma (Finland). The hardware allows simultaneous recording of EEG and ERP. The software associated with the scanner (the BrainVision II software) provides an array of tools to inspect and analyze the data. Most important is the possibility to export the raw data, so that we can import it into BpmDj. Analysis tools present in the analyzing software include standard Fourier and wavelet analysis. The wavelet analysis provides, among other, Morlet and Mexican hat wavelets.
The open source software BpmDj [3]
analyzes
and stores
a large number of soundtracks. The program has been developed by Werner
Van Belle since 2000 under the form of a hobby project. It
contains
advanced algorithms to measure spectrum, tempo,
rhythm, composition
and echo characteristics. Tempo module - Five different
tempo measurement techniques are available of which auto-correlation
[62]
and ray-shooting [5]
are most appropriate. For other techniques see [78,77,70,30].
All analyzers in BpmDj make use of the Bark psycho-acoustic [32,47,27]
scale. The spectrum or sound color is visualized as a 3 channel color
(red/green/blue) based on a Karhunen-Loéve transform [19]
of the available songs. Echo/delay Modules
- Measuring the
echo characteristics is based on a distribution analysis of the
frequency
content of the music and then enhancing it using a differential
auto-correlation
[7]. Rhythm/composition
modules - To calculate
rhythm and composition properties the song is split in all its
measures.
The rhythm property is the superimposition of all those measures.
The composition property measures the probability of a content change
after
measures. From an end-user
point of view the program supports
distributed analysis, automatic
mixing of music, distance
metrics for all analyzers as well as clustering
and automatic
classification based on this information. Everything is tied
together
in a Qt [41]
based user interface. BpmDj will form the basis
platform in which musical parameters will be measured.
![]()
|
Music cannot only elicit emotional response, many observations indicate that one's mental state influences music preference and thus symptomatically reveals mental aspects of the listener [53,16,66,69]. This means that one cannot simply relate audio-cues to a reported emotion since the participants' emotional state will resist or accept the 'cued' emotion differently. This specific project aims to understand the influence of the participants' initial mood to their reports.
Normal participants (equally distributed male/female, N=100) will be drawn from the student population. Before and after the test we will assess their mental state using a merge of different questionnaires and also ask them to list a number of their favorite songs. Limited biographic information will be gathered including gender, handedness, age and the presence of a neurological and/or psychiatric history. Information regarding the music preference and musical expertise of each participant will be gathered. We will classify every person in one of 3 groups: ''Naïve'' listeners, without any musical training; ``Amateurs'', who have learned to play an instrument since childhood, via formal training, or have become autodidacts at some point in their lives; and ``Professionals'', who have at least some training at the conservatory and are still practicing their instrument (at least four hours a day).
Short fragments of songs (N=50) will be presented and questions asked regarding the emotional content. The response variables will be based on 8 basic emotions [52,63,65] presented in a semantic differential questionnaire (as explained in Procedures in the Methods section). The sounds will be perceived through Philips SBC HP 840 headphones.
The design variables will be a well chosen set of songs targeting pitch, loudness, tempo, articulation, rhythm variation, timbre and high-frequency energy (according to [45]). These variables will be mathematically measured for a large set of songs (>20000 songs) using BpmDj and then a subset will be chosen based on the various multi-variate distributions and clusters.
The questionnaire will be created by merging the Pittsburgh sleep quality index, the Epworth sleepiness scale, the beck depression inventory II, the beck anxiety inventory, profile of mood states[59], state anxiety inventory [50] and the Plutchik's emotions profile index [64]. Analysis of the results will reveal which and how strong the relations are between initial participant mood and observed emotion.
A second experiment will rely on new design variables including low frequency oscillations, harmonic classes, key, echo, contour, vibrato, attack and velocity. We will furthermore measure pupil responses throughout the experiments.
Emotions are not only subjective, mental states but they are also physiological states that can be observed externally and measured. Many of the physiological manifestations of emotions are mediated by the autonomic nervous system [26,15,55,76] and there are systematic changes in various physiological responses mediated by this system to naturally occurring acoustic stimuli [14]. Given that the pupil of the eye is also controlled by the autonomic nervous system [56], then monitoring changes of pupil diameter can provide a window onto the emotional state of an individual.
The experiment will be set up similar to the first experiment, but with additionally procedures to measure pupil response. A standard calibration routine will be conducted at the beginning of each session. During the calibration process, the system 'learns' the relationship between the eye movement and gaze position. Specifically, each participant will be asked to gaze at the screen while a calibration map appears on screen, consisting of nine standard points marked as white crosses on a blue background. Each participant will be asked to gaze at each one of the crosses on the screen in a particular order. The eye position is then recorded. Subsequently, a blank screen, consisting of a uniform background in a light blue color, replaced the calibration map. This will remain on screen during the experiments. Each participant will be tested in a windowless laboratory room and artificial lighting will be kept constant for each participant and across sessions.
The new parameters will be mathematically measured using new BpmDj modules. The key/scale module will measure the occurrence of chords by measuring individual notes. To provide information on the scale (we assume an equitemporal scale) this module will also measure the tuning/detuning of the different notes. The dynamic module will measure energy changes at different frequencies. First order energy changes provide attack and decay parameters of the song. Second order energy changes might provide information on song temperament. The LFO module will measure low frequency oscillations between 0 and 30 Hz using a digital filter on the energy envelope of a song. The harmonic module will inter-relate different frequencies in a song by investigating the probability that specific frequencies occur together. A Bayesian classification of the time based frequency and phase content will determine different classes. Every class will describe which attributes (frequencies and phases) belong together, thereby providing a characteristic sound or waveform of the music. This classification will allow us to correlate harmonic relations to the perception of music. Autoclass [75,18] will perform the Bayesian classification. The melody module will rely on a similar technique by measuring relations between notes in time. All of the above modules need to decompose a song into its frequency content. To this end, we will initially make use of a sliding window Fourier transform [62]. Later in the project, integration of multi-rate filter-banks will achieve more accurate decomposition [42,11] by relying on various wavelet bases [46] (see figure).
Another experiment investigates how well complex emotional content
is recognized in music. It has been recognized that combinations of
basic emotions yield more complex (less powerful) emotions [52,63,65],
therefore it is of interest to verify whether those
level
or
level emotions can be found
back in music and whether
these are consistent with what one would expect. This experiment is
set up using similar procedures as the previous ones. The main
difference
are the antonyms we will be using (with opposing complex emotions
this time instead of basic emotions). Two different designs will be
used. First we will create new songs by perfectly superimposing
existing
songs with a strong, well known basic emotional content. This will
be done using the automixer feature of BpmDj. Secondly, we will choose
songs that have multiple, almost equally strong, basic emotional
reports
and see whether probing the combined emotion yields a significant
better response.
Low frequency oscillations, as observed in electro-encefalograms, show information on the modus operandi of the brain. Delta-waves (below 4 Hz), theta-waves (4 to 8 Hz), alpha-waves (8 to 12 Hz) and beta-waves (13 to 30 Hz) all relate to some form of attention and focus. They also seem to relate to mood [2,34]. The brainwave pattern can synchronize to external cues under the form of video and audio [1,48]. Interestingly, major and minor chords produce second order undertones that beat at frequencies of respectively 10 Hz and 16 Hz, which places them in distinct brainwave scales1. Rhythmical energy bursts also seem to influence the brainwave pattern (techno and 'trance' vs 'ambient'). We believe that both a quantification of the low frequency content and measurement of long period energy oscillations in songs can provide crucial input into our study[60,10].
A small number of participants (N=10) will be connected to the EEG measurement device and then asked to listen to full length songs (with a maximum of 3 minutes per song). Every 15 minutes a pause of 5 minutes will be introduced (totaling 24 songs over 2 hours). To avoid electromagnetic inference from headphones, standard loud speakers will be used.
The music will be chosen primarily on its low frequency behavior. Analysis will include cross-correlation between the music energy envelope and different EEG channels. Further analysis will relate EEG imbalance and laterality effects (E.g; right hemisphere's potentials versus left hemisphere's potentials) to previously reported emotions as well as expected emotions.
After establishing the prominent variables related to music, we are in the position to create music relying on this knowledge. Instead of selecting appropriate songs, music will be created to test specific parameters. This is especially important for parameters that cannot be easily isolated using standard songs. Parameters of interest include a) dynamics, such as fortissimo, forte, mezzo-forte, mezzo-piano, piano and pianissimo, b) attack and sustain factors, c) microtonality to measure the impact of detuning, d) melodic tempo e) song waveforms can be altered relying on different instruments. f) Echo and delay characteristics can be influenced in the sound production phase. Further parameters of influence will be g) rhythm (time signatures, poly-rhythmical structures), h) key (major, minor & modal scales) and i) melody (ambitus and intervals). By creating a number of etudes, presented in different styles we will validate our research. To this end, we will cooperate with a professional musician Geir Davidsen at The northern Norwegian Music Conservatory.
The planned research will primarily be of interest to international research communities in psychology and computer science. Articles based on the research will be submitted to top-end journals in cognitive science and cognitive neuroscience, such as Cognitive Psychology, Cognition, Cognitive Science or the journal of experimental psychology, perception and psycho-physics. In addition, findings from this research will be presented at international conferences. Methods and software produced in this project will be open-sourced to attract international attention from computer scientists working in the field of database meta-data extraction and content extraction. The unique inter-disciplinary nature of this project and the highly interesting concept of 'emotion and music' allows us to present our findings to larger audiences [6].
The presented research might improve the design of music therapy. The effect of music as a psycho-therapeutic tool has been recognized for a long time. Clearly, music can have a soothing and relaxing effect and can enhance well-being by reducing anxiety, enhancing sleep, and by distracting a patient from agitation, aggression, and depression states. We briefly touch upon various health-related areas. Depression - depression and dementia remain two of the most significant mental health issues for nursing home residents [73]. Nowadays, there is a growing interest in therapeutic use of music in nursing homes. A widely shared conclusion is that music can supplement medical treatment and has a clear potential for improving nursing homes' care. Music also seems to improve major depression [43]. Anxiety - it would seem that, in general, affective processes are critical to understanding and promoting lasting therapeutic change. Insomnia - music improves sleep quality in older adults. [51] showed that music significantly improves sleep quality. Pain reduction - music therapy seems an efficient treatment for different forms of chronic pain, including fibromyalgia, myofascial pain syndromes, polyarthritis [61]; chronic headaches [68] and chronic low back pain [31]. Music seems to affect especially the communicative and emotional dimension of chronic pain [61]. Sound induced trance also enables patients to distract them from their condition and it may result in pain relief 6-12 month later [68].
The general impact of music on the nervous system extends to the immune system. Research by [36] indicates that listening to music after a stressful task increases norepinephrine levels. This is in agreement with [9], who verified the immunological impact of drum circles. Drum circles have been part of healing rituals in many cultures throughout the world since antiquity. Composite drumming directs the immune system away from classical stress and results in increased dehydroepiandrosterone-to-cortisol ratios, natural killer cell activity and lymphokine-activated killer cell activity without alteration in plasma interleukin 2 or interferon-gamma. One area of application for these effects could be cancer treatment. Autologous stem cell transplantation, a common treatment for hematologic malignancies, causes significant psychological distress due to its effect on the immune system. A study by [17] reveals that music therapy reduces mood disturbance in such patients. The fact that music can be used as a mood induction procedure, with the required physiological effects can make its use relevant for pharmaceutical companies [67]. Positive benefits of music therapy have also been observed in Multiple Sclerosis patients [71].
These positive aspects of music have led to the use of 'music therapy' as an aid in the everyday care of patients in, for example, nursing homes [24,74]. Understanding which musical aspects lead to an emotional response might lead to creation of efficient play-lists and a more scientific way of assessing and selecting songs. Depending on the results of the presented work we might be able to present recommendations to different patients on what kind of music might be suitable to them. Creation of typical 'likes-a-lot' and 'should-listen-to' play-lists per emotional state might enhance the psychotherapists toolbox.
Commercial relevance of this project lies in the possibility to search Internet for similar musical pieces. This can further be extended into semantic peer to peer systemsin which file sharing programs cluster songs to machines of which the owner will probably like the music. Clustering songs based on their emotional co-notations is also indispensable in database systems. Recognizing similar emotions is also a first step in data content extraction algorithms. Radio stations and DJ's might be able to generate play-lists and do interesting things regarding the 'emotion' of a play-list. E.g: it might be possible to make a transition from sad to happy, from low-energy to high-energy and so on. Further commercial relevance can be found in plugins for existing software such as Cubase [40], Cakewalk [37], Protools [39] and others.
All participants in the experiments will participate on a voluntary basis and after written informed consent. They will be informed that they can interrupt the procedure at any time, without having to give a reason for it and at no costs for withdrawing. In none of the experiments will sensitive personal information or names or other characteristics that might identify the participant be recorded. All participants will be thoroughly debriefed after the experiment.
The computational processes underlying music and the emotions are a little investigated topic and interdisciplinary collaborations on this topic are rare. Hence, one way the present proposal is of relevance is the way it combines computational techniques with the empirical study of human responses in concert with explicit compositional methods and musicological structure.
Dr. Werner Van Belle - originally from Belgium, now lives in Norway, where he changed his career from pure computer science to signal processing in life sciences. In his spare time he is passionate about digital signal processing for audio applications. Of particular relevance for this proposal is his work on mood induction [4] and sound analysis [3,6,7,5].
Prof. Bruno Laeng - has a 100% academic position (50% research appointment) in the biologisk psykologi division of the Department of Psychology. Recent quality evaluations from Norges Forksningsråd show that the division of biologisk psykologi at Universitetet i Tromsø (UiTø) has received the highest level of evaluation within the institute for psychology from the examining committee (i.e., very good). Moreover, this applicant was awarded in the year 2000 the Pris til yngre forsker from UiTø.
The project also benefits from a collaboration with 'het Weyerke', a Belgian service center/nursing home for mentally handicapped and elderly. They are mainly interested in music as a stimulation and soothing mechanism to alleviate stress and depressive symptoms from dementing elderly. Their long standing tradition in this matter will provide input into our study.
| http://werner.yellowcouch.org/ mailto:werner@yellowcouch.org | ![]() |