The task of determining the affective quality of a piece of music is of interest in the field of music information retrieval. Current methodologies for determining what emotions a song may elicit rely on formal lab experiments or informal mass questionnaires such as the Amazon Mechanical Turk surveys used by AMG-1608 and other music emotion recognition datasets. These processes can be time consuming and have the potential to decelerate projects in this domain.
The goal of this research is to explore the use of social media as a tool for evaluating and predicting the emotions a song might invoke. Taking song lists from existing music emotion datasets, we create a data scraping system which crawls social media platforms such as Reddit and Youtube and retrieves user comments discussing these songs. We perform semantic word analysis on these conversations, analyzing the total affect of a conversation by measuring the individual affective qualities of words and bigrams using datasets such as the Extended Affective Norms for English Words (Extended ANEW) and NRC Semantic Lexicons. We aggregate the valence and arousal of the words to create a set of features which measure the average affect of the comments for each song.
This research contributes a framework for estimating a song’s emotional valence and arousal based on discourse about the song. In the future, the features generated here would be used to train a neural network to predict a song’s affective qualities based solely on social media discourse about the song.