Music Genre Classification of Lyrics using LSTM
RESEARCH QUESTION
Automatic classification of music is an important and well researched task in music information retrieval (MIR). Genre classification by lyrics presents itself as a natural language processing (NLP) problem. In NLP the aim is to assign meaning and labels to text; here this equates to a genre classification of the lyrical text.
SVMs, k-NN, and NB have been heavily used in previous lyrical classification research. These models produced classification accuracies of 69% amongst 5 genres and 43% amongst 10 genres.
Can an LSTM bi-directional model using GloVe embeddings improve the accuracy of genre classification?
UNDERSTANDING THE DATA
DATA SOURCE
The data comes from Kaggle that has a dataset of artists and lyrics. There are about 209K lyrics across 6 genres and multiple languages. For the purpose of the project, we only use English language lyrics around 109K across 3 genres.
Exploring Genres
Exploring the Word Counts
Topic Models
Used Latent Dirichlet Allocation (LDA) to identify various topics and word associated with the topics.
MODEL
The motivation to use Long Short-Term Memory (LSTM) model stems from the fact that lyrics are inherently sequential in nature, and the similarity between two lyrics must in at least some way be determined by the similarities between their sequences over time.
An important idea in NLP is the use of dense vectors to represent words. For the purpose of this project, we will be using GloVe embeddings.
To address overfitting since the corpus is 5000 lyrics I will incorporate dropout rates.
LSTM (Long Short-Term Memory) are very good for analyzing sequences of values and predicting the next one. For example, LSTM could be a good choice if you want to predict the very next point of a given time series.
Talking about sentences in texts, the phrase (sentences) is basically a sequence of words. So, it is natural to assume LStM could be useful to generate the next word of a given sentence.
In summary, the objective of our LSTM neural network will be to predict the genre given the lyrics of a song.
EXPERIMENT
As an initial step calculated accuracy of predicting the genre using SVM and Count Vectors. The results are shown in Table 1.
We noticed that the accuracy increased with the sample size but then it decreased when tested across the full corpus of lyrics of 109K songs.
We ran the data through LSTM model and calculated the accuracy for various samples size for epochs of 5, 10 and 15. The results are shown in Table 2 and Fig 7.
DISCUSSION
The LSTM model had a better accuracy rate in predicting the genre of lyrics even with a small corpus size of 5000 and it significantly increase with a larger corpus size of 10000 lyrics.
For complete code click here