Project Summary


Spotify_logo

For this project I set out to investigate the musical characteristics of a sample of songs (tracks) from Spotify. The stated aim was to build a reliable model which could take the musical attributes of a track and return its likely popularity on Spotify.

The assembled dataset came from a downloaded CSV file containing song characteristics and was appended with song popularity ratings obtained through the Spotify public API.

Track Attribute Data

The assembled dataset came from a downloaded CSV file containing a range of musical attributes for over 1.2 million songs

Retrieving Track Popularity

Track popularity was retrieved by hitting the Spotify public api. This returned the popularity of each song as number between 0 and 100. A few issues were had with the number of calls I could make to the API, but by splitting the requests into batches with different API keys I was able to return multiple batches of data. These were then cleaned, validated and stored giving me the popularity values for 74,000 songs.

Dataset

After cleaning the data and reducing down the number of columns I was left with a dataset containing a key (id) and 14 track attributes including popularity. These data were stored locally as a CSV for the purposes of creating a model.

songs dataframe

The mode of the popularity of songs in this dataset was 0. 39% of the sampled songs had a popularity of 0.