An Analysis of Ariana Grande Tracks

Ben Rosenberg

7/3/2021

import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
plt.style.use('seaborn-whitegrid')

track_data = pd.read_csv("ari_data.csv")

About the data

The data are composed of around 90 to 95 different Ariana Grande tracks, from 11 different albums. As can be seen below, the features include Title, Album, Date (release year), BPM, and Objective Rating. The last of these categories was handcrafted by your truly (me, not the song).

print(track_data.head())

              Title        Album  Date  BPM  Objective Rating
0  Honeymoon Avenue  Yours Truly  2013  125                 5
1            Baby I  Yours Truly  2013  102                 7
2       Right There  Yours Truly  2013  156                 4
3    Tattooed Heart  Yours Truly  2013   72                 5
4         Lovin' It  Yours Truly  2013   94                 6

BPM

We'll start with an analysis of BPM. Investigating a possible relationship between BPM and Objective Rating produces no significant results:

plt.plot(track_data["BPM"], track_data["Objective Rating"], '.')
plt.xlabel("BPM")
plt.ylabel("Objective Rating")
plt.title("Objective Ratings by BPM")

It appears that BPM is relatively normally distributed. Not much to report here.

plt.hist(track_data["BPM"], bins=20)
plt.title("Track BPM")

Objective Rating

Now let's move onto analyzing the data by Objective Rating. First, we'll look at the average Objective Ratings of each album:

albums = track_data.groupby(["Album"], as_index = False).mean().sort_values(by="Date")
fig, ax = plt.subplots()
ax.scatter(list(range(albums["Objective Rating"].size)), albums["Objective Rating"])
ax.set_ylim(0,10)
plt.xlabel("Approximate Release Order")
plt.title("Objective Ratings Over Time (Album Averages)")

# yeah this is cringe
y_coords = [j + 0.2 for j in albums["Objective Rating"].reset_index(drop=True)]
y_coords[0] -= 0.8
y_coords[1] -= 0.8
y_coords[2] += 0.2
y_coords[3] += 0.6
y_coords[4] -= 0.8
y_coords[5] -= 1.2
y_coords[-1] -= 1

for i, txt in enumerate(albums["Album"]):
    ax.annotate(txt, (list(range(albums["Objective Rating"].size))[i], y_coords[i]), ha="center", va="bottom", size=12)

There are some obvious outliers here -- the "albums" MONOPOLY, boyfriend, and Put Your Hearts Up are just singles, which is why MONOPOLY and boyfriend manage to have such high ratings (and Put Your Hearts Up manages to suck). Similarly, Charlie's Angels is essentially a single (the data have only one track from said album). Positions (Deluxe) is another smaller album in our data, as repeats from Positions were not double-counted.

track_counts = track_data.groupby(["Album"]).count()
small_albums = track_counts[track_counts["Title"] < 5]

print(small_albums)

                    Title  Date  BPM  Objective Rating
Album                                                 
Charlie's Angels        1     1    1                 1
MONOPOLY                1     1    1                 1
Positions (Deluxe)      3     3    3                 3
Put Your Hearts Up      1     1    1                 1
boyfriend               1     1    1                 1

Now let's look at the Objective Ratings by track:

plt.bar(list(range(1,11)), track_data.groupby(by=["Objective Rating"]).count()["Title"])
plt.xlabel("Objective Rating")
plt.ylabel("Number of Tracks")
plt.title("Objective Ratings By Track")

These ratings appear to be reasonably distributed, if a little left-skewed. Notably, some tracks are awful:

print(track_data.sort_values(by="Objective Rating").head(6))

           Title            Album  Date  BPM  Objective Rating
42    Step On Up  Dangerous Woman  2016  111                 1
24     Bang Bang    My Everything  2014  150                 1
10  Popular Song      Yours Truly  2013   99                 2
22   Hands On Me    My Everything  2014   98                 2
49     sweetener        Sweetener  2018  120                 2
35      Everyday  Dangerous Woman  2016  131                 2

...while some were bangers:

print(track_data.sort_values(by="Objective Rating", ascending=False).head(3))

        Title            Album  Date  BPM  Objective Rating
25     Only 1    My Everything  2014  101                10
89   MONOPOLY         MONOPOLY  2019  144                 9
36  Sometimes  Dangerous Woman  2016   78                 9

That's the end of this phase of the analysis. In the next phase, we'll look at lyrics and maybe other features.