Created a script which analyze the streams since creating your account.
How to get the data: The folder that you have should contain your entire
streaming history data for the life of your account. This can be obtained
by pressing the Request data
button in
this website
if you are logged in to your account. Make sure to select the field that
says Select Extended streaming history
. After some weeks, you
will get an email with the extended streaming history. After downloading
it, you will have a zip file called my_spotify_data.zip
and
when opened the directory is MyData
.
The MyData
will contain a pdf file which details the contents
of the other files in the directory. The files we care about are the ones
that start with Streaming_History_Audio
and are json files.
The following is from the Understanding my Data. A list of items (e.g. songs, videos, and podcasts) listened to or watched during the lifetime of your account, including the following details:
ts
- Date and time of when the stream ended in UTC format
(Coordinated Universal Time zone).
username
- Your Spotify username.platform
- Platform used when streaming the track (e.g.
Android OS, Google Chromecast).
ms_played
- For how many milliseconds the track was played.
conn_country
- Country code of the country where the stream
was played.
ip_addr_decrypted
- IP address used when streaming the
track.
user_agent_decrypted
- User agent used when streaming the
track (e.g. a browser).
master_metadata_track_name
- Name of the track.master_metadata_album_artist_name
- Name of the artist,
band or podcast.
master_metadata_album_album_name
- Name of the album of the
track.
spotify_track_uri
- A Spotify Track URI, that is
identifying the unique music track.
episode_name
- Name of the episode of the podcast.episode_show_name
- Name of the show of the podcast.spotify_episode_uri
- A Spotify Episode URI, that is
identifying the unique podcast episode.
reason_start
- Reason why the track started (e.g. previous
track finished or you picked it from the playlist).
reason_end
- Reason why the track ended (e.g. the track
finished playing or you hit the next button).
shuffle
- Whether shuffle mode was used when playing the
track.
skipped
- Information whether the user skipped to the next
song.
offline
- Information whether the track was played in
offline mode.
offline_timestamp
- Timestamp of when offline mode was
used, if it was used.
incognito_mode
- Information whether the track was played
during a private session.
Example of the streaming data of one song can be seen in the image on top of this page
All the functions in main.py
have docstrings which contains
the parameters of the function and it also contains what is returned by
the function. Note that the columns of the pandas DataFrame returned by
get_all_data()
can be found
here
How to run main.py
in the terminal First, make sure that the
MyData
directory is in the same directory as
main.py
. After this, you should run
python3 main.py
and it will create a txt file called
analysis.txt
which will contain the analyzed data after
running the functions of main.py
using the data from the
MyData
directory.
The analysis.txt
file contains the following analysis of the
data from the MyData
directory:
The source code can be found here.