top of page

OUR Members

I am a Junior at UC Berkeley majoring in Data Science with a focus on Business/Industrial Analytics. In our current project, I am responsible for the first draft of the project along with two other members. Additionally, I handle creating document files and adding corresponding guidelines with Qianlin, partially writing the "About" page, and researching image and video sources.

I am a senior at UC Berkeley majoring in Data Science with an Economics emphasis. I am passionate about data analysis and data engineering. For this project, I was responsible for selecting the dataset, quantitative analysis in data visualization, data critique, and partial contribution to the data narrative part.

I am a junior at UC Berkeley majoring in Economics and minoring in Data Science. I am interested in exploring the intersection of economic theory and data-driven decision-making, particularly in how big data and analytics can inform marketing and business strategies. For this project, I was responsible for web design, partially writing the About page and researching image  sources.

I am a Junior at UC Berkeley majoring in Data Science and MCB. I am passionate about using data analysis as a tool to explore the trend behind data. I am working on the first draft  of the project and also helping with the writing of the narrative for the final draft of the project.

Un Ieng 

Sit

unsit@berkeley.edu

I am a junior at UC Berkeley majoring in Economics and Data Science. I am interested in data analysis and how it impacts the economy. For this project, I am responsible for writing the narrative and helping with other parts of the project when needed. 

I am a senior at UC Berkeley majoring in Data Science with an emphasis on Quantitative Social Science. I am passionate about shaping data and using it to conduct analyses. For this project, I was responsible for the partial narrative part and helped search for datasets.

Everyone contributes to the annotated Bibliography

OUR Project

Assets

Our primary asset is the Spotify dataset ('genre_music.csv') spanning from 1960 to 2019, obtained from Kaggle. This dataset includes detailed information on various musical attributes such as danceability, energy, popularity, genre, etc. Additionally, we integrated a secondary dataset ('singers_gender.csv') to include gender information for artists, facilitating a gender-based analysis of music trends.

​

Most of the content in our Narrative section references journals from UCB Libraries and Google Scholar, with a small portion provided by history and music-related websites, as well as news sites such as Vox. We combine these with datasets and digital humanities theory, connecting traditional scholarly resources with contemporary digital tools to produce a comprehensive analysis. This integration allows us to present multifaceted perspectives that address both historical context and current scholarly discourse. By integrating these various sources, we aim to enrich the narrative with diverse perspectives and rigorous data, resulting in a comprehensive exploration of our topic.

​

The texts provide a specific content of each section of our project, ensuring clarity and accessibility for readers. The images we use in the Narrative are relevant to different historical events, and they support and visualize the content of the texts.

Program Services

Our website is created and supported by Wix.com

 

We utilized Python and its libraries (pandas, seaborn, numpy, matplotlib) for data cleaning, analysis, and visualization. These tools were chosen for their robustness in handling large datasets and performing complex statistical analyses.

Interface

The project is presented on a web-based mini-site created using Wix. This platform was chosen for its user-friendly interface and flexibility in design, allowing us to effectively display our visualizations and narratives. 

 

This site includes five sections: Home page, Narrative for introducing our methodology and research contents, Data Critique for explaining datasets, Annotated Bibliography showing references, and this About page introducing our project, members, and acknowledgements.

Tech & Data Decision

Python was selected due to its comprehensive libraries for data manipulation and visualization. Seaborn and Matplotlib were employed for creating line plots, bar plots, and other visualizations to represent trends over time and genre-specific differences. We used Pandas for efficient data cleaning and manipulation. At the beginning of the data cleaning part, we ensure no missing values in critical fields (e.g., danceability, energy, popularity) to maintain data integrity. And the 'decade' variable was transformed from categorical (e.g., '60s', '70s') to numerical values (e.g., 1960, 1970) to facilitate analysis and visualization. Line plots were used to show how danceability and energy have changed over the decades. These visualizations provide insights into changes in musical production and listener preferences, reflecting cultural shifts and technical improvements. Bar plots illustrated the average danceability and energy levels for different genres. This analysis highlighted significant genre-specific differences, with genres like EDM showing higher danceability and energy, aligning with their use in dance contexts. Gender data was combined to construct bar plots showing average danceability and popularity by gender. These visualizations revealed disparities between male, female, and non-binary artists, highlighting persisting gender inequalities within the music industry.

Acknowledgements 

  • Dr. Scott Caddy and all GSIs: Hannah Ellis, Rosa Norton, and Anooj Kansara. Thank you for your patience and instructions for the project.​

  • UCB Libraries Search and Google Scholars for accessing journals and articles.

  • KAGGLE.COM for finding the useful datasets.

  • WIX’s template and features for building the website for our project.

  • Thanks for every groupmate contributing to this project.

bottom of page