Skip to content
/ SHS-YT Public

A benchmark dataset for cover version identification.

Notifications You must be signed in to change notification settings

progsi/SHS-YT

Repository files navigation

SHS-YT

DOI

Code for the analysis of SHS-YT, a dataset of videos crawled from YouTube based on seed songs in SHS100K-Test.

Getting Started

We recommend using our conda environment. Install and activate by:

conda env create -f env.yml;
conda activate shs-yty

Content

Data

  • data directory datacontains our annotated dataset SHS-YTand the benchmark sets (SHS-YT combined with the songs corresponding songs for cliques from SHS100K-Test). Other datasets are Da-Tacos and SHS100K.
  • the subdir data/annotations contains expert and worker comments
  • the subdir data/preds contains the square similarity matrix per model
  • the subdir featurescontains a sample of the audio features
  • figs contains the GUI of the MTurk experiment
  • documentation contains descriptions for our classes

Download and Extract

To download and extract relevant features for the CSI task, you can use this repository: https://github.com/progsi/YTFeatureExtractor For example, to download the large benchmark dataset SHS-SEED+YT saved to BENCHMARK_CSV_PATH, run

python extract_list.py --listfile BENCHMARK_CSV_PATH -i YOUR_DATA_DIR

Code

This directory contains different notebooks for analysis of data.

  • benchmark.ipynb benchmarking the datasets (Table 5 in the paper)
  • statistics.ipynb basic stats, KDEs etc
  • curation_analysis.ipynb more profound analysis of amiguity annotations
  • pairs_analysis.ipynb contains analyses from Table 6, Table 7 and Figure 5 from the paper and some additional analyses.

Uncertainty Class

Uncertainty Applies for Description
Song: Difficult Cover Version Strong changes in melody, harmony, timbre and rhythm which are expected in cover song identification. During annotation stronger changes of these characteristics make the classification for a human annotator difficult, especially if the annotator does not know the song.
Song: Drum-Only Version & Non-Version Only the drum track. Typically either isolated by automatic sound source separation, covered by a drummer or programmed in a drum engine.
Song: Instrumental Version & Non-Version A version without the vocal track. Typically an karaoke version or a backing track. Might be generated by automatic sound source separation.
Song: Mashup/Remix Version & Non-Version A song which contains samples from the query song. The samples might be whole sections (typically the chorus) or just very short melodic lines.
Song: Medley Version & Non-Version A song which contains (typically sections of) multiple songs. One of the songs is (a section of) the query song.
Song: Same Artist Non-Version A different song but it is from the same artist.
Song: Same Genre Non-Version A different song but it is from the same genre.
Song: Similar Non-Version A different song but it is musically similar in terms of melody, harmony, timbre, rhythm etc.
Song: Single Instrument Version & Non-Version A song which includes only a stem of a single harmonic instrument. The instruments which are apparently occuring most are the piano and the guitar. Typically, either someone covers the query song by playing itself or the stem performance is programmed (eg. piano roll representation).
Song: Slowed/Spedup Version & Non-Version The query song but sped-up or slowed down.
Song: Vocal-Only Version & Non-Version Only the vocal stem of the query song. Either isolated automatically by sound source seperation or an acapella cover.
Video: In-Background Version & Non-Version The query song appears in the background with foreground noise such as crowd noise or speech or mixed noise (eg. in a movie or show scene).
Video: Low Fidelity Version & Non-Version The query song is presented with low fidelity.
Video: Multiple Songs Version & Non-Version Multiple songs beside the query song are contained in the video. Typical examples are concert performances or tributes.
Video: Similar Metadata Non-Version A rather obvious non-cover-song of the query with rather similar metadata (especially song title and artist name), which might confuse the annotator.
Video: With Non-Music Version & Non-Version The query song is contained in the video but it is interrupted by (and/or) preceded by (and/or) preceding non-music noise.
Placeholder: No Music No Music Placeholder class for videos which do not contain any music.
Placeholder: Non-Ambiguous Version & Non-Version Placeholder class for songs which were not perceived ambiguous.
Placeholder: Unavailable All Placeholder class for unavailable videos on YouTube at the time of curation.