-
-
Notifications
You must be signed in to change notification settings - Fork 1.2k
6. Data visualization
The library provides the ability to visualize content similarities between audio recordings.
Towards this end a [d3js] chordial representation has been adopted.
The core visualization functionality is provided in audioVisualization.py
and in particular in function visualizeFeaturesFolder()
.
This function uses dirWavFeatureExtraction()
to extract the long-term features for each of the WAV files contained in the provided folder.
Then, a dimensionality reduction approach is performed using either the PCA or the LDA method. Since LDA is supervised, the required labels are taken from the
subcategories of the input files (if available). These are provided through the respective filenames, using the string ---
as a seperator.
For example, if folder contains the files:
Radiohead --- Lucky.wav
Radiohead --- Karma Police.wav
The Smashing Pumpkins --- Perfect.wav
The Smashing Pumpkins --- Rhinocerous.wav
then the labels 0, 0, 1, 1
are given to the features of the respetive filenames.
In this context, the first part of the filename (if the seperator exists) defines the "group" (or the general category) of the respective recording.
In the example above the groups are Radiohead
for the first two and The Smashing Pumpkins
for the last two recordings.
Note that during the convertion of MP3 to WAV (see function convertDirMP3ToWav()
) the MP3 tags can be used in order to generate WAV filenames with an artist tag in the first half of their filename, just like the example above.
As soon as the dimension of the feature space is reduced, a similarity matrix is computed (in the reduced space). Through thresholding this similarity matrix, a graph that illustrates the content similarities between the recordings' content is extracted. This graph is represented using a self-similarity matrix. Command-line example:
python audioAnalysis.py featureVisualization -i MusicData/
The above functionality results in the following diagrams: (a) one that is based on the reduced 2D space (using either PCA or LDA) (b) self-similarity matrix of either the groups or the individual items
pyAudioAnalysis - Theodoros Giannakopoulos