Scripts and additional images for article "Investigating wav2vec2 context representations and the effects of fine-tuning, a case-study of a Finnish model"
You can find the visualizations of the embeddings produced by the CNN component in pics/cnn_*.svg
You can find the high res visualizations of the embeddings produced by the pre-trained Transformer component in pics/pre_*.[svg/eps]
You can find the high res visualizations of the embeddings produced by the fine-tuned Finnish Transformer component in pics/fine_*.[svg/eps]
Pictures marked with utt2age show how the age information is embedded in the models, utt2speaker files demonstrate how well the models could differentiate between speakers and utt2gender visualizes the gender information in the embeddings