This repository contains scripts related to the "Underrepresented speech dataset from open data: case study on the Romanian language (USPDATRO)" project. The project's web page is available at:
The main goal of this project was to investigate the usability of open data (available under Creative Commons licenses) to construct a speech dataset for the Romanian language with underrepresented speech types.
The dataset is available in the platforms: Zenodo, ELG and RELATE: