UGSpeechData - Audio speech dataset of 5 Ghanaian languages - Akan, Ewe, Dagbani, Dagaare, and Ikposo
The dataset comprises of 5000 hours speech corpus in Akan, Ewe, Dagbani, Daagare, and Ikposo. Each language includes 1000 hours of audio speech from indigenous speakers of the language and 100 hours of transcription.
Column | Description |
---|---|
IMAGE_URL |
Provides the relative path to the images in the folder |
IMAGE_SRC_URL |
Provides the source path to the actual image online |
AUDIO_URL |
Provides the relative path to the local audio language in the Local Audio folder |
ORG_NAME |
Identifies the institution coordinating the audio collection |
PROJECT_NAME |
Provides the name of the project |
SPEAKER_ID |
Provides the ID number of the individual describing the image |
LOCALE |
Provides the local language IETF BCP 47 language tag of the audio file |
GENDER |
Provides the individual providing the audio description gender |
AGE |
Provides the individual providing the audio description age |
DEVICE |
Identifies the device from which the audio recording was done |
ENVIRONMENT |
Identifies the space within which the audio was recorded |
YEAR |
The year in which the audio was recorded |
Locale ID | Name |
---|---|
ak_gh |
Akan |
dga_gh |
Dagbani |
dag_gh |
Dagaare |
ee_gh |
Ewe |
kpo_gh |
Ikposo |
Wiafe, I., Abdulai, J., Ekpezu, A. O., Dodzi, R., Atsakpo, E. D., Nutrokpor, C., Winful, F. B. P., & Solaga, K. K. (2023). UGSPEECHDATA (Version 1.0.0) [Data set]. https://github.com/isaacwiafe/speech_data_ug