An open source platform to submit any kind of media files
Bhashini DataDaan
is a portal/platform which enables any government entities or PSUs to submit any kind of media files (audio, video, text, pdf, etc). These can be transformed to rich datasets (Parallel, ASR, OCR, etc) which can be made available in ULCA and in parallel power the ML models.
- The actual media files should be zipped (zip or gz)
- Platform to support max size of 5GB zip file.
- The metadata file format can be txt file (though it is a free text, we highly encourage to keep it structural & precise)
The APIs used in DataDaan are specified as OpenAPI 3 under SwaggerHub Specs