Skip to content

Latest commit

 

History

History
53 lines (36 loc) · 2.59 KB

README.md

File metadata and controls

53 lines (36 loc) · 2.59 KB

NEL

Notes on NEL training

The baseline NEL models are built on top of the baseline NER models. No separate training is done for NEL.

Notes on NEL evaluation

For evaluation, two hyperparameters are tuned on dev set: offset and incl_blank. offset is a fixed duration by which we shift the time stamp predictions. incl_blank is a Boolean to decide whether the trailing blank tokens in the CTC emissions are considered as a part of the predicted segment. When incl_blank is True, the segment between the start and end word separator tokens is considered a hypothesis.

word-F1 metric is evaluated with a tolerance hyperparameter. tolerance, a value between 0 and 1, is the fraction of overlap between a ground-truth word segment and the predicted region needed to count the word as detected; ρ = 1 means a perfect match is required to count the word as detected.

End-to-end model

Time stamps are extracted using CTC emissions from the E2E NER model. The frames between start and end special characters constitute the detected entity segment.

Step 1: Extract CTC emissions from E2E NER model and save character-level timestamps.

bash baselines/nel/decode.sh e2e_ner dev

Step 2: Hyperparameter search on dev.

bash baselines/nel/eval_nel.sh e2e

Pipeline model

Time stamps are extracted using CTC emissions from the ASR model. The frames corresponding to the entity phrase as detected by the text NER model constitute the detected entity segment.

Add evaluation scripts in the table format.

pipeline-w2v2

ASR model: wav2vec2.0 finetuned for ASR text NER model: DeBERTa-Base finetuned for NER

Step 1: Extract CTC emissions from the ASR model and save character-level timestamps.

bash baselines/nel/decode.sh asr dev

Step 2: Hyperparameter search on dev.

bash baselines/nel/eval_nel.sh ppl

pipeline-oracle

Perfect ASR: assuming access to GT transcripts, so the predicted time stamps are the same as the GT force-aligned time stamps.

Evaluate output of the text NER model on dev.

bash baselines/nel/eval_nel.sh oracle_ppl