Analysis of manual annotation of gendered and gender biased language in archival metadata descriptions using the brat rapid annotation tool.
Gendered and Gender Biased Language
├── Person Name
│ ├── Unknown
│ ├── Non-binary
│ ├── Feminine
│ └── Masculine
├── Linguistic
│ ├── Generalization
│ ├── Gendered Pronoun
│ └── Gendered Role
└── Contextual
├── Empowering
├── Occupation
├── Omission
└── Stereotype
annot/
├── AnnotationInstructions.docx
├── data/
│ ├── analysis_data/ (**hidden in GitHub repo**)
│ ├── iaa/
│ └── sample/
├── notebooks/
│ ├── aggregating_data/
│ ├── analyzing_data/
│ ├── cleaning_metadata/
│ └── preparing_data
├── .gitignore
└── README.md
AnnotationInstructions.docx
: instructions given to the annotators for labeling archival metadata descriptions in brat (includes the annotation taxonomy)data:
data/sample:
directory with a sample of the annotated data as a CSV filedata/iaa:
inter-annotator agreement scores per annotator and per label- Note: annotated data will be uploaded to this directory after further analysis
notebooks:
code written to prepare, aggregate, and analyze the annotated data, and to clean additional metadata fields associated with the annotated data (e.g., date of material, language of material)
- Data source: Archives Online, Centre for Research Collections, University of Edinburgh
- Dataset preparation repository: annot-prep
- Publications:
- Research methodology: Situated Data, Situated Systems: A Methodology to Engage with Power Relations in Natural Language Processing Research
- Annotation taxonomy and data creation: Uncertainty and Inclusivity in Gender Bias Annotation: An Annotation Taxonomy and Annotated Datasets of British English Text
Creative Commons Attribution 4.0 International (CC BY 4.0)
@inproceedings{havens-etal-2022-uncertainty,
title = "Uncertainty and Inclusivity in Gender Bias Annotation: An Annotation Taxonomy and Annotated Datasets of {B}ritish {E}nglish Text",
author = "Havens, Lucy and
Terras, Melissa and
Bach, Benjamin and
Alex, Beatrice",
booktitle = "Proceedings of the 4th Workshop on Gender Bias in Natural Language Processing (GeBNLP)",
month = jul,
year = "2022",
address = "Seattle, Washington",
publisher = "Association for Computational Linguistics",
url = "https://aclanthology.org/2022.gebnlp-1.4",
pages = "30--57"
}