Each row contains label, annotation ratio between toxic/nontoxic (using 3 annotators) and tweet id as the example:
1[tab][3/0][tab]tweet_id
Labels are following items:
- 1: Toxic
- 0: Non-Toxic
These keywords are the 44 keywords that we used to collect the tweets via Twitter Search API.
Each row contains toxic keyword and its meaning as the example:Thai toxic word[tab]original meaning/toxic meaning
.
In Proceedings of the Second Workshop on Text Analytics for Cybersecurity and Online Safety 2018 (to appear).
http://cl.sd.tmu.ac.jp/thaitoxicity/
This project is licensed under the terms of the Creative Commons license.