CPSC 440 project on speaker Identification in novels using large language models by Richard Han, Charles Yan, Mu-Chen Liu.
Dialogue attribution in the context of text analysis refers to the process of associating dialogue with the correct speaker in a conversation or narrative text. This is a crucial task in various fields such as natural language processing (NLP), literary analysis, and dialogue systems. Automating this task can be challenging, especially in texts where multiple characters interact closely, or where there are limited clues to identify the speaker. In recent years, large language models (LLMs) have gained significant attention for their ability to handle a wide range of natural language tasks with high proficiency. In this paper, we explore the capabilities of zero-shot LLMs in dialogue attribution through experimental analysis. The Mistral 7B Instruct model achieves the highest overall accuracy, while the LLamA 2 Chat model struggles to follow the instructions in the prompt to perform the task. We observe a positive correlation between dialogue attribution performance and the amount of provided context, particularly when the speaker is not explicitly stated in the dialogue. Additionally, the Mistral 7B Instruct model shows a performance plateau when identifying speakers whose identities are directly mentioned in the dialogue.