Using Large Language Models to Annotate Complex Cases of Social Determinants of Health in Longitudinal Clinical Records

Overview

Social Determinants of Health (SDoH) such as housing insecurity are known to be intricately linked to patients’ health status. Large language models (LLMs) have shown potential for performing complex annotation tasks on unstructured clinical notes. Our work assessed the performance of LLMs (GPT-4 and GPT-3.5) in identifying temporal aspects of housing insecurity, and compared performance between original and de-identified notes.

Our 2024 medRxiv pre-print can be found here

Compared with GPT-3.5 and a named entity recognition (NER) model, GPT-4 had the highest performance and had a much higher recall than human annotators in identifying patients experiencing current or past housing instability.
In most cases, the evidence output by GPT-4 was similar or identical to that of human annotators, and there was no evidence of hallucinations in any of the outputs from GPT-4.
GPT-4 precision improved slightly on de-identified versions of the same notes, while recall dropped.

Figures

Comparison of recall and precision for Regex, GPT-3.5, GPT-4, and manual annotation in identifying notes with current or past housing instability, measured on 539 manually annotated notes.

Recall and precision metrics for GPT-4 and GPT-3.5 for each housing label.

Comparison of GPT-4 performance in identifying general housing status or current and past housing instability from three different versions of the same set of 539 notes. Notes were either complete (original) patient notes, completely de-identified notes, or de-identified notes with no date shift.

Models

We compared the ability of GPT-3.5 and GPT-4 to identify instances of both current and past housing instability, as well as general housing status, from 25,217 notes from 795 pregnant women. Results were compared with manual annotation, a named entity recognition (NER) model, and regular expressions (RegEx). The detailed annotation guidelines are here.

GPT-4 and GPT-3.5

GPT-4 version 0613 had a 32K token window, while GPT-3.5 Turbo version 0613 had a 16K token window. Both GPT models were run using LangChain and OpenAI libraries. You can find the script for running final prompt in GPT_prompt.py.

John Snow Labs NER Model

The John Snow Labs (JSL) NER model is an SDoH model designed to detect and label SDoH entities within text data. The housing-specific label includes entities related to the conditions of the patient’s living spaces, for example: homeless, housing, small apartment, etc.

Data Availability

The data used in this study contains identifiable protected health information and therefore cannot be shared publicly. Investigators from Providence Health and Services and affiliates (PHSA) with an appropriate IRB approval can contact the authors directly regarding data access.

Contact

For further information or queries, please contact:

Alexandra Ralevski (alexandraDOTralevskiATgmailDOTcom)
Michael Nossal (mikenossalATgmailDOTcom)
Nadaa Taiyab (nadaaDOTtaiyabATgmailDOTcom)

Name		Name	Last commit message	Last commit date
Latest commit History 56 Commits
Figures		Figures
models		models
Annotation Guidelines and LLM Prompt.pdf		Annotation Guidelines and LLM Prompt.pdf
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Using Large Language Models to Annotate Complex Cases of Social Determinants of Health in Longitudinal Clinical Records

Overview

Table of Contents

Main Findings

Figures

Models

GPT-4 and GPT-3.5

John Snow Labs NER Model

Data Availability

Contact

About

Releases

Packages

Contributors 3

Languages

Hadlock-Lab/LLM_SDoH

Folders and files

Latest commit

History

Repository files navigation

Using Large Language Models to Annotate Complex Cases of Social Determinants of Health in Longitudinal Clinical Records

Overview

Table of Contents

Main Findings

Figures

Models

GPT-4 and GPT-3.5

John Snow Labs NER Model

Data Availability

Contact

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages