Abstract
Many clinically significant health conditions in older adults are underreported or only recorded in unstructured health records. These records, however, contain valuable information for patient care and prognosis. This study utilized 10.6 million free-text entries from the electronic health records of 102,525 patients aged 50–80 across various care settings in Finland from 2010 to 2022. A deep learning-based natural language processing model was employed to perform named entity recognition (NER) to identify falls, incontinence, loneliness, and mobility limitations from the free-text entries. The performance of the NER models was evaluated by precision, recall and F1 scores. Diagnostic codes for incontinence and falls were collected for comparisons. Cox regression models were used to assess the predictive value of the identified conditions for all-cause mortality. The NER models demonstrated excellent performance with recall, precision and F1 scores greater than 0.80 across the health conditions. Compared to diagnostic codes, NER identified greater numbers of falls (31987 vs 4090) and incontinence (7059 vs 3873) onsets and yielded greater hazard ratios for all-cause mortality: 1.31 vs 1.04 for falls and 1.99 vs 0.65 for incontinence. Deep learning-based NER models present new opportunities to identify vulnerable patients in free text health records.
| Original language | English |
|---|---|
| Pages (from-to) | 341-347 |
| Number of pages | 7 |
| Journal | Computational and structural biotechnology journal |
| Volume | 28 |
| DOIs | |
| Publication status | Published - 2025 |
| Publication type | A1 Journal article-refereed |
Keywords
- Electronic health records
- Falls
- Incontinence
- Loneliness
- Mobility limitations
- Natural language processing
Publication forum classification
- Publication forum level 1
ASJC Scopus subject areas
- Biotechnology
- Structural Biology
- Biophysics
- Biochemistry
- Genetics
- Computer Science Applications