Informal Dementia Forum Corpus (IDFC)

45,216 informal and unstructured conversational sentences are scrapped from an online dementia forum in which users share personal experiences, discuss challenges, and seek advice. The dataset consists of 775 questions and 5,571 answers. The collected sentences include personal and sensitive information related to patients’ histories. To preserve privacy, all identifiable information was anonymised with the tags.

Data annotation is performed to develop the Informal Dementia Forum Corpus (IDFC). The BRAT annotation tool was used to annotate the dementia forum texts. The annotations follow these labels: Agitation, Verbal-aggressive, Verbal-nonaggressive, Physical-aggressive, Physical-nonaggressive, Cause, PwD (Person with Dementia), Family-Carer, People-Uncertain, Behavior-Uncertain, and Cause-Uncertain. In the IDFC dataset, the dementia texts are stored in “.txt” format, while the corresponding annotations are stored in “.ann” files.

Data and Resources

Additional Info

Field Value
Reason for dataset retention expiration
Expiration datetime
Source
Version
Author Sumaiya Suravee
Author Email Sumaiya Suravee
Maintainer Krisitna Yordanova
Maintainer Email Krisitna Yordanova