Abstract
We report on a research effort to create a corpus of clinical free text records enriched with annotation for symptoms of a particular disease (ovarian cancer). We describe the original data, the annotation procedure and the resulting corpus. The data (approximately 192K words) was annotated by three clinicians and a procedure was devised to resolve disagreements. We are using the corpus to investigate the amount of symptom-related information in clinical records that is not coded, and to develop techniques for recognizing these symptoms automatically in unseen text.
| Original language | English |
|---|---|
| Pages (from-to) | 43-50 |
| Number of pages | 8 |
| Journal | CEUR Workshop Proceedings |
| Volume | 744 |
| State | Published - 2010 |
| Externally published | Yes |
| Event | 3rd International Workshop on Health Document Text Mining and Information Analysis 2011, LOUHI 2011 - Bled, Slovenia Duration: Jul 6 2011 → Jul 6 2011 |