PhenoCHF is an annotated corpus consisting of documents belonging to two different text types (i.e., narrative reports from electronic health records (EHRs) and literature articles). It is manually annotated by medical doctors with detailed information relating to mentions of phenotype concepts and disease-phenotype relations.
The documents in PhenoCHF focus on a specific medical condition, i.e., congestive heart failure (CHF). This focus is motivated by CHF's current standing as the world's most deadly disease. However, our experiments using the corpus have demonstrated that it can be used to develop systems that can recognise information relating to a wider range of diseases in a broader variety of text types than those included in PhenoCHF.