Normalization is the task of mapping a word or a phrase in a document to a unique concept in an ontology (based on the description of that concept in the ontology) after disambiguating potential ambiguous surface words, or phrases. This task has been variously called entity disambiguation, record linkage, or entity linking.
Our sieve-based approach for the normalization of disorder mentions in biomedical data is described in the following paper.
Jennifer D’Souza and Vincent Ng. July 2015. Sieve-Based Entity Linking in the Biomedical Domain. In Proceedings of ACL-IJCNLP.
Our adopted approach is simple yet effective for disorder mention normalization. Each sieve in our normalization system corresponds to a unique syntactic string transformation heuristic responsible for converting a mention into one of its equivalent forms in order to match with its stored form in the ontology. A pictorial depiction of the working of our sieve-based system with only a subset of the sieves in our full system is provided below.
This work was supported in part by NSF Grants IIS-1147644 and IIS-1219142. Any opinions, findings, or conclusions expressed above are those of the authors and do not necessarily reflect the views or official policies of NSF.
Questions, feedback, and suggestions for improvement are welcome via email contact.