Inducing Fine-Grained Semantic Classes via Hierarchical and Collective Classification
Altaf Rahman and Vincent Ng.
Proceedings of the 23rd International Conference on Computational Linguistics, pp. 931-939, 2010.
Click here for the
PostScript or PDF
version.
The talk slides are available here.
Abstract
Research in named entity recognition
and mention detection has typically involved a fairly small number of
semantic classes, which may not be adequate if
semantic class information is intended to support
natural language applications.
Motivated by this observation, we examine the under-studied problem of
semantic subtype induction, where the goal is to automatically determine
which of a set of 92 fine-grained semantic classes a noun phrase belongs to.
We seek to improve the standard supervised
approach to this problem using two
techniques: hierarchical classification and collective classification.
Experimental results
demonstrate the effectiveness of these techniques,
whether or not they are applied in isolation or in combination
with the standard approach.
Train-test split
Here are the lists of names of the 200 files from the BBN Pronoun Coreference Corpus (LDC2005T33) that we used for training and testing. Note that we only used those files in LDC2005T33 that have corresponding .sense files in the LDC2008T04 corpus.
BibTeX entry
@InProceedings{Rahman+Ng:10a,
author = {Altaf Rahman and Vincent Ng},
title = {Inducing Fine-Grained Semantic Classes via Hierarchical and Collective Classification},
booktitle = {Proceedings of the 23rd International Conference on Computational Linguistics},
pages = {931--939},
year = 2010
}