Combining Sample Selection and Error-Driven Pruning for Machine
Learning of Coreference Rules
Vincent Ng and Claire Cardie.
Proceedings of the 2002 Coreference on Empirical Methods in Natural Language Processing (EMNLP), pp. 55-62, 2002.
Click here for the
PostScript or PDF
version.
The talk slides are available here.
Abstract
Most machine learning solutions to noun phrase coreference resolution
recast the problem as a classification task. We examine three
potential problems with this reformulation, namely, skewed class
distributions, the inclusion of hard training instances, and
the loss of transitivity inherent in the original coreference
relation. We show how these problems can be handled via intelligent
sample selection and error-driven pruning of classification
rulesets. The resulting system achieves an F-measure of 69.5 and
63.4 on the MUC-6 and MUC-7 coreference resolution data sets,
respectively, surpassing the performance of the best MUC-6 and MUC-7
coreference systems. In particular, the system outperforms the
best-performing learning-based coreference system to date.
BibTeX entry
@InProceedings{Ng+Cardie:02c,
author = {Vincent Ng and Claire Cardie},
title = {Combining Sample Selection and Error-Driven Pruning for Machine Learning of Coreference Rules},
booktitle = {Proceedings of the 2002 Conference on Empirical Methods in Natural Language Processing},
pages = {55--62},
year = 2002
}