Unsupervised Models for Coreference Resolution

Vincent Ng.
Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing, pp. 640-649, 2008.


Note: The CEAF results reported in the version of the paper that appeared in the proceedings were in fact produced by a variant of the CEAF scoring program that removes all singleton clusters from both the key partition and the system system before scoring. To see the results produced by the CEAF scorer that conforms to the definition in Luo's HLT-EMNLP 2005 paper, click here for the updated version of the paper (PostScript or PDF).

Abstract

We present a generative model for unsupervised coreference resolution that views coreference as an EM clustering process. For comparison purposes, we revisit Haghighi and Klein's (2007) fully-generative Bayesian model for unsupervised coreference resolution, discuss its potential weaknesses and consequently propose three modifications to their model. Experimental results on the ACE data sets show that our model outperforms their original model by a large margin and compares favorably to the modified model.

BibTeX entry

@InProceedings{Ng:08a,
  author = {Vincent Ng},
  title = {Unsupervised Models for Coreference Resolution},
  booktitle = {Proceedings of EMNLP},
  pages = {640--649},
  year = 2008
}