Sajib Dasgupta

Department of Computer Science
University of Texas at Dallas
Advisor: Vincent Ng


Contact Information News and Life CV Research Software:
  • Unsupervised Word Segmentation: Morpheme++

  • More coming ........

  • Datasets and Others:

  • Multifaceted Text Classification Datasets: Multifaceted Text

  • Multi-clustering Datasets in ICML/SIGIR 2010: Same as above Multifaceted Text

  • Our Unsupervised Morphological Segmentation Output: English, Bengali

  • Our Unsupervised Part-of-Speech Lexicon Induction Output: English, Bengali

  • Goldstandard Used for Unsupervised Morphological Segmentation: Bengali, Finnish and Turkish

  • Goldstandard Created for Unsupervised Part-of-Speech Lexicon Induction: Bengali

  • Some old papers: Here

  • Webpage in google drive: Here