CMDL: A Large-Scale Chinese Multi-Defendant Legal Judgment Prediction Dataset

Wanhong Huang, Yi Feng, Chuanyi Li, Honghan Wu, Jidong Ge, and Vincent Ng.
Findings of the Association for Computational Linguistics: ACL 2024, pp. 5895-5906, 2024.

Click here for the PDF version.

Abstract

Legal Judgment Prediction (LJP) has attracted significant attention in recent years. However, previous studies have primarily focused on cases involving only a single defendant, skipping multi-defendant cases due to complexity and difficulty. To advance research, we introduce CMDL, a large-scale real-world Chinese Multi-Defendant LJP dataset, which consists of over 393,945 cases with nearly 1.2 million defendants in total. For performance evaluation, we propose case-level evaluation metrics dedicated for the multi-defendant scenario. Experimental results on CMDL show existing SOTA approaches demonstrate weakness when applied to cases involving multiple defendants. We highlight several challenges that require attention and resolution.

BibTeX entry

@InProceedings{Huang+etal:24a,
  author = {Wanhong Huang and Yi Feng and Chuanyi Li and Honghan Wu and Jidong Ge and Vincent Ng},
  title = {{CMDL}: A Large-Scale Chinese Multi-Defendant Legal Judgment Prediction Dataset},
  booktitle = {Findings of the Association for Computational Linguistics: ACL 2024},
  pages = {5895--5906}, 
  year = 2024}