Ciência_Iscte
Publications
Publication Detailed Description
Exploring Metric Correlations for Legal Text Summarization Evaluation
Proceedings of the Twentieth International Conference on Artificial Intelligence and Law
Year (definitive publication)
2025
Language
English
Country
--
More Information
Web of Science®
Scopus
Google Scholar
This publication is not indexed in Overton
Abstract
The rapid advancements in legal text summarization have not been matched by equivalent progress in evaluation metrics capable of assessing the quality of legal summaries. Traditional evaluation approaches, such as ROUGE, remain widely used despite their inability to capture semantic fidelity. While more recent metrics focus on semantic evaluation, their applicability to legal summarization has not been thoroughly tested, and their performance is highly dependent on embedding models and computational resources, particularly for long and complex legal texts. Furthermore, the absence of publicly available datasets with expert annotations hinders the development and validation of domain-specific evaluation methods. In this paper, we address these challenges by introducing the first publicly available dataset of Portuguese legal summaries, annotated by legal experts across multiple dimensions such as Coherence and Relevance. We use this dataset to systematically evaluate several recent evaluation metrics, comparing their performance against ROUGE, the standard metric for summarization tasks. Our analysis, based on Spearman correlation with human judgments, reveals that ROUGE-2 maintains the highest correlation across almost every evaluated dimension, outperforming more recent metrics, including semantic-based approaches. These results emphasize the challenges of adapting new evaluation frameworks to the legal domain and underscore the need for further research into metrics that can better capture domain-specific requirements.
Acknowledgements
--
Keywords
Legal Automatic EvaluationSemantic similarity,Lexical similarity,Spearman Correlation,Expert Evaluation Dataset
Funding Records
| Funding Reference | Funding Entity |
|---|---|
| 10.54499/UIDB/50021/2020 | FCT |
| C645008882-00000055 | EU (PRR) |
Português