Syntactic nuclei in dependency parsing - A multilingual exploration

Research output: Chapter in Book/Report/Conference proceedingArticle in proceedingsResearchpeer-review

Standard models for syntactic dependency parsing take words to be the elementary units that enter into dependency relations. In this paper, we investigate whether there are any benefits from enriching these models with the more abstract notion of nucleus proposed by Tesnière. We do this by showing how the concept of nucleus can be defined in the framework of Universal Dependencies and how we can use composition functions to make a transition-based dependency parser aware of this concept. Experiments on 12 languages show that nucleus composition gives small but significant improvements in parsing accuracy. Further analysis reveals that the improvement mainly concerns a small number of dependency relations, including nominal modifiers, relations of coordination, main predicates, and direct objects.

Original languageEnglish
Title of host publicationEACL 2021 - 16th Conference of the European Chapter of the Association for Computational Linguistics, Proceedings of the Conference
Number of pages12
PublisherAssociation for Computational Linguistics (ACL)
Publication date2021
Pages1376-1387
ISBN (Electronic)9781954085022
Publication statusPublished - 2021
Externally publishedYes
Event16th Conference of the European Chapter of the Associationfor Computational Linguistics, EACL 2021 - Virtual, Online
Duration: 19 Apr 202123 Apr 2021

Conference

Conference16th Conference of the European Chapter of the Associationfor Computational Linguistics, EACL 2021
ByVirtual, Online
Periode19/04/202123/04/2021
SponsorBabelscape, Bloomberg Engineering, Facebook AI, Grammarly, LegalForce
SeriesEACL 2021 - 16th Conference of the European Chapter of the Association for Computational Linguistics, Proceedings of the Conference

Bibliographical note

Funding Information:
We thank Daniel Dakota, Artur Kulmizev, Sara Stymne and Gongbo Tang for useful discussions and the EACL reviewers for constructive criticism. We acknowledge the computational resources provided by CSC in Helsinki and Sigma2 in Oslo through NeIC-NLPL (www.nlpl.eu).

Publisher Copyright:
© 2021 Association for Computational Linguistics

ID: 366045841