Informatics and Applications
2025, Volume 19, Issue 3, pp 73-81
AUTOMATION OF ANNOTATING IMPLICIT DISCOURSE RELATIONS: CHALLENGES AND OPPORTUNITIES
- A. A. Goncharov
- P. V. Iaroshenko
Abstract
The article outlines the principal challenges encountered in the automation of annotating implicit discourse relations, analyzes the underlying causes of these challenges, and suggests possible solutions. The article examines the main stages of the process: (i) the extraction of examples with implicit discourse relations; (ii) the delimitation of relational argument boundaries; and (iii) the selection of features for annotation of the extracted fragments. The results of applying the method of search with exclusion in parallel texts are presented along with a critical assessment of its limitations. Two factors significantly hindering the automation of argument identification in text spans with implicit discourse relations are analyzed: the considerable variability in argument length and the noncontiguous nature of arguments, which may be interrupted by intervening tokens. A comprehensive analysis of methods for automating feature selection for the linguistic data is provided. It has been demonstrated that even the processing of formal features may require the involvement of experts. Furthermore, while some semantic features are amenable to partial automation, others currently require manual annotation. The conclusions are illustrated by examples from the corpus.
[+] References (22)
- Parmar, M., H. Deilamsalehy, F. Dernoncourt, S. Yoon, R.A. Rossi, and T. Bui. 2024. Towards enhancing coherence in extractive summarization: Dataset and experiments with LLMs. Conference on Empirical Methods in Natural Language Processing Proceedings. Miami, FL: Association for Computational Linguistics. 19810-19820. doi: 10.18653/v1/2024.emnlp-main.1106.
- Barbosa, B., and C. Campelo. 2024. LLMs as tools for evaluating textual coherence: A comparative analysis. Anais do XV Simposio Brasileiro de Tecnologia da Informaqao e da Linguagem Humana. Porto Alegre: Sociedade Brasileira de Computaqao. 278-287. doi: 10.5753/stil.2024.245379.
- Prasad, R., B. Webber, and A. Joshi. 2017. The Penn Discourse Treebank: An annotated corpus of discourse relations. Handbook of linguistic annotation. Eds. N. Ide and J. Pustejovsky. Dordrecht: Springer Science + Business Media. 1197-1217. doi: 10.1007/978-94-024-0881-2_45.
- Webber, B., R. Prasad, A. Lee, and A. Joshi. 2019. The Penn Discourse Treebank 3.0: Annotation manual. Philadelphia, PA: Linguistic Data Consortium, University of Pennsylvania. 81 p. Available at: https://catalog.ldc. upenn.edu/docs/LDC2019T05/PDTB3-Annotation- Manual.pdf(accessed August 28, 2025).
- Jiang, D., and J. He. 2020. Tree framework with BERT word embedding for the recognition of Chinese implicit discourse relations. IEEE Access 8:162004-162011. doi: 10.1109/ACCESS.2020.3019500.
- Inkova, O. Yu. 2019. Logiko-semanticheskie otnosheniya: problemy klassifikatsii [Logical-semantic relations: Classification problems]. Svyaznost' teksta: mereologicheskie logiko-semanticheskie otnosheniya [Text coherence: Mere- ological logical semantic relations]. Eds. O. Inkova and E. Manzotti. Moscow: LRC Publishing House. 11-98.
- Goncharov, A. A. 2021. Klassifikatsii vnutri tekstovykh otnosheniy: osnovaniya i printsipy strukturirovaniya [Classifications of intratextual relations: Bases and structuring principles]. Voprosy yazykoznaniya [Topics in the Study of Language] 3:97-119. doi: 10.31857/0373- 658X.2021.3.97-119. EDN: OKPZEI.
- Inkova, O. Yu., and M. G. Kruzhkov. 2021. Strukturirovannye opredeleniya diskursivnykh otnosheniy
v Nadkorpusnoy baze dannykh konnektorov [Structured definitions of discourse relations in the Supracorpora Database of Connectives]. Informatika i ee Primeneniya - Inform. Appl. 15(4):27-32. doi: 10.14357/ 19922264210404. EDN: EZJXVI.
- Inkova, O. Yu., and M. G. Kruzhkov. 2023. Kriterii opredeleniya semanticheskoy blizosti diskursivnykh otnosheniy [Evaluation criteria for discourse relations semantic affinity]. Informatika i ee Primeneniya - Inform. Appl. 17(3):100-106. doi: 10.14357/19922264230314. EDN: UJZJZI.
- Goncharov, A. A., and O. Yu. Inkova. 2021. Izvlechenie znaniy o sredstvakh vyrazheniya logiko-semanticheskikh otnosheniy pri pomoshchi nadkorpusnoy bazy dannykh [Extracting knowledge about means of expression of logical-semantic relations from the supracorpora database]. Informatika i ee Primeneniya - Inform. Appl. 15(2):96-103. doi: 10.14357/19922264210214. EDN: GRPWIB.
- Xiang, W., and B. Wang. 2023. A survey of implicit discourse relation recognition. ACM Comput. Surv. 55(12):258. 34 p. doi: 10.1145/3574134.
- Xiang, W., S. Liu, and B. Wang. 2024. Parsing and encoding interactive phrase structure for implicit discourse relation recognition. Neural Comput. Appl. 36:13783-13797. doi: 10.1007/s00521-024-09709-8.
- Cai, M., Zh. Yang, and P. Jian. 2024. Improving implicit discourse relation recognition with semantics confrontation. Joint Conference (International) on Computational Linguistics, Language Resources and Evaluation Proceedings. Torino, Italia: ELRAandICCL. 8828-8839.
- Goncharov, A. A. 2022. Metody poiska implitsitnykh logiko-semanticheskikh otnosheniy v monoyazychnykh tekstakh [Methods for retrieval of implicit logical- semantic relations from monolingual texts]. Sistemy i Sred- stva Informatiki - Systems and Means of Informatics 32(3):92-102. doi: 10.14357/08696527220309. EDN: NUVZGN.
- Goncharov, A. A. 2022. Metody poiska implitsitnykh logiko-semanticheskikh otnosheniy v parallel'nykh tekstakh [Methods for retrieval of implicit logical-semantic relations from parallel texts]. Sistemy i Sredstva Informatiki - Systems and Means of Informatics 32(4):32-44. doi: 10.14357/08696527220404.
- Goncharov, A. A. 2023. Poisk s isklyucheniem v parallel'nykh tekstakh [Search with exclusion in parallel texts]. Sistemy i Sredstva Informatiki - Systems and Means of Informatics 33(4):102-114. doi: 10.14357/08696527230410. EDN: CVPFDV.
- Kruzhkov, M. G. 2021. Kontseptsiya postroeniya nadkorpusnykh baz dannykh [Conceptual framework for supracorpora databases]. Sistemy i Sredstva Informatiki - Systems and Means of Informatics 31(3):101-112. doi: 10.14357/08696527210309. EDN: UMWNIU.
- Goncharov, A. A. 2023. Annotirovanie parallel'nykh korpusov: podkhody i napravleniya razvitiya [Parallel corpus annotation: Approaches and directions for development]. Informatika iee Primeneniya - Inform. Appl. 17(4):81-87. doi: 10.14357/19922264230411. EDN: GDKDOZ.
- Goncharov, A. A., and P. V. Iaroshenko. 2024. Printsipy annotirovaniya implitsitnykh logiko-semanticheskikh otnosheniy v parallel'nykh tekstakh [Implicit logical-
semantic relations in parallel texts: Annotation principles]. Informatika i ee Primeneniya - Inform. Appl. 18(3):106- 114. doi: 10.14357/19922264240313. EDN: NPXQNX.
- Goncharov, A. A., and P. V. Iaroshenko. 2025 (in press). How to describe implicit discourse relations: The experience of creating a dataset in Russian. Vestnik Moskovskogo universiteta. Ser. 9: Filologiya [Moscow University Philology Bulletin].
- Bird, St., E. Loper, and E. Klein. 2009. Natural language processing with Python. O'Reilly Media Inc. 502 p.
- Savchuk, S. O., T.A. Arkhangelskiy, A. A. Bonch- Osmolovskaya, O.V. Donina, Yu. N. Kuznetsova, O. N. Lyashevskaya, B. V. Orekhov, and M. V. Podryadchikova. 2024. Natsional'nyy korpus russkogo yazyka 2.0: novyye vozmozhnosti i perspektivy razvitiya [Russian National Corpus 2.0: New opportunities and development prospects]. Voprosy yazykoznaniya [Topics in the Study of Language] 2:7-34. doi: 10.31857/0373-658X.2024.2.7- 34. EDN: AATSXV.
[+] About this article
Title
AUTOMATION OF ANNOTATING IMPLICIT DISCOURSE RELATIONS: CHALLENGES AND OPPORTUNITIES
Journal
Informatics and Applications
2025, Volume 19, Issue 3, pp 73-81
Cover Date
2025-10-10
DOI
10.14357/19922264250309
Print ISSN
1992-2264
Publisher
Institute of Informatics Problems, Russian Academy of Sciences
Additional Links
Key words
linguistic annotation; discourse relations; logical-semantic relations; implicitness; parallel texts
Authors
A. A. Goncharov  and P. V. Iaroshenko  ,
Author Affiliations
 Federal Research Center "Computer Science and Control" of the Russian Academy of Sciences, 44-2 Vavilov Str., Moscow 119333, Russian Federation
 Research Computing Center Lomonosov Moscow State University, 1, bld. 4 Leninskie Gory, GSP-1, Moscow 119991, Russian Federation
|