Informatics and Applications

2021, Volume 15, Issue 2, pp 96-103

EXTRACTING KNOWLEDGE ABOUT MEANS OF EXPRESSION OF LOGICAL-SEMANTIC RELATIONS FROM THE SUPRACORPORA DATABASE

  • A. A. Goncharov
  • O. Yu. Inkova

Abstract

The goal of this paper is to demonstrate how parallel texts annotated with a supracorpora database (SCDB) can be efficiently used to extract knowledge about alternative means of expression of logical-semantic relations (LSR). The authors review the most prominent discursively annotated corpora (Penn Discourse Treebank, Prague Dependency Treebank, and Rhetorical Structure Theory Discourse Treebank) to support the observation that there is no consensus among the researchers as to which linguistic means are to be considered connectives (i. e., prototypical markers of LSR) and which means are deemed "alternative." The research shows that application of the comparative method while leveraging the capabilities of the SCDB of connectives makes it possible not only to extract new knowledge about LSR markers but also to create thesauri of various means of LSR expression in the languages involved, including the alternative ones. In addition, the SCDB data makes it possible to generate new knowledge on correlations between specific LSRs and unconventional means of LSR expression and calculate frequencies of utilization of these means for the studied languages.

[+] References (18)

[+] About this article