Corpus ID: 219792405

Extraction and Evaluation of Formulaic Expressions Used in Scholarly Papers

  title={Extraction and Evaluation of Formulaic Expressions Used in Scholarly Papers},
  author={Kenichi Iwatsuki and Florian Boudin and Akiko Aizawa},
Formulaic expressions, such as 'in this paper we propose', are helpful for authors of scholarly papers because they convey communicative functions; in the above, it is showing the aim of this paper'. Thus, resources of formulaic expressions, such as a dictionary, that could be looked up easily would be useful. However, forms of formulaic expressions can often vary to a great extent. For example, 'in this paper we propose', 'in this study we propose' and 'in this paper we propose a new method to… Expand
Communicative-Function-Based Sentence Classification for Construction of an Academic Formulaic Expression Database
This study considers a fully automated construction of a CF-labelled FE database using the top–down approach, in which the CF labels are first assigned to sentences, and then the FEs are extracted. Expand


An Evaluation Dataset for Identifying Communicative Functions of Sentences in English Scholarly Papers
To show the usefulness of the dataset, a series of experiments were conducted that determined to what extent sentence representations acquired by recent models, such as word2vec and BERT, can be employed to detect communicative functions in sentences. Expand
Using Formulaic Expressions in Writing Assistance Systems
This work proposes a new framework for semantic searches of FEs and a new method to leverage both existing dictionaries and domain sentence corpora, and expands an existing FE dictionary to consider building a more comprehensive and domain-specific FE dictionary. Expand
Syntactic complexity in English as a lingua franca academic writing
Abstract This study complements previous research on linguistic features of English as a lingua franca (ELF) from a syntactic complexity perspective. Specifically, the present study seeks to find outExpand
Rhetorical Move Detection in English Abstracts: Multi-label Sentence Classifiers and their Annotated Corpora
MAZEA (Multi-label Argumentative Zoning for English Abstracts), a multi-label classifier which automatically identifies rhetorical moves in abstracts but allows for a given sentence to be assigned as many labels as appropriate is presented. Expand
Formulaic language in L1 and L2 expert academic writing: Convergent and divergent usage
It is argued that the use of bundles by the L2 writers deviates from L1 norms and concludes that, although they are expert writers, their formulaicity is ‘hybrid’, that is, largely, but not completely, native-like. Expand
Feature Words of Moves in Scientific Abstracts
This study surprisedly found that the negative feature words play central role for prediction performance improvement and are 10% better than the baseline performance that use all keywords. Expand
An Academic Formulas List: New Methods in Phraseology Research
This research creates an empirically derived, pedagogically useful list of formulaic sequences for academic speech and writing, comparable with the Academic Word List (Coxhead 2000), called theExpand
A cross-disciplinary investigation of multi-word expressions in the moves of research article abstracts
Abstract Conformity to the epistemological orientations of academic disciplines is often reflected in the ways in which knowledge is constructed and communicated through certain linguistic featuresExpand
As can be seen: Lexical bundles and disciplinary variation
An important component of fluent linguistic production is control of the multi-word expressions referred to as clusters, chunks or bundles. These are extended collocations which appear moreExpand
The purpose of this study is to: Connecting lexical bundles and moves in research article introductions
A group of lexical bundles identified in a corpus of research article introductions were identified as the first step in the analysis of these expressions in the different sections of the research article and showed several new qualities for these expressions. Expand