• Corpus ID: 209324120

Text as Environment: A Deep Reinforcement Learning Text Readability Assessment Model

  title={Text as Environment: A Deep Reinforcement Learning Text Readability Assessment Model},
  author={Hamid Mohammadi and Seyed Hossein Khasteh},
Evaluating the readability of a text can significantly facilitate the precise expression of information in a written form. The formulation of text readability assessment demands the identification of meaningful properties of the text and correct conversion of features to the right readability level. Sophisticated features and models are being used to evaluate the comprehensibility of texts accurately. Still, these models are challenging to implement, heavily language-dependent, and do not… 
Arabic L2 readability assessment: Dimensionality reduction study
This paper presents an approach to automatically measure the readability of Arabic as a foreign language through a series of experiments, using a wide range of features plausibly relevant to readability, as found in the literature, and reducing them in subsequent experiments by eliminating features that appear to have little significance in readability prediction.
Supervised and Unsupervised Neural Approaches to Text Readability
This study exposes their strengths and weaknesses, compare their performance to current state-of-the-art classification approaches to readability, which in most cases still rely on extensive feature engineering, and proposes possibilities for improvements.
Learning Syntactic Dense Embedding with Correlation Graph for Automatic Readability Assessment
This work proposes to incorporate linguistic features into neural network models by learning syntactic dense embeddings based on linguistic features by forming a correlation graph among features and using it to learn their embedDings so that similar features will be represented by similar embeddments.
Readability of Spanish e-government information
The automated evaluation of readability, through the analysis of different linguistic characteristics associated with a better understanding of the websites of the Spanish Government's administrative procedures, shows that the official Spanish Government websites have a high difficulty level.
Analysis of Posters for Informing the Population via Social Media during Covid-19: Ukrainian Network
The study revealed that psycholinguistic variables such as readability, imageability, concreteness, conceptual familiarity, semantic size, name agreement, image agreement, visual complexity, typicality, image variability, authenticity of texts, processing fluency, etc. are at the borderline between medium and high levels.
Trends, Limitations and Open Challenges in Automatic Readability Assessment Research
A brief survey of contemporary research on developing computational models for readability assessment identifies the common approaches, discusses their shortcomings, and identifies some challenges for the future.


A Machine Learning Approach to Persian Text Readability Assessment Using a Crowdsourced Dataset
The first model for Persian text readability assessment using machine learning was introduced and revealed that this model was accurate and could assess the readability of Persian texts with a high degree of confidence.
An analysis of a French as a Foreign Language Corpus for Readability Assessment
The collection process of an annotated corpus of French as a foreign language texts with the purpose of training a readability model is described and it appears that, for some educational levels, the hypothesis of the annotation homogeneity must be rejected.
A framework for automatic question generation from text using deep reinforcement learning
This paper presents a novel deep reinforcement learning based framework for automatic question generation that significantly outperforms state-of-the-art systems on the widely-used SQuAD benchmark in both automatic and human evaluation.
Readability Assessment for Text Simplification
A readability assessment approach to support the process of text simplification for poor literacy readers with a number of new features, and experiment with alternative ways to model this problem using machine learning methods, namely classification, regression and ranking.
Framework of Automatic Text Summarization Using Reinforcement Learning
It is demonstrated that the method of reinforcement learning can be adapted to automatic summarization problems naturally and simply, and other summarizing techniques, such as sentence compression, can be easily adapted as actions of the framework.
Rule-based and machine learning approaches for second language sentence-level readability
Methods and knowledge from machine learning-based readability research, from rule-based studies of Good Dictionary Examples and from second language learning syllabuses are merged to present approaches for the identification of sentences understandable by second language learners of Swedish, which can be used in automatically generated exercises based on corpora.
Text readability and intuitive simplification: A comparison of readability formulas
The results demonstrate that the Coh-Metrix L2 Reading Index performs significantly better than traditional readability formulas, suggesting that the variables used in this index are more closely aligned to the intuitive text processing employed by authors when simplifying texts.
Evaluating Neural Text Simplification in the Medical Domain
This paper introduces a dataset created by filtering aligned health sentences using expert knowledge from an existing aligned corpus and a novel simple, language independent monolingual text alignment method and uses it to train a state-of-the-art neural machine translation model.
Revisiting the Readability Assessment of Texts in Portuguese
This paper presents experiments to build a readability checker to classify texts in Portuguese, considering different text genres, domains and reader ages, using naturally occurring texts.
Text Readability Assessment for Second Language Learners
A generalization method is applied to adapt models trained on larger native corpora to estimate text readability for learners, and domain adaptation and self-learning techniques are explored to make use of the native data to improve system performance on the limited L2 data.