Learning to Predict Readability using Diverse Linguistic Features

Learning to Predict Readability using Diverse Linguistic Features (2010)

Rohit J. Kate, Xiaoqiang Luo, Siddharth Patwardhan, Martin Franz, Radu Florian, Raymond J. Mooney, Salim Roukos and Chris Welty

In this paper we consider the problem of building a system to predict readability of natural-language documents. Our system is trained using diverse features based on syntax and language models which are generally indicative of readability. The experimental results on a dataset of documents from a mix of genres show that the predictions of the learned system are more accurate than the predictions of naive human judges when compared against the predictions of linguistically-trained expert human judges. The experiments also compare the performances of different learning algorithms and different types of feature sets when used for predicting readability

View:

PDF

Citation:

In 23rd International Conference on Computational Linguistics (COLING 2010) 2010.

Bibtex:

Presentation:

Slides (PPT)

People

Rohit Kate	Postdoctoral Alumni	katerj [at] uwm edu
Raymond J. Mooney	Faculty	mooney [at] cs utexas edu

Areas of Interest

Natural Language Processing

Labs

Machine Learning