Rohit J. Kate, Xiaoqiang Luo, Siddharth Patwardhan, Martin Franz, Radu Florian, Raymond J. Mooney, Salim Roukos and Chris Welty
In this paper we consider the problem of building a system to predict readability of natural-language documents. Our system is trained using diverse features based on syntax and language models which are generally indicative of readability. The experimental results on a dataset of documents from a mix of genres show that the predictions of the learned system are more accurate than the predictions of naive human judges when compared against the predictions of linguistically-trained expert human judges. The experiments also compare the performances of different learning algorithms and different types of feature sets when used for predicting readability
View:
PDF
Citation:
In 23rd International Conference on Computational Linguistics (COLING 2010) 2010.
Bibtex:

Presentation:
Slides (PPT)
Rohit Kate Postdoctoral Alumni katerj [at] uwm edu
Raymond J. Mooney Faculty mooney [at] cs utexas edu