Detecting Promotional Content in Wikipedia (2013)
This paper presents an approach for detecting promotional content in Wikipedia. By incorporating stylometric features, including features based on n-gram and PCFG language models, we demonstrate improved accuracy at identifying promotional articles, compared to using only lexical information and meta-features.
View:
PDF
Citation:
In Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing (EMNLP 2013), pp. 1851--1857, Seattle, WA, October 2013.
Bibtex:

Presentation:
Slides
Shruti Bhosale Formerly affiliated Masters Student shruti [at] cs utexas edu
Raymond J. Mooney Faculty mooney [at] cs utexas edu
Heath Vinicombe Formerly affiliated Masters Student vini [at] cs utexas edu