SABER

Autores
Zhila Alisa
Gelbukh Alexander

Título	Using Factual Density to Measure Informativeness of Web Documents
Tipo	Congreso
Sub-tipo	Memoria
Descripción	19th Nordic Conference on Computational Linguistics NODALIDA 2013
Resumen	The information obtained from the Web is increasingly important for decision making and for our everyday tasks. Due to the growth of uncertified sources, blogosphere, comments in the social media and automatically generated texts, the need to measure the quality of text information found on the Internet is becoming of crucial importance. It has been suggested that factual density can be used to measure the informativeness of text documents. However, this was only shown on very specific texts such as Wikipedia articles. In this work we move to the sphere of the arbitrary Internet texts and show that factual density is applicable to measure the informativeness of textual contents of arbitrary Web documents. For this, we compiled a human-annotated reference corpus to be used as ground truth data to measure the adequacy of automatic prediction of informativeness of documents. Our corpus consists of 50 documents randomly selected from the Web, which were ranked by 13 human annotators using the MaxDiff technique. Then we ranked the same documents automatically using ExtrHech, an open information extraction system. The two rankings correlate, with Spearman’s coefficient ρ = 0.41 at significance level of 99.64%
Observaciones	Linköping Electronic Conference Proceedings ** Drive: Using-factual_2013
Lugar	Oslo
País	Noruega
No. de páginas	227–238
Vol. / Cap.	85
Inicio	2013-05-22
Fin
ISBN/ISSN	978-91-7519-589-6