Autores
Batyrshin Ildar
Sidorov Grigori
Martín del Campo Rodríguez Carolina
Título Enhancement of Performance of Document Clustering in the Authorship Identification Problem with a Weighted Cosine Similarity
Tipo Congreso
Sub-tipo SCOPUS
Descripción 17th Mexican International Conference on Artificial Intelligence, MICAI 2018
Resumen Distance and similarity measures are essential to solve many pattern recognition problems such as classification, information retrieval and clustering, where the use of a specific distance could led to a better performance than others. A weighted cosine distance is proposed considering a variation in the weights of exclusive attributes of the input vectors. An agglomerative hierarchical clustering of documents was used for the comparison between the traditional cosine similarity and the one proposed in this paper. This modified measure has outcome in an improvement in the formation of clusters.
Observaciones DOI: 10.1007/978-3-030-04497-8_4 Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), v. 11289
Lugar Guadalajara
País Mexico
No. de páginas 49–56
Vol. / Cap.
Inicio 2018-10-22
Fin 2018-10-27
ISBN/ISSN 9783030044961