Autores
Markov Ilia
Gómez Adorno Helena Montserrat
Sidorov Grigori
Título Language- and Subtask-Dependent Feature Selection and Classifier Parameter Tuning for Author Profiling. Notebook for PAN at CLEF 2017
Tipo Congreso
Sub-tipo Memoria
Descripción 18th Working Notes of CLEF Conference and Labs of the Evaluation Forum, CLEF 2017
Resumen We present the CIC’s approach to the Author Profiling (AP) task at PAN 2017. This year task consists of two subtasks: gender and language variety identification in English, Spanish, Portuguese, and Arabic. We use typed and untyped character n-grams, word n-grams, and non-textual features (domain names). We experimented with various feature representations (binary, raw frequency, normalized frequency, log-entropy weighting, tf-idf), machine-learning algorithms (liblinear and libSVM implementations of Support Vector Machines (SVM), multinomial naive Bayes, ensemble classifier, meta-classifiers), and frequency threshold values. We adjusted system configurations for each of the languages and subtasks.
Observaciones
Lugar Dublin
País Irlanda
No. de páginas 7 p.
Vol. / Cap.
Inicio 2017-09-11
Fin 2017-09-14
ISBN/ISSN