Resumen |
Malware families are evolving constantly, in order to evade the detection mechanisms, the authors modify the order of functions, add random useless code and add new features, it has been observed that these new variations share common characteristics with previous versions, these existing patterns in the malware allows us to generate characteristics to describe it, for later use in a machine learning algo- rithm. In this paper, we analyze different feature sets extracted from malicious portable executables which are used as input to a machine learning algorithm, these features are extracted using n-grams, and are used in three classification models: Logistic Regression, Random For- est and Support Vector Machines |