Mostrar el registro sencillo del ítem
A Preliminary Study of SMOTE on Imbalanced Big Datasets When Dealing with Sparse and Dense High Dimensionality
dc.contributor.author | Bolivar, Armando | |
dc.date.accessioned | 2022-08-02T18:14:29Z | |
dc.date.available | 2022-08-02T18:14:29Z | |
dc.date.issued | 2022-06-11 | es_MX |
dc.identifier.issn | 0302-9743 | |
dc.identifier.uri | http://cathi.uacj.mx/20.500.11961/22144 | |
dc.description.abstract | The interest in exploiting big datasets with machine learning has led to adapting classic strategies in this new paradigm determined by volume, speed, and variety. Because data quality is a determining factor in constructing a classifier, it has also been necessary to adapt or develop new data preprocessing techniques. One of the challenges of most significant interest is the class imbalance problem, where the class of interest has a smaller number of examples concerning another class called the majority. To alleviate this problem, one of the most recognized techniques is SMOTE, which is characterized by generating instances of the minority class through a process that uses the nearest neighbor rule and the Euclidean distance. Various articles have shown that SMOTE is not appropriate for datasets with high dimensionality. However, in big data, datasets with high dimensionality have contained many zeros. Therefore, in this article, our objective is to analyze the SMOTE-BD behavior on imbalanced big datasets with sparse and dense dimensionality. Experimental results using two classifiers and big datasets with different dimensionalities suggest that sparsity is a predominant factor than the dimensionality in the behavior of SMOTE-BD. | es_MX |
dc.language.iso | en_US | es_MX |
dc.publisher | Springer | es_MX |
dc.relation.ispartof | Producto de investigación IIT | es_MX |
dc.relation.ispartof | Instituto de Ingeniería y Tecnología | es_MX |
dc.rights | Atribución-NoComercial-CompartirIgual 2.5 México | * |
dc.rights.uri | http://creativecommons.org/licenses/by-nc-sa/2.5/mx/ | * |
dc.subject | Big Data | es_MX |
dc.subject | High Dimensionality | es_MX |
dc.subject | Class Imbalance | es_MX |
dc.subject.other | info:eu-repo/classification/cti/7 | es_MX |
dc.title | A Preliminary Study of SMOTE on Imbalanced Big Datasets When Dealing with Sparse and Dense High Dimensionality | es_MX |
dc.type | Memoria in extenso | es_MX |
dcterms.thumbnail | http://ri.uacj.mx/vufind/thumbnails/rupiiit.png | es_MX |
dcrupi.instituto | Instituto de Ingeniería y Tecnología | es_MX |
dcrupi.cosechable | Si | es_MX |
dcrupi.subtipo | Investigación | es_MX |
dcrupi.alcance | Internacional | es_MX |
dcrupi.pais | Mexico | es_MX |
dc.contributor.coauthor | García, Vicente | |
dc.contributor.coauthor | Florencia, Rogelio | |
dc.contributor.coauthor | Rivera Zarate, Gilberto | |
dc.contributor.coauthor | Sánchez Solís, Julia Patricia | |
dc.contributor.alumno | 198665 | es_MX |
dcrupi.tipoevento | Congreso | es_MX |
dcrupi.evento | Mexican Conference on Pattern Recognition (MCPR 2022) | es_MX |
dcrupi.estado | Chihuahua | es_MX |
dc.contributor.coauthorexterno | Alejo Eleuterio, Roberto | |
dcrupi.pronaces | Educación | es_MX |
Archivos en el ítem
Este ítem aparece en la(s) siguiente(s) colección(ones)
-
Memoria en extenso [263]