Listar IIT por autor "0000-0003-1053-4658"

A regression model based on the nearest centroid neighborhood

Garcia, Vicente (2018-04-01)

The renowned k-nearest neighbor decision rule is widely used for classification tasks, where the label of any new sample is estimated based on a similarity criterion defined by an appropriate distance function. It has also ...

Addressing the Links Between Dimensionality and Data Characteristics in Gene-Expression Microarrays

Sánchez Garreta, Josep Salvador (ACM, 2018-05)

In gene-expression microarray data sets each sample is defined by hundreds or thousands of measurements. High-dimensionality data spaces have been reported as a significant obstacle to apply machine learning algorithms, ...

Dissimilarity-Based Linear Models for Corporate Bankruptcy Prediction

García, Vicente (2019-03-01)

Bankruptcy prediction has acquired great relevance for financial institutions due to the complexity of global economies and the growing number of corporate failures, especially since the world financial crisis of 2008. In ...

Estudio empírico del enfoque asociativo en el contexto de los problemas de clasificación

Sánchez, Laura Cleofas (2019)

. Investigaciones realizadas por la comunidad cient´ıfica han evidenciado que el rendimiento de los clasificadores, no solamente depende de la regla de aprendizaje, sino tambien de las complejidades inher- ´ entes en ...

Exploring the synergetic effects of sample types on the performance of ensembles for credit risk and corporate bankruptcy prediction

García, Vicente (2019-05-01)

Credit risk and corporate bankruptcy prediction has widely been studied as a binary classification problem using both advanced statistical and machine learning models. Ensembles of classifiers have demonstrated their ...

Feature dimensionality vs. distribution of sample types: A preliminary study on gene-expression microarrays

Sánchez Garreta, José Salvador (AEPIA, 2018)

n gene-expression microarray data sets each sample is defined by hundreds or thousands of measurements. High- dimensionality data spaces have been reported as a significant obstacle to apply machine learning algorithms, ...

Gene selection and disease prediction from gene expression data using a two-stage hetero-associative memory

Cleofas-Sánchez, Laura (2019-04-01)

In general, gene expression microarrays consist of a vast number of genes and very few samples, which represents a critical challenge for disease prediction and diagnosis. This paper develops a two-stage algorithm that ...

Instance selection for the nearest neighbor classifier: Connecting the performance to the underlying data structure

García, Vicente (Springer, 2019)

Instance selection is one of the most successful solutions to low noise tolerance of the nearest neighbor classifier. Many algorithms have been proposed in the literature, but further research in this area is still needed ...

Understanding the apparent superiority of over-sampling through an analysis of local information for class-imbalanced data

García, Vicente (2019-10-14)

Data plays a key role in the design of expert and intelligent systems and therefore, data preprocessing appears to be a critical step to produce high-quality data and build accurate machine learning models. Over the past ...