Revisiting the Dissimilarity Representation in the Context of Regression
Resumen
In machine learning, a natural way to represent an instance is by a feature vector. However, several studies have shown that this representation may not characterize an object accurately. For classification problems, the dissimilarity paradigm has been proposed as an alternative to the standard feature-based approach. Encoding each object by pairwise dissimilarities has demonstrated to improve the data quality because it mitigates some complexities such as the class overlap, the small disjuncts, or the lack of samples. However, it has not been fully explored its suitability and performance when applied to regression problems. This paper redefines the dissimilarity representation for regression. To this end, we have carried out an extensive experimental evaluation on 34 data sets with two linear regression models. The results show that the dissimilarity approach decreases the error rates of both the traditional linear regression and the linear model with elastic net regularization, and it also reduces the complexity of most regression data sets.
Colecciones
El ítem tiene asociados los siguientes archivos de licencia: