Mostrar el registro sencillo del ítem
News Classification for Identifying Traffic Incident Points in a Spanish-Speaking Country: A Real-World Case Study of Class Imbalance Learning
dc.contributor.author | Rivera Zarate, Gilberto | |
dc.date.accessioned | 2020-12-09T19:04:56Z | |
dc.date.available | 2020-12-09T19:04:56Z | |
dc.date.issued | 2020-09-09 | es_MX |
dc.identifier.uri | http://cathi.uacj.mx/20.500.11961/15614 | |
dc.description.abstract | ‘El Diario de Juárez’ is a local newspaper in a city of 1.5 million Spanish-speaking inhabitants that publishes texts of which citizens read them on both a website and an RSS (Really Simple Syndication) service. This research applies natural-language-processing and machine-learning algorithms to the news provided by the RSS service in order to classify them based on whether they are about a traffic incident or not, with the final intention of notifying citizens where such accidents occur. The classification process explores the bag-of-words technique with five learners (Classification and Regression Tree (CART), Naïve Bayes, kNN, Random Forest, and Support Vector Machine (SVM)) on a class-imbalanced benchmark; this challenging issue is dealt with via five sampling algorithms: synthetic minority oversampling technique (SMOTE), borderline SMOTE, adaptive synthetic sampling, random oversampling, and random undersampling. Consequently, our final classifier reaches a sensitivity of 0.86 and an area under the precision-recall curve of 0.86, which is an acceptable performance when considering the complexity of analyzing unstructured texts in Spanish | es_MX |
dc.description.uri | https://www.mdpi.com/2076-3417/10/18/6253 | es_MX |
dc.language.iso | en_US | es_MX |
dc.relation.ispartof | Producto de investigación IIT | es_MX |
dc.relation.ispartof | Instituto de Ingeniería y Tecnología | es_MX |
dc.subject | natural language processing | es_MX |
dc.subject | short-text classification | es_MX |
dc.subject | data extraction | es_MX |
dc.subject | sampling algorithms | es_MX |
dc.subject | vector support machine | es_MX |
dc.subject | random forest | es_MX |
dc.subject | smart cities | es_MX |
dc.subject | real-world application | es_MX |
dc.subject.other | info:eu-repo/classification/cti/1 | es_MX |
dc.title | News Classification for Identifying Traffic Incident Points in a Spanish-Speaking Country: A Real-World Case Study of Class Imbalance Learning | es_MX |
dc.type | Artículo | es_MX |
dcterms.thumbnail | http://ri.uacj.mx/vufind/thumbnails/rupiiit.png | es_MX |
dcrupi.instituto | Instituto de Ingeniería y Tecnología | es_MX |
dcrupi.cosechable | Si | es_MX |
dcrupi.norevista | 18 | es_MX |
dcrupi.volumen | 10 | es_MX |
dcrupi.nopagina | 1-23 | es_MX |
dc.identifier.doi | doi.org/10.3390/app10186253 | es_MX |
dc.contributor.coauthor | Florencia, Rogelio | |
dc.contributor.coauthor | García, Vicente | |
dc.contributor.coauthor | Sánchez Solís, Julia Patricia | |
dc.contributor.alumno | 169735 | es_MX |
dc.journal.title | Applied Sciences | es_MX |
dc.lgac | COMPUTACIÓN COGNITIVA | es_MX |
dc.cuerpoacademico | Inteligencia Artificial Aplicada | es_MX |