Convolutional Neural Network in a Pseudo-Distributed Environment for Classification of Chest X-Ray Images of Patients with Pneumonia
Resumen
In recent years, there has been an increase in the volume of medical data, generating hundreds of terabytes (TB)/petabytes (PB) of data from different sources. This has led to the emergence of innovative technologies such as Apache Spark, which is a framework that allows the analysis of data in memory based on distributed processing. However, since it is a relatively new technology, both Spark and the other tools that have been developed as a complement, do not have orderly and updated documentation. In this project, a convolutional neural network was implemented in a pseudo-distributed environment for the automatic classification of chest X-Ray images of patients with pneumonia using the Dist-Keras library. Thus, it was possible to explore how the convolutional neural network behaves in Spark as the size of the database increases. While the time was showing an increase as the database grew, the accuracy, precision and sensitivity metrics showed a non-stable behavior.