DEV Community

Cover image for Using Data Science to Fight Malaria: A Breakthrough in Blood Cell Classification
Jonathan Fetterolf
Jonathan Fetterolf

Posted on

Using Data Science to Fight Malaria: A Breakthrough in Blood Cell Classification

GitHub | LinkedIn | Twitter

Introduction:

Data science has the potential to revolutionize the medical field. I demonstrated this by developing an application to swiftly and accurately identify the presence of Malaria in blood cells. This innovative approach enhances the capabilities of doctors and technicians, allowing them to allocate their valuable resources more effectively and ultimately save more lives. In this blog post, we will explore the development of this application, the challenges encountered, and the exciting future possibilities.

Building the Application:

The project began with a dataset consisting of images of blood cells categorized as either Uninfected Blood Cells or Parasitized Blood Cells. To determine where the application could be most impactful, auxiliary data from the World Health Organization was incorporated to identify regions with high Malaria prevalence. Python, along with libraries such as pandas, matplotlib, seaborn, and geopandas, were utilized to analyze and visualize this additional data.

Map of Malaria Deaths

Using the power of Python, Tensorflow, and Keras, I constructed a neural network and trained it using over 15,000 blood cell images. Remarkably, the network achieved an impressive accuracy rate of 96% in classifying the blood cells. To leverage the advantages of GPU computing, Google Colab was employed to create a notebook that facilitated the training process. Subsequently, a live application was developed using streamlit.io, enabling users to submit their own blood cell images and receive predictions in real-time.

Example Image Augmentations

The Impact:

The neural network proved to be highly reliable, accurately predicting blood cell classifications in over 96% of cases. This breakthrough technology empowers doctors to make quicker and more informed decisions in treating Malaria cases, thereby halting further transmission and ultimately saving lives. By providing an accurate diagnosis and determining the parasitic burden, this application enhances the efficiency and effectiveness of medical interventions.

Challenges Encountered:

While developing the application, several challenges were encountered, demonstrating the complexity of implementing advanced technologies in real-world scenarios. One such challenge was the restriction on Google Colab's compute units, necessitating limitations on the utilization of GPU processing for model training. Given additional resources, such as a higher budget, the model could be trained with more data, leading to even greater accuracy in cell classifications. Furthermore, certain features of Keras did not seamlessly integrate with Colab, specifically in terms of image pre-processing. To circumvent this issue and prevent overtraining the neural network, alternative methods were employed for image pre-processing, omitting the brightness and contrast adjustments.

Future Possibilities:

Given more time and resources, I have plans for enhancements to the application. Expanding the dataset and retraining the model would improve its accuracy and robustness. Additionally, a new feature could be introduced, allowing users to submit images of entire blood smears, which would then be automatically split into individual cell images for input to the model. This advancement would enable the model to estimate the parasitic burden, a crucial factor used by clinicians to make informed decisions regarding the treatment of Malaria cases.

Conclusion:

The development of an application that utilizes data science to classify Malaria in blood cells represents a significant breakthrough in the medical field. By leveraging advanced technologies such as neural networks and GPU computing, this project demonstrates the potential for data-driven solutions to positively impact healthcare. Despite the challenges faced, the impressive accuracy achieved and the future possibilities outlined highlight the importance of continued exploration and innovation in the field of data science for medical applications. With ongoing advancements, we can look forward to a future where data-driven approaches play a vital role in combating diseases and improving patient outcomes.

If you want to see or check my work, you can find the project details on my GitHub here: Malaria Blood Cell Classification

Want to Follow Along?

GitHub | LinkedIn | Twitter

Top comments (0)