Center for Computational and Theoretical Biology

ALL-SESSA Deep Feature Optimization

This webpage contains links to all code and data for the paper

"Efficient Classification of White Blood Cell Leukemia with Improved Swarm Optimization of Deep Features" by Ahmed T. Sahlol, Philip Kollmannsberger and Ahmed A. Ewees, published in Scientifc Reports 2020 and freely available online at www.nature.com/articles/s41598-020-59215-9


Code

The entire pipeline consists of three phases:

1) Applying very deep convolutional neural networks for feature extraction

This part was done using VGG19 pretrained on ImageNet in Keras. The code is in the following Jupyter Notebook:

Feature_Extraction_VGG19.ipynb

2) Feature selection using the Salp Swarm Algorithm (SSA)

The MATLAB implementation of SSA can be downloaded here.

3) Statistical enhancements to SSA for improved classification

These operations were implemented using scikit-learn, specifically:

  • SelectKBest for univariate selection
  • Recursive Feature Elimination (RFE), and
  • Feature Importance using the ExtraTreesClassifier

The code is in the following Jupyter Notebook:

Statistical_Operations_Classification.ipynb


Data

The datasets used in this study can be downloaded here:

ALL-IDB

C-NMC 2019