Brain Tumor Classification with a Convolutional Neural Network (Using TensorFlow) | by Vasukumar P

Human brain in a dark and white background.

A brain tumor is an overgrowth of cells in the brain or near it. Generally, it can happen in the brain tissue or near the brain tissue. Nearby locations include nerves, the pituitary gland, the pineal gland, and the membranes that cover the surface of the brain.

Brain tumors begin in the brain; they’re known as primary brain tumors. If the cancer spreads to the brain from other parts of the body, these tumors are secondary brain tumors, also called metastatic brain tumors.

While different types of primary brain tumors exist, some brain tumors aren’t cancerous. These are called noncancerous brain tumors or benign brain tumors. Noncancerous brain tumors may grow over time and press on the brain tissue.

The other brain tumors are cancers, also called malignant brain tumors. Brain cancers may grow quickly. The cancer cells can invade and destroy the brain tissue.

vocabulary

Malignant — cancer cells that can grow uncontrollably and invade nearby tissues.

Benign — noncancerous cell growths in the tissues.

Types

1.Glioma

Glioma is a growth of cells that starts in the brain or spinal cord. The cells in a glioma look similar to healthy brain cells called glial cells. Glial cells surround nerve cells and help them function.

As a glioma grows, it forms a mass of cells called a tumor. The tumor can grow to press on brain or spinal cord tissue and cause symptoms.

2.Meningioma

A meningioma is a tumor that grows from the membranes that surround the brain and spinal cord, called the meninges.

A meningioma is not a brain tumor, but it may press on the nearby brain, nerves, and vessels. Meningioma is the most common type of tumor that forms in the head.

3.Pituitary

Pituitary tumors are unusual growths that develop in the pituitary gland. This gland is an organ about the size of a pea. It’s located behind the nose at the base of the brain.

Some of these tumors cause the pituitary gland to make too much of certain hormones that control important body functions. Others can cause the pituitary gland to make too little of those hormones. Another name for these noncancerous tumors is pituitary adenomas.

Elements of the Convolutional Neural Network model architecture. — Convolutional Neural Network Model Architecture

1. Convolutional Layer

The convolutional layer is the core building block of a CNN, where the majority of computation occurs. The required component of the layer is input data, a filter, and a feature map.

Let’s assume that the input will have three dimensions — height, width, and depth — which correspond to RGB in an image. The feature detector is known as a kernel or a filter, which will move across the receptive fields of the image, checking if the feature is present.

The feature detector (kernel) can vary in size. It may be in a 2×2 or 3×3 array of weights, which represents part of the image. The filter is then applied to an area of the image, and a dot product is calculated between the input pixels and the filter.

Then the filter shifts by a stride, repeating the process until the kernel has swept across the entire image. The final output from the series of dot products from the input and the filter is known as a feature map.

Stride is the distance, or number of pixels, that the kernel moves over the input matrix. While stride values of two or greater are rare, a larger stride yields a smaller output.

Types of padding:

i) Valid padding: This is also known as no padding. In this case, the last convolution is dropped if the dimensions do not align.

ii) Same padding: This padding ensures that the output layer has the same size as the input layer.

iii) Full padding: This type of padding increases the size of the output by adding zeros to the border of the input.

After each convolution operation, the CNN applies a Rectified Linear Unit (ReLU) transformation to the feature map, introducing nonlinearity to the model.

2. Pooling Layer

The pooling layers, also known as downsampling, conduct dimensionality reduction, reducing the number of parameters in the input. The layer sweeps a filter across the entire input, but the filter does not have any weights.

Types of pooling:

i) Max pooling: As the filter moves across the input, it selects the pixel with the maximum value to send to the output array. This approach is used more often compared to average pooling.

ii) Average pooling: As the filter moves across the input, it calculates the average value within the receptive field to send to the output array.

While a lot of information is lost in the pooling layer, it also has a number of benefits for CNN. They help to reduce complexity, improve efficiency, and limit the risk of overfitting.

3. Flatten Layer

A neural network flatten layer is a type of layer commonly used in deep learning architectures to transform multi-dimensional input data into a one-dimensional array.

4. Fully Connected Layer

The flattened layer output is processed by two dense layers. The first fully connected dense layer, each node is connected directly to a node in the second layer. The second layer performs the task of classification based on the features extracted through the previous layers and their different filters.

While convolutional and pooling layers tend to use ReLu functions, fully connected layers usually leverage a softmax activation function to classify inputs appropriately, producing a probability from 0 to 1.

Data set link: Brain Tumor Classification (MRI) (kaggle.com)

Note: There is no validation data in the data set. We have split the test data equally, to make a validation data.

The first step in this process is importing necessary libraries into our environment,

# Importing the Libraries
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
import tensorflow as tf
import keras
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Rescaling, Input, Dense, Flatten, Dropout, BatchNormalization
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.preprocessing.image import ImageDataGenerator

The second step is loading the training, validation and testing datasets to our environment,

# Loading the Dataset
train = r"path_of_training_data"
val = r"path_of_validation_data"
test = r"path_of_test_data"

The third step involves, scaling those datasets in order to make sure the three dataset’s images are correct width and height.

image_height = 180
image_width = 180
batch_size = 32# Scaling the Training Data
data_train = tf.keras.utils.image_dataset_from_directory(
train,
shuffle = True,
image_size = (image_width, image_height),
batch_size = batch_size,
validation_split = False)
# Scaling the Validation Data
data_val = tf.keras.utils.image_dataset_from_directory(
val,
image_size = (image_width, image_height),
shuffle = False,
batch_size = batch_size,
validation_split = False)
# Scaling the Test Data
data_test = tf.keras.utils.image_dataset_from_directory(
test,
image_size = (image_width, image_height),
shuffle = False,
batch_size = batch_size,
validation_split = False)

Print the class labels of the images in the training dataset,

data_categories = data_train.class_names
data_categories

Visualizing the training dataset images,

#Visualizing the training dataset images
plt.figure(figsize=(10, 10))
for image, labels in data_train.take(1):
for i in range(9):
plt.subplot(3, 3, i+1)
plt.imshow(image[i].numpy().astype('uint8'))
plt.title(data_categories[labels[i]])
plt.axis("off")

The fourth important step in this process is creating the Convolutional Neural Network model using the TensorFlow framework,

# Building the CNN Model Architecture
model = Sequential()model.add(Conv2D(16, kernel_size = (3, 3), activation = "relu", input_shape = (image_width, image_height, 3), padding = "same"))
model.add(MaxPooling2D(pool_size = (2, 2)))
model.add(Conv2D(32, kernel_size = (3, 3), activation = "relu", padding = "same"))
model.add(MaxPooling2D(pool_size = (2, 2)))
model.add(Conv2D(64, kernel_size = (3, 3), activation = "relu", padding = "same"))
model.add(MaxPooling2D(pool_size = (2, 2)))
model.add(Flatten())
model.add(Dense(128, activation = "relu"))
model.add(Dense(len(data_categories), activation = "softmax"))
model.compile(optimizer = Adam(0.001), loss = "sparse_categorical_crossentropy", metrics = ["accuracy"])
model.summary()

The fifth step is training the model with both training and validation data. The training data is used to train the model based on feature mapping in the images. The validation dataset is used to correct a model’s errors by itself.

# Training the model
history = model.fit(data_train,
validation_data = data_val,
epochs = 20,
batch_size = 50,
verbose = 1)

The sixth step in this process is evaluating the model with test dataset, The model accuracy is indicating how the model correctly predicting the image classes in the test set. The model loss indicating the predicted image is how different from the actual image. Basically, the model must have a minimum loss and maximum accuracy.

# Evaluating the model with Test Data
model.evaluate(data_test)

Model Loss: 0.3608

Model Accuracy: 0.9198

The last step of this process is visualizing the accuracy of the model in both training and validation in each iteration,

accuracy = history.history["accuracy"]
val_accuracy = history.history["val_accuracy"]loss = history.history["loss"]
val_loss = history.history["val_loss"]
epochs = range(len(accuracy))
# Comparing the Training and Validation Accuracy 
plt.plot(epochs, accuracy, label = "Training Accuracy")
plt.plot(epochs, val_accuracy, label = "Validation Accuracy")
plt.title("Model Accuracy")
plt.legend(loc = "lower right")
plt.figure()
plt.show()

Also, visualizing the loss of the model in both training and validation in each iteration, to understand the model’s performance.

#Comparing the Training and Validation Loss
plt.plot(epochs, loss, label = "Training Loss")
plt.plot(epochs, val_loss, label = "Validation Loss")plt.title("Model Loss")
plt.legend(loc = "lower right")
plt.figure()
plt.show()

Testing the model’s performance on unseen data, by giving the random input image to the model,

#Testing the model with input image
image = "glioma.jpg"
image = tf.keras.utils.load_img(image, target_size=(image_height, image_width))
img_arr = tf.keras.utils.array_to_img(image)
img_bat = tf.expand_dims(img_arr, 0)prediction = model.predict(img_bat)

#Calculating the probability of the input image
score = tf.nn.softmax(prediction)print("Brain Tumor in image is {} with accuracy {:0.2}".format(data_categories[np.argmax(score)], np.max(score)))