
Once the data has been explored, we are going to proceed with the definition of the ML model, that in this case will be a CNN (Convolutional Neural Network) as we are facing an image classification problem.
The created model architecture consists on an initial Conv2D
layer (that also indicates the input_shape
of the net), which is a 2D convolutional layer that produces 16 filters as output of windows of 3×3 convolutions, followed by a MaxPooling2D
in order to downsample the Tensor resulting from the previous convolutional layer. Usually, you will find this layer after two consecutive convolutions, but for the sake of simplicity, here we will be downsampling the data after each convolution, as this is a simple CNN with a relatively small dataset (less than 20k images).
Then we will include another combination of Conv2D
and MaxPooling2D
layers as increasing the number of convolutional filters means that we will provide more data to the CNN as it is capturing more combinations of pixel values from the input image Tensor.
After applying the convolutional operations, we will include a Flatten
layer in order to transform the image Tensor into a 1D Tensor which prepares the data that goes through the CNN so as to include a few fully connected layers after it.
Finally, we will include some Dense
fully connected layers so as to assign the final weights of the net, and some Dropout
layers to avoid overfitting during the training phase. You also need to take into consideration that the latest Dense
layer contains as much units as the total labels to predict, which in this case is the number of The Simpsons Characters available in the training set.
The trained model has been named SimpsonsNet (this name will be used later while serving the model as its identifier) and its architecture looks like:
Finally, once trained we will need to dump the model in SavedModel
format, which is the universal serialization format for the TensorFlow models. This format provides a language-neutral format to save ML models that is recoverable and hermetic. It enables higher-level systems and tools to produce, consume and transform TensorFlow models.
The resulting model’s directory should more or less look like the following:
assets/
assets.extra/
variables/
variables.data-?????-of-?????
variables.index
saved_model.pb
More information regarding the SavedModel
format at TensorFlow SavedModel.
Be the first to comment