Binary Classification with Neural Networks using Tensorflow & Keras ⭐️ Pt.2 | by Daniel Hernandez | Mar, 2024


Our original model looked like this, an input layer, a hidden layer, and an output layer which obtained these results:

loss: 0.6252 — accuracy: 0.8614

model = models.Sequential()
# Input Layer
model.add(layers.Dense(16, activation='relu', input_shape=(10000,)))
# Hidden Layer
model.add(layers.Dense(16, activation='relu'))
# Output Layer
model.add(layers.Dense(1, activation='sigmoid'))

Training with Regularizers 🔩

A regularizer is a technique used in machine learning to prevent overfitting by adding a penalty term to the objective function. This penalty encourages simpler models or smoother parameter values, ultimately improving generalization to unseen data.

Here we use the same layers, but in our hidden layer, an L2 Regularization is added with a strenght of 0.001. This means that during training, the model will not only aim to minimize the loss on the training data but also ensure that the weights of this layer remain small, effectively preventing overfitting.

What you have to understand is that this penalty encourages smaller weight values by penalizing large ones.

from keras import regularizers

model_reg = models.Sequential()

# Input Layer
model_reg.add(layers.Dense(16, activation='relu', input_shape=(10000,),kernel_regularizer=regularizers.l2(0.001)))

# Hidden Layer
model_reg.add(layers.Dense(16, activation='relu', kernel_regularizer=regularizers.l2(0.001)))

# Output Layer
model_reg.add(layers.Dense(1, activation='sigmoid'))

# Compile our model
model_reg.compile(optimizer='rmsprop', loss='binary_crossentropy', metrics=['acc'])

# Train our model
history_reg = model_reg.fit(partial_x_train,
partial_y_train,
epochs=20,
batch_size=512,
validation_data=(x_val,y_val))

Now, if we want to evaluate these results, we can compare it to the loss values of our original model.

# Original loss values
history_dict = history.history
loss_values = history_dict['loss']
val_loss_values = history_dict['val_loss']

# Dropout loss values
val_loss_values_reg = history_reg.history['val_loss']

# Data Viz
fig = plt.figure(figsize=(10,10))
epoch = range(1,len(loss_values)+1)
plt.plot(epoch,val_loss_values_reg, 'o',label='dropout')
plt.plot(epoch,val_loss_values, '--',label='original')
plt.legend()
plt.show()

We obtain a model with a higher accuracy, but still not as good loss values🥴.

Original vs Regularization. Seems like our loss values are not good enough

Check out the model evaluation running this code:

model_reg.evaluate(x_test, y_test)

Seems like we will have to keep trying…

Training with Dropout 🔨

This technique randomly disconnects some neurons (units) in a neural network during training. It helps prevent overfitting by forcing the network to learn more robust features, as it can’t rely on any one neuron.

We’re going to transform our original model. Here we add a Dropout layer that randomly ignores 50% of the connections between the neurons in the two hidden layers and the output layer in each backpropagation pass.

# Model architecture
model_dropout = models.Sequential()

model_dropout.add(layers.Dense(16, activation='relu', input_shape=(10000,)))
model_dropout.add(layers.Dropout(0.5))

model_dropout.add(layers.Dense(16, activation='relu'))
model_dropout.add(layers.Dropout(0.5))

model_dropout.add(layers.Dense(1, activation='sigmoid'))

# Compile our model
model_dropout.compile(optimizer='rmsprop', loss='binary_crossentropy', metrics=['acc'])

# Train our model
history_dropout = model_dropout.fit(partial_x_train,
partial_y_train,
epochs=20,
batch_size=512,
validation_data=(x_val,y_val))

Check out the model evaluation running this code:

# Loss values of Dropout Model
val_loss_values = history_dict['val_loss']
val_loss_values_dropout = history_dropout.history['val_loss']

fig = plt.figure(figsize=(10,10))
epoch = range(1,len(loss_values)+1)
plt.plot(epoch,val_loss_values_dropout, '--',label='dropout')
plt.plot(epoch,val_loss_values, '--',label='original')
plt.legend()
plt.show()

model_dropout.evaluate(x_test, y_test)

Original vs Dropout

Now we can definitely notice that there was an improvement in our loss function values 🥳.



Source link

Be the first to comment

Leave a Reply

Your email address will not be published.


*