Build your first Artificial Intelligence Model in 5 minutes! | by Vladimir Paredes | Aug, 2024


https://alexlenail.me/NN-SVG/index.html
https://alexlenail.me/NN-SVG/index.html

Let’s Design the Model

Neural Network architecture is a whole topic on its own. We will dive deeper into this in another post. For now, it is important to understand that sequential neural networks have an input layer, many hidden layers, and an output layer.

Input Layer: In the image above, the input layer is on the left, it is the layer with the most neurons. In our case, it has 28×28 neurons, because our images are 28×28 pixels.

Hidden Layers: These are the 3 layers in the middle. At the bottom of the image, you can see they are identified as “Hidden Layer”. They play the role of identifying relationships in the data. In our model, we will have 3 hidden layers with 128 neurons each. We will use the standard activation function relu (more on that in another post).

Output Layer: It is the layer on the right. You want to set this output layer equal to the number of categories you are trying to predict. In our case, it will be 10 because we are trying to identify numbers from 0 to 9, so we have 10 categories. Our activation function will be softmax, used for multiclass classification. It is appropriate because we have 10 categories we are trying to predict.

The code to design the neural network model specified above is:

model = tf.keras.Sequential([
tf.keras.layers.Flatten(input_shape=(28, 28), name='input'), #Input
tf.keras.layers.Dense(128, activation='relu', name='hidden1'), #Hidden1
tf.keras.layers.Dense(128, activation='relu', name='hidden2'), #Hidden2
tf.keras.layers.Dense(128, activation='relu', name='hidden3'), #Hidden3
tf.keras.layers.Dense(10, activation='softmax', name='output') #Output
])

Now let’s configure the model for training (compile), here we must specify what are we trying to measure and how:

model.compile(optimizer='adam', loss='sparse_categorical_crossentropy',
metrics=['accuracy'])

Optimizer: We chose ‘adam’, it has a dynamic learning rate and is considered one of the best optimizers to adjust weights and biases.

Loss: Since we do not have our y-values encoded, and have multiple classes. Setting the loss to ‘sparse_categorical_crossentropy’ should do the job.

Metrics: Finally, we want to measure the ‘accuracy’ of the model at the end of each epoch (training round).

The time to train our model has finally come! We need to specify some parameters.

model.fit(x_train, y_train,              # Pass x & y training values
epochs=100, # We will train for 100 rounds
validation_data=(x_val, y_val),# val data to check at epoch end
callbacks=[ # Early stopping
keras.callbacks.EarlyStopping(patience=5,
restore_best_weights=True,
verbose=1)])

Training Data: We pass the x_train & y_train we set up in Step 2.

Epochs: The number of rounds you want the model to run for.

Validation: At the end of each round (epoch) it will see how it is performing against the validation data. This is important because if it is improving on the training data, but not on the validation data, it means the model is overfitting.

Callbacks: EarlyStopping is used to stop the model if it stops improving on the validation data. We specified patience=5 & restore_best_weights=True, which means if the model does not improve in 5 rounds, it restores the model to the best one in the past epochs (training rounds).



Source link

Be the first to comment

Leave a Reply

Your email address will not be published.


*