
In this tutorial, I show how to share neural network layer weights and define custom loss functions. The example code assumes beginner knowledge of Tensorflow 2 and the Keras API.
For a recent project, I wanted to use Tensorflow 2 / Keras to re-implement DeepKoopman, an autoencoder-based neural network architecture described in “Deep learning for universal linear embeddings of nonlinear dynamics”. My end goal was to create a user-friendly version that I could eventually extend
DeepKoopman embeds time series x onto data into a low-dimensional coordinate system y in which the dynamics are linear.
The DeepKoopman schematic shows that there are three main components:
- The encoder φ, which maps the input to the latent code
- The decoder φ-inverse, which reconstructs the input from the latent code
- The linear dynamics K, which describe how the latent code evolves over time
To start building the model, we can define the three sub-models as follows:
We can connect the sub-models and then plot the overall architecture using Keras plot_model
.
At this point, we are set up to train the autoencoder component, but we haven’t taken into account the time series nature. We still need to be able to input and compute over a second input, x1.
In basic use-cases, neural networks have a single input node and a single output node (although the corresponding tensors may be multi-dimensional). The original DeepKopman shows the encoder and decoder converting different inputs to different outputs, namely x
samples from different times.
Layer sharing turns out to be quite simple in Keras. We can share layers by calling the same encoder
and decoder
models on a new Input
.
To recap, in the DeepKoopman example, we want to use the same encoder φ, decoder, and linear dynamics K for each time-point. To share models, we first define the encoder, decoder, and linear dynamics models. Then, we can use the models to connect different inputs and outputs as if they were independent.
This approach of sharing layers can be helpful in other situations, too. For example, if we wanted to create neural networks with tied weights, we could call the same layer on two inputs.
So far, we have defined the connections of our neural network architecture. But we haven’t yet defined the loss function, so Tensorflow has no way to optimize the weights.
The DeepKoopman loss function is composed of :
- reconstruction accuracy:
x0
vsx0_reconstructed
- future state prediction:
x1
vsx1_pred
- linearity of dynamics:
y1
vsy1_pred
Each loss is the mean squared error between two values. In a typical neural network setup, we would pass in ground-truth targets to compare against our model predictions. For example, many Tensorflow/Keras examples use something like:
With DeepKoopman, we know the target values for losses (1) and (2), but y1
and y1_pred
do not have ground truth values, so we cannot use the same approach to calculate loss (3). Instead, Keras offers a second interface to add custom losses, model.add_loss()
.
model.add_loss()
takes a tensor as input, which means that you can create arbitrarily complex computations using Keras and Tensorflow, then simply add the result as a loss.
If you want to add arbitrary metrics, you can also use a similar API through model.add_metric()
:
The last step is to compile and fit the model:
Note: unfortunately, the model.add_loss()
approach is not compatible with applying loss functions to outputs through model.compile(loss=...)
. The best solution for losses that include model outputs and internal tensors may be to define a custom training loop.
Be the first to comment