TensorFlow on Arduino. I am going to train a toy algorithm and… | by Paul Bruffett


Training and deploying a TensorFlow model to an Arduino

(Image by author)

I am going to train a toy algorithm and deploy it for inferencing on an Arduino Nano 33 BLE Sense. I am seeking to build and test a shell using the fewest possible components, to be enhanced later.

I’ll be using the AutoMPG Dataset, training a model that will use one feature, Horsepower, to predict the vehicle’s miles per gallon. We will interact with the model using the Arduino Serial Monitor.

The training notebook to follow along.

I won’t spend a lot of time on data preparation, it’s a fairly straightforward dataset, the one note is, since I want to invoke this and get a prediction from the Serial Monitor, we’ll only use one feature, Horsepower.

So here we’re dropping nulls, splitting the data, getting our label (MPG) and our one feature (Horsepower).

Next I build my model. I only need one input (Horsepower) and I am building a very simple model with one hidden layer. I also only have one output neuron to predict MPG.

Finally, I’ll save the model so I can make my conversions later.

Now we’ll convert the model. On our GPU the model used float32 numbers to capture weights and biases, on our microcontrollers that would run very slowly, if at all. We will quantize the model after training, smaller weights and activations can be used at training time, which is a separate topic.

We need to convert to integers in order to run on limited microcontroller hardware. This, frequently, has a minimal impact on accuracy. There are several options for converting a model;

  • float32 to float16, this can reduce a model’s size by half and dramatically speed up inferencing on some hardware, this means parameters are float16 and inferencing is performed float32
  • int8 parameters, model uses mixed computation when available
  • int8 parameters and activations, executing only with integer operations

A handy decision tree and additional detail here.

We want to quantize to the latter, int8 everything with enforced integer only operations. To do this we have to generate a representative dataset. This is required because, in order to efficiently convert the 8-bit values require a linear conversion to real numbers. For weights this can be done automatically because TensorFlow can calculate the range for each layer from the trained values. For activations it’s more difficult because it’s not self-evident from the parameters what the range of each layer’s outputs is. If an arbitrarily chosen range is too small, values will be clipped by the minimum or maximum, if the range is too large the precision will be lost.

TensorFlow uses the representative dataset to perform this calculation for the activations and convert them.

Here we’re taking the test dataset and converting it to a tensor, using that in our representative_dataset function, which yields a record when called.

We actually save two models here, one that just converts to TFLite format but keeps weights and activations as float32, so we can see how that impacts accuracy. The model is then saved with quantization, enforcing int only (because it can fall back to float32 for unsupported operations if not specified). A representative dataset is provided, the model converted and saved.



Source link

Be the first to comment

Leave a Reply

Your email address will not be published.


*