Tutorial Series: ONNX. If you are into computer science, you… | by Shima | Feb, 2024


If you are into computer science, you have probably heard of TensorFlow and PyTorch or even used them before. They are among the most famous frameworks for deep learning model training. But there is another player in this world that might not have crossed your radar: ONNX.

ONNX, short for Open Neural Network Exchange, isn’t as famous, but it plays an important role in the world of deep learning. We can say that it is a translator between different deep-learning tools! Everyone agrees that TensorFlow and PyTorch are great at what they do and probably there is no better than them. In this middle ONNX acts as a bridge and helps these frameworks talk to each other and makes it easier to share and use models over different software. So, it simplifies model transfer and compatibility across various frameworks.

ONNX is not a completely recent or new framework and it was born in September 2017. It was developed as an open-source project by Meta (Facebook that days) and Microsoft to create a standardized format for representing deep learning models.

ONNX vs Tensorflow and PyTorch
Why Use ONNX?
Why and when do we need to transfer a model to another framework?
Hands-on:
Installation
Exporting from Frameworks
Conversion Tools
Importing into Frameworks
ONNX Runtime
Inference with Imported Models
Performance Optimization
· Example

Interoperability:

  • TensorFlow: TensorFlow models are compatible with the TensorFlow ecosystem, although there are tools available for converting TensorFlow models to other formats, such as TensorFlow Lite for mobile deployment and TensorFlow.js for web deployment.
  • PyTorch: PyTorch models are also native to the PyTorch framework.
  • ONNX: ONNX is specifically designed for interoperability. Models can be easily transferred between different frameworks, including TensorFlow and PyTorch.

Ease of Use:

  • TensorFlow: TensorFlow offers a comprehensive set of tools and resources for deep learning development.
  • PyTorch: PyTorch is known for its simplicity and ease of use, with an intuitive API that makes it popular among researchers and developers.
  • ONNX: While ONNX itself is not a deep learning framework, it simplifies the process of working with models over frameworks, therefore overall ease of use will be enhanced.

Community and Ecosystem:

  • TensorFlow: TensorFlow has a large and active community with extensive documentation, tutorials, and pre-trained models. It is widely adopted across academia and industry.
  • PyTorch: PyTorch has rapid adoption, particularly in the research community, thanks to its dynamic computation graph and Pythonic design.
  • ONNX: ONNX benefits from the support of major tech companies and research institutions, with a growing ecosystem of tools and frameworks that support the format.

Performance:

  • TensorFlow and PyTorch: Both TensorFlow and PyTorch are highly optimized for performance, with support for GPU acceleration and distributed training.
  • ONNX: ONNX itself does not directly affect model performance, but it makes it possible to use models efficiently through different frameworks and in the end leverages platform-specific optimizations.

Model Compatibility:

  • TensorFlow and PyTorch: Models trained in TensorFlow or PyTorch can only be directly used within their respective frameworks.
  • ONNX: ONNX serves as a common exchange format, models can be transferred between TensorFlow, PyTorch, and other supported frameworks smoothly.

TensorFlow and PyTorch both are powerful deep-learning frameworks, and ONNX complements them by providing a standardized format for model interoperability. By ONNX, developers can benefit from the strengths of both frameworks while overcoming the limitations of vendor lock-in.

Some of the following features are already written in the text but it does not hurt to have all of them in one section.

Interoperability:

ONNX facilitates interoperability between different deep learning frameworks such as TensorFlow, PyTorch, and MXNet. Models trained in one of these frameworks can be exported to ONNX format and then imported into another framework without needing to rewrite or retrain the model.

Efficiency:

By using ONNX, developers can avoid the overhead of manually converting models between frameworks. This saves time and effort, especially when the project involves multiple frameworks or when collaborating with others who use different frameworks.

Flexibility:

ONNX provides flexibility in choosing the best framework for a particular task without being locked into a specific ecosystem. It allows developers to take advantage of the unique features and optimizations offered by different frameworks while maintaining compatibility with others.

Ecosystem Support:

ONNX is supported by a wide range of deep learning frameworks, including TensorFlow, PyTorch, and Caffe2,…. This broad ecosystem support ensures that ONNX can be easily integrated into existing workflows and projects.

Community and Industry Adoption:

ONNX plays a significant role in both research and industrial projects. Lots of companies and research institutions are using ONNX. This shows that ONNX is great at problem-solving solving regarding to challenges of deep learning model interoperability.

Integration with Existing Ecosystems:

Organizations or projects may have existing infrastructure built around a specific deep-learning framework. If you are adding models from another framework into your existing setup, you need to convert them to the same format as what you are already using.

Framework Selection Flexibility:

Each deep-learning framework has its special features and strengths. When you convert models between frameworks, you can take advantage of the best parts of each one in the same project. For example, you might use TensorFlow for model training and PyTorch for experimentation, then convert models between the two as needed.

Deployment Optimization:

Some frameworks or runtime environments may have better performance or compatibility with specific hardware platforms. By converting models to fit these platforms, we make sure they run efficiently and use resources effectively.

Collaboration and Knowledge Sharing:

Anyone who works with deep learning uses different software tools. When we convert models, it makes it easy for everyone to share and try out each other’s work, even if they’re using different tools. This helps us learn from each other and come up with new ideas.

Platform Independence:

When we convert models to a standard format like ONNX, it means they are not stuck with one particular software or device. Instead, they can work on lots of different systems, like different software tools, computers, or even phones. This gives us more flexibility and freedom to use our models wherever we need them.

Simply, converting models between frameworks makes them more flexible, helps people work together better, improves how we deploy them, and ensures they work well in different situations. This makes it easier to develop and use deep learning models in lots of different ways.

Now that we understand the importance of transferring models and utilizing the ONNX framework, it is time to learn how to use it. However, before we can dive in, we need to ensure it is installed on our system. Let’s continue with the installation.

Installation:

Like my other tutorials, I’ll be using Linux (Debian) as my default operating system for the installation process. It is really easy to install it, you can install ONNX using pipor conda for Python:

pip install onnx

There is an optional step here and with that, you can install ONNX Runtime, which is an efficient inference engine for ONNX models. As I said, it is not required, but it can improve performance when running ONNX models:

pip install onnxruntime

If later you want to use TensorFlow you should also install:

pip install onnx_tf

Exporting from Frameworks:

If you have trained a model using a deep learning framework like TensorFlow or PyTorch, you can export it to ONNX format using the framework’s tools or APIs. For example, in PyTorch, you can export a model to ONNX format using the torch.onnx.export() function.

Conversion Tools:

If your framework doesn’t have built-in support for exporting to ONNX, you can use conversion tools like tf2onnx for TensorFlow models or torch.onnx.export() for PyTorch models.

Importing into Frameworks:

Once you have an ONNX model file, you can import it into another deep-learning framework that supports ONNX. Most frameworks provide APIs or tools for importing ONNX models. For example: tf.raw_ops.OnnxImport or tf.experimental.tensorrt.Converter.from_onnx_model().

ONNX Runtime:

Alternatively, you can use ONNX Runtime to load and run ONNX models without needing to import them into a specific framework. ONNX Runtime provides efficient execution of ONNX models across different hardware platforms.

Inference with Imported Models:

Once the model is imported into your desired framework or loaded into ONNX Runtime, you can use it to make predictions on new data. Provide input data to the model and obtain the output predictions.

Performance Optimization:

Experiment with different hardware configurations and optimizations (e.g., quantization) to improve the performance of your ONNX models, especially during inference.

Now the installation step is finished and we can try it out.

Here we have an example of a simple neural network model implemented in PyTorch, and then we want to show the equivalent representation in ONNX:

import torch
import torch.nn as nn
import torch.optim as optim

# Assuming we have dataset and dataloader set up
# For demonstration purposes, let's create some random inputs and labels
inputs = torch.randn(32, 784) # Assuming batch size of 32 and input size of 784
labels = torch.randint(0, 10, (32,)) # Assuming 10 classes and batch size of 32

# Definition of a simple neural network model
class SimpleNN(nn.Module):
def __init__(self):
super(SimpleNN, self).__init__()
self.fc1 = nn.Linear(784, 128)
self.relu = nn.ReLU()
self.fc2 = nn.Linear(128, 10)

def forward(self, x):
x = torch.flatten(x, 1)
x = self.fc1(x)
x = self.relu(x)
x = self.fc2(x)
return x

# Instantiate the model
model = SimpleNN()

# Define loss function and optimizer
criterion = nn.CrossEntropyLoss()
optimizer = optim.SGD(model.parameters(), lr=0.01)

# Define the number of epochs
num_epochs = 10

# Train the model
for epoch in range(num_epochs):
# Forward pass
outputs = model(inputs)
loss = criterion(outputs, labels)

# Backward pass and optimization
optimizer.zero_grad()
loss.backward()
optimizer.step()

Here we have a line-by-line explanation of the code above:

import torch: Imports PyTorch library.

import torch.nn as nn: Imports the neural network module from PyTorch that contains useful classes for building neural network models.

import torch.optim as optim: Imports the optimization module from PyTorch, which includes different optimization algorithms.

inputs = torch.randn(32, 784): Generates a tensor of random numbers with shape (32, 784)which means a batch of 32 inputs, each one with a size of 784 (assuming each input is a vector of length 784).

labels = torch.randint(0, 10, (32,)): Generates a tensor of random integers between 0 and 9 (inclusive) with shape (32,), thie tensor represents the labels for the respective inputs. This assumes 10 classes and a batch size of 32.

class SimpleNN(nn.Module):: Defines a neural network class named SimpleNN that inherits from nn.Module, which is a base class for all neural network modules in PyTorch.

def __init__(self):: Initializes the SimpleNN class. This method is called when an object of the class is created.

super(SimpleNN, self).__init__(): Calls the constructor of the superclass which is nn.Module to initialize the neural network.

self.fc1 = nn.Linear(784, 128): Defines the first fully connected layer fc1 with an input size of 784 and an output size of 128.

self.relu = nn.ReLU(): Initializes the rectified linear unit (ReLU) activation function.

self.fc2 = nn.Linear(128, 10): Defines the second fully connected layer fc2 with an input size of 128 and an output size of 10, which corresponds to the number of classes.

def forward(self, x):: Defines the forward pass method for the neural network. This method specifies how input data flows through the network.

x = torch.flatten(x, 1): Flattens the input tensor x along dimension 1, which is typically used to flatten the batch dimension.

x = self.fc1(x): Passes the flattened input through the first fully connected layer.

x = self.relu(x): Applies the ReLU activation function to the output of the first fully connected layer.

x = self.fc2(x): Passes the output of the ReLU activation through the second fully connected layer.

return x: Returns the output of the neural network.

From here our main section starts:

model = SimpleNN(): Creates an instance of the SimpleNN class that is our neural network model and alongside the __init__ function has been called.

criterion = nn.CrossEntropyLoss(): Defines the loss function, which is cross-entropy loss and it is usually used for classification tasks.

optimizer = optim.SGD(model.parameters(), lr=0.01): Initializes the optimizer, which is stochastic gradient descent (SGD) in this case, to update the parameters of the model during training.

num_epochs = 10: Specifies the number of epochs, which is the number of times the entire dataset will be passed forward and backward through the network during training.

for epoch in range(num_epochs):: Starts a loop over the specified number of epochs.

outputs = model(inputs): Performs a forward pass of the input data through the model to obtain the predicted outputs.

loss = criterion(outputs, labels): Calculates the loss between the predicted outputs and the actual labels using the specified loss function.

optimizer.zero_grad(): Clears the gradients of all optimized parameters before performing a backward pass.

loss.backward(): Computes the gradients of the loss concerning the model parameters, enabling gradient descent optimization.

optimizer.step(): Updates the model parameters based on the computed gradients and the chosen optimization algorithm (SGD in this case).

Now we want to export the implemented model in PyTorchto ONNX:

# Other libraries used in previous PyTorch code should be here
import onnx
import onnxruntime

# The rest of PyTorch code shoudl be here

# Export the model to ONNX format
dummy_input = torch.randn(1, 784) # A dummy input to trace the model
torch.onnx.export(model, dummy_input, "simple_nn.onnx", verbose=True)

After running the complete version of the code you will get a stored file called simple_nn.onnx and something like the following lines on Terminal:

Exported graph: graph(%onnx::Flatten_0 : Float(1, 784, strides=[784, 1], requires_grad=0, device=cpu),
%fc1.weight : Float(128, 784, strides=[784, 1], requires_grad=1, device=cpu),
%fc1.bias : Float(128, strides=[1], requires_grad=1, device=cpu),
%fc2.weight : Float(10, 128, strides=[128, 1], requires_grad=1, device=cpu),
%fc2.bias : Float(10, strides=[1], requires_grad=1, device=cpu)):
%/Flatten_output_0 : Float(1, 784, strides=[784, 1], requires_grad=0, device=cpu) = onnx::Flatten[axis=1, onnx_name="/Flatten"](%onnx::Flatten_0), scope: __main__.SimpleNN:: # /home/shima/test.py:21:0
%/fc1/Gemm_output_0 : Float(1, 128, strides=[128, 1], requires_grad=1, device=cpu) = onnx::Gemm[alpha=1., beta=1., transB=1, onnx_name="/fc1/Gemm"](%/Flatten_output_0, %fc1.weight, %fc1.bias), scope: __main__.SimpleNN::/torch.nn.modules.linear.Linear::fc1 # /usr/local/lib/python3.10/dist-packages/torch/nn/modules/linear.py:114:0
%/relu/Relu_output_0 : Float(1, 128, strides=[128, 1], requires_grad=1, device=cpu) = onnx::Relu[onnx_name="/relu/Relu"](%/fc1/Gemm_output_0), scope: __main__.SimpleNN::/torch.nn.modules.activation.ReLU::relu # /usr/local/lib/python3.10/dist-packages/torch/nn/functional.py:1453:0
%8 : Float(1, 10, strides=[10, 1], requires_grad=1, device=cpu) = onnx::Gemm[alpha=1., beta=1., transB=1, onnx_name="/fc2/Gemm"](%/relu/Relu_output_0, %fc2.weight, %fc2.bias), scope: __main__.SimpleNN::/torch.nn.modules.linear.Linear::fc2 # /usr/local/lib/python3.10/dist-packages/torch/nn/modules/linear.py:114:0
return (%8)

But what does it say? This is the exported ONNX graph that represents the computation flow of your neural network model. Let’s break down each line:

%onnx::Flatten_0 : Float(1, 784, strides=[784, 1], requires_grad=0, device=cpu): This line defines an input tensor named onnx::Flatten_0 with a shape of (1, 784) and other properties such as strides, gradient requirement, and device.

%fc1.weight : Float(128, 784, strides=[784, 1], requires_grad=1, device=cpu): This line defines the weight tensor for the first fully connected layer (fc1). It has a shape of (128, 784) and properties similar to the input tensor.

%fc1.bias : Float(128, strides=[1], requires_grad=1, device=cpu): This line defines the bias tensor for the first fully connected layer (fc1). It has a shape of (128,) and properties similar to the weight tensor.

%fc2.weight : Float(10, 128, strides=[128, 1], requires_grad=1, device=cpu): This line defines the weight tensor for the second fully connected layer (fc2). It has a shape of (10, 128) and properties similar to the weight tensor of fc1.

%fc2.bias : Float(10, strides=[1], requires_grad=1, device=cpu): This line defines the bias tensor for the second fully connected layer (fc2). It has a shape of (10,) and properties similar to the bias tensor of fc1.

%/Flatten_output_0 : Float(1, 784, strides=[784, 1], requires_grad=0, device=cpu) = onnx::Flatten[axis=1, onnx_name=”/Flatten”](%onnx::Flatten_0), scope: __main__.SimpleNN::: This line applies the Flatten operation to the input tensor (onnx::Flatten_0) along the specified axis (axis 1), resulting in an output tensor %/Flatten_output_0.

%/fc1/Gemm_output_0 : Float(1, 128, strides=[128, 1], requires_grad=1, device=cpu) = onnx::Gemm[alpha=1., beta=1., transB=1, onnx_name=”/fc1/Gemm”](%/Flatten_output_0, %fc1.weight, %fc1.bias), scope: __main__.SimpleNN::/torch.nn.modules.linear.Linear::fc1: This line performs the matrix multiplication (GEMM: General Matrix Multiply) operation between the flattened input tensor and the weight tensor of fc1, adds the bias, and produces an output tensor %/fc1/Gemm_output_0.

%/relu/Relu_output_0 : Float(1, 128, strides=[128, 1], requires_grad=1, device=cpu) = onnx::Relu[onnx_name=”/relu/Relu”](%/fc1/Gemm_output_0), scope: __main__.SimpleNN::/torch.nn.modules.activation.ReLU::relu: This line applies the ReLU activation function to the output tensor of fc1, producing an output tensor %/relu/Relu_output_0.

%8 : Float(1, 10, strides=[10, 1], requires_grad=1, device=cpu) = onnx::Gemm[alpha=1., beta=1., transB=1, onnx_name=”/fc2/Gemm”](%/relu/Relu_output_0, %fc2.weight, %fc2.bias), scope: __main__.SimpleNN::/torch.nn.modules.linear.Linear::fc2: This line performs another matrix multiplication operation between the ReLU output tensor and the weight tensor of fc2, adds the bias, and produces the final output tensor %8.

return (%8): This line indicates that the output of the graph is the tensor %8, which represents the final output of the neural network model.

Let’s review what we have done until this step:

  • Implementing a simple neural network in PyTorch
  • Exporting the model into ONNX format.

Now we want to import this model to another framework like TensorFlow.

import onnx
from onnx_tf.backend import prepare

onnx_model = onnx.load("simple_nn.onnx")
tf_rep = prepare(onnx_model)
tf_rep.export_graph("model.pb")

tf_rep.export_graph(“model.pb”) is an optional step and using that we can convert the TensorFlow Model to TensorFlow GraphDef Format.



Source link

Be the first to comment

Leave a Reply

Your email address will not be published.


*