Starting today, you can choose Inferentia 2 and Trainium 1 as additional targets to compile your PyTorch and TensorFlow models for Amazon SageMaker Neo, a capability of Amazon SageMaker that enables customers to optimize machine learning (ML) models for inference on SageMaker to achieve faster inference without any loss in accuracy. Amazon Elastic Compute Cloud (Amazon EC2) Inf2 instances deliver high performance at the lowest cost for generative artificial intelligence (AI) models, including large language models (LLMs) and vision transformers. AWS Trainium is a machine learning (ML) accelerator that AWS purpose built for deep learning training of 100B+ parameter models.
Be the first to comment