DEV Community

Julien Simon
Julien Simon

Posted on • Originally published at julsimon.Medium on

Video: Accelerate Transformer inference with AWS Inferentia 2

AWS Inferentia2 is now generally available, and I couldn’t resist testing it with BERT models and comparing results with Inferentia1.

This thing is FAST and looks very cost-effective. Check it out!

Top comments (0)