What is USE1-AsyncInf:ml.g4dn.xlarge?

Last updated: January 09, 2025

Solution for handling inference requests by queuing and processing them asynchronously. This option is ideal for use cases involving large data payloads or models with lengthy processing times that do not require immediate response speeds. Pricing is based on the selected instance type.

This specific billing code is related to the US East region in Northern Virginia.