What is AWS AsyncInf:ml?
Last updated: April 10, 2025Solution for handling inference requests by queuing and processing them asynchronously. This option is ideal for use cases involving large data payloads or models with lengthy processing times that do not require immediate response speeds. Pricing is based on the selected instance type.
Line Item | Region |
---|---|
APN2-AsyncInf:ml.m4.xlarge | Asia Pacific region in Seoul, South Korea |
USE1-AsyncInf:ml.c4.xlarge | US East region in Northern Virginia |
USE1-AsyncInf:ml.g4dn.xlarge | US East region in Northern Virginia |
USW2-AsyncInf:ml.c5.2xlarge | US West region in Oregon |