AI observability

Monitor, record, and analyze AI request and response metrics and logs in the statistics and log modules of the AI Gateway console.

Note

The Throttling, Caching, and Web Search policies expose exception logs when handling exceptions, providing complete policy log visibility.

Procedure

Log on to the AI Gateway console and choose Instance. In the top menu bar, select a region, then click the target instance ID.
In the navigation pane on the left, choose Model API, then click the target API name to go to the API Details page.
Click the Statistics tab to view the apig-ai-api-dashboard. Key metrics include:

Important
AI Gateway uses Simple Log Service (SLS) to collect, analyze, and display logs. If you have not enabled gateway log delivery, click Enable Log Delivery.
- QPS: AI requests and responses per second, broken down by overall, streaming responses, and non-streaming responses.
- Request success rate: Success rate of AI requests, calculated at 1-second, 15-second, or 1-minute intervals.
- Tokens consumed/s: Tokens consumed per second: input, output, and total.
- Average request RT (ms): Average response time for AI requests at 1-second, 15-second, or 1-minute intervals. Includes non-streaming RT, streaming RT (total stream duration), and first-packet streaming RT (time to first packet).
- Cache hits and misses/s: Successful cache retrievals (hits) and failed lookups (misses) per second.
- Throttled requests/s: Rate-limited requests per second.
- Model token usage statistics: Tokens consumed by each model over a specified period.
- Consumer token usage statistics: Tokens consumed by each consumer over a specified period.
- Threat type statistics: Threats detected by Content Moderation, categorized by type and consumer.
- Risky consumer statistics: Risks associated with specific consumers by consumer authentication.
- Throttled consumer statistics: Consumers impacted by throttling.
Click the Log tab. Use SQL to analyze query results. Quick start for query and analysis.