Log analysis

更新时间:
复制 MD 格式

PolarDB-X 1.0 supports powerful SQL audit and analysis, built on Log Service. This topic describes common SQL log analysis queries and provides examples.

Background information

After you enable the SQL audit and analysis feature for PolarDB-X 1.0, you can use the Log Service query syntax for SQL audit and analysis. The query syntax allows you to quickly identify problem SQL and analyze the execution status, performance metrics, and security of your PolarDB-X 1.0 database. For more information about the Log Service query syntax, see Query syntax and functions.

Notes

In the same region, the audit logs of all PolarDB-X 1.0 databases are written to the same Logstore in Log Service. Therefore, the search page for PolarDB-X 1.0 SQL audit and analysis includes a default __topic__ filter. This ensures that your searches return only SQL logs from your PolarDB-X 1.0 database. You must append all queries provided in this topic after the existing filter condition.

For example, the first part of the query in the search box is the default filter condition, and the part after the and keyword is the appended filter condition.

On the SQL Audit and Analysis page, the query box is automatically populated with the __topic__ filter as the default condition. You can then append your query statement by using the and keyword. The Quick Analysis panel on the left lists available index fields, such as __topic__, affect_rows, and client_ip. Details of the matching log entries are displayed on the right.

Quickly identify problem SQL

You can use the following commands to quickly identify problem SQL.

  • Fuzzy search

    For example, to search for SQL statements that contain the keyword 34, run the following command:

     and sql: 34
    This query returns SQL log entries that contain the keyword 34. Each log entry includes details such as affect_rows (rows affected), response_time (response time), sql (the SQL statement), and sql_type.
  • Field search

    Based on pre-configured index fields, PolarDB-X 1.0 SQL audit and analysis also supports searching by specific fields. For example, to search for SQL statements of the Drop type, run the following command:

    and sql_type:Drop
    This query returns SQL log entries where sql_type is Drop. The sql_type field is highlighted as Drop, and the sql field shows the specific statement, such as drop table if exists bb. The timeline chart at the top of the page shows the distribution of logs over time.

    Log Service supports click-to-generate query statements.

    In the log details, clicking the value of a field (for example, Update in the sql_type field) automatically generates the corresponding filter condition in the query box (for example, and sql_type: Update).
  • Multi-condition search

    You can use keywords such as and or or to perform multi-condition searches. For example, to search for a delete operation on the row where id=34, run the following command:

    and sql:34 and sql_type: Delete
  • Numerical comparison search

    The affect_rows and response_time index fields are numeric and support comparison operators. For example, to search for INSERT statements with a response_time greater than 1 second, run the following command:

     and response_time  > 1 and sql_type: Insert

    Alternatively, to search for SQL statements that delete more than 100 rows, run the following command:

     and affect_rows  > 100 and sql_type: Delete

SQL execution analysis

You can use the following commands to analyze the execution status of your SQL statements.

  • SQL execution failure rate

    To calculate the failure rate of SQL executions, run the following command:

    | SELECT sum(case when fail = 1 then 1 else 0 end) * 1.0 / count(1) as fail_ratio

    The query results are as follows:

    Statistical chartfail_ratioSave as Alert

    If your service is sensitive to SQL error rates, you can customize alert information based on these query results. The following alert setting checks every 15 minutes for the number of logs from the past 15 minutes that have an SQL execution error rate greater than 0.01. You can also customize alerts based on your business needs.

    Alert rule nameSQL failure rate alertQuick name searchUse the current queryTrigger count1Notification typeNotification CenterNotification contentHigh SQL execution failure rate
  • Total rows queried by SELECT statements

    To calculate the total number of rows returned by all SELECT statements, run the following command:

    and sql_type: Select | SELECT sum(affect_rows)
  • SQL type distribution

    To view the distribution of different SQL statement types, run the following command:

    | SELECT  sql_type, count(sql) as times GROUP BY sql_type
  • Distribution of unique user IPs for SQL

    To view the distribution of IP addresses for unique users executing SQL statements, run the following command:

    | SELECT  user, client_ip, count(sql) as times GROUP BY user, client_ip

SQL performance analysis

You can use the following commands to view details about SQL performance.

  • Average response time for SELECT

    To calculate the average response time of SELECT statements, run the following command:

    and sql_type: Select | SELECT avg(response_time)
  • Distribution of SQL execution time

    To analyze the distribution of SQL execution times, run the following command:

    and response_time > 0 | select   case  when response_time <= 10 then '<=10 ms'  when response_time > 10 and response_time <= 100 then '10-100 ms'  when response_time > 100 and response_time <= 1000 then '100 ms-1 s'  when response_time > 1000 and response_time <= 10000  then '1 s-10 s'  when response_time > 10000 and response_time <= 60000  then '10 s-1 min'  else '>1 min' end as latency_type,  count(1) as cnt group by latency_type order by latency_type DESC
    Note The preceding query provides a distribution of SQL execution times based on predefined time intervals. You can adjust the time intervals to get more granular results.
  • Top 50 slow SQL queries

    To identify the top 50 slow SQL queries in the system, run the following command:

    SELECT date_format(from_unixtime(__time__), '%m/%d %H:%i:%s') as time, user, client_ip, client_port, sql_type, affect_rows, response_time, sql ORDER BY  response_time desc LIMIT 50

    The query result includes the execution time, username, IP address, port number, SQL type, rows affected, response time, and the SQL text.

  • Top 10 high-cost SQL templates

    In most applications, SQL statements are dynamically generated from a few templates with different parameters. You can use the following command to identify high-cost SQL templates by their template ID for analysis and optimization:

    SELECT sql_code as "SQL Template ID", round(total_time * 1.0 /sum(total_time) over() * 100, 2) as "Overall Time Ratio (%)" ,execute_times as "Execution Count", round(avg_time) as "Average Execution Time",round(avg_rows) as "Average Affected Rows", CASE WHEN length(sql) > 200 THEN  concat(substr(sql, 1, 200), '......') ELSE trim(lpad(sql, 200, ' ')) end as "Sample SQL" FROM  (SELECT sql_code, count(1) as execute_times, sum(response_time) as total_time, avg(response_time) as avg_time, avg(affect_rows) as avg_rows, arbitrary(sql) as sql FROM log GROUP BY sql_code) ORDER BY "Overall Time Ratio (%)" desc limit 10

    The result includes the SQL template ID, the template’s share of the total query time, execution count, average execution time, average rows affected, and a sample SQL statement, truncated to 200 characters.

    Note The preceding query sorts by the overall time ratio. You can also sort by average execution time or execution count to investigate issues.
  • Average transaction duration

    For SQL statements within the same transaction, the pre-configured trace_id field shares a common prefix and ends with a suffix like '-' + sequence_number. For non-transactional SQL, the trace_id does not contain a '-'. You can use this pattern to analyze the performance of transactional SQL.

    Note Transaction analysis involves prefix matching operations, which can be less efficient than other types of queries.
    • To calculate the average execution time of transactions, run the following command:
       SELECT  sum(response_time) / COUNT(DISTINCT substr(trace_id, 1, strpos(trace_id, '-') - 1)) where strpos(trace_id, '-') > 0
    • Top 10 slow transactions

      To get a list of slow transactions sorted by execution time, run the following query:

      SELECT substr(trace_id, 1, strpos(trace_id, '-') - 1) as "Transaction ID" , sum(response_time) as "Transaction Duration" where strpos(trace_id, '-') > 0 GROUP BY substr(trace_id, 1, strpos(trace_id, '-') - 1) ORDER BY "Transaction Duration" DESC LIMIT 10

      Based on the results, you can use a slow transaction ID to search for all SQL statements within that transaction and analyze the root cause. To do this, run a query similar to the following:

       and trace_id: db3226a20402000*
    • Top 10 transactions with large-batch operations

      To find transactions that perform large-batch operations, run the following query. It sorts transactions by the total number of rows affected.

      SELECT substr(trace_id, 1, strpos(trace_id, '-') - 1) as  "Transaction ID" , sum(affect_rows) as "Affected Rows" where strpos(trace_id, '-') > 0 GROUP BY substr(trace_id, 1, strpos(trace_id, '-') - 1) ORDER BY "Affected Rows" DESC LIMIT 10

SQL security analysis

You can use the following commands to analyze the security of your SQL statements.

  • Failed SQL type distribution

    To view the type distribution of failed SQL statements, run the following command:

    and fail > 0 | select sql_type, count(1) as "Error Count" group by sql_type
  • List of high-risk SQL statements

    High-risk SQL refers to DROP or TRUNCATE statements. You can add more conditions to fit your specific requirements.

    and (sql_type: Drop OR  sql_type: Truncate)
  • List of large-scale deletion SQL statements

    To list SQL statements that perform large-scale deletions, run the following command:

    and affect_rows > 100 and sql_type: Delete | SELECT date_format(from_unixtime(__time__), '%m/%d %H:%i:%s') as time, user, client_ip, client_port, affect_rows, sql ORDER BY  affect_rows desc LIMIT 50