WAF日志告警配置案例

本文提供了典型的Web应用防火墙(WAF)日志查询与分析告警配置案例。您可以参考本文提供的告警配置参数,在自定义WAF日志仪表盘中添加监控图表及配置告警。

重要

本文以旧版日志服务告警配置为例,介绍相关配置参数。如果您已升级使用了新版日志服务告警,请结合本文提供的查询语句及告警参数建议,并参见快速设置日志告警来完成相关配置。

旧版日志服务告警的配置参数如下图所示。告警配置示例发送内容示例

4XX比例异常告警

告警参数配置建议:

  • 图表名称:4XX比例(忽略拦截数据)

  • 查询语句

    user_id :您的阿里云账号ID
    and not real_client_ip :被拦截的请求IP |
    SELECT
      user_id,
      host AS "域名",
      Rate_2XX AS "2XX比例",
      Rate_3XX AS "3XX比例",
      Rate_4XX AS "4XX比例",
      Rate_5XX AS "5XX比例",
      countall AS "aveQPS",
      status_2XX,
      status_3XX,
      status_4XX,
      status_5XX,
      countall
    FROM(
        SELECT
          user_id,
          host,
          round(
            round(status_2XX * 1.0000 / countall, 4) * 100,
            2
          ) AS Rate_2XX,
          round(
            round(status_3XX * 1.0000 / countall, 4) * 100,
            2
          ) AS Rate_3XX,
          round(
            round (status_4XX * 1.0000 / countall, 4) * 100,
            2
          ) AS Rate_4XX,
          round(
            round(status_5XX * 1.0000 / countall, 4) * 100,
            2
          ) AS Rate_5XX,
          status_2XX,
          status_3XX,
          status_4XX,
          status_5XX,
          countall
        FROM(
            SELECT
              user_id,
              host,
              count_if(
                status >= 200
                and status < 300
              ) AS status_2XX,
              count_if(
                status >= 300
                and status < 400
              ) AS status_3XX,
              count_if(
                status >= 400
                and status < 500
                and status <> 444
                and status <> 405
              ) AS status_4XX,
              count_if(
                status >= 500
                and status < 600
              ) AS status_5XX,
              COUNT(*) AS countall
            FROM          log
            GROUP BY
              host,
              user_id
          )
      )
    WHERE
      countall > 120
    ORDER BY
      Rate_4XX DESC
    LIMIT
      5

    该图表包含以下字段:aveQPS2XX比例3XX比例4XX比例5XX比例,分别表示域名QPS和各类型响应状态码的占比。其中,4XX比例不包含WAF拦截的CC攻击和Web攻击等造成的444405状态码,以便只展示因业务自身原因造成的状态码变化。在设置告警触发条件时,您可以自由组合上述字段。例如,aveQPS>10 && 2XX比例<60表示在设定的统计时间内,指定域名的QPS达到10以上且2XX比例小于60%。

  • 查询区间:5分钟(相对)

  • 频率:固定间隔5分钟

  • 触发条件$0.countall>3000&& $0.4XX比例>80

  • 触发通知阈值:2

  • 通知间隔:10分钟

  • 发送内容

    - [时间]:${FireTime}
    - [Uid]:${Results[0].RawResults[0].user_id}
    - 域名:${Results[0].RawResults[0].域名}
    - 产品:WAF
    - 最近5分钟内总请求数:${Results[0].RawResults[0].countall}
    - 2XX比例:${Results[0].RawResults[0].2XX比例} %
    - 3XX比例:${Results[0].RawResults[0].3XX比例} %
    - 4XX比例:${Results[0].RawResults[0].4XX比例} %
    - 5XX比例:${Results[0].RawResults[0].5XX比例} %

5XX比例异常告警

告警参数配置建议:

  • 图表名称:5XX比例

  • 查询语句

    user_id :您的阿里云账号ID
    and not real_client_ip :被拦截的请求IP |
    select
      user_id,
      host AS "域名",
      Rate_2XX AS "2XX比例",
      Rate_3XX AS "3XX比例",
      Rate_4XX AS "4XX比例",
      Rate_5XX AS "5XX比例",
      countall AS "相对时间内访问量",
      status_2XX,
      status_3XX,
      status_4XX,
      status_5XX,
      countall
    FROM(
        SELECT
          user_id,
          host,
          round(
            round(status_2XX * 1.0000 / countall, 4) * 100,
            2
          ) AS Rate_2XX,
          round(
            round(status_3XX * 1.0000 / countall, 4) * 100,
            2
          ) AS Rate_3XX,
          round(
            round (status_4XX * 1.0000 / countall, 4) * 100,
            2
          ) AS Rate_4XX,
          round(
            round(status_5XX * 1.0000 / countall, 4) * 100,
            2
          ) AS Rate_5XX,
          status_2XX,
          status_3XX,
          status_4XX,
          status_5XX,
          countall
        FROM(
            SELECT
              user_id,
              host,
              count_if(
                status >= 200
                and status < 300
              ) AS status_2XX,
              count_if(
                status >= 300
                and status < 400
              ) AS status_3XX,
              count_if(
                status >= 400
                and status < 500
              ) AS status_4XX,
              count_if(
                status >= 500
                and status < 600
              ) AS status_5XX,
              COUNT(*) AS countall
            FROM          log
            GROUP BY
              host,
              user_id
          )
      )
    WHERE
      countall > 120
    ORDER BY
      Rate_5XX DESC
    LIMIT
      5
  • 查询区间:5分钟(相对)

  • 频率:固定间隔5分钟

  • 触发条件$0.countall>3000&& $0.5XX比例>80

  • 触发通知阈值:2

  • 通知间隔:10分钟

  • 发送内容

    - [时间]:${FireTime}
    - [Uid]:${Results[0].RawResults[0].user_id}
    - 域名:${Results[0].RawResults[0].域名}
    - 产品:WAF
    - 最近5分钟内总请求数:${Results[0].RawResults[0].countall}
    - 2XX比例:${Results[0].RawResults[0].2XX比例} %
    - 3XX比例:${Results[0].RawResults[0].3XX比例} %
    - 4XX比例:${Results[0].RawResults[0].4XX比例} %
    - 5XX比例:${Results[0].RawResults[0].5XX比例} %

QPS异常告警

告警参数配置建议:

  • 图表名称:QPS TOP 5

  • 查询语句

    user_id :您的阿里云账号ID
    and not real_client_ip :被拦截的请求IP |
    SELECT
      user_id,
      host,
      Rate_2XX,
      Rate_3XX,
      Rate_4XX,
      Rate_5XX,
      countall / 60 as "aveQPS",
      status_2XX,
      status_3XX,
      status_4XX,
      status_5XX,
      countall
    FROM(
        SELECT
          user_id,
          host,
          round(
            round(status_2XX * 1.0000 / countall, 4) * 100,
            2
          ) as Rate_2XX,
          round(
            round(status_3XX * 1.0000 / countall, 4) * 100,
            2
          ) as Rate_3XX,
          round(
            round (status_4XX * 1.0000 / countall, 4) * 100,
            2
          ) as Rate_4XX,
          round(
            round(status_5XX * 1.0000 / countall, 4) * 100,
            2
          ) as Rate_5XX,
          status_2XX,
          status_3XX,
          status_4XX,
          status_5XX,
          countall
        FROM(
            SELECT
              user_id,
              host,
              count_if(
                status >= 200
                and status < 300
              ) as status_2XX,
              count_if(
                status >= 300
                and status < 400
              ) as status_3XX,
              count_if(
                status >= 400
                and status < 500
                and status <> 444
                and status <> 405
              ) as status_4XX,
              count_if(
                status >= 500
                and status < 600
              ) as status_5XX,
              COUNT(*) as countall
            FROM          log
            GROUP BY
              host,
              user_id
          )
      )
    WHERE
      countall > 120
    ORDER BY
      aveQPS DESC
    LIMIT
      5
  • 查询区间:1分钟(相对)

  • 频率:固定间隔1分钟

  • 触发条件$0.aveQPS>=50

  • 触发通知阈值:1

  • 通知间隔:5分钟

  • 发送内容

    - [时间]:${FireTime}
    - [Uid]:${Results[0].RawResults[0].user_id}
    - 域名:${Results[0].RawResults[0].host}
    - 产品:WAF
    - 过去1分钟平均QPS:${Results[0].RawResults[0].aveQPS}
    - 响应码 2xx_rate :${Results[0].RawResults[0].Rate_2XX}%
    - 响应码 3xx_rate :${Results[0].RawResults[0].Rate_3XX}%
    - 响应码 4xx_rate :${Results[0].RawResults[0].Rate_4XX}%
    - 响应码 5xx_rate :${Results[0].RawResults[0].Rate_5XX}%

QPS突增告警

告警参数配置建议:

  • 图表名称:QPS突增监控

  • 查询语句

    user_id :您的阿里云账号ID |
    SELECT
      t1.user_id,
      t1.now1mQPS,
      t1.past1mQPS,
      in_ratio,
      t1.host,
      t2.Rate_2XX,
      Rate_3XX,
      Rate_4XX,
      Rate_5XX,
      aveQPS
    FROM  (
        (
          SELECT
            user_id,
            round(c [1] / 60, 0) AS now1mQPS,
            round(c [2] / 60, 0) AS past1mQPS,
            round(
              round(c [1] / 60, 0) / round(c [2] / 60, 0) * 100 -100,
              0
            ) AS in_ratio,
            host
          FROM        (
              SELECT
                compare(t, 60) AS c,
                host,
                user_id
              FROM            (
                  SELECT
                    COUNT(*) AS t,
                    host,
                    user_id
                  FROM                log
                  GROUP by
                    host,
                    user_id
                )
              GROUP by
                host,
                user_id
            )
          WHERE
            c [3] > 1.1
            and (
              c [1] > 180
              or c [2] > 180
            )
        ) t1
        JOIN (
          SELECT
            user_id,
            host,
            Rate_2XX,
            Rate_3XX,
            Rate_4XX,
            Rate_5XX,
            countall / 60 AS "aveQPS",
            status_2XX,
            status_3XX,
            status_4XX,
            status_5XX,
            countall
          FROM        (
              SELECT
                user_id,
                host,
                round(
                  round(status_2XX * 1.0000 / countall, 4) * 100,
                  2
                ) AS Rate_2XX,
                round(
                  round(status_3XX * 1.0000 / countall, 4) * 100,
                  2
                ) AS Rate_3XX,
                round(
                  round(status_4XX * 1.0000 / countall, 4) * 100,
                  2
                ) AS Rate_4XX,
                round(
                  round(status_5XX * 1.0000 / countall, 4) * 100,
                  2
                ) AS Rate_5XX,
                status_2XX,
                status_3XX,
                status_4XX,
                status_5XX,
                countall
              FROM            (
                  SELECT
                    user_id,
                    host,
                    count_if(
                      status >= 200
                      and status < 300
                    ) AS status_2XX,
                    count_if(
                      status >= 300
                      and status < 400
                    ) AS status_3XX,
                    count_if(
                      status >= 400
                      and status < 500
                      and status <> 444
                      and status <> 405
                    ) AS status_4XX,
                    count_if(
                      status >= 500
                      and status < 600
                    ) AS status_5XX,
                    COUNT(*) AS countall
                  FROM                log
                  GROUP BY
                    host,
                    user_id
                )
            )
          WHERE
            countall > 1
        ) t2 on t1.host = t2.host
      )
    ORDER BY
      in_ratio DESC
    LIMIT
      5
  • 查询区间:1分钟(相对)

  • 频率:固定间隔1分钟

  • 触发条件$0.now1mqps>50&& $0.in_ratio>300

  • 触发通知阈值:1

  • 通知间隔:5分钟

  • 发送内容

    - [时间]:${FireTime}
    - [Uid]:${Results[0].RawResults[0].user_id}
    - 域名:${Results[0].RawResults[0].host}
    - 产品:WAF
    - 过去1分钟平均QPS:${Results[0].RawResults[0].now1mqps}
    - QPS突增率:${Results[0].RawResults[0].in_ratio}%
    - 响应码 2xx_Rate :${Results[0].RawResults[0].rate_2xx}%
    - 响应码 3xx_rate :${Results[0].RawResults[0].Rate_3XX}%
    - 响应码 4xx_rate :${Results[0].RawResults[0].Rate_4XX}%
    - 响应码 5xx_rate :${Results[0].RawResults[0].Rate_5XX}%

QPS突降告警

  • 图表名称:QPS突降监控

  • 查询语句

    user_id :您的阿里云账号ID |
    SELECT
      t1.user_id,
      t1.now1mQPS,
      t1.past1mQPS,
      de_ratio,
      t1.host,
      t2.Rate_2XX,
      Rate_3XX,
      Rate_4XX,
      Rate_5XX,
      aveQPS
    FROM  (
        (
          SELECT
            user_id,
            round(c [1] / 60, 0) AS now1mQPS,
            round(c [2] / 60, 0) AS past1mQPS,
            round(
              100-round(c [1] / 60, 0) / round(c [2] / 60, 0) * 100,
              2
            ) AS de_ratio,
            host
          FROM        (
              SELECT
                compare(t, 60) AS c,
                host,
                user_id
              FROM            (
                  SELECT
                    COUNT(*) AS t,
                    host,
                    user_id
                  FROM                log
                  GROUP BY
                    host,
                    user_id
                )
              GROUP BY
                host,
                user_id
            )
          WHERE
            c [3] < 0.9
            AND (
              c [1] > 180
              or c [2] > 180
            )
        ) t1
        JOIN (
          SELECT
            user_id,
            host,
            Rate_2XX,
            Rate_3XX,
            Rate_4XX,
            Rate_5XX,
            countall / 60 AS "aveQPS",
            status_2XX,
            status_3XX,
            status_4XX,
            status_5XX,
            countall
          FROM        (
              SELECT
                user_id,
                host,
                round(
                  round(status_2XX * 1.0000 / countall, 4) * 100,
                  2
                ) AS Rate_2XX,
                round(
                  round(status_3XX * 1.0000 / countall, 4) * 100,
                  2
                ) AS Rate_3XX,
                round(
                  round(status_4XX * 1.0000 / countall, 4) * 100,
                  2
                ) AS Rate_4XX,
                round(
                  round(status_5XX * 1.0000 / countall, 4) * 100,
                  2
                ) AS Rate_5XX,
                status_2XX,
                status_3XX,
                status_4XX,
                status_5XX,
                countall
              FROM            (
                  SELECT
                    user_id,
                    host,
                    count_if(
                      status >= 200
                      and status < 300
                    ) AS status_2XX,
                    count_if(
                      status >= 300
                      and status < 400
                    ) AS status_3XX,
                    count_if (
                      status >= 400
                      and status < 500
                      and status <> 444
                      and status <> 405
                    ) AS status_4XX,
                    count_if(
                      status >= 500
                      and status < 600
                    ) AS status_5XX,
                    COUNT(*) AS countall
                  FROM                log
                  GROUP BY
                    host,
                    user_id
                )
            )
          WHERE
            countall > 1
        ) t2 on t1.host = t2.host
      )
    ORDER BY
      de_ratio DESC
    LIMIT
      5

    该图表中包含now1mqps(当前一分钟的平均QPS)、past1mqps(过去一分钟的平均QPS)、de_ratio(QPS下降率)、host等字段,您可以根据需要使用这些字段设置告警条件。

  • 查询区间:1分钟(相对)

  • 频率:固定间隔1分钟

  • 触发条件$0.now1mqps>10&& $0.de_ratio>50

  • 触发通知阈值:2

  • 通知间隔:5分钟

  • 发送内容

    - [时间]:${FireTime}
    - [Uid]:${Results[0].RawResults[0].user_id}
    - 域名:${Results[0].RawResults[0].host}
    - 产品:WAF(海外)
    - 过去1分钟平均QPS:${Results[0].RawResults[0].now1mqps}
    - QPS突降率:${Results[0].RawResults[0].de_ratio}%
    - 响应码 2xx_rate :${Results[0].RawResults[0].rate_2xx}%
    - 响应码 3xx_rate :${Results[0].RawResults[0].Rate_3XX}%
    - 响应码 4xx_rate :${Results[0].RawResults[0].Rate_4XX}%
    - 响应码 5xx_rate :${Results[0].RawResults[0].Rate_5XX}%

5分钟内ACL拦截情况告警

告警参数配置建议:

  • 图表名称:ACL规则拦截量

  • 查询语句

    user_id :您的阿里云账号ID |
    SELECT
      user_id,
      host,
      count_if(
        final_plugin = 'waf'
        AND final_action = 'block'
      ) AS "规则防护引擎拦截量",
      count_if(
        final_plugin = 'cc'
        AND final_action = 'block'
      ) AS "CC拦截量",
      count_if(
        final_plugin = 'acl'
        AND final_action = 'block'
      ) AS "ACL拦截量",
      count_if(
        final_plugin = 'antiscan'
        AND final_action = 'block'
      ) AS "扫描防护拦截量",
      count_if(
        (final_plugin = 'waf'
        AND final_action = 'block')
        OR (final_plugin = 'cc'
        AND final_action = 'block')
        OR (final_plugin = 'acl'
        AND final_action = 'block')
        OR (final_plugin = 'antiscan'
        AND final_action = 'block')
      ) AS totalblock
    GROUP BY
      host,
      user_id
    HAVING
      (
        "ACL拦截量" >= 0
        AND "规则防护引擎拦截量" >= 0
        AND "CC拦截量" >= 0
        AND "扫描防护拦截量" >= 0
        AND totalblock > 10
      )
    ORDER BY
      "ACL拦截量" DESC
    LIMIT
      5
  • 查询区间:5分钟(相对)

  • 频率:固定间隔5分钟

  • 触发条件$0.totalblock>=500&&($0.ACL拦截量>=500)

  • 触发通知阈值:1

  • 通知间隔:5分钟

  • 发送内容

    - [时间]:${FireTime}
    - [Uid]:${Results[0].RawResults[0].user_id}
    - 域名:${Results[0].RawResults[0].host}
    - 产品:WAF
    - 最近5分钟内拦截总量:${Results[0].RawResults[0].totalblock}
    - ACL拦截量:${Results[0].RawResults[0].ACL拦截量}
    - 规则防护引擎拦截量:${Results[0].RawResults[0].规则防护引擎拦截量}
    - CC拦截量:${Results[0].RawResults[0].CC拦截量}
    - 扫描防护拦截量:${Results[0].RawResults[0].扫描防护拦截量}

5分钟内规则防护引擎拦截情况告警

告警参数配置建议:

  • 图表名称:规则防护引擎拦截量

  • 查询语句

    user_id :您的阿里云账号ID |
    SELECT
      user_id,
      host,
      count_if(
        final_plugin = 'waf'
        AND final_action = 'block'
      ) AS "规则防护引擎拦截量",
      count_if(
        final_plugin = 'cc'
        AND final_action = 'block'
      ) AS "CC拦截量",
      count_if(
        final_plugin = 'acl'
        AND final_action = 'block'
      ) AS "ACL拦截量",
      count_if(
        final_plugin = 'antiscan'
        AND final_action = 'block'
      ) AS "扫描防护拦截量",
      count_if(
        (final_plugin = 'waf'
        AND final_action = 'block')
        OR (final_plugin = 'cc'
        AND final_action = 'block')
        OR (final_plugin = 'acl'
        AND final_action = 'block')
        OR (final_plugin = 'antiscan'
        AND final_action = 'block')
      ) AS totalblock
    GROUP BY
      host,
      user_id
    HAVING
      (
        "ACL拦截量" >= 0
        AND "规则防护引擎拦截量" >= 0
        AND "CC拦截量" >= 0
        AND "扫描防护拦截量" >= 0
        AND totalblock > 10
      )
    ORDER BY
      "规则防护引擎拦截量" DESC
    LIMIT
      5
  • 查询区间:5分钟(相对)

  • 频率:固定间隔5分钟

  • 触发条件$0.totalblock>=500&&($0.规则防护引擎拦截量>=500)

  • 触发通知阈值:1

  • 通知间隔:5分钟

  • 发送内容

    - [时间]:${FireTime}
    - [Uid]:${Results[0].RawResults[0].user_id}
    - 域名:${Results[0].RawResults[0].host}
    - 产品:WAF
    - 最近5分钟内拦截总量:${Results[0].RawResults[0].totalblock}
    - ACL拦截量:${Results[0].RawResults[0].ACL拦截量}
    - 规则防护引擎拦截量:${Results[0].RawResults[0].规则防护引擎拦截量}
    - CC拦截量:${Results[0].RawResults[0].CC拦截量}
    - 扫描防护拦截量:${Results[0].RawResults[0].扫描防护拦截量}

5分钟内CC拦截情况告警

告警参数配置建议:

  • 图表名称:CC防护规则拦截量

  • 查询语句

    user_id :您的阿里云账号ID |
    SELECT
      user_id,
      host,
      count_if(
        final_plugin = 'waf'
        AND final_action = 'block'
      ) AS "规则防护引擎拦截量",
      count_if(
        final_plugin = 'cc'
        AND final_action = 'block'
      ) AS "CC拦截量",
      count_if(
        final_plugin = 'acl'
        AND final_action = 'block'
      ) AS "ACL拦截量",
      count_if(
        final_plugin = 'antiscan'
        AND final_action = 'block'
      ) AS "扫描防护拦截量",
      count_if(
        (final_plugin = 'waf'
        AND final_action = 'block')
        OR (final_plugin = 'cc'
        AND final_action = 'block')
        OR (final_plugin = 'acl'
        AND final_action = 'block')
        OR (final_plugin = 'antiscan'
        AND final_action = 'block')
      ) AS totalblock
    GROUP BY
      host,
      user_id
    HAVING
      (
        "ACL拦截量" >= 0
        AND "规则防护引擎拦截量" >= 0
        AND "CC拦截量" >= 0
        AND "扫描防护拦截量" >= 0
        AND totalblock > 10
      )
    ORDER BY
      "CC拦截量" DESC
    LIMIT
      5
  • 查询区间:5分钟(相对)

  • 频率:固定间隔5分钟

  • 触发条件$0.totalblock>=500&&($0.CC拦截量>=500)

  • 触发通知阈值:1

  • 通知间隔:5分钟

  • 发送内容

    - [时间]:${FireTime}
    - [Uid]:${Results[0].RawResults[0].user_id}
    - 域名:${Results[0].RawResults[0].host}
    - 产品:WAF
    - 最近5分钟内拦截总量:${Results[0].RawResults[0].totalblock}
    - ACL拦截量:${Results[0].RawResults[0].ACL拦截量}
    - 规则防护引擎拦截量:${Results[0].RawResults[0].规则防护引擎拦截量}
    - CC拦截量:${Results[0].RawResults[0].CC拦截量}
    - 扫描防护拦截量:${Results[0].RawResults[0].扫描防护拦截量}

5分钟内扫描拦截情况告警

告警参数配置建议:

  • 图表名称:扫描防护拦截量

  • 查询语句

    user_id :您的阿里云账号ID |
    SELECT
      user_id,
      host,
      count_if(
        final_plugin = 'waf'
        AND final_action = 'block'
      ) AS "规则防护引擎拦截量",
      count_if(
        final_plugin = 'cc'
        AND final_action = 'block'
      ) AS "CC拦截量",
      count_if(
        final_plugin = 'acl'
        AND final_action = 'block'
      ) AS "ACL拦截量",
      count_if(
        final_plugin = 'antiscan'
        AND final_action = 'block'
      ) AS "扫描防护拦截量",
      count_if(
        (final_plugin = 'waf'
        AND final_action = 'block')
        OR (final_plugin = 'cc'
        AND final_action = 'block')
        OR (final_plugin = 'acl'
        AND final_action = 'block')
        OR (final_plugin = 'antiscan'
        AND final_action = 'block')
      ) AS totalblock
    GROUP BY
      host,
      user_id
    HAVING
      (
        "ACL拦截量" >= 0
        AND "规则防护引擎拦截量" >= 0
        AND "CC拦截量" >= 0
        AND "扫描防护拦截量" >= 0
        AND totalblock > 10
      )
    ORDER BY
      "扫描防护拦截量" DESC
    LIMIT
      5
  • 查询区间:5分钟(相对)

  • 频率:固定间隔5分钟

  • 触发条件$0.totalblock>=500&&($0.扫描防护拦截量>=500)

  • 触发通知阈值:1

  • 通知间隔:5分钟

  • 发送内容

    - [时间]:${FireTime}
    - [Uid]:${Results[0].RawResults[0].user_id}
    - 域名:${Results[0].RawResults[0].host}
    - 产品:WAF(海外)
    - 最近5分钟内拦截总量:${Results[0].RawResults[0].totalblock}
    - ACL拦截量:${Results[0].RawResults[0].ACL拦截量}
    - 规则防护引擎拦截量:${Results[0].RawResults[0].规则防护引擎拦截量}
    - CC拦截量:${Results[0].RawResults[0].CC拦截量}
    - 扫描防护拦截量:${Results[0].RawResults[0].扫描防护拦截量}

IP攻击量预警

告警参数配置建议:

  • 图表名称:单IP攻击量

  • 查询语句

    user_id :您的阿里云账号ID |
    SELECT
      user_id,
      real_client_ip,
      concat(
        'ACL拦截量:',
        cast(aclblock AS varchar(10)),
        ' ',
        '规则防护引擎拦截量:',
        cast(wafblock AS varchar(10)),
        '
    ',
        'CC拦截量:',
        cast(aclblock AS varchar(10))
      ) AS blockNum,
      totalblock,
      allRequest
    FROM  (
        SELECT
          user_id,
          real_client_ip,
          count_if(
            final_plugin = 'acl'
            AND final_action = 'block'
          ) AS aclblock,
          count_if(
            final_plugin = 'waf'
            AND final_action = 'block'
          ) AS wafblock,
          count_if(
            final_plugin = 'cc'
            AND final_action = 'block'
          ) AS ccblock,
          count_if(
            (
              final_plugin = 'acl'
              AND final_action = 'block'
            )
            OR (
              final_plugin = 'waf'
              AND final_action = 'block'
            )
            OR (
              final_plugin = 'cc'
              AND final_action = 'block'
            )
          ) AS totalblock,
          COUNT(*) AS allRequest
        FROM      log
        GROUP BY
          user_id,
          real_client_ip
        HAVING
          totalblock > 1
        ORDER BY
          totalblock DESC
        LIMIT
          5
      )

    该图表中包含real_client_ip(攻击IP)、blockNum(包含ACL拦截量规则防护引擎拦截量CC拦截量等数据)、totalblock(总拦截请求数)、allRequest(总请求数)字段,您可以根据需要使用这些字段设置告警条件。

  • 查询区间:5分钟(相对)

  • 频率:固定间隔5分钟

  • 触发条件$0.totalblock >=500

  • 触发通知阈值:1

  • 通知间隔:5分钟

  • 发送内容

    - [时间]:${FireTime}
    - [Uid]:${Results[0].RawResults[0].user_id}
    - 产品:WAF
    - 最近5分钟内单IP攻击排行Top3:
    - ${Results[0].RawResults[0].real_client_ip} (${Results[0].RawResults[0].blockNum})
    - ${Results[0].RawResults[1].real_client_ip} (${Results[0].RawResults[1].blockNum})
    - ${Results[0].RawResults[2].real_client_ip} (${Results[0].RawResults[2].blockNum})

IP攻击域名数量告警

告警参数配置建议:

  • 图表名称:单IP攻击域名数量

  • 查询语句

    user_id :您的阿里云账号ID
    and not upstream_status :504
    and not upstream_addr :'-'
    and request_time_msec < 5000
    and upstream_status :200
    and not ua_browser :bot |
    SELECT
      user_id,
      host,
      upstream_time,
      request_time,
      requestnum
    FROM  (
        SELECT
          user_id,
          host,
          round(avg(upstream_response_time), 2) * 1000 AS upstream_time,
          round(avg(request_time_msec), 2) AS request_time,
          COUNT(*) AS requestnum
        FROM      log
        GROUP BY
          host,
          user_id
      )
    WHERE
      requestnum > 30
    ORDER BY
      request_time DESC
    LIMIT
      5

    该图表中包含real_client_ip(攻击IP)、totalblock(总拦截请求数)、domainnum(该IP攻击的域名数)等字段。在设置告警触发条件时,您可以自由组合上述字段来设置告警条件。例如,totalblock>500&& domainnum>5表示某IP在对应时间内总攻击量达到500,并且攻击域名数多于5个。

  • 查询区间:5分钟(相对)

  • 频率:固定间隔1分钟

  • 触发条件$0.domainnum>=10

  • 触发通知阈值:1

  • 通知间隔:5分钟

  • 发送内容

    - [时间]:${FireTime}
    - [Uid]:${Results[0].RawResults[0].user_id}
    - 产品:WAF
    - 攻击IP:${Results[0].RawResults[0].real_client_ip}
    - 攻击的域名数:${Results[0].RawResults[0].domainnum}
    - 最近5分钟总攻击请求数:${Results[0].RawResults[0].totalblock}
    - 请及时关注处理

5分钟平均时延异常告警

告警参数配置建议:

  • 图表名称:平均时延监控

  • 查询语句

    user_id :您的阿里云账号ID
    and not upstream_status :504
    and not upstream_addr :'-'
    and request_time_msec < 5000
    and upstream_status :200
    and not ua_browser :bot |
    SELECT
      user_id,
      host,
      upstream_time,
      request_time,
      requestnum
    FROM  (
        SELECT
          user_id,
          host,
          round(avg(upstream_response_time), 2) * 1000 AS upstream_time,
          round(avg(request_time_msec), 2) AS request_time,
          COUNT(*) AS requestnum
        FROM      log
        GROUP BY
          host,
          user_id
      )
    WHERE
      requestnum > 30
    ORDER BY
      request_time DESC
    LIMIT
      5
  • 查询区间:5分钟(相对)

  • 频率:固定间隔5分钟

  • 触发条件$0.request_time>1000&& $0.requestnum>30

  • 触发通知阈值:2

  • 通知间隔:10分钟

  • 发送内容

    - [时间]:${FireTime}
    - [Uid]:${Results[0].RawResults[0].user_id}
    - 域名:${Results[0].RawResults[0].host}
    - 产品:WAF(海外)
    - [触发条件]:${condition}
    - 最近5分钟延时情况TOP 3(毫秒)
    - Host1:${Results[0].RawResults[0].host} Delay_time:${Results[0].RawResults[0].upstream_time} 
    - Host2:${Results[0].RawResults[1].host} Delay_time:${Results[0].RawResults[1].upstream_time} 
    - Host3:${Results[0].RawResults[2].host} Delay_time:${Results[0].RawResults[2].upstream_time}

流量突降告警

告警参数配置建议:

  • 图表名称:流量突降监控

  • 查询语句

    user_id :您的阿里云账号ID |
    SELECT
      t1.user_id,
      t1.now1mQPS,
      t1.past1mQPS,
      de_ratio,
      t2.Rate_2XX,
      Rate_3XX,
      Rate_4XX,
      Rate_5XX,
      aveQPS
    FROM  (
        (
          SELECT
            user_id,
            round(c [1] / 60, 0) AS now1mQPS,
            round(c [2] / 60, 0) AS past1mQPS,
            round(
              100-round(c [1] / 60, 0) / round(c [2] / 60, 0) * 100,
              2
            ) AS de_ratio
          FROM        (
              SELECT
                compare(t, 60) AS c,
                user_id
              FROM            (
                  SELECT
                    COUNT(*) AS t,
                    user_id
                  FROM                log
                  GROUP BY
                    user_id
                )
              GROUP BY
                user_id
            )
          WHERE
            c [3] < 0.9
            AND (
              c [1] > 180
              or c [2] > 180
            )
        ) t1
        JOIN (
          SELECT
            user_id,
            Rate_2XX,
            Rate_3XX,
            Rate_4XX,
            Rate_5XX,
            countall / 60 AS "aveQPS",
            status_2XX,
            status_3XX,
            status_4XX,
            status_5XX,
            countall
          FROM        (
              SELECT
                user_id,
                round(
                  round(status_2XX * 1.0000 / countall, 4) * 100,
                  2
                ) AS Rate_2XX,
                round(
                  round(status_3XX * 1.0000 / countall, 4) * 100,
                  2
                ) AS Rate_3XX,
                round(
                  round(status_4XX * 1.0000 / countall, 4) * 100,
                  2
                ) AS Rate_4XX,
                round(
                  round(status_5XX * 1.0000 / countall, 4) * 100,
                  2
                ) AS Rate_5XX,
                status_2XX,
                status_3XX,
                status_4XX,
                status_5XX,
                countall
              FROM            (
                  SELECT
                    user_id,
                    count_if(
                      status >= 200
                      AND status < 300
                    ) AS status_2XX,
                    count_if(
                      status >= 300
                      AND status < 400
                    ) AS status_3XX,
                    count_if (
                      status >= 400
                      AND status < 500
                      AND status <> 444
                      AND status <> 405
                    ) AS status_4XX,
                    count_if(
                      status >= 500
                      AND status < 600
                    ) AS status_5XX,
                    COUNT(*) AS countall
                  FROM                log
                  GROUP BY
                    user_id
                )
            )
          WHERE
            countall > 0
        ) t2 ON t1.user_id = t2.user_id
      )
    ORDER BY
      de_ratio DESC
    LIMIT
      5
  • 查询区间:1分钟(相对)

  • 频率:固定间隔1分钟

  • 触发条件$0.de_ratio>50&& $0.now1mqps>20

  • 触发通知阈值:1

  • 通知间隔:5分钟

  • 发送内容

    - [时间]:${FireTime}
    - [UID]:${Results[0].RawResults[0].user_id}
    - 产品:WAF
    - 过去1分钟平均QPS:${Results[0].RawResults[0].now1mqps}
    - [触发条件(突降率&QPS)]:${condition}
    - QPS突降率:${Results[0].RawResults[0].de_ratio}%
    - 响应码 2xx_rate :${Results[0].RawResults[0].rate_2xx}%
    - 响应码 3xx_rate :${Results[0].RawResults[0].Rate_3XX}%
    - 响应码 4xx_rate :${Results[0].RawResults[0].Rate_4XX}%
    - 响应码 5xx_rate :${Results[0].RawResults[0].Rate_5XX}%