High disk usage and read-only indexes

更新时间:
复制 MD 格式

When disk usage on an Alibaba Cloud Elasticsearch cluster exceeds 85%, Elasticsearch automatically restricts write access to protect data integrity. This topic explains the root cause, how to restore write access quickly, and how to prevent recurrence.

Important

Disclaimer: This topic may contain information about third-party products. Such information is only for reference. Alibaba Cloud does not make any guarantee, express or implied, with respect to the performance and reliability of third-party products, as well as potential impacts of operations on the products.

Symptoms

  • Write requests fail with: FORBIDDEN/12/index read-only / allow delete (api)

  • Cluster health status is red. Run GET /_cat/nodes?v to check whether nodes have rejoined the cluster. Run GET /_cat/allocation?v to check shard allocation.

    Note

    Red cluster health means primary shards are unavailable. Data may be at risk.

  • Kibana returns internal server error when creating ingest pipelines or enrolling Beats

  • Cluster or Kibana monitoring shows disk usage approaching 100%

Root cause

Disk usage thresholds

Elasticsearch monitors disk usage continuously and enforces three watermark thresholds:

  • 85% — low watermark: Elasticsearch stops allocating new shards to this node.

  • 90% — high watermark: Elasticsearch relocates existing shards to nodes with more free disk space.

  • 95% — flood stage: Elasticsearch sets the read_only_allow_delete attribute on all indexes, blocking all write operations.Disk usage watermarks

Quick fix (10–15 minutes)

  1. Delete old or unused indexes to free disk space.

    Warning

    Deleted data cannot be restored. To preserve data, consider increasing storage capacity instead.

    curl -u <username>:<password> -XDELETE http://<host>:<port>/<index-name>
    • <host> is your cluster's internal or public endpoint. Configure the access allowlist before running this command.

    • If the cluster is unresponsive, trigger a forced restart and run this command during the restart.

  2. Remove the read-only lock. Freeing disk space does not automatically lift the write block. Clear it by setting index.blocks.read_only_allow_delete to null:

    PUT /_all/_settings
    {
       "index.blocks.read_only_allow_delete": null
    }
  3. Verify cluster health. If the status is still red, run GET /_cat/allocation?v to check for unassigned shards.

  4. If unassigned shards remain, run GET /_cluster/allocation/explain to identify the cause. If the output shows exhausted allocation retries (as in the screenshot below), run POST /_cluster/reroute?retry_failed=true.Allocation explain showing retries exhausted

  5. If the cluster health status remains red after all recovery steps, contact Alibaba Cloud technical support.

Prevention

Enable disk usage monitoring and configure alerts to fire when usage exceeds 80%. Route alerts to your operations team so they can act before the flood stage is reached. For setup instructions, see Configure monitoring and alerting.