Introduction

This article records a case where Elasticsearch errors occurred due to disk pressure caused by Docker containers and images, along with the investigation and resolution methods. We hope this serves as a reference for those facing similar issues.

Problem Occurrence

The following error occurred in a running Elasticsearch instance.

{}"}"e,sr"""trtrpaoyehtrpaau"esss:"oe":n":{":":5s"0e"q3aaurleclrhy_s"ph,haarsdes_efxaeicluetdi"o,n_exception",

Initial investigation revealed that indices were in a close state, and insufficient disk space was suspected.

Investigating Disk Usage

Checking Root Directory Usage

First, we checked the overall disk usage of the system.

sudodu-h-max-depth=1/|sort-hr|head-n20

Output:

6542100...GG712GGG//uhoasoprrmte

It was found that the /var directory was abnormally large at 50GB.

Detailed Investigation of /var Directory

sudodu-h-max-depth=1var|sort-hr

Output:

5432109442GG208MMMvaaaaarrrrr////llcsioapbgcohoel

Since /var/lib occupied nearly all the capacity, we investigated further.

sudodu-h-max-depth=1var/lib|sort-hr

Output:

44219952GG68MMvaaaarrrr////lllliiiibbbb///dsaonpcatkpedr

Root cause identified: Docker data was occupying 49GB.

Analyzing Docker Disk Usage

We checked the detailed usage of Docker.

dockersystemdf

Output:

TICLBYmoouPanciEgtalealdsinVCeoarlcsuhmeesT3521O82T9ALA5410CTIVES3162I904.Z..69E84M7G4B2BMGBBR3032E5B2.C..9L9(57A9072IG%kGMB)BBAB(((L901E0%0%)0)%)

Analysis Results

  • Images: 33 out of 38 (approximately 36GB) were unused
  • Build Cache: All 3GB were deletable
  • Containers: Most were active and not eligible for deletion
  • Volumes: Mostly in use

Performing Cleanup

Bulk Cleanup Command

We deleted unused resources in bulk with the following command.

dockersystemprune-aolumes

This command deletes:

  • Stopped containers
  • Unused images (including untagged ones with the -a option)
  • Unused networks
  • Unused volumes (with the --volumes option)
  • Build cache

Results

sudodu-h-max-depth=1/|sort-hr|head-n20

0

Approximately 39GB of free disk space was recovered.

Prevention Measures

Configuring Docker Log Rotation

To prevent Docker container logs from accumulating indefinitely, we edited /etc/docker/daemon.json.

sudodu-h-max-depth=1/|sort-hr|head-n20

1

Configuration explanation:

  • max-size: Maximum size of a single log file
  • max-file: Number of log files to retain

Applying the Configuration

sudodu-h-max-depth=1/|sort-hr|head-n20

2

Considering Periodic Cleanup

In production environments, automating periodic cleanup can also be considered.

sudodu-h-max-depth=1/|sort-hr|head-n20

3

Results and Lessons Learned

Resolution Results

  • Elasticsearch errors were resolved
  • Disk usage was reduced from 60GB to 21GB
  • System stability improved

Lessons Learned

  1. Importance of regular monitoring: Regular monitoring of disk usage is necessary
  2. Docker operations management: Unused resources tend to accumulate, especially in development environments
  3. Importance of log management: Log rotation configuration is essential
  4. Preventive maintenance: Periodic cleanup before problems occur is effective

Summary

In environments using Docker, images, containers, and build cache tend to accumulate, making regular cleanup important. We recommend implementing proper operational management using the investigation methods and solutions introduced in this article.

Through this response, we were able to restore stable server operation. We hope this helps others facing similar issues.


Reference Command List

sudodu-h-max-depth=1/|sort-hr|head-n20

4