Introduction
This article records a case where Elasticsearch errors occurred due to disk pressure caused by Docker containers and images, along with the investigation and resolution methods. We hope this serves as a reference for those facing similar issues.
Problem Occurrence
The following error occurred in a running Elasticsearch instance.
Initial investigation revealed that indices were in a close state, and insufficient disk space was suspected.
Investigating Disk Usage
Checking Root Directory Usage
First, we checked the overall disk usage of the system.
Output:
It was found that the /var directory was abnormally large at 50GB.
Detailed Investigation of /var Directory
Output:
Since /var/lib occupied nearly all the capacity, we investigated further.
Output:
Root cause identified: Docker data was occupying 49GB.
Analyzing Docker Disk Usage
We checked the detailed usage of Docker.
Output:
Analysis Results
- Images: 33 out of 38 (approximately 36GB) were unused
- Build Cache: All 3GB were deletable
- Containers: Most were active and not eligible for deletion
- Volumes: Mostly in use
Performing Cleanup
Bulk Cleanup Command
We deleted unused resources in bulk with the following command.
This command deletes:
- Stopped containers
- Unused images (including untagged ones with the
-aoption) - Unused networks
- Unused volumes (with the
--volumesoption) - Build cache
Results
0
Approximately 39GB of free disk space was recovered.
Prevention Measures
Configuring Docker Log Rotation
To prevent Docker container logs from accumulating indefinitely, we edited /etc/docker/daemon.json.
1
Configuration explanation:
max-size: Maximum size of a single log filemax-file: Number of log files to retain
Applying the Configuration
2
Considering Periodic Cleanup
In production environments, automating periodic cleanup can also be considered.
3
Results and Lessons Learned
Resolution Results
- Elasticsearch errors were resolved
- Disk usage was reduced from 60GB to 21GB
- System stability improved
Lessons Learned
- Importance of regular monitoring: Regular monitoring of disk usage is necessary
- Docker operations management: Unused resources tend to accumulate, especially in development environments
- Importance of log management: Log rotation configuration is essential
- Preventive maintenance: Periodic cleanup before problems occur is effective
Summary
In environments using Docker, images, containers, and build cache tend to accumulate, making regular cleanup important. We recommend implementing proper operational management using the investigation methods and solutions introduced in this article.
Through this response, we were able to restore stable server operation. We hope this helps others facing similar issues.
Reference Command List
4