Data volume reduction
Evaluating Data Volume in ClickHouse
Download and execute the script to evaluate the volume of data in ClickHouse:
wget https://docs.mitigator.ru/collector/v24.10/maintenance/reduce_db/utilization.sql && \
docker-compose exec -T clickhouse clickhouse-client -mn < utilization.sql
Running this script may take a long time.
This script calculates the following metrics:
min_date
- the earliest timestamp in the database;flow_quantity
- the number of flow records stored in the database;single_flow_size
- the average size of one flow record in bytes;max_flow_size
- the maximum available volume for storing flow records in bytes;max_flow_quantity
- The maximum available number of flow records that can be stored in the database;max_flow_rate
- The observed maximum rate of writing flow records per second;avg_flow_rate
- The observed average rate of writing flow records per second;ttl_for_max_flow_rate
- the recommended time-to-live (TTL) for records at the observed maximum write rate;ttl_for_avg_flow_rate
- the recommended TTL for records at the observed average write rate.
Based on these calculated metrics, choose new data compression periods and record TTLs.
Parameter | Calculation Method | Default Value |
---|---|---|
First Compression Period | Desired period of working with operational data. Less than the recommended TTL at maximum write rate | 1 month |
Second Compression Period | 0.5 * Recommended TTL at average write rate | 6 months |
Record TTL | Recommended TTL at average write rate | 1 year |
Procedure for setting recompression and Time-to-Live periods
Evaluate the data volume in ClickHouse and choose new recompression and TTL periods.
Check the number of flow records stored in the database does not exceed the maximum available number of flow records that can be stored in the database. If exceeded, manually remove old records.
Download the script for setting recompression and TTL periods:
wget https://docs.mitigator.ru/collector/v24.10/maintenance/reduce_db/volume_reduction.sql
Edit the script to specify new recompression and TTL periods for all tables.
Run the script:
docker-compose exec -T clickhouse clickhouse-client -mn < volume_reduction.sql
The new recompression and TTL periods will be applied within 4 hours. During application, there may be increased CPU load and disk usage. This will not affect processing of new data.
Manual data deletion in ClickHouse
To reduce the size of the database, you can delete old records by running the following commands:
docker-compose exec clickhouse clickhouse-client --query="ALTER TABLE collector.flows DELETE WHERE timestamp_sec < toUnixTimestamp('1970-01-01');" && \
docker-compose exec clickhouse clickhouse-client --query="ALTER TABLE collector.rate_metrics DELETE WHERE date < toDateTime('1970-01-01');" && \
docker-compose exec clickhouse clickhouse-client --query="ALTER TABLE collector.snmp_counters DELETE WHERE date < toDateTime('1970-01-01');"
where 1970-01-01
is the desired date.