Because keeping 3 years of server logs in a hot database will bankrupt you.
Application logs and metrics are highly valuable for the first 7 days when you are actively debugging incidents. After that, they are almost never queried, but compliance teams often require you to keep them for 1 to 7 years. You must aggressively move old logs out of expensive "Hot" indexing systems (like Datadog or Elasticsearch) into "Cold" object storage (like AWS S3 Glacier).
Data is migrated automatically based on its age. Hot storage keeps data on fast SSDs and indexes every field. Cold storage compresses the data into chunks (e.g., Parquet or GZIP) and dumps it onto slow, cheap magnetic tape/disks.
# Typical Data Lifecycle Policy
def evaluate_log_retention(log_index):
age_days = (current_date - log_index.creation_date).days
if age_days < 7:
# Keep in Elasticsearch on fast NVMe SSDs ($$$)
# Allows sub-second search during active incidents.
log_index.tier = "HOT"
elif age_days < 30:
# Move to slower HDDs, reduce index replication ($)
# Search takes ~5 seconds, but cheaper.
log_index.tier = "WARM"
elif age_days < 2555: # 7 years
# Compress to .gz files and send to S3 Glacier (¢)
# Search requires a manual "restore" job taking 12+ hours.
s3.upload(log_index.compress())
log_index.delete_from_elasticsearch()
else:
# Compliance period is over. Delete forever to avoid liability.
s3.delete(log_index.s3_path)
The cost savings are monumental (often 100x cheaper per GB). The trade-off is Retrieval Time. If auditors ask for a log from 2 years ago, an engineer has to trigger a data restoration job and wait 12-24 hours before they can even run a search query.