Best Practices for Snapshots and Data Retention
1. Introduction
The environment delivered by CloudFerro includes block volumes, shared file systems, and object storage compatible with S3. Within such an infrastructure, proper planning and implementation of a data protection policy becomes particularly important.
The purpose of this document is to support the client in preparing and implementing an effective data backup and retention policy in the environment provided by CloudFerro. The document presents best practices for snapshots, backups, and archiving in order to:
ensure business continuity for systems and applications,
meet business objectives expressed through RPO/RTO requirements,
align processes with regulatory requirements (e.g., GDPR, audits),
optimize data storage costs while maintaining high availability and security.
CloudFerro, as an infrastructure provider, offers technical mechanisms that enable the implementation of data protection policies. This document serves as a guide to help apply them consciously and effectively.
2. Requirements Analysis
Before implementing backup policies, it is necessary to define the basic assumptions:
RPO/RTO objectives: Define the maximum acceptable data loss (RPO) and the permissible downtime (RTO). For example, RTO is the maximum service restoration time, while RPO defines the maximum data loss window. Backup strategies must match these objectives — data with low RPO requires very frequent backups, while high RTO may call for instant recovery mechanisms.
Data classification: Divide data into categories (critical, important, standard) based on business significance. For instance, databases and key production volumes should be treated as the highest protection priority, suggesting frequent and tightly controlled backups.
Regulatory requirements: Define retention periods imposed by law. In financial or medical sectors, data is often required to be stored for 7–10 years. At the same time, under GDPR, personal data should not be kept longer than necessary and must be deleted after the retention period expires.
Data growth forecast and cost analysis: Estimate the rate of data growth and associated costs. Large datasets (e.g., logs or archives) can be moved to cheaper storage tiers (e.g., cold storage) and deduplicated to reduce expenses.
Isolation strategy: Plan environment separation. It is recommended to use separate cloud projects/accounts and distinct access permissions for backups. For example, storing backups in a separate AWS account protects against accidental deletion or credential compromise. Such a logical “air gap” increases resilience against ransomware and other threats.
3. Snapshots – Best Practices
3.1 Block Volumes
Block volume snapshots in OpenStack (Cinder service) are fast, point-in-time copies of volume state that can be used for data recovery or to create new volumes. They should primarily be treated as short-term protection during operations such as system upgrades, virtual machine migrations, or application configuration changes.
To ensure data consistency, suspend write operations to the volume before creating a snapshot — particularly when the volume is used by a database or a high I/O application. In practice, this may mean unmounting the file system, freezing processes, or creating a database checkpoint.
Snapshot creation in OpenStack can be automated by integrating Cinder with orchestration mechanisms (e.g., Heat, Mistral) or configuration management tools (e.g., Ansible, Puppet). A typical sequence includes:
Pause application operations (I/O freeze, database checkpoint).
Create the Cinder snapshot.
Resume operations and validate the application.
This approach enables consistent, application-level snapshots that can safely be used for rollback or rapid recovery.
Recommendation: In production environments, snapshots should be kept for a short period (e.g., a few days) and limited in number. Snapshots do not replace full backups — they are a fast recovery tool during administrative operations.
4. Backup and Retention in Object Storage (S3)
4.1 Strategies
Long-term backups are usually stored in object storage — both full copies and incremental deltas (e.g., log files, database snapshots). Backups should be stored in separate resources (distinct buckets/accounts/projects) and encrypted with dedicated keys. This backup environment should be isolated from production — for example, storing backups in a different cloud account protects against data loss caused by credential compromise or operator error. All application backups should be logically consistent, meaning, for example, a database dump should be accompanied by transaction logs to enable recovery to any point in time.
4.2 Retention
Recommended backup retention periods in S3 can be summarized as follows:
Daily backups – typically kept for 1–2 months (e.g., 30–62 days) for quick restoration of recently modified data.
Monthly backups – retained for several years (e.g., 3–7 years) to meet audit requirements.
Yearly archives – oldest copies that meet compliance obligations (e.g., financial data for 7–10+ years). Cold storage tiers (e.g., Glacier or Deep Archive) should be used here to minimize costs and ensure durability.
4.3 Protection Mechanisms
To secure backups in S3, the following mechanisms should be implemented:
Bucket versioning: Enable S3 Versioning to retain all object versions and allow recovery of any previous file version. This makes it possible to reverse accidental deletions or overwrites.
Lifecycle policies and storage classes: Configure S3 lifecycle rules to automatically transition older backups to cheaper storage classes (e.g., Standard → Infrequent Access → Glacier). This optimizes costs — archival data should be stored in Glacier, following cloud backup optimization best practices.
Encryption: Enforce encryption in transit (TLS/HTTPS) and at rest for all backups. Each backup bucket should require object encryption upon upload.
Access control (RBAC): Apply the principle of least privilege — only designated administrators should have access to backups. IAM policies should strictly limit deletion or configuration modification rights. As experts note, “Immutable backups, encryption, and RBAC keep data safe from tampering.”
5. Retention Matrix
The following table presents an example of backup parameters for different data classes:
Data Class |
Snapshots – Frequency |
Snapshots – Retention |
Daily Backup |
Monthly Backup |
Archive |
|---|---|---|---|---|---|
Critical |
Before changes + daily |
7–14 days |
~62 days |
84 months (7 years) |
7–10 years |
Important |
Daily |
7–30 days |
30–62 days |
36–60 months (3–5 years) |
5–7 years |
Standard |
Daily |
7–14 days |
30 days |
12–24 months |
Optional |
This table is illustrative — actual retention periods may vary depending on company policy and legal requirements. For example, certain financial or medical data may require archiving for up to a decade.
6. Security and Compliance
Backup protection must account for both security and legal compliance:
Logical air gap: Use environment separation, e.g., store backups in separate accounts/projects with independent credentials. This ensures that backups remain intact in case of production environment compromise or failure.
Ransomware resilience: Implement immutable backup mechanisms and monitor for anomalies. As experts point out, “Immutable copies, encryption, and RBAC” are key to ransomware protection. Restrict access to essential personnel only and deploy detection systems for unauthorized backup changes.
Privacy and GDPR: Backups should not contain unnecessary personal data. Apply encryption and anonymization/deletion procedures once retention ends. Under GDPR, personal data in backups must be protected against unauthorized access and retained only as long as necessary.
7. Operations and Testing
Backup policies must be supported by continuous operations and regular testing:
Monitoring and reporting: Use tools to monitor backup and snapshot status. Enable logging of all backup operations and set alerts for failed copies. This ensures full visibility and quick issue detection.
Recovery testing (DR drills): Regularly verify the usefulness of backups. For critical data, perform restoration exercises quarterly; for important data, biannually; and for standard data, at least annually. These drills confirm RPO/RTO compliance and uncover potential gaps.
Runbooks and procedures: Prepare detailed backup/restore instructions — checklists for snapshots, archiving scripts, etc. Audit access to backup environments regularly. Proper documentation enables quick incident response and ensures adherence to policies.
Cost optimization: Use deduplication and compression where possible. Utilize storage classes so that older data moves to cheaper tiers (e.g., S3 lifecycle Deep Archive). Exclude temporary or non-critical files from backups to reduce storage costs.
8. Implementation Checklist
Define data classes and RPO/RTO goals – perform a business impact analysis.
Develop and approve snapshot policies (frequency, consistency).
Implement backup tools (full + incremental/differential) for all resources.
Create isolated backup resources.
Configure lifecycle and retention policies (e.g., automatic Cold transitions).
Set roles and permissions (RBAC) and rotate access keys.
Monitor the backup system and report on SLA compliance.
Regularly test data restoration (DR drills).
Review and update policies quarterly or annually to match business and technology changes.
9. Final Recommendations
Snapshots as short-term protection: Do not treat snapshots as long-term backups. Delete them immediately after the operation they were created for. Following best practices, they should not be kept longer than a few days.
Long-term backups in S3: Use object storage (e.g., S3) with versioning and lifecycle rules enabled for archival purposes. This allows backups to be securely stored for years.
Environment isolation: Separate production and backup environments at the account/project, network, and key levels. Even if production is compromised, the backup remains safe.
Regular testing and monitoring: Continuously perform data recovery tests and compare results with RPO/RTO objectives. Verify that data growth and restore speed remain within expectations. Only real recovery drills confirm the effectiveness of a backup strategy.
Cost optimization: Choose storage classes based on data age and remove unnecessary copies. Apply compression and deduplication to prevent excessive backup costs.