qvest-task/docs/backup-strategy.md
aviyadeveloper b8eb8e991c
All checks were successful
Update Automation Tests / Integration Tests (pull_request) Successful in 37s
feat: implement disaster recovery with automated restore
- Create restore.sh for automated S3 backup recovery
  - Fetches backups, stops services, restores database/data/config, restarts & validates
- Successfully tested on production system
- Document procedures in backup-strategy.md
- Add Test 6: Full backup/restore cycle with disaster simulation
- Rename test-update.sh → test-integration.sh
2026-06-11 19:27:49 +02:00

2.5 KiB

Backup Strategy

Overview

Implements the 3-2-1 rule: 3 copies of data, on 2 different storage types, with 1 offsite.

Copy Location Type Retention
1 EC2 (EBS) Block Storage Live
2 S3 Standard Object Storage 30 days
3 S3 Glacier Cold Storage 90 days

What is Backed Up

  1. PostgreSQL Database (database-*.sql.gz) - All application data, users, repos metadata
  2. Gitea Data (gitea-data-*.tar.gz) - Git repositories, LFS objects, attachments, SSH keys
  3. Configuration (config-*.tar.gz) - docker-compose.yml, nginx configs, .env, scripts

Backup Schedule

Type Frequency Time Script
Automated Daily 02:00 UTC /opt/gitea/scripts/backup.sh
Pre-Update Before updates Variable Called by update scripts
Manual On-demand N/A Run backup.sh manually

Location: s3://qvest-task-backups/backups/

Retention & Lifecycle

Day 1-30:  S3 Standard (instant access)
Day 31-90: S3 Glacier (retrieval: minutes to hours)
Day 90+:   Automatically deleted

Managed by Terraform (terraform/storage.tf). S3 versioning enabled with 30-day noncurrent version expiration.

Restore Procedures

Quick Restore

# List available backups
sudo /opt/gitea/scripts/restore.sh

# Restore specific backup
sudo /opt/gitea/scripts/restore.sh <timestamp>
# Example: sudo /opt/gitea/scripts/restore.sh 20260611_164408

The script will:

  1. Prompt for confirmation
  2. Download backups from S3
  3. Stop services
  4. Restore database, data, and configuration
  5. Restart and verify services

Disaster Recovery Scenarios

Database Corruption

Solution: Database-only restore

Repository Deletion

Solution: Full restore (database + data must match)

Complete Instance Failure

Solution: Rebuild infrastructure + restore
Steps:

  1. terraform apply
  2. ansible-playbook site.yml
  3. restore.sh
  4. Update DNS if needed

Security

  • Encryption: S3 server-side AES-256 encryption enabled
  • Access: EC2 IAM role with S3FullAccess (consider tightening to bucket-specific)
  • Data Sensitivity: Backups contain passwords, SSH keys, API tokens - restrict S3 bucket access

⚠️ Note: .env file with secrets is included in config backups. Secure S3 bucket appropriately.

Document History

Version Date Changes
1.0 2026-06-11 Initial backup strategy