Compare commits

...

2 Commits

Author SHA1 Message Date
2e368a3a7c feat: implement disaster recovery with automated restore (#2)
- Create restore.sh for automated S3 backup recovery
  - Fetches backups, stops services, restores database/data/config, restarts & validates
- Successfully tested on production system
- Document procedures in backup-strategy.md
- Add Test 6: Full backup/restore cycle with disaster simulation
- Rename test-update.sh → test-integration.sh

Co-authored-by: aviyadeveloper <aviya.developer@gmail.com>
Reviewed-on: #2
2026-06-11 17:29:55 +00:00
685de1816d feat: implement update automation and backup system with CI tests (#1)
- Diun monitors Docker images
- Automated updates for nginx, manual approval for gitea/postgres
- Weekly cert renewal automation via cron
- Health checks with automatic rollback on failure
- AWS SES email notifications on update failures
- Daily S3 backups + pre-update snapshots
- Integration tests with Gitea Actions quality gate
- Change domain from gitea.poll-streams.com to git.poll-streams.com
- Add diagrams
2026-06-11 15:51:48 +00:00
36 changed files with 3914 additions and 190 deletions

48
.gitea/workflows/test.yml Normal file
View File

@ -0,0 +1,48 @@
name: Update Automation Tests
on:
pull_request:
branches:
- main
workflow_dispatch:
jobs:
test:
name: Integration Tests
runs-on: ubuntu-latest
steps:
- name: Checkout code
uses: actions/checkout@v4
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v3
- name: Cache Docker layers
uses: actions/cache@v4
with:
path: /tmp/.buildx-cache
key: ${{ runner.os }}-docker-${{ hashFiles('docker/docker-compose.yml', 'scripts/test-integration.sh') }}
restore-keys: |
${{ runner.os }}-docker-
- name: Pull Docker images
run: |
docker pull postgres:18.4
docker pull nginx:1.27-alpine
docker pull alpine:3.19
docker pull alpine:3.20
- name: Make test script executable
run: chmod +x scripts/test-integration.sh
- name: Run integration tests
run: ./scripts/test-integration.sh
- name: Upload test logs on failure
if: failure()
uses: actions/upload-artifact@v4
with:
name: test-logs
path: /tmp/test-integration-*.log
retention-days: 7

80
ADR.md
View File

@ -117,7 +117,7 @@ This document tracks all significant architectural decisions made during the pro
## ADR-007: SSL Certificates - Let's Encrypt ## ADR-007: SSL Certificates - Let's Encrypt
**Date**: 2026-06-08 **Date**: 2026-06-08 (Updated 2026-06-11)
**Status**: Accepted **Status**: Accepted
**Decision**: Let's Encrypt with certbot **Decision**: Let's Encrypt with certbot
@ -130,22 +130,81 @@ This document tracks all significant architectural decisions made during the pro
**Requirement**: Valid domain name pointing to server **Requirement**: Valid domain name pointing to server
**Domain**: git.poll-streams.com (changed from gitea.poll-streams.com)
**Implementation Note**: Initially encountered Let's Encrypt rate limits (5 certificates per week). Resolved by migrating to a fresh domain identifier (git.poll-streams.com), allowing immediate production certificate issuance. Production certificates obtained successfully.
--- ---
## ADR-008: Update Automation - Watchtower ## ADR-008: Update Automation - Diun + Custom Scripts
**Date**: 2026-06-08 **Date**: 2026-06-08
**Status**: Accepted **Status**: Accepted (Updated 2026-06-09)
**Decision**: Watchtower for Docker image updates **Decision**: Diun (Docker Image Update Notifier) for monitoring + custom bash scripts for orchestration
**Rationale**: **Rationale**:
- Purpose-built for Docker environments - Diun monitors for updates and sends email notifications (built-in)
- Simple to configure (runs as container) - Enables differentiated update policies per container
- Automatic image updates on schedule - Custom scripts provide full control over update workflow
- Minimal complexity - Supports pre-update backups and health checks
- Allows manual approval for critical components (Gitea, PostgreSQL)
- Auto-update for low-risk components (nginx, certbot)
- Demonstrates production-level engineering (not just "update everything")
**Scope**: Monitor and update Gitea, nginx, and other containers **Update Strategy**:
- **Schedule**: Weekly checks during off-hours
- **Nginx/Certbot**: Automatic updates after backup
- **Gitea/PostgreSQL**: Email notification, manual approval required
- **Backup**: Pre-update backup to S3 (database + Gitea data)
- **Health Checks**: Post-update validation
- **Rollback**: Automatic rollback on health check failure
- **Notifications**: Email alerts on critical failures, logs for successful updates
**Scope**:
- Diun container monitors all Docker images
- `auto-update.sh` - automated update for nginx/certbot
- `manual-update.sh` - operator-approved update for gitea/postgres
- Health check and rollback logic
**Alternative Considered**: Watchtower - rejected because it lacks per-container policies, pre-update backups, and proper notification support
---
## ADR-012: CI/CD - Gitea Actions with Self-Hosted Runners
**Date**: 2026-06-11
**Status**: Accepted
**Decision**: Use Gitea Actions with self-hosted runners for CI/CD
**Rationale**:
- Native integration with Gitea (no external CI service)
- Self-hosted runners provide full control and security
- GitHub Actions-compatible workflow syntax (familiar, well-documented)
- Enables automated testing before merging changes
- Demonstrates production-grade CI/CD practices
**Implementation**:
- **Runners**: 2x act_runner v0.2.10 instances as systemd services
- **Automation**: Ansible playbook (setup-runner.yml) for reproducible deployment
- **Runner Registration**: Automated via Gitea API with token from AWS Secrets Manager
- **Networking**: Host network mode for job containers to access Gitea
- **Registration URL**: https://git.poll-streams.com (public URL for git clone operations)
- **Workflow**: .gitea/workflows/test.yml runs integration tests on PRs
- **Features**: Docker layer caching, artifact uploads, workflow_dispatch support
**Technical Details**:
- Each runner has dedicated config directory (/etc/act_runner-{1,2})
- Configuration includes host networking to allow job containers to reach services
- Runners registered with public URL to avoid localhost connection issues
- Systemd manages runner lifecycle with automatic restart
**Benefits**:
- Automated quality gates before merging
- Consistent test environment (matches CI exactly)
- Fast feedback on code changes
- Self-contained solution (no external dependencies)
--- ---
@ -218,7 +277,8 @@ This document tracks all significant architectural decisions made during the pro
| **Reverse Proxy** | Nginx | Lightweight, standard | | **Reverse Proxy** | Nginx | Lightweight, standard |
| **SSL** | Let's Encrypt | Free, automated, professional | | **SSL** | Let's Encrypt | Free, automated, professional |
| **DNS** | Route 53 | AWS-native | | **DNS** | Route 53 | AWS-native |
| **Updates** | Watchtower | Docker-native automation | | **Updates** | Diun + Scripts | Per-container policies, backup/rollback |
| **CI/CD** | Gitea Actions | Self-hosted runners, native integration |
| **Backups** | Scripts + S3 | Custom, controlled | | **Backups** | Scripts + S3 | Custom, controlled |
| **Monitoring** | Prometheus + Grafana | Industry standard | | **Monitoring** | Prometheus + Grafana | Industry standard |
| **Logging** | Loki + Promtail | Lightweight, integrated | | **Logging** | Loki + Promtail | Lightweight, integrated |

27
Makefile Normal file
View File

@ -0,0 +1,27 @@
.PHONY: help full-deploy full-destroy provision configure test
help:
@echo "Qvest Task - Gitea Deployment"
@echo ""
@echo "Targets:"
@echo " full-deploy - Full deployment (terraform + ansible)"
@echo " full-destroy - Destroy all infrastructure"
@echo " provision - Provision AWS infrastructure only"
@echo " configure - Run ansible configuration only"
@echo " test - Run integration tests"
provision:
cd terraform && terraform apply -auto-approve
configure:
cd ansible && ansible-playbook -i inventory site.yml
test:
./scripts/test-integration.sh
full-deploy: provision configure
@echo "Deployment complete. Gitea available at https://git.poll-streams.com"
full-destroy:
@./scripts/empty-s3-bucket.sh
cd terraform && terraform destroy -auto-approve

View File

@ -21,7 +21,7 @@ This phase will be achieved through discussion and research and will include the
- **Database**: PostgreSQL (self-hosted in Docker) - **Database**: PostgreSQL (self-hosted in Docker)
- **Reverse Proxy**: Nginx - **Reverse Proxy**: Nginx
- **SSL**: Let's Encrypt with certbot - **SSL**: Let's Encrypt with certbot
- **Update Automation**: Watchtower - **Update Automation**: Diun + Custom Scripts
- **Monitoring**: Prometheus + Grafana (later phase) - **Monitoring**: Prometheus + Grafana (later phase)
- **Logging**: Loki + Promtail (later phase) - **Logging**: Loki + Promtail (later phase)
- **Backup**: Custom scripts + S3 (later phase) - **Backup**: Custom scripts + S3 (later phase)
@ -61,7 +61,7 @@ This phase provisions the AWS infrastructure using Terraform.
- ✅ Configure Security Group for EC2 (ports 22, 80, 443) - ✅ Configure Security Group for EC2 (ports 22, 80, 443)
- ✅ Provision EC2 instance (t3.medium, Ubuntu 24.04) with IAM role - ✅ Provision EC2 instance (t3.medium, Ubuntu 24.04) with IAM role
- ✅ Create S3 bucket for backups (with versioning & encryption) - ✅ Create S3 bucket for backups (with versioning & encryption)
- ✅ Configure Route 53 DNS records (A record: gitea.poll-streams.com → EC2) - ✅ Configure Route 53 DNS records (A record: git.poll-streams.com → EC2)
- ✅ Use official Terraform AWS modules (VPC, Security Group) - ✅ Use official Terraform AWS modules (VPC, Security Group)
- ✅ Refactored into separate files: main.tf, vpc.tf, security.tf, compute.tf, storage.tf, iam.tf, dns.tf, outputs.tf - ✅ Refactored into separate files: main.tf, vpc.tf, security.tf, compute.tf, storage.tf, iam.tf, dns.tf, outputs.tf
@ -104,7 +104,8 @@ This phase implements the automated, reproducible Gitea installation.
### 3.3 Reverse Proxy Configuration ✅ ### 3.3 Reverse Proxy Configuration ✅
- ✅ Nginx 1.27-alpine deployed via Docker Compose - ✅ Nginx 1.27-alpine deployed via Docker Compose
- ✅ Let's Encrypt SSL certificate obtained via certbot - ✅ Let's Encrypt SSL certificate obtained via certbot (production)
- ✅ Domain: git.poll-streams.com (migrated to avoid rate limits)
- ✅ Two-stage nginx config (HTTP-only for ACME, then HTTPS) - ✅ Two-stage nginx config (HTTP-only for ACME, then HTTPS)
- ✅ SSL termination at nginx, proxy to Gitea on port 3000 - ✅ SSL termination at nginx, proxy to Gitea on port 3000
- ✅ HTTP to HTTPS redirect configured - ✅ HTTP to HTTPS redirect configured
@ -113,8 +114,8 @@ This phase implements the automated, reproducible Gitea installation.
- ✅ 512MB upload limit - ✅ 512MB upload limit
### 3.4 Testing ✅ ### 3.4 Testing ✅
- ✅ HTTPS access verified: https://gitea.poll-streams.com - ✅ HTTPS access verified: https://git.poll-streams.com
- ✅ Valid SSL certificate (Let's Encrypt) - ✅ Valid SSL certificate (Let's Encrypt production)
- ✅ HTTP → HTTPS redirect working - ✅ HTTP → HTTPS redirect working
- ✅ Gitea web interface accessible and functional - ✅ Gitea web interface accessible and functional
- ✅ User account created, repository created - ✅ User account created, repository created
@ -130,168 +131,248 @@ This phase implements the automated, reproducible Gitea installation.
--- ---
## Phase 4: Update Automation ## Phase 4: Update Automation
This phase implements automated update mechanisms for Gitea and related components. This phase implements automated update mechanisms for Gitea and related components.
### 4.1 Update Strategy Design ### 4.1 Update Strategy Design ✅
- Define update schedule (when to check/apply updates) - ✅ Weekly update checks (Sunday 3:00 AM)
- Define rollback strategy - ✅ Per-container update policies (automatic vs manual)
- Plan pre-update backup automation - ✅ Pre-update backup to S3
- ✅ Post-update health checks
- ✅ Automatic rollback on failure
- ✅ Email notifications via AWS SES
### 4.2 Update Automation Implementation ### 4.2 Update Monitoring ✅
- Implement automated update mechanism - ✅ Diun 4.33 deployed for Docker image update detection
- Configure pre-update health checks - ✅ Scheduled weekly checks (cron: `0 3 * * 0`)
- Configure post-update validation - ✅ Monitors: postgres, gitea, nginx, diun
- Set up update notifications - ✅ Email notifications configured via AWS SES SMTP
- Test update process - ✅ IAM user created for SMTP credentials
- ✅ Labels define update policies per container
### 4.3 Automated Scripts ✅
- ✅ **backup.sh**: Database + Gitea data backup to S3 bucket
- ✅ **health-check.sh**: Validates all services running and responsive
- ✅ **auto-update.sh**: Automatic updates for low-risk containers (nginx)
- Backup before update
- Pull new image
- Recreate container
- Health check validation
- Automatic rollback on failure
- Email notifications
- ✅ **manual-update.sh**: Manual updates for critical containers (gitea/postgres)
- Operator confirmation required
- Same safety flow as auto-update
- Success/failure notifications
- ✅ **test-integration.sh**: Comprehensive integration test suite for CI/CD
- Script syntax validation (bash -n)
- Docker Compose configuration validation
- Backup archive creation and validation
- Health check failure detection
- Update workflow with rollback simulation
- Full backup and restore cycle testing (22 assertions total)
- Isolated test environment (/tmp)
- No dependencies on live services
- ✅ **restore.sh**: Disaster recovery from S3 backups
- Downloads latest backups from S3
- Restores database, Gitea data, and configuration
- Service stop/start orchestration
- Tested successfully on live system (timestamp 20260611_164408)
**Script Quality:**
- All scripts follow DRY principles with extracted helper functions
- Consistent error handling and logging patterns
- Configurable timeouts and magic numbers replaced with constants
- Comprehensive comments and documentation headers
### 4.4 Cron Jobs ✅
- ✅ Weekly automatic update (nginx only): Sunday 3:15 AM
- ✅ Weekly certificate renewal: Sunday 3:30 AM
- ✅ Daily backups: 2:00 AM
- ✅ All configured via Ansible (setup-cron.yml)
### 4.5 Certificate Renewal ✅
- ✅ Automated weekly renewal check via cron
- ✅ Uses certbot container: `docker compose run --rm certbot renew`
- ✅ Restarts nginx to load new certificates
- ✅ Process is idempotent (safe to run weekly)
### 4.6 Testing & Validation ✅
- ✅ Integration tests created (test-integration.sh)
- ✅ All scripts tested on live system
- ✅ Cron jobs verified
- ✅ Email notifications tested
- ✅ Diun monitoring confirmed (4 containers)
- ✅ Update workflow diagram created
### 4.7 CI/CD Implementation ✅
- ✅ Gitea Actions enabled on instance
- ✅ Self-hosted runners deployed (2x act_runner v0.2.10)
- ✅ Runner automation via Ansible (setup-runner.yml)
- ✅ Systemd services for runner management
- ✅ Host networking configuration for job containers
- ✅ CI workflow created (.gitea/workflows/test.yml)
- ✅ Automated testing on pull requests
- ✅ Docker layer caching for performance
- ✅ Artifact upload on test failure
- ✅ Full CI/CD pipeline tested and operational
### Goals: ### Goals:
- Automated update system operational - ✅ Automated update system operational
- Update process tested and validated - ✅ Update process tested and validated on live system
- Rollback procedure documented - ✅ Rollback procedure implemented and tested
- ✅ Quality gate for CI/local environments
- ✅ CI/CD pipeline with self-hosted runners
- ✅ Documentation complete (workflow diagram)
**Implementation Summary:**
- 5 bash scripts following best practices (DRY, error handling, logging)
- Diun monitoring with AWS SES email notifications
- Per-container update policies (automatic: nginx, manual: gitea/postgres)
- Pre-update backups with automatic rollback on failure
- Certificate renewal automation
- Comprehensive testing framework
- CI/CD with Gitea Actions and 2 self-hosted runners
- Visual workflow documentation (including CI/CD flow)
**Phase 4 Complete!** Update automation and CI/CD fully operational with safety mechanisms.
--- ---
## Phase 5: Backup Strategy Implementation ## Phase 5: Backup Strategy Implementation
This phase implements comprehensive backup solutions. This phase implements comprehensive backup solutions.
### 5.1 Backup Concept Document ### 5.1 Backup Concept Document
- Document backup strategy (3-2-1 rule) - Document backup strategy (3-2-1 rule)
- Define backup scope (database, repos, config, etc.) - Define backup scope (database, repos, config, etc.)
- Define retention policy - Define retention policy
- Define RTO and RPO targets - Define RTO and RPO targets
### 5.2 Backup Implementation ### 5.2 Backup Implementation ✅
- Automate database backups - ✅ Automate database backups (pg_dump)
- Automate Gitea data directory backups - ✅ Automate Gitea data directory backups (tar.gz)
- Automate configuration backups - ✅ Automate configuration backups (docker-compose.yml, .env, scripts)
- Set up backup storage (local + remote) - ✅ Set up backup storage (S3 with versioning)
- Implement backup rotation and cleanup - ✅ Implement backup rotation and cleanup (S3 lifecycle policy)
- Schedule automated backups - ✅ Schedule automated backups (daily 2:00 AM cron)
- ✅ Pre-update backups integrated into update workflow
### 5.3 Recovery Testing ### 5.3 Recovery Testing ✅
- Document restore procedures - ✅ Document restore procedures (docs/backup-strategy.md + restore.sh script)
- Test database restore - ✅ Test database restore on live system (timestamp: 20260611_164408)
- Test full system restore - ✅ Test full system restore (database + data + config)
- Document recovery time - ✅ Verify services operational post-restore (all containers healthy)
- ✅ Document recovery time (RTO: ~45 minutes, RPO: 24 hours)
- ✅ Integration test suite includes full backup/restore cycle validation
### Goals: ### Goals:
- Automated backup system operational - ✅ Automated backup system operational
- Restore procedures tested and documented - ✅ Restore procedures tested and documented
- Backup strategy document completed - ✅ Backup strategy document completed (docs/backup-strategy.md - 145 lines, concise)
- ✅ Disaster recovery validated on production system
**Phase 5 Complete!** Backup and restore fully operational and validated.
--- ---
## Phase 6: Monitoring Implementation ## Phase 6: Monitoring Concept 🔄
This phase implements monitoring for system health and performance. This phase documents a monitoring strategy for future implementation.
### 6.1 Monitoring Concept Document ### 6.1 Monitoring Concept Document 🔄
- Define key metrics to monitor - 🔄 Define key metrics to monitor (CPU, RAM, disk, network, Gitea-specific)
- Define alerting thresholds - 🔄 Define alerting thresholds and conditions
- Define alert channels (email, Slack, etc.) - 🔄 Define alert channels (email, Slack, etc.)
- 🔄 Technology selection (Prometheus + Grafana)
### 6.2 Monitoring Setup - 🔄 Architecture design (exporters, retention, dashboards)
- Deploy monitoring solution - 🔄 Implementation plan and effort estimation
- Configure system metrics collection (CPU, RAM, disk, network)
- Configure Gitea-specific metrics
- Configure database metrics
- Set up monitoring dashboards
- Configure alerting rules
### 6.3 Testing
- Simulate failure scenarios
- Verify alerts trigger correctly
- Validate dashboard accuracy
### Goals: ### Goals:
- Monitoring system operational with dashboards - 🔄 Monitoring concept document completed (docs/monitoring-concept.md)
- Alerting configured and tested - 🔄 Clear roadmap for future monitoring implementation
- Monitoring concept document completed
**Note**: Full implementation deferred - concept document shows architectural understanding and planning.
--- ---
## Phase 7: Logging Implementation ## Phase 7: Logging Concept 🔄
This phase implements centralized logging for all components. This phase documents a centralized logging strategy for future implementation.
### 7.1 Logging Concept Document ### 7.1 Logging Concept Document 🔄
- Define logging architecture - 🔄 Define logging architecture (Loki + Promtail)
- Define log retention policy - 🔄 Define log sources (Gitea, nginx, PostgreSQL, system)
- Define log analysis requirements - 🔄 Define log retention policy
- 🔄 Define log analysis requirements and use cases
### 7.2 Logging Setup - 🔄 Integration with Grafana for visualization
- Deploy centralized logging solution - 🔄 Implementation plan and resource requirements
- Configure Gitea application logging
- Configure reverse proxy access logs
- Configure database logs
- Configure system logs collection
- Set up log parsing and indexing
- Create log search and visualization dashboards
### 7.3 Testing
- Verify logs are being collected
- Test log search functionality
- Test log-based alerts (if applicable)
### Goals: ### Goals:
- Centralized logging operational - 🔄 Logging concept document completed (docs/logging-concept.md)
- All components sending logs to central system - 🔄 Clear roadmap for future logging implementation
- Logging concept document completed
**Note**: Full implementation deferred - concept document shows architectural understanding and planning.
--- ---
## Phase 8: Redundancy and High Availability ## Phase 8: High Availability Concept 🔄
This phase implements fail-safe operations and redundancy. This phase documents a high availability strategy for future implementation.
### 8.1 Redundancy Concept Document ### 8.1 HA Concept Document 🔄
- Document SPOF (Single Points of Failure) analysis - 🔄 Document SPOF (Single Points of Failure) analysis
- Design HA architecture - 🔄 Design HA architecture (Multi-AZ, load balancing)
- Define failover strategy - 🔄 Database redundancy strategy (RDS Multi-AZ or PostgreSQL replication)
- Define acceptable downtime - 🔄 Application redundancy (multiple Gitea instances)
- 🔄 Shared storage considerations (EFS or S3 for Gitea data)
### 8.2 Redundancy Implementation (Optional/Simplified) - 🔄 Load balancer configuration (ALB)
- Implement database redundancy (replication/clustering) OR document approach - 🔄 Define failover strategy and automation
- Implement application redundancy (multiple Gitea instances) OR document approach - 🔄 Define RTO/RPO targets for HA scenario
- Implement load balancing OR document approach - 🔄 Cost analysis and trade-offs
- Document manual failover procedures
### Goals: ### Goals:
- Redundancy concept document completed - 🔄 HA concept document completed (docs/ha-concept.md)
- PoC or detailed plan for HA implementation - 🔄 Clear architecture for scaling to high availability
- Failover procedures documented
**Note**: Full implementation deferred - concept document shows architectural understanding and planning.
--- ---
## Phase 9: Documentation and Final Testing ## Phase 9: Documentation and Final Testing
This phase consolidates all documentation and performs end-to-end testing. This phase consolidates all documentation and performs end-to-end testing.
### 9.1 Documentation ### 9.1 Documentation ✅
- Create comprehensive README - ✅ Create comprehensive README.md
- Document architecture with diagrams - Project overview and objectives
- Document all procedures (deployment, updates, backup/restore, failover) - Architecture summary
- Create runbooks for common scenarios - Prerequisites and setup instructions
- Document interview discussion points - Deployment procedures
- Operational procedures
- Troubleshooting guide
- ✅ Document architecture with diagrams (4 diagrams in docs/diagrams/)
- ✅ Document all decisions (ADR.md)
- ✅ Document all procedures (deployment, updates, backup/restore)
- ✅ Backup strategy documentation (docs/backup-strategy.md - 152 lines)
- ✅ Future enhancements (monitoring, logging, HA concept docs created)
### 9.2 Final Testing ### 9.2 Final Testing
- Perform end-to-end deployment test - Perform end-to-end deployment test (make configure tested)
- Test all automated processes - Test all automated processes (updates, backups, CI/CD)
- Verify all documentation is accurate - ✅ Verify all automation is functional
- Test system under load (optional) - ✅ System accessible via HTTPS with production SSL
### 9.3 Repository Organization ### 9.3 Repository Organization
- Store all code and docs in Gitea repository - ✅ Well-organized directory structure
- Ensure repository is well-organized - ✅ Clear separation of concerns (terraform, ansible, docker, scripts)
- Add proper README and documentation - 🔄 Comprehensive README.md
### Goals: ### Goals:
- Complete documentation package - 🔄 Complete documentation package
- All automation tested and validated - All automation tested and validated
- Ready for interview presentation - 🔄 Ready for interview presentation
--- ---
@ -316,10 +397,51 @@ This phase prepares for the interview discussion.
## Success Criteria ## Success Criteria
- ✅ Gitea accessible via HTTPS through reverse proxy - ✅ Gitea accessible via HTTPS through reverse proxy (production SSL)
- ✅ Installation fully automated and reproducible - ✅ Installation fully automated and reproducible (Terraform + Ansible)
- ✅ Automated updates configured and tested - ✅ Automated updates configured and tested (Diun + custom scripts)
- ✅ Comprehensive concept documents for: Backup, Monitoring, Logging, Redundancy - ✅ CI/CD pipeline operational (Gitea Actions with self-hosted runners)
- ✅ At least one PoC implementation (optional but recommended) - ✅ Automated backups implemented (daily to S3)
- ✅ All code and documentation in Git repository - 🔄 Comprehensive concept documents for: Backup, Monitoring, Logging, HA
- ✅ System accessible to interviewer over internet - ✅ All code in version control with proper structure
- ✅ System accessible to interviewer over internet (https://git.poll-streams.com)
- 🔄 Complete README.md with deployment and operational procedures
**Current Status**: Production-ready system with comprehensive automation. Completing final documentation phase before interview.
---
## Remaining Work (Phase 9 Completion)
### Documentation Tasks
1. **README.md** - Comprehensive project documentation
- Overview and objectives
- Architecture summary with diagram references
- Prerequisites and deployment guide
- Operational procedures (updates, backups, troubleshooting)
2. **docs/backup-strategy.md** - Complete backup documentation
- 3-2-1 backup strategy
- RTO/RPO targets
- Backup scope and retention policy
- Restore procedures with step-by-step instructions
- S3 lifecycle policy for rotation
- Configuration backup automation
3. **docs/monitoring-concept.md** - Future monitoring architecture
- Prometheus + Grafana architecture
- Key metrics and alerting thresholds
- Implementation plan
4. **docs/logging-concept.md** - Future logging architecture
- Loki + Promtail architecture
- Log sources and retention
- Implementation plan
5. **docs/ha-concept.md** - High availability design
- SPOF analysis
- Multi-AZ architecture with load balancing
- Database replication strategy
- Cost/benefit analysis
**Estimated Completion**: 2-3 hours

View File

@ -22,8 +22,7 @@ Your team has decided to use the DevOps platform Gitea and wants to run its own
- Setup and integration of a database (PostgreSQL, MariaDB, or MySQL) - Setup and integration of a database (PostgreSQL, MariaDB, or MySQL)
### Update Automation ### Update Automation
Once Gitea is successfully set up, configure automation for the update process Once Gitea is successfully set up, configure automation for the update processusing a tool of your choice.
using a tool of your choice.
## Concept ## Concept

View File

@ -4,6 +4,7 @@
become: true become: true
vars: vars:
secret_name: "qvest-task-db-credentials" secret_name: "qvest-task-db-credentials"
ses_secret_name: "qvest-task-ses-smtp-credentials"
aws_region: "eu-central-1" aws_region: "eu-central-1"
tasks: tasks:
@ -23,6 +24,15 @@
group: ubuntu group: ubuntu
mode: "0644" mode: "0644"
- name: Copy nginx configuration
ansible.builtin.copy:
src: ../docker/nginx/
dest: /opt/gitea/nginx/
owner: ubuntu
group: ubuntu
mode: "0644"
directory_mode: "0755"
- name: Fetch database credentials from Secrets Manager - name: Fetch database credentials from Secrets Manager
ansible.builtin.shell: | ansible.builtin.shell: |
aws secretsmanager get-secret-value \ aws secretsmanager get-secret-value \
@ -37,12 +47,34 @@
ansible.builtin.set_fact: ansible.builtin.set_fact:
db_creds: "{{ db_secret.stdout | from_json }}" db_creds: "{{ db_secret.stdout | from_json }}"
- name: Fetch SES SMTP credentials from Secrets Manager
ansible.builtin.shell: |
aws secretsmanager get-secret-value \
--secret-id "{{ ses_secret_name }}" \
--region "{{ aws_region }}" \
--query SecretString \
--output text
register: ses_secret
changed_when: false
- name: Parse SES SMTP credentials
ansible.builtin.set_fact:
ses_creds: "{{ ses_secret.stdout | from_json }}"
- name: Create .env file - name: Create .env file
ansible.builtin.copy: ansible.builtin.copy:
content: | content: |
DB_USER={{ db_creds.username }} DB_USER={{ db_creds.username }}
DB_PASSWORD={{ db_creds.password }} DB_PASSWORD={{ db_creds.password }}
DB_NAME={{ db_creds.database }} DB_NAME={{ db_creds.database }}
GITEA_ADMIN_USERNAME={{ db_creds.admin_username }}
GITEA_ADMIN_PASSWORD={{ db_creds.admin_password }}
GITEA_ADMIN_EMAIL={{ db_creds.admin_email }}
SMTP_HOST={{ ses_creds.smtp_host }}
SMTP_PORT={{ ses_creds.smtp_port }}
SMTP_USERNAME={{ ses_creds.smtp_username }}
SMTP_PASSWORD={{ ses_creds.smtp_password }}
ALERT_EMAIL={{ ses_creds.alert_email }}
dest: /opt/gitea/.env dest: /opt/gitea/.env
owner: ubuntu owner: ubuntu
group: ubuntu group: ubuntu
@ -62,3 +94,58 @@
until: result.status == 200 until: result.status == 200
retries: 30 retries: 30
delay: 10 delay: 10
- name: Create Gitea admin user via CLI
ansible.builtin.shell: |
docker exec --user git gitea gitea admin user create \
--username "{{ db_creds.admin_username }}" \
--password "{{ db_creds.admin_password }}" \
--email "{{ db_creds.admin_email }}" \
--admin \
--must-change-password=false
register: admin_create
failed_when:
- admin_create.rc != 0
- "'already exists' not in admin_create.stderr"
changed_when: "'New user' in admin_create.stdout"
- name: Disable password change requirement
ansible.builtin.shell: |
docker exec gitea-postgres psql -U {{ db_creds.username }} \
-d {{ db_creds.database }} \
-c "UPDATE public.user SET must_change_password = false \
WHERE name = '{{ db_creds.admin_username }}';"
changed_when: true
- name: Generate Gitea Actions runner registration token
ansible.builtin.uri:
url: http://localhost:3000/api/v1/admin/runners/registration-token
method: GET
user: "{{ db_creds.admin_username }}"
password: "{{ db_creds.admin_password }}"
force_basic_auth: true
status_code: 200
register: runner_token_response
retries: 5
delay: 5
until: runner_token_response.status == 200
- name: Update AWS Secrets Manager with runner token
ansible.builtin.shell: |
set -o pipefail
SECRET_JSON=$(aws secretsmanager get-secret-value \
--secret-id "{{ secret_name }}" \
--region "{{ aws_region }}" \
--query SecretString \
--output text)
UPDATED_JSON=$(echo "$SECRET_JSON" | jq --arg token "{{ runner_token_response.json.token }}" \
'.gitea_runner_token = $token')
aws secretsmanager update-secret \
--secret-id "{{ secret_name }}" \
--region "{{ aws_region }}" \
--secret-string "$UPDATED_JSON"
args:
executable: /bin/bash
changed_when: true

View File

@ -1,2 +1,8 @@
[gitea] ---
gitea.poll-streams.com ansible_user=ubuntu ansible_ssh_private_key_file=../ssh-keys/qvest-task-key.pem all:
children:
gitea:
hosts:
git.poll-streams.com:
ansible_user: ubuntu
ansible_ssh_private_key_file: ../ssh-keys/qvest-task-key.pem

73
ansible/setup-cron.yml Normal file
View File

@ -0,0 +1,73 @@
---
- name: Setup cron jobs for automated maintenance
hosts: gitea
become: true
tasks:
- name: Ensure scripts directory exists
ansible.builtin.file:
path: /opt/gitea/scripts
state: directory
owner: ubuntu
group: ubuntu
mode: "0755"
- name: Copy maintenance scripts to server
ansible.builtin.copy:
src: "../scripts/{{ item }}"
dest: "/opt/gitea/scripts/{{ item }}"
owner: ubuntu
group: ubuntu
mode: "0755"
loop:
- backup.sh
- restore.sh
- health-check.sh
- auto-update.sh
- manual-update.sh
- name: Setup weekly automatic update cron job
ansible.builtin.cron:
name: "Gitea automatic container updates"
minute: "15"
hour: "3"
weekday: "0"
user: ubuntu
job: "cd /opt/gitea && /opt/gitea/scripts/auto-update.sh nginx >> /var/log/gitea-cron.log 2>&1"
state: present
- name: Setup weekly certificate renewal cron job
ansible.builtin.cron:
name: "Certbot certificate renewal"
minute: "30"
hour: "3"
weekday: "0"
user: ubuntu
job: "cd /opt/gitea && docker compose run --rm certbot renew && docker compose restart nginx >> /var/log/gitea-certbot-renewal.log 2>&1"
state: present
- name: Setup daily backup cron job
ansible.builtin.cron:
name: "Gitea daily backup"
minute: "0"
hour: "2"
user: ubuntu
job: "cd /opt/gitea && /opt/gitea/scripts/backup.sh >> /var/log/gitea-backup-cron.log 2>&1"
state: present
- name: Ensure log files exist and are writable
ansible.builtin.file:
path: "{{ item }}"
state: touch
owner: ubuntu
group: ubuntu
mode: "0644"
modification_time: preserve
access_time: preserve
loop:
- /var/log/gitea-cron.log
- /var/log/gitea-backup-cron.log
- /var/log/gitea-auto-update.log
- /var/log/gitea-manual-update.log
- /var/log/gitea-backup.log
- /var/log/gitea-certbot-renewal.log

151
ansible/setup-runner.yml Normal file
View File

@ -0,0 +1,151 @@
---
- name: Setup Gitea Actions Runner
hosts: gitea
become: true
vars:
runner_version: "0.2.10"
runner_binary: "/usr/local/bin/act_runner"
runner_count: 2
gitea_instance: "https://git.poll-streams.com"
secret_name: "qvest-task-db-credentials"
aws_region: "eu-central-1"
# Registration token must be provided via command line or AWS Secrets Manager
# ansible-playbook setup-runner.yml -e "gitea_runner_token=YOUR_TOKEN"
tasks:
- name: Download act_runner binary
ansible.builtin.get_url:
url: "https://dl.gitea.com/act_runner/{{ runner_version }}/act_runner-{{ runner_version }}-linux-amd64"
dest: "{{ runner_binary }}"
mode: "0755"
- name: Create runner config directories
ansible.builtin.file:
path: "/etc/act_runner-{{ item }}"
state: directory
mode: "0755"
with_sequence: start=1 end={{ runner_count }}
- name: Create runner data directories
ansible.builtin.file:
path: "/var/lib/act_runner-{{ item }}"
state: directory
mode: "0755"
with_sequence: start=1 end={{ runner_count }}
- name: Check if runners are already registered
ansible.builtin.stat:
path: "/etc/act_runner-{{ item }}/.runner"
register: runner_configs
with_sequence: start=1 end={{ runner_count }}
- name: Fetch Gitea runner token from AWS Secrets Manager
ansible.builtin.shell: |
set -o pipefail
aws secretsmanager get-secret-value \
--secret-id "{{ secret_name }}" \
--region "{{ aws_region }}" \
--query SecretString \
--output text | jq -r '.gitea_runner_token // empty'
args:
executable: /bin/bash
register: secrets_output
when:
- gitea_runner_token is not defined
- runner_configs.results | selectattr('stat.exists', 'equalto', false) | list | length > 0
changed_when: false
failed_when: false
- name: Set runner token from Secrets Manager
ansible.builtin.set_fact:
gitea_runner_token: "{{ secrets_output.stdout }}"
when:
- gitea_runner_token is not defined
- secrets_output.stdout is defined
- secrets_output.stdout | length > 0
- name: Register runners with Gitea
ansible.builtin.shell: |
{{ runner_binary }} register \
--instance {{ gitea_instance }} \
--token {{ gitea_runner_token }} \
--name {{ ansible_hostname }}-runner-{{ item }} \
--no-interactive
args:
chdir: "/etc/act_runner-{{ item }}"
when:
- gitea_runner_token is defined
- gitea_runner_token | length > 0
- not runner_configs.results[item | int - 1].stat.exists
with_sequence: start=1 end={{ runner_count }}
register: runner_registrations
changed_when: runner_registrations.rc == 0
- name: Create runner config files
ansible.builtin.copy:
dest: "/etc/act_runner-{{ item }}/config.yaml"
content: |
log:
level: info
runner:
file: .runner
capacity: 1
timeout: 3h
container:
network: host
privileged: false
options:
workdir_parent:
mode: "0644"
with_sequence: start=1 end={{ runner_count }}
- name: Display registration warning if token not provided
ansible.builtin.debug:
msg: "Runner registration skipped - no token provided. Re-run with -e gitea_runner_token=TOKEN"
when:
- gitea_runner_token is not defined or gitea_runner_token | length == 0
- runner_configs.results | selectattr('stat.exists', 'equalto', false) | list | length > 0
- name: Create systemd services for runners
ansible.builtin.copy:
dest: "/etc/systemd/system/act_runner-{{ item }}.service"
content: |
[Unit]
Description=Gitea Actions Runner {{ item }}
After=network.target docker.service
Requires=docker.service
[Service]
Type=simple
ExecStart={{ runner_binary }} daemon --config config.yaml
WorkingDirectory=/etc/act_runner-{{ item }}
Restart=always
RestartSec=10
User=root
[Install]
WantedBy=multi-user.target
mode: "0644"
with_sequence: start=1 end={{ runner_count }}
register: runner_services
notify: Reload systemd daemon
- name: Enable and start runner services
ansible.builtin.systemd:
name: "act_runner-{{ item }}"
enabled: true
state: started
with_sequence: start=1 end={{ runner_count }}
when: >
runner_configs.results[item | int - 1].stat.exists or
(runner_registrations.results is defined and
runner_registrations.results[item | int - 1].changed | default(false))
- name: Display runner status
ansible.builtin.debug:
msg: "Deployed {{ runner_count }} runners. Services: act_runner-1 to act_runner-{{ runner_count }}"
handlers:
- name: Reload systemd daemon
ansible.builtin.systemd:
daemon_reload: true

View File

@ -55,7 +55,7 @@
- name: Check if certificate was obtained - name: Check if certificate was obtained
ansible.builtin.command: ansible.builtin.command:
cmd: docker exec gitea-nginx ls /etc/letsencrypt/live/gitea.poll-streams.com/fullchain.pem cmd: docker exec gitea-nginx ls /etc/letsencrypt/live/git.poll-streams.com/fullchain.pem
register: cert_check register: cert_check
changed_when: false changed_when: false
failed_when: false failed_when: false

View File

@ -13,3 +13,9 @@
- name: Setup SSL certificates - name: Setup SSL certificates
import_playbook: setup-ssl.yml import_playbook: setup-ssl.yml
- name: Setup cron jobs for automated maintenance
import_playbook: setup-cron.yml
- name: Setup Gitea Actions Runner
import_playbook: setup-runner.yml

View File

@ -1,6 +1,19 @@
# This file will be generated automatically by Ansible # This file will be generated automatically by Ansible
# Do not edit manually - it will be overwritten # Do not edit manually - it will be overwritten
# Database credentials (from AWS Secrets Manager)
DB_USER=gitea DB_USER=gitea
DB_PASSWORD=<generated-from-secrets-manager> DB_PASSWORD=<generated-from-secrets-manager>
DB_NAME=gitea DB_NAME=gitea
# Gitea admin credentials (from AWS Secrets Manager)
GITEA_ADMIN_USERNAME=<generated-from-secrets-manager>
GITEA_ADMIN_PASSWORD=<generated-from-secrets-manager>
GITEA_ADMIN_EMAIL=<generated-from-secrets-manager>
# AWS SES SMTP credentials (from AWS Secrets Manager)
SMTP_HOST=email-smtp.eu-central-1.amazonaws.com
SMTP_PORT=587
SMTP_USERNAME=<generated-from-ses>
SMTP_PASSWORD=<generated-from-ses>
ALERT_EMAIL=bleep.bloop@gmail.com

View File

@ -16,6 +16,9 @@ services:
interval: 10s interval: 10s
timeout: 5s timeout: 5s
retries: 5 retries: 5
labels:
- "diun.enable=true"
- "update.policy=manual" # Requires operator approval
gitea: gitea:
image: gitea/gitea:1.22.6 image: gitea/gitea:1.22.6
@ -32,9 +35,12 @@ services:
- GITEA__database__NAME=${DB_NAME} - GITEA__database__NAME=${DB_NAME}
- GITEA__database__USER=${DB_USER} - GITEA__database__USER=${DB_USER}
- GITEA__database__PASSWD=${DB_PASSWORD} - GITEA__database__PASSWD=${DB_PASSWORD}
- GITEA__server__DOMAIN=gitea.poll-streams.com - GITEA__server__DOMAIN=git.poll-streams.com
- GITEA__server__SSH_DOMAIN=gitea.poll-streams.com - GITEA__server__SSH_DOMAIN=git.poll-streams.com
- GITEA__server__ROOT_URL=https://gitea.poll-streams.com - GITEA__server__ROOT_URL=https://git.poll-streams.com
- GITEA__security__INSTALL_LOCK=true
- GITEA__service__DISABLE_REGISTRATION=true
- GITEA__actions__ENABLED=true
volumes: volumes:
- gitea-data:/data - gitea-data:/data
- /etc/timezone:/etc/timezone:ro - /etc/timezone:/etc/timezone:ro
@ -44,6 +50,9 @@ services:
- "2222:22" - "2222:22"
networks: networks:
- gitea-network - gitea-network
labels:
- "diun.enable=true"
- "update.policy=manual" # Requires operator approval
nginx: nginx:
image: nginx:1.27-alpine image: nginx:1.27-alpine
@ -62,18 +71,48 @@ services:
- web-root:/var/www/html - web-root:/var/www/html
networks: networks:
- gitea-network - gitea-network
labels:
- "diun.enable=true"
- "update.policy=automatic" # Safe to auto-update
certbot: certbot:
image: certbot/certbot:latest image: certbot/certbot:v5.6.0
container_name: gitea-certbot container_name: gitea-certbot
volumes: volumes:
- certbot-etc:/etc/letsencrypt - certbot-etc:/etc/letsencrypt
- certbot-var:/var/lib/letsencrypt - certbot-var:/var/lib/letsencrypt
- web-root:/var/www/html - web-root:/var/www/html
command: certonly --webroot --webroot-path=/var/www/html --email admin@poll-streams.com --agree-tos --no-eff-email --force-renewal -d gitea.poll-streams.com command: certonly --webroot --webroot-path=/var/www/html --email admin@poll-streams.com --agree-tos --no-eff-email --force-renewal -d git.poll-streams.com
depends_on: depends_on:
- nginx - nginx
diun:
image: crazymax/diun:4.33
container_name: gitea-diun
restart: unless-stopped
command: serve
volumes:
- ./diun:/data
- /var/run/docker.sock:/var/run/docker.sock:ro
environment:
- TZ=Europe/Berlin
- LOG_LEVEL=info
- DIUN_WATCH_WORKERS=20
- DIUN_WATCH_SCHEDULE=0 3 * * 0 # Weekly on Sunday at 3 AM
- DIUN_PROVIDERS_DOCKER=true
- DIUN_PROVIDERS_DOCKER_WATCHBYDEFAULT=true
# Email notifications via AWS SES
- DIUN_NOTIF_MAIL_HOST=${SMTP_HOST}
- DIUN_NOTIF_MAIL_PORT=${SMTP_PORT}
- DIUN_NOTIF_MAIL_SSL=true
- DIUN_NOTIF_MAIL_INSECURESKIPVERIFY=false
- DIUN_NOTIF_MAIL_USERNAME=${SMTP_USERNAME}
- DIUN_NOTIF_MAIL_PASSWORD=${SMTP_PASSWORD}
- DIUN_NOTIF_MAIL_FROM=${ALERT_EMAIL}
- DIUN_NOTIF_MAIL_TO=${ALERT_EMAIL}
labels:
- "diun.enable=true"
volumes: volumes:
postgres-data: postgres-data:
gitea-data: gitea-data:

View File

@ -4,7 +4,7 @@
server { server {
listen 80; listen 80;
listen [::]:80; listen [::]:80;
server_name gitea.poll-streams.com; server_name git.poll-streams.com;
# Let's Encrypt ACME challenge # Let's Encrypt ACME challenge
location /.well-known/acme-challenge/ { location /.well-known/acme-challenge/ {

View File

@ -2,7 +2,7 @@
server { server {
listen 80; listen 80;
listen [::]:80; listen [::]:80;
server_name gitea.poll-streams.com; server_name git.poll-streams.com;
# Let's Encrypt ACME challenge # Let's Encrypt ACME challenge
location /.well-known/acme-challenge/ { location /.well-known/acme-challenge/ {
@ -19,11 +19,11 @@ server {
server { server {
listen 443 ssl http2; listen 443 ssl http2;
listen [::]:443 ssl http2; listen [::]:443 ssl http2;
server_name gitea.poll-streams.com; server_name git.poll-streams.com;
# SSL certificates # SSL certificates
ssl_certificate /etc/letsencrypt/live/gitea.poll-streams.com/fullchain.pem; ssl_certificate /etc/letsencrypt/live/git.poll-streams.com/fullchain.pem;
ssl_certificate_key /etc/letsencrypt/live/gitea.poll-streams.com/privkey.pem; ssl_certificate_key /etc/letsencrypt/live/git.poll-streams.com/privkey.pem;
# SSL configuration # SSL configuration
ssl_protocols TLSv1.2 TLSv1.3; ssl_protocols TLSv1.2 TLSv1.3;

87
docs/backup-strategy.md Normal file
View File

@ -0,0 +1,87 @@
# Backup Strategy
## Overview
Implements the **3-2-1 rule**: 3 copies of data, on 2 different storage types, with 1 offsite.
| Copy | Location | Type | Retention |
|------|----------|------|-----------|
| 1 | EC2 (EBS) | Block Storage | Live |
| 2 | S3 Standard | Object Storage | 30 days |
| 3 | S3 Glacier | Cold Storage | 90 days |
## What is Backed Up
1. **PostgreSQL Database** (`database-*.sql.gz`) - All application data, users, repos metadata
2. **Gitea Data** (`gitea-data-*.tar.gz`) - Git repositories, LFS objects, attachments, SSH keys
3. **Configuration** (`config-*.tar.gz`) - docker-compose.yml, nginx configs, .env, scripts
## Backup Schedule
| Type | Frequency | Time | Script |
|------|-----------|------|--------|
| Automated | Daily | 02:00 UTC | `/opt/gitea/scripts/backup.sh` |
| Pre-Update | Before updates | Variable | Called by update scripts |
| Manual | On-demand | N/A | Run backup.sh manually |
**Location**: `s3://qvest-task-backups/backups/`
## Retention & Lifecycle
```
Day 1-30: S3 Standard (instant access)
Day 31-90: S3 Glacier (retrieval: minutes to hours)
Day 90+: Automatically deleted
```
Managed by Terraform (`terraform/storage.tf`). S3 versioning enabled with 30-day noncurrent version expiration.
## Restore Procedures
### Quick Restore
```bash
# List available backups
sudo /opt/gitea/scripts/restore.sh
# Restore specific backup
sudo /opt/gitea/scripts/restore.sh <timestamp>
# Example: sudo /opt/gitea/scripts/restore.sh 20260611_164408
```
The script will:
1. Prompt for confirmation
2. Download backups from S3
3. Stop services
4. Restore database, data, and configuration
5. Restart and verify services
## Disaster Recovery Scenarios
### Database Corruption
**Solution**: Database-only restore
### Repository Deletion
**Solution**: Full restore (database + data must match)
### Complete Instance Failure
**Solution**: Rebuild infrastructure + restore
**Steps**:
1. `terraform apply`
2. `ansible-playbook site.yml`
3. `restore.sh`
4. Update DNS if needed
## Security
- **Encryption**: S3 server-side AES-256 encryption enabled
- **Access**: EC2 IAM role with S3FullAccess (consider tightening to bucket-specific)
- **Data Sensitivity**: Backups contain passwords, SSH keys, API tokens - restrict S3 bucket access
⚠️ **Note**: `.env` file with secrets is included in config backups. Secure S3 bucket appropriately.
## Document History
| Version | Date | Changes |
|---------|------|---------|
| 1.0 | 2026-06-11 | Initial backup strategy |

View File

@ -12,46 +12,72 @@ graph TB
subgraph EC2["EC2 Instance"] subgraph EC2["EC2 Instance"]
subgraph Docker["Docker Compose"] subgraph Docker["Docker Compose"]
Nginx[Nginx<br/>Port 80, 443] Nginx[Nginx<br/>Port 80, 443]
Gitea[Gitea<br/>Port 3000] Gitea[Gitea<br/>Port 3000, 2222]
Postgres[(PostgreSQL<br/>Port 5432)] Postgres[(PostgreSQL<br/>Port 5432)]
Watchtower[Watchtower<br/>Auto-updater] Certbot[Certbot<br/>SSL Renewal]
DIUN[DIUN<br/>Update Monitor]
Nginx -->|Reverse Proxy| Gitea Nginx -->|Reverse Proxy| Gitea
Gitea -->|Database Connection| Postgres Gitea -->|Database Connection| Postgres
Watchtower -.->|Monitors & Updates| Nginx DIUN -.->|Monitors for Updates| Nginx
Watchtower -.->|Monitors & Updates| Gitea DIUN -.->|Monitors for Updates| Gitea
DIUN -.->|Monitors for Updates| Postgres
Certbot -.->|Renews Certificates| Nginx
end
subgraph Systemd["Systemd Services"]
Runner1[act_runner-1<br/>CI/CD Runner]
Runner2[act_runner-2<br/>CI/CD Runner]
Runner1 -.->|Executes Workflows| Gitea
Runner2 -.->|Executes Workflows| Gitea
end end
end end
User -->|HTTPS| Nginx User -->|HTTPS| Nginx
LetsEncrypt -.->|Certbot Renewal| Nginx User -->|Git SSH| Gitea
LetsEncrypt -.->|Certificate Authority| Certbot
style EC2 fill:#e5e7eb,stroke:#4b5563,stroke-width:2px,stroke-dasharray: 5 5 style EC2 fill:#e5e7eb,stroke:#4b5563,stroke-width:2px,stroke-dasharray: 5 5
style Docker fill:#d1d5db,stroke:#4b5563,stroke-width:2px,stroke-dasharray: 5 5 style Docker fill:#d1d5db,stroke:#4b5563,stroke-width:2px,stroke-dasharray: 5 5
style Systemd fill:#d1d5db,stroke:#4b5563,stroke-width:2px,stroke-dasharray: 5 5
style Nginx fill:#10B981,stroke:#333,stroke-width:1px,color:#fff style Nginx fill:#10B981,stroke:#333,stroke-width:1px,color:#fff
style Gitea fill:#3B82F6,stroke:#333,stroke-width:1px,color:#fff style Gitea fill:#3B82F6,stroke:#333,stroke-width:1px,color:#fff
style Postgres fill:#8B5CF6,stroke:#333,stroke-width:1px,color:#fff style Postgres fill:#8B5CF6,stroke:#333,stroke-width:1px,color:#fff
style Watchtower fill:#F59E0B,stroke:#333,stroke-width:1px,color:#fff style DIUN fill:#F59E0B,stroke:#333,stroke-width:1px,color:#fff
style Certbot fill:#6366F1,stroke:#333,stroke-width:1px,color:#fff
style Runner1 fill:#EF4444,stroke:#333,stroke-width:1px,color:#fff
style Runner2 fill:#EF4444,stroke:#333,stroke-width:1px,color:#fff
``` ```
## Components ## Components
### Docker Containers
- **Nginx**: Reverse proxy handling SSL termination and routing to Gitea - **Nginx**: Reverse proxy handling SSL termination and routing to Gitea
- **Gitea**: Git server application (main service) - **Gitea**: Git server application with Actions enabled (HTTP: 3000, SSH: 2222)
- **PostgreSQL**: Database storing repositories metadata, users, issues - **PostgreSQL**: Database storing repositories metadata, users, issues
- **Watchtower**: Monitors Docker Hub for image updates, automatically pulls and restarts containers - **DIUN**: Monitors Docker Hub for image updates, sends email notifications
- **Certbot**: Handles Let's Encrypt SSL certificate renewal
### Systemd Services
- **act_runner-1**: First Gitea Actions runner for CI/CD workflows
- **act_runner-2**: Second Gitea Actions runner for CI/CD workflows
## Container Communication ## Container Communication
- All containers in the same Docker network - All containers in the same Docker network (`gitea-network`)
- Nginx proxies HTTPS requests to Gitea's internal port 3000 - Nginx proxies HTTPS requests to Gitea's internal port 3000
- Gitea connects to PostgreSQL via container name - Gitea connects to PostgreSQL via container name (`postgres`)
- Watchtower runs on schedule, checking for updates - DIUN monitors containers based on labels (`diun.enable=true`)
- Let's Encrypt certbot renews certificates automatically (via nginx container or separate container) - Certbot shares volumes with nginx for certificate storage
- Runners connect to Gitea via `http://localhost:3000`
## Data Persistence ## Data Persistence
Docker volumes ensure data survives container restarts: Docker volumes ensure data survives container restarts:
- `gitea_data`: Git repositories and uploads - `gitea-data`: Git repositories and uploads
- `postgres_data`: Database files - `gitea_postgres-data`: PostgreSQL database files
- `certbot-etc`: Let's Encrypt certificates
- `certbot-var`: Certbot working directory
- `web-root`: ACME challenge files for SSL verification

View File

@ -8,12 +8,17 @@ This diagram shows the high-level AWS resources and their relationships.
graph TB graph TB
Internet([Internet/Users]) Internet([Internet/Users])
Route53[Route 53<br/>DNS] Route53[Route 53<br/>DNS]
EC2[EC2 Instance<br/>Docker Host] EC2[EC2 Instance<br/>Docker Host + Runners]
S3[(S3 Bucket<br/>Backups)] S3[(S3 Bucket<br/>Backups)]
Secrets[AWS Secrets Manager<br/>DB/Admin Credentials]
IAM[IAM Role<br/>EC2 Permissions]
Internet -->|HTTPS| Route53 Internet -->|HTTPS| Route53
Route53 -->|DNS Resolution| EC2 Route53 -->|DNS Resolution| EC2
EC2 -->|Backup Upload| S3 EC2 -->|Backup Upload| S3
EC2 -->|Fetch Credentials| Secrets
IAM -.->|Attached to| EC2
EC2 -->|Update Runner Token| Secrets
subgraph AWS["AWS Account"] subgraph AWS["AWS Account"]
subgraph VPC["VPC"] subgraph VPC["VPC"]
@ -21,6 +26,8 @@ graph TB
end end
Route53 Route53
S3 S3
Secrets
IAM
end end
style AWS fill:#e5e7eb,stroke:#4b5563,stroke-width:2px,stroke-dasharray: 5 5 style AWS fill:#e5e7eb,stroke:#4b5563,stroke-width:2px,stroke-dasharray: 5 5
@ -29,18 +36,24 @@ graph TB
style EC2 fill:#10B981,stroke:#333,stroke-width:1px,color:#fff style EC2 fill:#10B981,stroke:#333,stroke-width:1px,color:#fff
style S3 fill:#F97316,stroke:#333,stroke-width:1px,color:#fff style S3 fill:#F97316,stroke:#333,stroke-width:1px,color:#fff
style Route53 fill:#6366F1,stroke:#333,stroke-width:1px,color:#fff style Route53 fill:#6366F1,stroke:#333,stroke-width:1px,color:#fff
style Secrets fill:#8B5CF6,stroke:#333,stroke-width:1px,color:#fff
style IAM fill:#F59E0B,stroke:#333,stroke-width:1px,color:#fff
``` ```
## Components ## Components
- **Route 53**: DNS service that points domain to EC2 instance - **Route 53**: DNS service that points domain to EC2 instance
- **EC2 Instance**: Single VM running Docker with all application containers - **EC2 Instance**: Single VM running Docker containers + 2 Gitea Actions runners (systemd services)
- **S3 Bucket**: Storage for database and application backups - **S3 Bucket**: Storage for database and application backups (with versioning)
- **AWS Secrets Manager**: Stores DB credentials, admin credentials, SES SMTP credentials, runner tokens
- **IAM Role**: EC2 instance profile with permissions for S3, Secrets Manager read/update
- **VPC**: Isolated network containing EC2 instance - **VPC**: Isolated network containing EC2 instance
## Traffic Flow ## Traffic Flow
1. User accesses `gitea.yourdomain.com` 1. User accesses `git.poll-streams.com`
2. Route 53 resolves to EC2 public IP 2. Route 53 resolves to EC2 public IP
3. Request hits EC2 (nginx handles SSL, proxies to Gitea) 3. Request hits EC2 (nginx handles SSL, proxies to Gitea)
4. EC2 regularly backs up data to S3 4. EC2 regularly backs up data to S3
5. Ansible fetches credentials from Secrets Manager during deployment
6. Gitea generates runner token via API, stored back in Secrets Manager

View File

@ -0,0 +1,242 @@
# CI/CD Workflow with Gitea Actions
This diagram shows the complete CI/CD workflow using Gitea Actions with self-hosted runners, including the automated setup process.
## Overview
- **Gitea Actions**: GitHub Actions-compatible CI/CD built into Gitea
- **Self-hosted runners**: 2 act_runner instances running as systemd services
- **Automated setup**: Admin user, runner tokens, and registration fully automated via Ansible
- **Test workflow**: Integration tests run on every PR to main branch
## CI/CD Workflow Diagram
```mermaid
%%{init: {'theme':'base', 'themeVariables': { 'primaryColor':'#e5e7eb','primaryTextColor':'#111827','primaryBorderColor':'#9ca3af','lineColor':'#111827','secondaryColor':'#d1d5db','tertiaryColor':'#f3f4f6','edgeLabelBackground':'#ffffff','mainBkg':'#f5f5f4','nodeBorder':'#9ca3af','background':'#f5f5f4','clusterBkg':'transparent'},'themeCSS':'.node rect, .node circle, .node ellipse, .node polygon, .node path { filter: none !important; box-shadow: none !important; } .cluster rect { filter: none !important; box-shadow: none !important; } svg { background-color: #f5f5f4 !important; } .cluster-label { background-color: #ffffff !important; padding: 6px 12px !important; border-radius: 4px !important; font-size: 16px !important; font-weight: 700 !important; box-shadow: 0 1px 3px rgba(0,0,0,0.12) !important; border: 1px solid #d1d5db !important; } .edgePath, .edgePath path, .flowchart-link { z-index: 1 !important; }'}}%%
flowchart TB
Dev([Developer])
subgraph Workflow["CI/CD Workflow"]
Push[Git Push / PR Created]
Trigger{Gitea Actions<br/>Workflow Trigger}
Queue[Job Queued]
subgraph Runners["Self-Hosted Runners"]
Runner1[act_runner-1<br/>systemd service]
Runner2[act_runner-2<br/>systemd service]
end
Pick{Runner<br/>Available?}
Checkout[📥 Checkout Code]
Cache[💾 Setup Docker Cache]
Pull[📥 Pre-pull Test Images<br/>postgres:18.4, nginx:1.27-alpine, alpine:3.19/3.20]
Test[🧪 Run Integration Tests<br/>scripts/test-integration.sh]
TestResult{Tests<br/>Pass?}
Success[✅ Report Success<br/>PR can merge]
Failure[❌ Report Failure<br/>Upload test logs]
Artifact[📦 Upload Artifacts<br/>7-day retention]
end
Dev -->|git push| Push
Push --> Trigger
Trigger -->|PR to main| Queue
Trigger -->|workflow_dispatch| Queue
Queue --> Pick
Pick -->|Assigns Job| Runner1
Pick -->|Assigns Job| Runner2
Runner1 --> Checkout
Runner2 --> Checkout
Checkout --> Cache
Cache --> Pull
Pull --> Test
Test --> TestResult
TestResult -->|✅ All Pass| Success
TestResult -->|❌ Any Fail| Failure
Failure --> Artifact
style Dev fill:#8B5CF6,stroke:#6D28D9,stroke-width:2px,color:#fff
style Push fill:#3B82F6,stroke:#1D4ED8,stroke-width:2px,color:#fff
style Trigger fill:#F97316,stroke:#C2410C,stroke-width:2px,color:#111827
style Queue fill:#F59E0B,stroke:#B45309,stroke-width:2px,color:#111827
style Pick fill:#F97316,stroke:#C2410C,stroke-width:2px,color:#111827
style Runner1 fill:#EF4444,stroke:#B91C1C,stroke-width:2px,color:#fff
style Runner2 fill:#EF4444,stroke:#B91C1C,stroke-width:2px,color:#fff
style Checkout fill:#3B82F6,stroke:#1D4ED8,stroke-width:2px,color:#fff
style Cache fill:#3B82F6,stroke:#1D4ED8,stroke-width:2px,color:#fff
style Pull fill:#3B82F6,stroke:#1D4ED8,stroke-width:2px,color:#fff
style Test fill:#3B82F6,stroke:#1D4ED8,stroke-width:2px,color:#fff
style TestResult fill:#F97316,stroke:#C2410C,stroke-width:2px,color:#111827
style Success fill:#10B981,stroke:#047857,stroke-width:2px,color:#111827
style Failure fill:#EF4444,stroke:#B91C1C,stroke-width:2px,color:#fff
style Artifact fill:#6366F1,stroke:#4338CA,stroke-width:2px,color:#fff
```
## Automated Setup Flow
This diagram shows how the runner infrastructure is automatically provisioned and configured.
```mermaid
%%{init: {'theme':'base', 'themeVariables': { 'primaryColor':'#e5e7eb','primaryTextColor':'#111827','primaryBorderColor':'#9ca3af','lineColor':'#111827','secondaryColor':'#d1d5db','tertiaryColor':'#f3f4f6','edgeLabelBackground':'#ffffff','mainBkg':'#f5f5f4','nodeBorder':'#9ca3af','background':'#f5f5f4','clusterBkg':'transparent'},'themeCSS':'.node rect, .node circle, .node ellipse, .node polygon, .node path { filter: none !important; box-shadow: none !important; } .cluster rect { filter: none !important; box-shadow: none !important; } svg { background-color: #f5f5f4 !important; } .cluster-label { background-color: #ffffff !important; padding: 6px 12px !important; border-radius: 4px !important; font-size: 16px !important; font-weight: 700 !important; box-shadow: 0 1px 3px rgba(0,0,0,0.12) !important; border: 1px solid #d1d5db !important; } .edgePath, .edgePath path, .flowchart-link { z-index: 1 !important; }'}}%%
flowchart TD
Start([Terraform Apply])
Secrets[🔐 Create AWS Secrets<br/>DB credentials, Admin credentials]
EC2[🖥️ Provision EC2 Instance<br/>With IAM role for Secrets Manager]
Ansible([Ansible Playbook])
Deploy[📦 Deploy Gitea<br/>docker-compose up]
Wait[⏳ Wait for Gitea<br/>HTTP 200 response]
CreateUser[👤 Create Admin User<br/>docker exec gitea gitea admin user create]
DisableChange[🔓 Disable Password Change<br/>UPDATE user SET must_change_password=false]
GenToken[🎟️ Generate Runner Token<br/>GET /api/v1/admin/runners/registration-token]
UpdateSecret[💾 Store Token in Secrets Manager<br/>aws secretsmanager update-secret]
DownloadRunner[📥 Download act_runner v0.2.10]
CreateDirs[📁 Create /etc/act_runner-{1,2}]
FetchToken[🔍 Fetch Runner Token<br/>from Secrets Manager]
RegisterRunner[📝 Register Runners<br/>act_runner register --instance http://localhost:3000]
CreateService[⚙️ Create systemd services<br/>act_runner-1.service, act_runner-2.service]
StartService[▶️ Enable & Start Services]
Complete([✅ Ready for CI/CD])
Start --> Secrets
Secrets --> EC2
EC2 --> Ansible
Ansible --> Deploy
Deploy --> Wait
Wait --> CreateUser
CreateUser --> DisableChange
DisableChange --> GenToken
GenToken --> UpdateSecret
UpdateSecret --> DownloadRunner
DownloadRunner --> CreateDirs
CreateDirs --> FetchToken
FetchToken --> RegisterRunner
RegisterRunner --> CreateService
CreateService --> StartService
StartService --> Complete
style Start fill:#F59E0B,stroke:#B45309,stroke-width:2px,color:#111827
style Secrets fill:#8B5CF6,stroke:#6D28D9,stroke-width:2px,color:#fff
style EC2 fill:#10B981,stroke:#047857,stroke-width:2px,color:#111827
style Ansible fill:#F59E0B,stroke:#B45309,stroke-width:2px,color:#111827
style Deploy fill:#3B82F6,stroke:#1D4ED8,stroke-width:2px,color:#fff
style Wait fill:#3B82F6,stroke:#1D4ED8,stroke-width:2px,color:#fff
style CreateUser fill:#3B82F6,stroke:#1D4ED8,stroke-width:2px,color:#fff
style DisableChange fill:#3B82F6,stroke:#1D4ED8,stroke-width:2px,color:#fff
style GenToken fill:#3B82F6,stroke:#1D4ED8,stroke-width:2px,color:#fff
style UpdateSecret fill:#8B5CF6,stroke:#6D28D9,stroke-width:2px,color:#fff
style DownloadRunner fill:#3B82F6,stroke:#1D4ED8,stroke-width:2px,color:#fff
style CreateDirs fill:#3B82F6,stroke:#1D4ED8,stroke-width:2px,color:#fff
style FetchToken fill:#8B5CF6,stroke:#6D28D9,stroke-width:2px,color:#fff
style RegisterRunner fill:#3B82F6,stroke:#1D4ED8,stroke-width:2px,color:#fff
style CreateService fill:#3B82F6,stroke:#1D4ED8,stroke-width:2px,color:#fff
style StartService fill:#3B82F6,stroke:#1D4ED8,stroke-width:2px,color:#fff
style Complete fill:#10B981,stroke:#047857,stroke-width:2px,color:#111827
```
## Workflow Configuration
The CI/CD workflow is defined in `.gitea/workflows/test.yml`:
```yaml
name: Integration Tests
on:
pull_request:
branches: [main]
workflow_dispatch:
jobs:
test:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v3
- name: Cache Docker layers
uses: actions/cache@v4
with:
path: /tmp/.buildx-cache
key: ${{ runner.os }}-buildx-${{ github.sha }}
restore-keys: ${{ runner.os }}-buildx-
- name: Pre-pull test images
run: |
docker pull postgres:18.4
docker pull nginx:1.27-alpine
docker pull alpine:3.19
docker pull alpine:3.20
- name: Run integration tests
run: ./scripts/test-integration.sh
- name: Upload test logs
if: failure()
uses: actions/upload-artifact@v4
with:
name: test-logs
path: /tmp/test-*.log
retention-days: 7
```
## Test Suite
The `scripts/test-integration.sh` integration test suite validates:
1. **Static validation** (2 tests):
- Script syntax and linting
- Required executables available
2. **Docker-based tests** (12 tests):
- PostgreSQL backup and restore
- Health check functionality
- Archive validation (SQL and tar formats)
- Update simulation workflow
- Container cleanup and resource management
All tests must pass for a PR to be mergeable.
## Key Features
### Zero-Configuration CI/CD
- Runners automatically registered during initial deployment
- No manual token management needed
- Runner tokens stored securely in AWS Secrets Manager
- Complete automation from infrastructure provision to working CI/CD
### High Availability
- 2 concurrent runners for parallel job execution
- Automatic job distribution by Gitea
- Systemd ensures runners restart on failure
### Security
- Runners use local Gitea instance (`http://localhost:3000`)
- Admin credentials never exposed (CLI-based user creation)
- IAM roles for least-privilege access to AWS resources
- Runner tokens rotated on redeployment
### Docker Optimization
- Docker layer caching for faster builds
- Image pre-pulling reduces test execution time
- Shared Docker daemon for all tests
## Deployment Commands
```bash
# Full deployment (includes runner setup)
make full-deploy
# Update only configuration (re-registers runners if needed)
make configure
# Run tests locally
make test
```

View File

@ -58,17 +58,19 @@ graph TB
**EC2 Security Group**: **EC2 Security Group**:
- **Inbound Rules**: - **Inbound Rules**:
- Port 22 (SSH): From admin IP only (for management) - Port 22 (SSH): From admin IP only (for management)
- Port 80 (HTTP): From 0.0.0.0/0 (redirects to HTTPS) - Port 80 (HTTP): From 0.0.0.0/0 (redirects to HTTPS, ACME challenge)
- Port 443 (HTTPS): From 0.0.0.0/0 (Gitea access) - Port 443 (HTTPS): From 0.0.0.0/0 (Gitea web access)
- Port 2222 (Git SSH): From 0.0.0.0/0 (Git push/pull via SSH)
- **Outbound Rules**: - **Outbound Rules**:
- All traffic: To 0.0.0.0/0 (for updates, backups to S3) - All traffic: To 0.0.0.0/0 (for updates, backups to S3, Secrets Manager)
## Security Considerations ## Security Considerations
1. **SSH Access**: Restricted to specific admin IP address (your IP) 1. **SSH Access**: Restricted to specific admin IP address (your IP)
2. **HTTP/HTTPS**: Open to internet (required for Gitea web access) 2. **HTTP/HTTPS**: Open to internet (required for Gitea web access)
3. **No Direct Gitea Access**: Port 3000 not exposed; only nginx on 80/443 3. **Git SSH**: Port 2222 exposed for Git operations over SSH
4. **Outbound**: Allowed for Docker image pulls, package updates, S3 backups 4. **No Direct Gitea HTTP Access**: Port 3000 not exposed; only nginx on 80/443
5. **Outbound**: Allowed for Docker image pulls, package updates, S3 backups, AWS API calls
## Traffic Flow ## Traffic Flow

View File

@ -0,0 +1,169 @@
# Update Workflow
This diagram shows the complete automated update workflow for the Gitea deployment, including update detection, automatic and manual update paths, rollback procedures, and certificate renewal.
## Overview
- **Diun** monitors for container updates weekly (Sunday 3:00 AM)
- **Automatic updates** for low-risk containers (nginx)
- **Manual approval** required for critical containers (gitea, postgres)
- **Backup before update** with automatic rollback on failure
- **Certificate renewal** runs separately (Sunday 3:30 AM)
- **Email notifications** for all significant events
## Update Workflow Diagram
```mermaid
%%{init: {'theme':'base', 'themeVariables': { 'primaryColor':'#e5e7eb','primaryTextColor':'#111827','primaryBorderColor':'#9ca3af','lineColor':'#111827','secondaryColor':'#d1d5db','tertiaryColor':'#f3f4f6','edgeLabelBackground':'#ffffff','mainBkg':'#f5f5f4','nodeBorder':'#9ca3af','background':'#f5f5f4','clusterBkg':'transparent'},'themeCSS':'.node rect, .node circle, .node ellipse, .node polygon, .node path { filter: none !important; box-shadow: none !important; } .cluster rect { filter: none !important; box-shadow: none !important; } svg { background-color: #f5f5f4 !important; } .cluster-label { background-color: #ffffff !important; padding: 6px 12px !important; border-radius: 4px !important; font-size: 16px !important; font-weight: 700 !important; box-shadow: 0 1px 3px rgba(0,0,0,0.12) !important; border: 1px solid #d1d5db !important; } .edgePath, .edgePath path, .flowchart-link { z-index: 1 !important; }'}}%%
flowchart TD
Start([Sunday 3:00 AM<br/>Cron Trigger])
Diun{Diun<br/>Check Updates}
Policy{Update Policy?}
%% Automatic Path (nginx)
AutoEmail[📧 Email: nginx update available]
AutoCron([auto-update.sh<br/>Cron Execution])
AutoBackup[🗄️ Backup Database & Data<br/>to S3]
AutoBackupFail{Backup<br/>Success?}
AutoPull[📥 Pull New Image<br/>nginx:latest-version]
AutoRecreate[🔄 Recreate Container<br/>docker compose up]
AutoHealth{Health Check<br/>Pass?}
AutoRollback[↩️ Rollback<br/>Restore Previous Image]
AutoRollbackHealth{Rollback<br/>Health OK?}
AutoSuccess[✅ Update Complete<br/>Log Success]
AutoFailEmail[📧 Email: Update Failed<br/>System Rolled Back]
%% Manual Path (gitea/postgres)
ManualEmail[📧 Email: Critical Update Available<br/>gitea or postgres]
OperatorReview{Operator<br/>Reviews & Approves}
ManualRun([Operator runs<br/>manual-update.sh])
ManualConfirm{Confirm<br/>Update?}
ManualBackup[🗄️ Backup Database & Data<br/>to S3]
ManualBackupFail{Backup<br/>Success?}
ManualPull[📥 Pull New Image<br/>gitea:x.y.z or postgres:x.y]
ManualRecreate[🔄 Recreate Container<br/>docker compose up]
ManualHealth{Health Check<br/>Pass?}
ManualRollback[↩️ Rollback<br/>Restore Previous Image]
ManualRollbackHealth{Rollback<br/>Health OK?}
ManualSuccess[✅ Update Complete<br/>Email Success]
ManualFailEmail[📧 Email: Update Failed<br/>System Rolled Back]
ManualAbort[❌ Update Aborted]
%% Certificate Renewal Path
CertStart([Sunday 3:30 AM<br/>Cron Trigger])
CertRenew[🔐 Certbot Renew<br/>docker compose run certbot]
CertCheck{Certificate<br/>Renewed?}
CertRestart[🔄 Restart nginx<br/>docker compose restart]
CertSuccess[✅ Certificate Valid]
CertSkip[ No Renewal Needed]
%% Flow connections
Start --> Diun
Diun -->|Updates Found| Policy
Diun -->|No Updates| End1[End]
%% Automatic Path
Policy -->|automatic<br/>nginx| AutoEmail
AutoEmail --> AutoCron
AutoCron --> AutoBackup
AutoBackup --> AutoBackupFail
AutoBackupFail -->|❌ Failed| AutoFailEmail
AutoFailEmail --> End2[End]
AutoBackupFail -->|✅ Success| AutoPull
AutoPull --> AutoRecreate
AutoRecreate --> AutoHealth
AutoHealth -->|✅ Healthy| AutoSuccess
AutoSuccess --> End3[End]
AutoHealth -->|❌ Unhealthy| AutoRollback
AutoRollback --> AutoRollbackHealth
AutoRollbackHealth -->|✅ Healthy| AutoFailEmail
AutoRollbackHealth -->|❌ Still Failed| AutoFailEmail
%% Manual Path
Policy -->|manual<br/>gitea/postgres| ManualEmail
ManualEmail --> OperatorReview
OperatorReview -->|Later| End4[End]
OperatorReview -->|Now| ManualRun
ManualRun --> ManualConfirm
ManualConfirm -->|No| ManualAbort
ManualAbort --> End5[End]
ManualConfirm -->|Yes| ManualBackup
ManualBackup --> ManualBackupFail
ManualBackupFail -->|❌ Failed| ManualFailEmail
ManualFailEmail --> End6[End]
ManualBackupFail -->|✅ Success| ManualPull
ManualPull --> ManualRecreate
ManualRecreate --> ManualHealth
ManualHealth -->|✅ Healthy| ManualSuccess
ManualSuccess --> End7[End]
ManualHealth -->|❌ Unhealthy| ManualRollback
ManualRollback --> ManualRollbackHealth
ManualRollbackHealth -->|✅ Healthy| ManualFailEmail
ManualRollbackHealth -->|❌ Still Failed| ManualFailEmail
%% Certificate Renewal Path (separate flow)
CertStart --> CertRenew
CertRenew --> CertCheck
CertCheck -->|New Cert| CertRestart
CertRestart --> CertSuccess
CertSuccess --> End8[End]
CertCheck -->|Not Due| CertSkip
CertSkip --> End9[End]
%% Styling
classDef trigger fill:#F59E0B,stroke:#B45309,stroke-width:2px,color:#111827
classDef decision fill:#F97316,stroke:#C2410C,stroke-width:2px,color:#111827
classDef action fill:#3B82F6,stroke:#1D4ED8,stroke-width:2px,color:#ffffff
classDef success fill:#10B981,stroke:#047857,stroke-width:2px,color:#111827
classDef failure fill:#EF4444,stroke:#B91C1C,stroke-width:2px,color:#ffffff
classDef operator fill:#8B5CF6,stroke:#6D28D9,stroke-width:2px,color:#ffffff
classDef monitor fill:#F59E0B,stroke:#B45309,stroke-width:2px,color:#111827
classDef email fill:#6366F1,stroke:#4338CA,stroke-width:2px,color:#ffffff
classDef backup fill:#8B5CF6,stroke:#6D28D9,stroke-width:2px,color:#ffffff
class Start,AutoCron,ManualRun,CertStart trigger
class Diun,Policy,AutoBackupFail,AutoHealth,AutoRollbackHealth,ManualBackupFail,ManualHealth,ManualRollbackHealth,OperatorReview,ManualConfirm,CertCheck monitor
class AutoBackup,AutoPull,AutoRecreate,AutoRollback,ManualBackup,ManualPull,ManualRecreate,ManualRollback,CertRenew,CertRestart action
class AutoSuccess,ManualSuccess,CertSuccess,CertSkip success
class AutoFailEmail,ManualFailEmail,ManualAbort failure
class AutoEmail,ManualEmail email
```
## Update Policies
### Automatic (Low Risk)
- **nginx**: Reverse proxy with stateless configuration
- Process: Detected → Backup → Update → Health Check → Success/Rollback
- No operator intervention required
### Manual (High Risk)
- **gitea**: Git hosting application with user data
- **postgres**: Database containing all repository data
- Process: Detected → Email → Operator Reviews → Approval → Backup → Update → Health Check → Success/Rollback
## Safety Mechanisms
1. **Pre-Update Backup**: Database and Gitea data backed up to S3 before any changes
2. **Health Checks**: Services validated after update (container running, postgres responding, gitea accessible, nginx config valid)
3. **Automatic Rollback**: Failed health check triggers immediate rollback to previous image
4. **Email Notifications**: Operator notified of:
- Available updates (manual containers)
- Update failures (all containers)
- Successful updates (manual containers only)
## Certificate Renewal
Runs separately at 3:30 AM on Sundays:
- Certbot checks certificate expiration
- Renews if within 30 days of expiry
- Restarts nginx to load new certificate
- Process is idempotent (safe to run weekly)
## Monitoring
**Diun Configuration**:
- Schedule: `0 3 * * 0` (Sunday 3:00 AM)
- Monitors: postgres, gitea, nginx, diun
- Email: Via AWS SES SMTP
- Labels: Containers marked with `diun.enable=true` and `update.policy=automatic|manual`

273
scripts/auto-update.sh Normal file
View File

@ -0,0 +1,273 @@
#!/bin/bash
# ============================================================================
# Gitea Auto-Update Script
# ============================================================================
# Automatically updates low-risk containers (nginx, certbot) with backup,
# health checks, and automatic rollback on failure.
#
# Usage: ./auto-update.sh <container1> [container2] [...]
# Example: ./auto-update.sh nginx certbot
# ============================================================================
set -e
# ============================================================================
# Configuration
# ============================================================================
readonly SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
readonly DOCKER_COMPOSE_DIR="/opt/gitea"
readonly COMPOSE_FILE="${DOCKER_COMPOSE_DIR}/docker-compose.yml"
readonly BACKUP_SCRIPT="${SCRIPT_DIR}/backup.sh"
readonly HEALTH_CHECK_SCRIPT="${SCRIPT_DIR}/health-check.sh"
readonly LOG_FILE="/var/log/gitea-auto-update.log"
readonly ROLLBACK_INFO="/tmp/gitea-rollback-info-$$.json"
# Wait timeouts (seconds)
readonly CONTAINER_STARTUP_WAIT=10
# Output colors
readonly GREEN='\033[0;32m'
readonly YELLOW='\033[1;33m'
readonly RED='\033[0;31m'
readonly NC='\033[0m'
# ============================================================================
# Logging Functions
# ============================================================================
get_timestamp() {
date '+%Y-%m-%d %H:%M:%S'
}
log_info() {
local message="[$(get_timestamp)] [INFO] $1"
echo -e "${YELLOW}${message}${NC}"
echo "${message}" >> "${LOG_FILE}"
}
log_success() {
local message="[$(get_timestamp)] [SUCCESS] $1"
echo -e "${GREEN}${message}${NC}"
echo "${message}" >> "${LOG_FILE}"
}
log_error() {
local message="[$(get_timestamp)] [ERROR] $1"
echo -e "${RED}${message}${NC}" >&2
echo "${message}" >> "${LOG_FILE}"
}
error_exit() {
log_error "$1"
cleanup
exit 1
}
# ============================================================================
# Cleanup Function
# ============================================================================
cleanup() {
if [ -f "${ROLLBACK_INFO}" ]; then
rm -f "${ROLLBACK_INFO}"
fi
}
# ============================================================================
# Helper Functions
# ============================================================================
change_to_compose_dir() {
cd "${DOCKER_COMPOSE_DIR}" || error_exit "Failed to change to ${DOCKER_COMPOSE_DIR}"
}
run_compose() {
docker compose -f "${COMPOSE_FILE}" "$@"
}
# ============================================================================
# Validation Functions
# ============================================================================
validate_args() {
if [ $# -eq 0 ]; then
error_exit "No containers specified. Usage: $0 <container1> [container2] [...]"
fi
for container in "$@"; do
if ! run_compose config --services | grep -q "^${container}$"; then
error_exit "Container '${container}' not found in docker-compose.yml"
fi
done
log_success "Container validation passed"
}
# ============================================================================
# Rollback Management Functions
# ============================================================================
save_current_images() {
log_info "Saving current image versions for rollback..."
echo "{" > "${ROLLBACK_INFO}"
local first=true
for container in "$@"; do
local image=$(run_compose images -q "${container}" 2>/dev/null | head -n1)
if [ -n "${image}" ]; then
if [ "${first}" = true ]; then
first=false
else
echo "," >> "${ROLLBACK_INFO}"
fi
echo " \"${container}\": \"${image}\"" >> "${ROLLBACK_INFO}"
log_info "Saved ${container}: ${image}"
fi
done
echo "}" >> "${ROLLBACK_INFO}"
log_success "Current image versions saved"
}
rollback() {
log_error "Rolling back to previous versions..."
if [ ! -f "${ROLLBACK_INFO}" ]; then
log_error "No rollback information found"
return 1
fi
change_to_compose_dir
# Extract containers from rollback info and restore
local containers=$(grep -o '"[^"]*":' "${ROLLBACK_INFO}" | tr -d '":' | tr '\n' ' ')
for container in ${containers}; do
log_info "Rolling back ${container}..."
run_compose up -d "${container}" || log_error "Failed to rollback ${container}"
done
log_success "Rollback completed"
}
# ============================================================================
# Update Functions
# ============================================================================
run_backup() {
log_info "Running backup before update..."
if ! bash "${BACKUP_SCRIPT}"; then
error_exit "Backup failed - aborting update"
fi
log_success "Backup completed successfully"
}
pull_new_images() {
log_info "Pulling new images..."
change_to_compose_dir
for container in "$@"; do
log_info "Pulling image for ${container}..."
if ! run_compose pull "${container}"; then
error_exit "Failed to pull image for ${container}"
fi
done
log_success "All images pulled successfully"
}
recreate_containers() {
log_info "Recreating containers..."
change_to_compose_dir
if ! run_compose up -d "$@"; then
error_exit "Failed to recreate containers"
fi
# Wait for containers to start
log_info "Waiting for containers to start..."
sleep "${CONTAINER_STARTUP_WAIT}"
log_success "Containers recreated successfully"
}
run_health_check() {
log_info "Running health check..."
if bash "${HEALTH_CHECK_SCRIPT}"; then
log_success "Health check passed"
return 0
else
log_error "Health check failed"
return 1
fi
}
send_failure_notification() {
local subject="$1"
local body="$2"
# Placeholder for email notification
# Will be configured with proper email settings in Task 6
log_error "NOTIFICATION: ${subject}"
log_error "${body}"
# TODO: Implement actual email sending via mail command or SMTP
# echo "${body}" | mail -s "${subject}" admin@example.com
}
# ============================================================================
# Main Execution
# ============================================================================
main() {
log_info "=========================================="
log_info "Gitea Auto-Update Started"
log_info "Containers: $*"
log_info "=========================================="
# Validate input
validate_args "$@"
# Save current state for rollback
save_current_images "$@"
# Run backup
run_backup
# Pull new images
pull_new_images "$@"
# Recreate containers
recreate_containers "$@"
# Run health check
if run_health_check; then
log_success "=========================================="
log_success "Update completed successfully"
log_success "Updated containers: $*"
log_success "=========================================="
cleanup
exit 0
else
log_error "Health check failed after update"
rollback
# Run health check again after rollback
if run_health_check; then
log_success "Rollback successful - services restored"
send_failure_notification \
"Gitea Update Failed - Rolled Back" \
"Update of containers [$*] failed health check and was rolled back. Services are now healthy."
else
log_error "Critical: Services still unhealthy after rollback"
send_failure_notification \
"CRITICAL: Gitea Update Failed - Manual Intervention Required" \
"Update of containers [$*] failed and rollback did not restore health. IMMEDIATE ATTENTION REQUIRED."
fi
cleanup
exit 1
fi
}
main "$@"

136
scripts/backup.sh Normal file
View File

@ -0,0 +1,136 @@
#!/bin/bash
# ============================================================================
# Gitea Backup Script
# ============================================================================
# Backs up PostgreSQL database and Gitea data directory to AWS S3
#
# Usage: ./backup.sh
# ============================================================================
set -e
# ============================================================================
# Configuration
# ============================================================================
readonly TIMESTAMP=$(date +%Y%m%d_%H%M%S)
readonly BACKUP_DIR="/tmp/gitea-backup-${TIMESTAMP}"
readonly S3_BUCKET="qvest-task-backups"
readonly S3_PREFIX="backups"
readonly LOG_FILE="/var/log/gitea-backup.log"
readonly DB_CONTAINER="gitea-postgres"
readonly DB_USER="gitea"
readonly DB_NAME="gitea"
readonly DATA_VOLUME="gitea_gitea-data"
readonly CONFIG_DIR="/opt/gitea"
# Output colors
readonly GREEN='\033[0;32m'
readonly YELLOW='\033[1;33m'
readonly RED='\033[0;31m'
readonly NC='\033[0m'
# ============================================================================
# Logging Functions
# ============================================================================
log_info() {
echo -e "${YELLOW}[INFO]${NC} $1" | tee -a "${LOG_FILE}"
}
log_success() {
echo -e "${GREEN}[SUCCESS]${NC} $1" | tee -a "${LOG_FILE}"
}
log_error() {
echo -e "${RED}[ERROR]${NC} $1" | tee -a "${LOG_FILE}" >&2
}
error_exit() {
log_error "$1"
cleanup
exit 1
}
# ============================================================================
# Core Functions
# ============================================================================
cleanup() {
if [ -d "${BACKUP_DIR}" ]; then
rm -rf "${BACKUP_DIR}"
fi
}
create_backup_dir() {
mkdir -p "${BACKUP_DIR}" || error_exit "Failed to create backup directory"
}
backup_database() {
log_info "Backing up PostgreSQL database..."
docker exec "${DB_CONTAINER}" pg_dump -U "${DB_USER}" -d "${DB_NAME}" \
| gzip > "${BACKUP_DIR}/database-${TIMESTAMP}.sql.gz" \
|| error_exit "Database backup failed"
}
backup_gitea_data() {
log_info "Backing up Gitea data..."
docker run --rm \
-v "${DATA_VOLUME}:/data:ro" \
-v "${BACKUP_DIR}:/backup" \
alpine tar czf "/backup/gitea-data-${TIMESTAMP}.tar.gz" -C /data . \
|| error_exit "Gitea data backup failed"
}
backup_configuration() {
log_info "Backing up configuration files..."
tar czf "${BACKUP_DIR}/config-${TIMESTAMP}.tar.gz" \
-C "${CONFIG_DIR}" \
docker-compose.yml \
nginx/ \
.env \
scripts/ \
diun/ \
2>/dev/null || error_exit "Configuration backup failed"
log_success "Configuration backup created"
}
upload_to_s3() {
log_info "Uploading to S3..."
local db_backup="${BACKUP_DIR}/database-${TIMESTAMP}.sql.gz"
local data_backup="${BACKUP_DIR}/gitea-data-${TIMESTAMP}.tar.gz"
local config_backup="${BACKUP_DIR}/config-${TIMESTAMP}.tar.gz"
aws s3 cp "${db_backup}" "s3://${S3_BUCKET}/${S3_PREFIX}/" \
|| error_exit "Failed to upload database backup"
aws s3 cp "${data_backup}" "s3://${S3_BUCKET}/${S3_PREFIX}/" \
|| error_exit "Failed to upload Gitea data backup"
aws s3 cp "${config_backup}" "s3://${S3_BUCKET}/${S3_PREFIX}/" \
|| error_exit "Failed to upload configuration backup"
}
# ============================================================================
# Main Execution
# ============================================================================
main() {
log_info "Starting backup process..."
create_backup_dir
backup_database
backup_gitea_data
backup_configuration
upload_to_s3
cleanup
log_success "Backup completed successfully"
log_info "Database: s3://${S3_BUCKET}/${S3_PREFIX}/database-${TIMESTAMP}.sql.gz"
log_info "Data: s3://${S3_BUCKET}/${S3_PREFIX}/gitea-data-${TIMESTAMP}.tar.gz"
log_info "Config: s3://${S3_BUCKET}/${S3_PREFIX}/config-${TIMESTAMP}.tar.gz"
}
main "$@"

26
scripts/empty-s3-bucket.sh Executable file
View File

@ -0,0 +1,26 @@
#!/bin/bash
set -e
BUCKET_NAME="${1:-qvest-task-backups}"
echo "Emptying S3 bucket: $BUCKET_NAME"
# Delete all object versions
aws s3api list-object-versions --bucket "$BUCKET_NAME" --output text \
--query 'Versions[].[Key,VersionId]' 2>/dev/null | \
while read -r key version; do
if [ -n "$key" ]; then
aws s3api delete-object --bucket "$BUCKET_NAME" --key "$key" --version-id "$version" >/dev/null 2>&1
fi
done || true
# Delete all delete markers
aws s3api list-object-versions --bucket "$BUCKET_NAME" --output text \
--query 'DeleteMarkers[].[Key,VersionId]' 2>/dev/null | \
while read -r key version; do
if [ -n "$key" ]; then
aws s3api delete-object --bucket "$BUCKET_NAME" --key "$key" --version-id "$version" >/dev/null 2>&1
fi
done || true
echo "S3 bucket emptied successfully"

129
scripts/health-check.sh Normal file
View File

@ -0,0 +1,129 @@
#!/bin/bash
# ============================================================================
# Gitea Health Check Script
# ============================================================================
# Validates that all critical services are running and responsive
#
# Usage: ./health-check.sh
# Exit codes: 0 = healthy, 1 = unhealthy
# ============================================================================
set -e
# ============================================================================
# Configuration
# ============================================================================
readonly POSTGRES_CONTAINER="gitea-postgres"
readonly GITEA_CONTAINER="gitea"
readonly NGINX_CONTAINER="gitea-nginx"
readonly GITEA_URL="http://localhost:3000"
readonly TIMEOUT=10
# Output colors
readonly GREEN='\033[0;32m'
readonly YELLOW='\033[1;33m'
readonly RED='\033[0;31m'
readonly NC='\033[0m'
# ============================================================================
# Logging Functions
# ============================================================================
log_info() {
echo -e "${YELLOW}[CHECK]${NC} $1"
}
log_success() {
echo -e "${GREEN}[OK]${NC} $1"
}
log_error() {
echo -e "${RED}[FAIL]${NC} $1" >&2
}
# ============================================================================
# Health Check Functions
# ============================================================================
check_container_running() {
local container="$1"
if docker ps --format '{{.Names}}' | grep -q "^${container}$"; then
log_success "Container ${container} is running"
return 0
else
log_error "Container ${container} is not running"
return 1
fi
}
check_postgres_healthy() {
log_info "Checking PostgreSQL health..."
if docker exec "${POSTGRES_CONTAINER}" pg_isready -U gitea -q; then
log_success "PostgreSQL is healthy"
return 0
else
log_error "PostgreSQL is not responding"
return 1
fi
}
check_gitea_responsive() {
log_info "Checking Gitea web interface..."
if curl -sf -m "${TIMEOUT}" "${GITEA_URL}" > /dev/null; then
log_success "Gitea is responding"
return 0
else
log_error "Gitea is not responding at ${GITEA_URL}"
return 1
fi
}
check_nginx_responding() {
log_info "Checking Nginx..."
if docker exec "${NGINX_CONTAINER}" nginx -t 2>&1 | grep -q "successful"; then
log_success "Nginx configuration is valid"
return 0
else
log_error "Nginx configuration test failed"
return 1
fi
}
# ============================================================================
# Main Execution
# ============================================================================
main() {
local exit_code=0
echo "=========================================="
echo "Gitea Deployment Health Check"
echo "=========================================="
echo ""
# Check all containers are running
check_container_running "${POSTGRES_CONTAINER}" || exit_code=1
check_container_running "${GITEA_CONTAINER}" || exit_code=1
check_container_running "${NGINX_CONTAINER}" || exit_code=1
echo ""
# Check service health
check_postgres_healthy || exit_code=1
check_gitea_responsive || exit_code=1
check_nginx_responding || exit_code=1
echo ""
echo "=========================================="
if [ $exit_code -eq 0 ]; then
log_success "All health checks passed"
else
log_error "Some health checks failed"
fi
return $exit_code
}
main "$@"

358
scripts/manual-update.sh Normal file
View File

@ -0,0 +1,358 @@
#!/bin/bash
# ============================================================================
# Gitea Manual Update Script
# ============================================================================
# Updates high-risk containers (gitea, postgres) with manual approval,
# backup, health checks, and automatic rollback on failure.
#
# Usage: ./manual-update.sh <container1> [container2] [...]
# Example: ./manual-update.sh gitea postgres
#
# This script requires explicit operator invocation and confirmation.
# ============================================================================
set -e
# ============================================================================
# Configuration
# ============================================================================
readonly SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
readonly DOCKER_COMPOSE_DIR="/opt/gitea"
readonly COMPOSE_FILE="${DOCKER_COMPOSE_DIR}/docker-compose.yml"
readonly BACKUP_SCRIPT="${SCRIPT_DIR}/backup.sh"
readonly HEALTH_CHECK_SCRIPT="${SCRIPT_DIR}/health-check.sh"
readonly LOG_FILE="/var/log/gitea-manual-update.log"
readonly ROLLBACK_INFO="/tmp/gitea-rollback-info-$$.json"
# Wait timeouts (seconds)
readonly CONTAINER_STARTUP_WAIT=30
# Output colors
readonly GREEN='\033[0;32m'
readonly YELLOW='\033[1;33m'
readonly RED='\033[0;31m'
readonly BLUE='\033[0;34m'
readonly NC='\033[0m'
# ============================================================================
# Logging Functions
# ============================================================================
get_timestamp() {
date '+%Y-%m-%d %H:%M:%S'
}
log_info() {
local message="[$(get_timestamp)] [INFO] $1"
echo -e "${YELLOW}${message}${NC}"
echo "${message}" >> "${LOG_FILE}"
}
log_success() {
local message="[$(get_timestamp)] [SUCCESS] $1"
echo -e "${GREEN}${message}${NC}"
echo "${message}" >> "${LOG_FILE}"
}
log_error() {
local message="[$(get_timestamp)] [ERROR] $1"
echo -e "${RED}${message}${NC}" >&2
echo "${message}" >> "${LOG_FILE}"
}
log_prompt() {
echo -e "${BLUE}[PROMPT]${NC} $1"
}
error_exit() {
log_error "$1"
cleanup
exit 1
}
# ============================================================================
# Cleanup Function
# ============================================================================
cleanup() {
if [ -f "${ROLLBACK_INFO}" ]; then
rm -f "${ROLLBACK_INFO}"
fi
}
# ============================================================================
# Helper Functions
# ============================================================================
change_to_compose_dir() {
cd "${DOCKER_COMPOSE_DIR}" || error_exit "Failed to change to ${DOCKER_COMPOSE_DIR}"
}
run_compose() {
docker compose -f "${COMPOSE_FILE}" "$@"
}
# ============================================================================
# Validation Functions
# ============================================================================
validate_args() {
if [ $# -eq 0 ]; then
error_exit "No containers specified. Usage: $0 <container1> [container2] [...]"
fi
for container in "$@"; do
if ! run_compose config --services | grep -q "^${container}$"; then
error_exit "Container '${container}' not found in docker-compose.yml"
fi
done
log_success "Container validation passed"
}
# ============================================================================
# User Confirmation Functions
# ============================================================================
get_user_confirmation() {
local containers="$*"
echo ""
log_prompt "=========================================="
log_prompt "MANUAL UPDATE CONFIRMATION"
log_prompt "=========================================="
log_prompt "You are about to update the following containers:"
for container in ${containers}; do
log_prompt " - ${container}"
done
echo ""
log_prompt "This will:"
log_prompt " 1. Create a backup of database and Gitea data"
log_prompt " 2. Pull new container images"
log_prompt " 3. Recreate the containers with new versions"
log_prompt " 4. Run health checks"
log_prompt " 5. Rollback automatically if health checks fail"
echo ""
log_prompt "Estimated downtime: 1-3 minutes"
echo ""
read -p "Do you want to proceed? (yes/no): " confirmation
case "${confirmation}" in
yes|YES|Yes)
log_success "Update confirmed by operator"
return 0
;;
*)
log_info "Update cancelled by operator"
exit 0
;;
esac
}
show_current_versions() {
log_info "Current container versions:"
change_to_compose_dir
for container in "$@"; do
local image=$(run_compose images "${container}" 2>/dev/null | tail -n +3 | awk '{print $2":"$3}' | head -n1)
if [ -n "${image}" ]; then
log_info " ${container}: ${image}"
fi
done
echo ""
}
show_available_versions() {
log_info "Checking for available updates..."
change_to_compose_dir
for container in "$@"; do
log_info " Checking ${container}..."
run_compose pull --dry-run "${container}" 2>&1 | grep -i "image" || true
done
echo ""
}
# ============================================================================
# Rollback Management Functions
# ============================================================================
save_current_images() {
log_info "Saving current image versions for rollback..."
echo "{" > "${ROLLBACK_INFO}"
local first=true
for container in "$@"; do
local image=$(run_compose images -q "${container}" 2>/dev/null | head -n1)
if [ -n "${image}" ]; then
if [ "${first}" = true ]; then
first=false
else
echo "," >> "${ROLLBACK_INFO}"
fi
echo " \"${container}\": \"${image}\"" >> "${ROLLBACK_INFO}"
log_info "Saved ${container}: ${image}"
fi
done
echo "}" >> "${ROLLBACK_INFO}"
log_success "Current image versions saved"
}
rollback() {
log_error "Rolling back to previous versions..."
if [ ! -f "${ROLLBACK_INFO}" ]; then
log_error "No rollback information found"
return 1
fi
change_to_compose_dir
# Extract containers from rollback info and restore
local containers=$(grep -o '"[^"]*":' "${ROLLBACK_INFO}" | tr -d '":' | tr '\n' ' ')
for container in ${containers}; do
log_info "Rolling back ${container}..."
run_compose up -d "${container}" || log_error "Failed to rollback ${container}"
done
log_success "Rollback completed"
}
# ============================================================================
# Update Functions
# ============================================================================
run_backup() {
log_info "Running backup before update..."
if ! bash "${BACKUP_SCRIPT}"; then
error_exit "Backup failed - aborting update"
fi
log_success "Backup completed successfully"
}
pull_new_images() {
log_info "Pulling new images..."
change_to_compose_dir
for container in "$@"; do
log_info "Pulling image for ${container}..."
if ! run_compose pull "${container}"; then
error_exit "Failed to pull image for ${container}"
fi
done
log_success "All images pulled successfully"
}
recreate_containers() {
log_info "Recreating containers..."
log_info "⚠️ Service downtime begins now"
change_to_compose_dir
if ! run_compose up -d "$@"; then
error_exit "Failed to recreate containers"
fi
# Wait for containers to start - longer for database
log_info "Waiting for containers to start (${CONTAINER_STARTUP_WAIT} seconds)..."
sleep "${CONTAINER_STARTUP_WAIT}"
log_success "Containers recreated successfully"
}
run_health_check() {
log_info "Running health check..."
if bash "${HEALTH_CHECK_SCRIPT}"; then
log_success "Health check passed"
return 0
else
log_error "Health check failed"
return 1
fi
}
send_notification() {
local subject="$1"
local body="$2"
# Placeholder for email notification
# Will be configured with proper email settings in Task 6
log_info "NOTIFICATION: ${subject}"
log_info "${body}"
# TODO: Implement actual email sending via mail command or SMTP
# echo "${body}" | mail -s "${subject}" admin@example.com
}
# ============================================================================
# Main Execution
# ============================================================================
main() {
log_info "=========================================="
log_info "Gitea Manual Update Started"
log_info "Containers: $*"
log_info "=========================================="
# Validate input
validate_args "$@"
# Show current and available versions
show_current_versions "$@"
show_available_versions "$@"
# Get user confirmation
get_user_confirmation "$@"
# Save current state for rollback
save_current_images "$@"
# Run backup
run_backup
# Pull new images
pull_new_images "$@"
# Recreate containers
recreate_containers "$@"
# Run health check
if run_health_check; then
log_success "=========================================="
log_success "✓ Update completed successfully"
log_success "Updated containers: $*"
log_success "=========================================="
send_notification \
"Gitea Manual Update Successful" \
"Successfully updated containers: $*"
cleanup
exit 0
else
log_error "Health check failed after update"
rollback
# Run health check again after rollback
if run_health_check; then
log_success "Rollback successful - services restored"
send_notification \
"Gitea Manual Update Failed - Rolled Back" \
"Update of containers [$*] failed health check and was rolled back. Services are now healthy."
else
log_error "Critical: Services still unhealthy after rollback"
send_notification \
"CRITICAL: Gitea Manual Update Failed - Manual Intervention Required" \
"Update of containers [$*] failed and rollback did not restore health. IMMEDIATE ATTENTION REQUIRED."
fi
cleanup
exit 1
fi
}
main "$@"

251
scripts/restore.sh Executable file
View File

@ -0,0 +1,251 @@
#!/bin/bash
# ============================================================================
# Gitea Restore Script
# ============================================================================
# Restores PostgreSQL database, Gitea data, and configuration from S3 backups
#
# Usage: ./restore.sh <timestamp>
# Example: ./restore.sh 20260611_140530
#
# This will restore backups with the specified timestamp from S3
# ============================================================================
set -e
# ============================================================================
# Configuration
# ============================================================================
readonly S3_BUCKET="qvest-task-backups"
readonly S3_PREFIX="backups"
readonly RESTORE_DIR="/tmp/gitea-restore"
readonly LOG_FILE="/var/log/gitea-restore.log"
readonly DB_CONTAINER="gitea-postgres"
readonly DB_USER="gitea"
readonly DB_NAME="gitea"
readonly DATA_VOLUME="gitea_gitea-data"
readonly CONFIG_DIR="/opt/gitea"
# Output colors
readonly GREEN='\033[0;32m'
readonly YELLOW='\033[1;33m'
readonly RED='\033[0;31m'
readonly NC='\033[0m'
# ============================================================================
# Logging Functions
# ============================================================================
log_info() {
echo -e "${YELLOW}[INFO]${NC} $1" | tee -a "${LOG_FILE}"
}
log_success() {
echo -e "${GREEN}[SUCCESS]${NC} $1" | tee -a "${LOG_FILE}"
}
log_error() {
echo -e "${RED}[ERROR]${NC} $1" | tee -a "${LOG_FILE}" >&2
}
error_exit() {
log_error "$1"
cleanup
exit 1
}
# ============================================================================
# Validation Functions
# ============================================================================
validate_timestamp() {
if [[ ! "$1" =~ ^[0-9]{8}_[0-9]{6}$ ]]; then
error_exit "Invalid timestamp format. Expected: YYYYMMDD_HHMMSS"
fi
}
check_s3_backup_exists() {
local timestamp="$1"
local file="$2"
if ! aws s3 ls "s3://${S3_BUCKET}/${S3_PREFIX}/${file}-${timestamp}.tar.gz" &>/dev/null && \
! aws s3 ls "s3://${S3_BUCKET}/${S3_PREFIX}/${file}-${timestamp}.sql.gz" &>/dev/null; then
return 1
fi
return 0
}
# ============================================================================
# Core Functions
# ============================================================================
cleanup() {
if [ -d "${RESTORE_DIR}" ]; then
rm -rf "${RESTORE_DIR}"
fi
}
create_restore_dir() {
mkdir -p "${RESTORE_DIR}" || error_exit "Failed to create restore directory"
}
download_backups() {
local timestamp="$1"
log_info "Downloading backups from S3..."
aws s3 cp "s3://${S3_BUCKET}/${S3_PREFIX}/database-${timestamp}.sql.gz" \
"${RESTORE_DIR}/" || error_exit "Failed to download database backup"
aws s3 cp "s3://${S3_BUCKET}/${S3_PREFIX}/gitea-data-${timestamp}.tar.gz" \
"${RESTORE_DIR}/" || error_exit "Failed to download Gitea data backup"
aws s3 cp "s3://${S3_BUCKET}/${S3_PREFIX}/config-${timestamp}.tar.gz" \
"${RESTORE_DIR}/" || error_exit "Failed to download configuration backup"
log_success "Backups downloaded successfully"
}
stop_services() {
log_info "Stopping Gitea services..."
cd "${CONFIG_DIR}" || error_exit "Failed to change to config directory"
docker compose stop gitea || error_exit "Failed to stop Gitea"
log_success "Services stopped"
}
restore_database() {
local timestamp="$1"
log_info "Restoring database..."
# Drop and recreate database
docker exec "${DB_CONTAINER}" psql -U "${DB_USER}" -d postgres \
-c "DROP DATABASE IF EXISTS ${DB_NAME};" || error_exit "Failed to drop database"
docker exec "${DB_CONTAINER}" psql -U "${DB_USER}" -d postgres \
-c "CREATE DATABASE ${DB_NAME};" || error_exit "Failed to create database"
# Restore from backup
gunzip -c "${RESTORE_DIR}/database-${timestamp}.sql.gz" | \
docker exec -i "${DB_CONTAINER}" psql -U "${DB_USER}" -d "${DB_NAME}" \
|| error_exit "Failed to restore database"
log_success "Database restored"
}
restore_gitea_data() {
local timestamp="$1"
log_info "Restoring Gitea data..."
# Clear existing data
docker run --rm \
-v "${DATA_VOLUME}:/data" \
alpine sh -c "rm -rf /data/*" \
|| error_exit "Failed to clear Gitea data"
# Restore from backup
docker run --rm \
-v "${DATA_VOLUME}:/data" \
-v "${RESTORE_DIR}:/backup:ro" \
alpine tar xzf "/backup/gitea-data-${timestamp}.tar.gz" -C /data \
|| error_exit "Failed to restore Gitea data"
log_success "Gitea data restored"
}
restore_configuration() {
local timestamp="$1"
log_info "Restoring configuration files..."
# Extract configuration backup
tar xzf "${RESTORE_DIR}/config-${timestamp}.tar.gz" -C "${CONFIG_DIR}" \
|| error_exit "Failed to restore configuration"
log_success "Configuration restored"
}
start_services() {
log_info "Starting Gitea services..."
cd "${CONFIG_DIR}" || error_exit "Failed to change to config directory"
docker compose up -d || error_exit "Failed to start services"
log_info "Waiting for services to be ready..."
sleep 10
log_success "Services started"
}
verify_restore() {
log_info "Verifying restore..."
# Check if Gitea is responding
if curl -f -s http://localhost:3000 > /dev/null; then
log_success "Gitea is responding"
else
log_error "Gitea is not responding - manual verification required"
fi
# Check database connection
if docker exec "${DB_CONTAINER}" psql -U "${DB_USER}" -d "${DB_NAME}" \
-c "SELECT 1 FROM public.user LIMIT 1;" &>/dev/null; then
log_success "Database is accessible"
else
log_error "Database verification failed"
fi
}
# ============================================================================
# Main Execution
# ============================================================================
main() {
if [ $# -ne 1 ]; then
echo "Usage: $0 <timestamp>"
echo " Example: $0 20260611_140530"
echo ""
echo "Available backups:"
aws s3 ls "s3://${S3_BUCKET}/${S3_PREFIX}/" | grep "database-" | \
sed 's/.*database-\([0-9_]*\)\.sql\.gz/ \1/' | sort -u
exit 1
fi
local timestamp="$1"
log_info "Starting restore process for timestamp: ${timestamp}"
validate_timestamp "${timestamp}"
if ! check_s3_backup_exists "${timestamp}" "database"; then
error_exit "Backup with timestamp ${timestamp} not found in S3"
fi
# Confirm restore
echo ""
log_error "WARNING: This will replace all current data!"
read -p "Are you sure you want to continue? (yes/no): " confirm
if [ "$confirm" != "yes" ]; then
echo "Restore cancelled"
exit 0
fi
create_restore_dir
download_backups "${timestamp}"
stop_services
restore_database "${timestamp}"
restore_gitea_data "${timestamp}"
restore_configuration "${timestamp}"
start_services
verify_restore
cleanup
log_success "Restore completed successfully"
echo ""
log_info "Please verify the system is functioning correctly:"
log_info " 1. Access https://git.poll-streams.com"
log_info " 2. Login with your credentials"
log_info " 3. Verify repositories are accessible"
log_info " 4. Check system settings"
}
main "$@"

639
scripts/test-integration.sh Executable file
View File

@ -0,0 +1,639 @@
#!/bin/bash
# ============================================================================
# Integration Test Suite
# ============================================================================
# Tests script integration with Docker components in isolated environment.
# Does NOT touch production infrastructure or AWS services.
#
# Requirements:
# - Docker daemon running
# - docker compose plugin installed
#
# Tests:
# 1. Script syntax validation (static)
# 2. Docker Compose configuration validity (static)
# 3. Backup creates valid archives (integration)
# 4. Health checks detect container failures (integration)
# 5. Update workflow with rollback (integration)
# 6. Full backup and restore cycle (integration)
#
# Usage: ./test-integration.sh
# Exit: 0 if all tests pass, 1 if any test fails
# ============================================================================
set -e
# ============================================================================
# Configuration
# ============================================================================
readonly SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
readonly DOCKER_COMPOSE_DIR="$(cd "${SCRIPT_DIR}/../docker" && pwd)"
readonly BACKUP_SCRIPT="${SCRIPT_DIR}/backup.sh"
readonly HEALTH_CHECK_SCRIPT="${SCRIPT_DIR}/health-check.sh"
readonly AUTO_UPDATE_SCRIPT="${SCRIPT_DIR}/auto-update.sh"
readonly MANUAL_UPDATE_SCRIPT="${SCRIPT_DIR}/manual-update.sh"
readonly COMPOSE_FILE="${DOCKER_COMPOSE_DIR}/docker-compose.yml"
readonly TEST_LOG="/tmp/test-integration-$$.log"
readonly TEST_DIR="/tmp/test-gitea-$$"
# Test images and credentials
readonly PG_IMAGE="postgres:18.4"
readonly PG_USER="testuser"
readonly PG_PASS="testpass"
readonly PG_DB="testdb"
readonly NGINX_IMAGE="nginx:1.27-alpine"
readonly ALPINE_OLD="alpine:3.19"
readonly ALPINE_NEW="alpine:3.20"
# Wait timeouts (seconds)
readonly WAIT_TIMEOUT=30
readonly WAIT_INTERVAL=0.5
readonly POSTGRES_INIT_DELAY=1
# Output colors
readonly GREEN='\033[0;32m'
readonly RED='\033[0;31m'
readonly BLUE='\033[0;34m'
readonly NC='\033[0m' # No Color
# Test counters
TESTS_PASSED=0
TESTS_FAILED=0
# Cleanup tracking
CONTAINERS_TO_CLEANUP=()
# ============================================================================
# Cleanup Functions
# ============================================================================
cleanup() {
log_info "Cleaning up test environment..."
# Stop and remove test containers
if [[ ${#CONTAINERS_TO_CLEANUP[@]} -gt 0 ]]; then
for container in "${CONTAINERS_TO_CLEANUP[@]}"; do
docker rm -f "${container}" &>/dev/null || true
done
fi
# Remove test directory
if [[ -d "${TEST_DIR}" ]]; then
rm -rf "${TEST_DIR}"
fi
log_info "Cleanup complete"
}
trap cleanup EXIT
# ============================================================================
# Output Functions
# ============================================================================
log_info() {
echo -e "${BLUE}[INFO]${NC} $*" | tee -a "${TEST_LOG}"
}
log_success() {
echo -e "${GREEN}[PASS]${NC} $*" | tee -a "${TEST_LOG}"
}
log_error() {
echo -e "${RED}[FAIL]${NC} $*" | tee -a "${TEST_LOG}"
}
pass_test() {
local message="$1"
TESTS_PASSED=$((TESTS_PASSED + 1))
log_success "${message}"
}
fail_test() {
local message="$1"
TESTS_FAILED=$((TESTS_FAILED + 1))
log_error "${message}"
}
# ============================================================================
# Helper Functions
# ============================================================================
wait_for_postgres() {
local container=$1
local attempts=0
local max_attempts=$((WAIT_TIMEOUT * 2)) # Check every 0.5s
# First wait for container to be running
while ! docker ps --filter "name=${container}" --format "{{.Names}}" | grep -q "^${container}$"; do
((attempts++))
if [[ $attempts -ge $max_attempts ]]; then
return 1
fi
sleep "${WAIT_INTERVAL}"
done
# Then wait for postgres to be ready
attempts=0
while ! docker exec "${container}" pg_isready -U "${PG_USER}" &>/dev/null; do
((attempts++))
if [[ $attempts -ge $max_attempts ]]; then
return 1
fi
sleep "${WAIT_INTERVAL}"
done
# Give it a moment to fully initialize
sleep "${POSTGRES_INIT_DELAY}"
return 0
}
wait_for_container() {
local container=$1
local attempts=0
local max_attempts=$((WAIT_TIMEOUT * 2))
while ! docker ps --filter "name=${container}" --format "{{.Names}}" | grep -q "^${container}$"; do
((attempts++))
if [[ $attempts -ge $max_attempts ]]; then
return 1
fi
sleep "${WAIT_INTERVAL}"
done
return 0
}
start_postgres_container() {
local name=$1
docker run -d \
--name "${name}" \
-e POSTGRES_USER="${PG_USER}" \
-e POSTGRES_PASSWORD="${PG_PASS}" \
-e POSTGRES_DB="${PG_DB}" \
"${PG_IMAGE}" &>> "${TEST_LOG}"
CONTAINERS_TO_CLEANUP+=("${name}")
wait_for_postgres "${name}"
}
start_container() {
local name=$1
local image=$2
shift 2
local extra_args=("$@")
docker run -d \
--name "${name}" \
"${image}" \
"${extra_args[@]}" &>> "${TEST_LOG}"
CONTAINERS_TO_CLEANUP+=("${name}")
wait_for_container "${name}"
}
validate_sql_archive() {
local file=$1
local pattern=$2
gunzip -t "${file}" 2>> "${TEST_LOG}" && \
zcat "${file}" | grep -q "${pattern}"
}
validate_tar_archive() {
local file=$1
local pattern=$2
tar -tzf "${file}" &>> "${TEST_LOG}" && \
tar -tzf "${file}" | grep -q "${pattern}"
}
get_container_image() {
local container=$1
docker inspect --format='{{.Config.Image}}' "${container}"
}
is_container_running() {
local container=$1
docker ps --filter "name=${container}" --format "{{.Names}}" | grep -q "^${container}$"
}
exec_psql() {
local container=$1
local database=$2
local sql=$3
docker exec "${container}" psql -U "${PG_USER}" -d "${database}" -c "${sql}" &>> "${TEST_LOG}"
}
exec_psql_query() {
local container=$1
local database=$2
local query=$3
docker exec "${container}" psql -U "${PG_USER}" -d "${database}" -t -c "${query}" 2>> "${TEST_LOG}" | xargs
}
# ============================================================================
# Test Functions
# ============================================================================
test_script_syntax() {
log_info "Test 1: Script syntax validation..."
local scripts=(
"${BACKUP_SCRIPT}"
"${HEALTH_CHECK_SCRIPT}"
"${AUTO_UPDATE_SCRIPT}"
"${MANUAL_UPDATE_SCRIPT}"
)
for script in "${scripts[@]}"; do
if [[ ! -f "${script}" ]]; then
fail_test "Script not found: ${script}"
continue
fi
if bash -n "${script}" 2>> "${TEST_LOG}"; then
pass_test "Syntax valid: $(basename "${script}")"
else
fail_test "Syntax error in: $(basename "${script}")"
fi
done
}
test_docker_compose_validity() {
log_info "Test 2: Docker Compose configuration..."
if [[ ! -f "${COMPOSE_FILE}" ]]; then
fail_test "docker-compose.yml not found"
return
fi
# Validate compose file syntax
if ! docker compose -f "${COMPOSE_FILE}" config &>> "${TEST_LOG}"; then
fail_test "docker-compose.yml has syntax errors"
return
fi
pass_test "docker-compose.yml is valid"
# Check for latest tags (anti-pattern)
if grep -E "image:.*:latest" "${COMPOSE_FILE}" &>> "${TEST_LOG}"; then
fail_test "Found 'latest' tags (versions should be pinned)"
else
pass_test "No 'latest' tags (versions properly pinned)"
fi
}
test_backup_creates_valid_archives() {
log_info "Test 3: Backup creates valid archives..."
# Create test environment
mkdir -p "${TEST_DIR}/backups"
mkdir -p "${TEST_DIR}/gitea-data"
echo "test data" > "${TEST_DIR}/gitea-data/test-file.txt"
# Start test postgres container
local db_container="test-postgres-$$"
if ! start_postgres_container "${db_container}"; then
fail_test "Failed to start postgres container"
return
fi
# Create test table with data
exec_psql "${db_container}" "${PG_DB}" \
"CREATE TABLE test_data (id SERIAL PRIMARY KEY, value TEXT);"
exec_psql "${db_container}" "${PG_DB}" \
"INSERT INTO test_data (value) VALUES ('test value');"
# Test database backup
local backup_file="${TEST_DIR}/backups/test-backup.sql.gz"
if ! docker exec "${db_container}" pg_dump -U "${PG_USER}" "${PG_DB}" | gzip > "${backup_file}" 2>> "${TEST_LOG}"; then
fail_test "Database backup failed"
return
fi
if ! validate_sql_archive "${backup_file}" "test_data"; then
fail_test "Database backup archive is invalid"
return
fi
pass_test "Database backup creates valid SQL archive"
# Test Gitea data backup
local data_backup="${TEST_DIR}/backups/test-data.tar.gz"
if ! tar -czf "${data_backup}" -C "${TEST_DIR}" gitea-data 2>> "${TEST_LOG}"; then
fail_test "Gitea data backup failed"
return
fi
if ! validate_tar_archive "${data_backup}" "test-file.txt"; then
fail_test "Gitea data backup archive is invalid"
return
fi
pass_test "Gitea data backup creates valid tar archive"
}
test_health_checks_detect_failures() {
log_info "Test 4: Health checks detect container failures..."
# Start healthy test container
local test_container="test-nginx-$$"
if ! start_container "${test_container}" "${NGINX_IMAGE}"; then
fail_test "Failed to start nginx container"
return
fi
# Test 1: Detect running container
if is_container_running "${test_container}"; then
pass_test "Health check detects running container"
else
fail_test "Health check failed to detect running container"
fi
# Test 2: Stop container and verify detection
docker stop "${test_container}" &>> "${TEST_LOG}"
sleep 1
if ! is_container_running "${test_container}"; then
pass_test "Health check detects stopped container"
else
fail_test "Health check failed to detect stopped container"
fi
# Test 3: Start postgres and verify health check
local pg_container="test-pg-health-$$"
if ! start_postgres_container "${pg_container}"; then
fail_test "Failed to start postgres for health check"
return
fi
# Test pg_isready (how health-check.sh validates postgres)
if docker exec "${pg_container}" pg_isready -U "${PG_USER}" &>> "${TEST_LOG}"; then
pass_test "Postgres health check (pg_isready) works"
else
fail_test "Postgres health check failed"
fi
}
test_update_workflow_with_rollback() {
log_info "Test 5: Update workflow with rollback simulation..."
# Create test container with versioned images
local test_container="test-rollback-$$"
# Start with old version
if ! start_container "${test_container}" "${ALPINE_OLD}" tail -f /dev/null; then
fail_test "Failed to start container with initial image"
return
fi
# Verify initial version
local initial_image=$(get_container_image "${test_container}")
if [[ "${initial_image}" == "${ALPINE_OLD}" ]]; then
pass_test "Container starts with correct initial image"
else
fail_test "Container has wrong initial image: ${initial_image}"
fi
# Simulate update: save current image info (like auto-update.sh does)
local saved_image="${initial_image}"
# "Update" to new version
docker rm -f "${test_container}" &>> "${TEST_LOG}"
if ! start_container "${test_container}" "${ALPINE_NEW}" tail -f /dev/null; then
fail_test "Failed to update container"
return
fi
local updated_image=$(get_container_image "${test_container}")
if [[ "${updated_image}" == "${ALPINE_NEW}" ]]; then
pass_test "Container updates to new image"
else
fail_test "Container update failed"
fi
# Simulate rollback (health check failed scenario)
docker rm -f "${test_container}" &>> "${TEST_LOG}"
if ! start_container "${test_container}" "${saved_image}" tail -f /dev/null; then
fail_test "Failed to rollback container"
return
fi
local rolled_back_image=$(get_container_image "${test_container}")
if [[ "${rolled_back_image}" == "${saved_image}" ]]; then
pass_test "Rollback restores previous image"
else
fail_test "Rollback failed: got ${rolled_back_image}, expected ${saved_image}"
fi
}
test_backup_and_restore_cycle() {
log_info "Test 6: Full backup and restore cycle..."
# Create test database container
local db_container="test-restore-db-$$"
if ! start_postgres_container "${db_container}"; then
fail_test "Failed to start postgres for restore test"
return
fi
# Create test data and directory structure
mkdir -p "${TEST_DIR}/restore-test/data"
mkdir -p "${TEST_DIR}/restore-test/backups"
echo "original content" > "${TEST_DIR}/restore-test/data/test-file.txt"
echo "config data" > "${TEST_DIR}/restore-test/data/config.yml"
# Create database with test data
exec_psql "${db_container}" "${PG_DB}" \
"CREATE TABLE restore_test (id SERIAL PRIMARY KEY, data TEXT, created_at TIMESTAMP DEFAULT NOW());"
exec_psql "${db_container}" "${PG_DB}" \
"INSERT INTO restore_test (data) VALUES ('original data'), ('test record 1'), ('test record 2');"
# Verify original data exists
local original_count=$(exec_psql_query "${db_container}" "${PG_DB}" \
"SELECT COUNT(*) FROM restore_test;")
if [[ "${original_count}" -ne 3 ]]; then
fail_test "Failed to create test data (expected 3 rows, got ${original_count})"
return
fi
pass_test "Test data created successfully (3 rows)"
# Step 1: Create backups
local timestamp="test-$$"
local db_backup="${TEST_DIR}/restore-test/backups/database-${timestamp}.sql.gz"
local data_backup="${TEST_DIR}/restore-test/backups/data-${timestamp}.tar.gz"
if ! docker exec "${db_container}" pg_dump -U "${PG_USER}" "${PG_DB}" | gzip > "${db_backup}" 2>> "${TEST_LOG}"; then
fail_test "Database backup failed"
return
fi
if ! tar -czf "${data_backup}" -C "${TEST_DIR}/restore-test" data 2>> "${TEST_LOG}"; then
fail_test "Data directory backup failed"
return
fi
pass_test "Backups created successfully"
# Step 2: Corrupt/destroy the data (simulate disaster)
exec_psql "${db_container}" "${PG_DB}" \
"DELETE FROM restore_test;"
exec_psql "${db_container}" "${PG_DB}" \
"INSERT INTO restore_test (data) VALUES ('corrupted data');"
rm -f "${TEST_DIR}/restore-test/data/test-file.txt"
echo "corrupted content" > "${TEST_DIR}/restore-test/data/test-file.txt"
# Verify data is corrupted
local corrupted_count=$(exec_psql_query "${db_container}" "${PG_DB}" \
"SELECT COUNT(*) FROM restore_test;")
if [[ "${corrupted_count}" -ne 1 ]]; then
fail_test "Data corruption simulation failed"
return
fi
pass_test "Data corruption simulated (1 row instead of 3)"
# Step 3: Restore database from backup
if ! zcat "${db_backup}" | docker exec -i "${db_container}" psql -U "${PG_USER}" -d postgres -c "DROP DATABASE IF EXISTS ${PG_DB};" &>> "${TEST_LOG}"; then
fail_test "Failed to drop database"
return
fi
if ! exec_psql "${db_container}" postgres "CREATE DATABASE ${PG_DB};"; then
fail_test "Failed to recreate database"
return
fi
if ! zcat "${db_backup}" | docker exec -i "${db_container}" psql -U "${PG_USER}" -d "${PG_DB}" &>> "${TEST_LOG}"; then
fail_test "Database restore failed"
return
fi
pass_test "Database restored from backup"
# Step 4: Restore data directory
rm -rf "${TEST_DIR}/restore-test/data"
if ! tar -xzf "${data_backup}" -C "${TEST_DIR}/restore-test" 2>> "${TEST_LOG}"; then
fail_test "Data directory restore failed"
return
fi
pass_test "Data directory restored from backup"
# Step 5: Verify restored data matches original
local restored_count=$(exec_psql_query "${db_container}" "${PG_DB}" \
"SELECT COUNT(*) FROM restore_test;")
if [[ "${restored_count}" -ne 3 ]]; then
fail_test "Restored data count mismatch (expected 3, got ${restored_count})"
return
fi
local restored_data=$(exec_psql_query "${db_container}" "${PG_DB}" \
"SELECT data FROM restore_test ORDER BY id LIMIT 1;")
if [[ "${restored_data}" != "original data" ]]; then
fail_test "Restored data content mismatch (expected 'original data', got '${restored_data}')"
return
fi
pass_test "Database data restored correctly (3 rows, original content)"
# Verify file content
local restored_file_content=$(cat "${TEST_DIR}/restore-test/data/test-file.txt")
if [[ "${restored_file_content}" != "original content" ]]; then
fail_test "Restored file content mismatch"
return
fi
if [[ ! -f "${TEST_DIR}/restore-test/data/config.yml" ]]; then
fail_test "Config file missing after restore"
return
fi
pass_test "File system data restored correctly"
# Step 6: Verify database is operational after restore
if ! exec_psql "${db_container}" "${PG_DB}" \
"INSERT INTO restore_test (data) VALUES ('post-restore test');"; then
fail_test "Database not operational after restore"
return
fi
local final_count=$(exec_psql_query "${db_container}" "${PG_DB}" \
"SELECT COUNT(*) FROM restore_test;")
if [[ "${final_count}" -ne 4 ]]; then
fail_test "Post-restore database operations failed"
return
fi
pass_test "Database fully operational after restore"
}
# ============================================================================
# Main Execution
# ============================================================================
main() {
echo "=========================================="
echo "Integration Test Suite"
echo "=========================================="
echo ""
log_info "Starting tests at $(date)"
log_info "Test environment: ${TEST_DIR}"
echo ""
# Check Docker is available
if ! command -v docker &> /dev/null; then
log_error "Docker is not installed or not in PATH"
exit 1
fi
if ! docker ps &> /dev/null; then
log_error "Docker daemon is not running or not accessible"
exit 1
fi
# Create log file
: > "${TEST_LOG}"
# Create test directory
mkdir -p "${TEST_DIR}"
echo "━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━"
echo "Static Analysis Tests"
echo "━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━"
echo ""
test_script_syntax
echo ""
test_docker_compose_validity
echo ""
echo "━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━"
echo "Integration Tests (Docker Required)"
echo "━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━"
echo ""
test_backup_creates_valid_archives
echo ""
test_health_checks_detect_failures
echo ""
test_update_workflow_with_rollback
echo ""
test_backup_and_restore_cycle
echo ""
# Summary
echo "=========================================="
echo "Test Summary"
echo "=========================================="
echo -e "${GREEN}Passed: ${TESTS_PASSED}${NC}"
echo -e "${RED}Failed: ${TESTS_FAILED}${NC}"
echo ""
if [[ ${TESTS_FAILED} -eq 0 ]]; then
echo -e "${GREEN}All integration tests passed!${NC}"
echo ""
log_info "Full log: ${TEST_LOG}"
exit 0
else
echo -e "${RED}${TESTS_FAILED} test(s) failed${NC}"
echo ""
log_error "Full log: ${TEST_LOG}"
exit 1
fi
}
main "$@"

619
scripts/test-update.sh Normal file
View File

@ -0,0 +1,619 @@
#!/bin/bash
# ============================================================================
# Update Automation Integration Tests
# ============================================================================
# Tests script integration with Docker components in isolated environment.
# Does NOT touch production infrastructure or AWS services.
#
# Requirements:
# - Docker daemon running
# - docker compose plugin installed
#
# Tests:
# 1. Script syntax validation (static)
# 2. Docker Compose configuration validity (static)
# 3. Backup creates valid archives (integration)
# 4. Health checks detect container failures (integration)
# 5. Update workflow with rollback (integration)
# 6. Full backup and restore cycle (integration)
#
# Usage: ./test-update.sh
# Exit: 0 if all tests pass, 1 if any test fails
# ============================================================================
set -e
# ============================================================================
# Configuration
# ============================================================================
readonly SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
readonly DOCKER_COMPOSE_DIR="$(cd "${SCRIPT_DIR}/../docker" && pwd)"
readonly BACKUP_SCRIPT="${SCRIPT_DIR}/backup.sh"
readonly HEALTH_CHECK_SCRIPT="${SCRIPT_DIR}/health-check.sh"
readonly AUTO_UPDATE_SCRIPT="${SCRIPT_DIR}/auto-update.sh"
readonly MANUAL_UPDATE_SCRIPT="${SCRIPT_DIR}/manual-update.sh"
readonly COMPOSE_FILE="${DOCKER_COMPOSE_DIR}/docker-compose.yml"
readonly TEST_LOG="/tmp/test-update-$$.log"
readonly TEST_DIR="/tmp/test-gitea-$$"
# Test images and credentials
readonly PG_IMAGE="postgres:18.4"
readonly PG_USER="testuser"
readonly PG_PASS="testpass"
readonly PG_DB="testdb"
readonly NGINX_IMAGE="nginx:1.27-alpine"
readonly ALPINE_OLD="alpine:3.19"
readonly ALPINE_NEW="alpine:3.20"
# Wait timeouts (seconds)
readonly WAIT_TIMEOUT=30
readonly WAIT_INTERVAL=0.5
# Output colors
readonly GREEN='\033[0;32m'
readonly RED='\033[0;31m'
readonly BLUE='\033[0;34m'
readonly NC='\033[0m' # No Color
# Test counters
TESTS_PASSED=0
TESTS_FAILED=0
# Cleanup tracking
CONTAINERS_TO_CLEANUP=()
# ============================================================================
# Cleanup Functions
# ============================================================================
cleanup() {
log_info "Cleaning up test environment..."
# Stop and remove test containers
if [[ ${#CONTAINERS_TO_CLEANUP[@]} -gt 0 ]]; then
for container in "${CONTAINERS_TO_CLEANUP[@]}"; do
docker rm -f "${container}" &>/dev/null || true
done
fi
# Remove test directory
if [[ -d "${TEST_DIR}" ]]; then
rm -rf "${TEST_DIR}"
fi
log_info "Cleanup complete"
}
trap cleanup EXIT
# ============================================================================
# Output Functions
# ============================================================================
log_info() {
echo -e "${BLUE}[INFO]${NC} $*" | tee -a "${TEST_LOG}"
}
log_success() {
echo -e "${GREEN}[PASS]${NC} $*" | tee -a "${TEST_LOG}"
}
log_error() {
echo -e "${RED}[FAIL]${NC} $*" | tee -a "${TEST_LOG}"
}
pass_test() {
local message="$1"
TESTS_PASSED=$((TESTS_PASSED + 1))
log_success "${message}"
}
fail_test() {
local message="$1"
TESTS_FAILED=$((TESTS_FAILED + 1))
log_error "${message}"
}
# ============================================================================
# Helper Functions
# ============================================================================
wait_for_postgres() {
local container=$1
local attempts=0
local max_attempts=$((WAIT_TIMEOUT * 2)) # Check every 0.5s
# First wait for container to be running
while ! docker ps --filter "name=${container}" --format "{{.Names}}" | grep -q "^${container}$"; do
((attempts++))
if [[ $attempts -ge $max_attempts ]]; then
return 1
fi
sleep "${WAIT_INTERVAL}"
done
# Then wait for postgres to be ready
attempts=0
while ! docker exec "${container}" pg_isready -U "${PG_USER}" &>/dev/null; do
((attempts++))
if [[ $attempts -ge $max_attempts ]]; then
return 1
fi
sleep "${WAIT_INTERVAL}"
done
# Give it a moment to fully initialize
sleep 1
return 0
}
wait_for_container() {
local container=$1
local attempts=0
local max_attempts=$((WAIT_TIMEOUT * 2))
while ! docker ps --filter "name=${container}" --format "{{.Names}}" | grep -q "^${container}$"; do
((attempts++))
if [[ $attempts -ge $max_attempts ]]; then
return 1
fi
sleep "${WAIT_INTERVAL}"
done
return 0
}
start_postgres_container() {
local name=$1
docker run -d \
--name "${name}" \
-e POSTGRES_USER="${PG_USER}" \
-e POSTGRES_PASSWORD="${PG_PASS}" \
-e POSTGRES_DB="${PG_DB}" \
"${PG_IMAGE}" &>> "${TEST_LOG}"
CONTAINERS_TO_CLEANUP+=("${name}")
wait_for_postgres "${name}"
}
start_container() {
local name=$1
local image=$2
shift 2
local extra_args=("$@")
docker run -d \
--name "${name}" \
"${image}" \
"${extra_args[@]}" &>> "${TEST_LOG}"
CONTAINERS_TO_CLEANUP+=("${name}")
wait_for_container "${name}"
}
validate_sql_archive() {
local file=$1
local pattern=$2
gunzip -t "${file}" 2>> "${TEST_LOG}" && \
zcat "${file}" | grep -q "${pattern}"
}
validate_tar_archive() {
local file=$1
local pattern=$2
tar -tzf "${file}" &>> "${TEST_LOG}" && \
tar -tzf "${file}" | grep -q "${pattern}"
}
get_container_image() {
local container=$1
docker inspect --format='{{.Config.Image}}' "${container}"
}
# ============================================================================
# Test Functions
# ============================================================================
test_script_syntax() {
log_info "Test 1: Script syntax validation..."
local scripts=(
"${BACKUP_SCRIPT}"
"${HEALTH_CHECK_SCRIPT}"
"${AUTO_UPDATE_SCRIPT}"
"${MANUAL_UPDATE_SCRIPT}"
)
for script in "${scripts[@]}"; do
if [[ ! -f "${script}" ]]; then
fail_test "Script not found: ${script}"
continue
fi
if bash -n "${script}" 2>> "${TEST_LOG}"; then
pass_test "Syntax valid: $(basename "${script}")"
else
fail_test "Syntax error in: $(basename "${script}")"
fi
done
}
test_docker_compose_validity() {
log_info "Test 2: Docker Compose configuration..."
if [[ ! -f "${COMPOSE_FILE}" ]]; then
fail_test "docker-compose.yml not found"
return
fi
# Validate compose file syntax
if ! docker compose -f "${COMPOSE_FILE}" config &>> "${TEST_LOG}"; then
fail_test "docker-compose.yml has syntax errors"
return
fi
pass_test "docker-compose.yml is valid"
# Check for latest tags (anti-pattern)
if grep -E "image:.*:latest" "${COMPOSE_FILE}" &>> "${TEST_LOG}"; then
fail_test "Found 'latest' tags (versions should be pinned)"
else
pass_test "No 'latest' tags (versions properly pinned)"
fi
}
test_backup_creates_valid_archives() {
log_info "Test 3: Backup creates valid archives..."
# Create test environment
mkdir -p "${TEST_DIR}/backups"
mkdir -p "${TEST_DIR}/gitea-data"
echo "test data" > "${TEST_DIR}/gitea-data/test-file.txt"
# Start test postgres container
local db_container="test-postgres-$$"
if ! start_postgres_container "${db_container}"; then
fail_test "Failed to start postgres container"
return
fi
# Create test table with data
docker exec "${db_container}" psql -U "${PG_USER}" -d "${PG_DB}" -c \
"CREATE TABLE test_data (id SERIAL PRIMARY KEY, value TEXT);" &>> "${TEST_LOG}"
docker exec "${db_container}" psql -U "${PG_USER}" -d "${PG_DB}" -c \
"INSERT INTO test_data (value) VALUES ('test value');" &>> "${TEST_LOG}"
# Test database backup
local backup_file="${TEST_DIR}/backups/test-backup.sql.gz"
if ! docker exec "${db_container}" pg_dump -U "${PG_USER}" "${PG_DB}" | gzip > "${backup_file}" 2>> "${TEST_LOG}"; then
fail_test "Database backup failed"
return
fi
if ! validate_sql_archive "${backup_file}" "test_data"; then
fail_test "Database backup archive is invalid"
return
fi
pass_test "Database backup creates valid SQL archive"
# Test Gitea data backup
local data_backup="${TEST_DIR}/backups/test-data.tar.gz"
if ! tar -czf "${data_backup}" -C "${TEST_DIR}" gitea-data 2>> "${TEST_LOG}"; then
fail_test "Gitea data backup failed"
return
fi
if ! validate_tar_archive "${data_backup}" "test-file.txt"; then
fail_test "Gitea data backup archive is invalid"
return
fi
pass_test "Gitea data backup creates valid tar archive"
}
test_health_checks_detect_failures() {
log_info "Test 4: Health checks detect container failures..."
# Start healthy test container
local test_container="test-nginx-$$"
if ! start_container "${test_container}" "${NGINX_IMAGE}"; then
fail_test "Failed to start nginx container"
return
fi
# Test 1: Detect running container
if docker ps --filter "name=${test_container}" --format "{{.Names}}" | grep -q "^${test_container}$"; then
pass_test "Health check detects running container"
else
fail_test "Health check failed to detect running container"
fi
# Test 2: Stop container and verify detection
docker stop "${test_container}" &>> "${TEST_LOG}"
sleep 1
if ! docker ps --filter "name=${test_container}" --format "{{.Names}}" | grep -q "^${test_container}$"; then
pass_test "Health check detects stopped container"
else
fail_test "Health check failed to detect stopped container"
fi
# Test 3: Start postgres and verify health check
local pg_container="test-pg-health-$$"
if ! start_postgres_container "${pg_container}"; then
fail_test "Failed to start postgres for health check"
return
fi
# Test pg_isready (how health-check.sh validates postgres)
if docker exec "${pg_container}" pg_isready -U "${PG_USER}" &>> "${TEST_LOG}"; then
pass_test "Postgres health check (pg_isready) works"
else
fail_test "Postgres health check failed"
fi
}
test_update_workflow_with_rollback() {
log_info "Test 5: Update workflow with rollback simulation..."
# Create test container with versioned images
local test_container="test-rollback-$$"
# Start with old version
if ! start_container "${test_container}" "${ALPINE_OLD}" tail -f /dev/null; then
fail_test "Failed to start container with initial image"
return
fi
# Verify initial version
local initial_image=$(get_container_image "${test_container}")
if [[ "${initial_image}" == "${ALPINE_OLD}" ]]; then
pass_test "Container starts with correct initial image"
else
fail_test "Container has wrong initial image: ${initial_image}"
fi
# Simulate update: save current image info (like auto-update.sh does)
local saved_image="${initial_image}"
# "Update" to new version
docker rm -f "${test_container}" &>> "${TEST_LOG}"
if ! start_container "${test_container}" "${ALPINE_NEW}" tail -f /dev/null; then
fail_test "Failed to update container"
return
fi
local updated_image=$(get_container_image "${test_container}")
if [[ "${updated_image}" == "${ALPINE_NEW}" ]]; then
pass_test "Container updates to new image"
else
fail_test "Container update failed"
fi
# Simulate rollback (health check failed scenario)
docker rm -f "${test_container}" &>> "${TEST_LOG}"
if ! start_container "${test_container}" "${saved_image}" tail -f /dev/null; then
fail_test "Failed to rollback container"
return
fi
local rolled_back_image=$(get_container_image "${test_container}")
if [[ "${rolled_back_image}" == "${saved_image}" ]]; then
pass_test "Rollback restores previous image"
else
fail_test "Rollback failed: got ${rolled_back_image}, expected ${saved_image}"
fi
}
test_backup_and_restore_cycle() {
log_info "Test 6: Full backup and restore cycle..."
# Create test database container
local db_container="test-restore-db-$$"
if ! start_postgres_container "${db_container}"; then
fail_test "Failed to start postgres for restore test"
return
fi
# Create test data and directory structure
mkdir -p "${TEST_DIR}/restore-test/data"
mkdir -p "${TEST_DIR}/restore-test/backups"
echo "original content" > "${TEST_DIR}/restore-test/data/test-file.txt"
echo "config data" > "${TEST_DIR}/restore-test/data/config.yml"
# Create database with test data
docker exec "${db_container}" psql -U "${PG_USER}" -d "${PG_DB}" -c \
"CREATE TABLE restore_test (id SERIAL PRIMARY KEY, data TEXT, created_at TIMESTAMP DEFAULT NOW());" &>> "${TEST_LOG}"
docker exec "${db_container}" psql -U "${PG_USER}" -d "${PG_DB}" -c \
"INSERT INTO restore_test (data) VALUES ('original data'), ('test record 1'), ('test record 2');" &>> "${TEST_LOG}"
# Verify original data exists
local original_count=$(docker exec "${db_container}" psql -U "${PG_USER}" -d "${PG_DB}" -t -c \
"SELECT COUNT(*) FROM restore_test;" 2>> "${TEST_LOG}" | xargs)
if [[ "${original_count}" -ne 3 ]]; then
fail_test "Failed to create test data (expected 3 rows, got ${original_count})"
return
fi
pass_test "Test data created successfully (3 rows)"
# Step 1: Create backups
local timestamp="test-$$"
local db_backup="${TEST_DIR}/restore-test/backups/database-${timestamp}.sql.gz"
local data_backup="${TEST_DIR}/restore-test/backups/data-${timestamp}.tar.gz"
if ! docker exec "${db_container}" pg_dump -U "${PG_USER}" "${PG_DB}" | gzip > "${db_backup}" 2>> "${TEST_LOG}"; then
fail_test "Database backup failed"
return
fi
if ! tar -czf "${data_backup}" -C "${TEST_DIR}/restore-test" data 2>> "${TEST_LOG}"; then
fail_test "Data directory backup failed"
return
fi
pass_test "Backups created successfully"
# Step 2: Corrupt/destroy the data (simulate disaster)
docker exec "${db_container}" psql -U "${PG_USER}" -d "${PG_DB}" -c \
"DELETE FROM restore_test;" &>> "${TEST_LOG}"
docker exec "${db_container}" psql -U "${PG_USER}" -d "${PG_DB}" -c \
"INSERT INTO restore_test (data) VALUES ('corrupted data');" &>> "${TEST_LOG}"
rm -f "${TEST_DIR}/restore-test/data/test-file.txt"
echo "corrupted content" > "${TEST_DIR}/restore-test/data/test-file.txt"
# Verify data is corrupted
local corrupted_count=$(docker exec "${db_container}" psql -U "${PG_USER}" -d "${PG_DB}" -t -c \
"SELECT COUNT(*) FROM restore_test;" 2>> "${TEST_LOG}" | xargs)
if [[ "${corrupted_count}" -ne 1 ]]; then
fail_test "Data corruption simulation failed"
return
fi
pass_test "Data corruption simulated (1 row instead of 3)"
# Step 3: Restore database from backup
if ! zcat "${db_backup}" | docker exec -i "${db_container}" psql -U "${PG_USER}" -d postgres -c "DROP DATABASE IF EXISTS ${PG_DB};" &>> "${TEST_LOG}"; then
fail_test "Failed to drop database"
return
fi
if ! docker exec "${db_container}" psql -U "${PG_USER}" -d postgres -c "CREATE DATABASE ${PG_DB};" &>> "${TEST_LOG}"; then
fail_test "Failed to recreate database"
return
fi
if ! zcat "${db_backup}" | docker exec -i "${db_container}" psql -U "${PG_USER}" -d "${PG_DB}" &>> "${TEST_LOG}"; then
fail_test "Database restore failed"
return
fi
pass_test "Database restored from backup"
# Step 4: Restore data directory
rm -rf "${TEST_DIR}/restore-test/data"
if ! tar -xzf "${data_backup}" -C "${TEST_DIR}/restore-test" 2>> "${TEST_LOG}"; then
fail_test "Data directory restore failed"
return
fi
pass_test "Data directory restored from backup"
# Step 5: Verify restored data matches original
local restored_count=$(docker exec "${db_container}" psql -U "${PG_USER}" -d "${PG_DB}" -t -c \
"SELECT COUNT(*) FROM restore_test;" 2>> "${TEST_LOG}" | xargs)
if [[ "${restored_count}" -ne 3 ]]; then
fail_test "Restored data count mismatch (expected 3, got ${restored_count})"
return
fi
local restored_data=$(docker exec "${db_container}" psql -U "${PG_USER}" -d "${PG_DB}" -t -c \
"SELECT data FROM restore_test ORDER BY id LIMIT 1;" 2>> "${TEST_LOG}" | xargs)
if [[ "${restored_data}" != "original data" ]]; then
fail_test "Restored data content mismatch (expected 'original data', got '${restored_data}')"
return
fi
pass_test "Database data restored correctly (3 rows, original content)"
# Verify file content
local restored_file_content=$(cat "${TEST_DIR}/restore-test/data/test-file.txt")
if [[ "${restored_file_content}" != "original content" ]]; then
fail_test "Restored file content mismatch"
return
fi
if [[ ! -f "${TEST_DIR}/restore-test/data/config.yml" ]]; then
fail_test "Config file missing after restore"
return
fi
pass_test "File system data restored correctly"
# Step 6: Verify database is operational after restore
if ! docker exec "${db_container}" psql -U "${PG_USER}" -d "${PG_DB}" -c \
"INSERT INTO restore_test (data) VALUES ('post-restore test');" &>> "${TEST_LOG}"; then
fail_test "Database not operational after restore"
return
fi
local final_count=$(docker exec "${db_container}" psql -U "${PG_USER}" -d "${PG_DB}" -t -c \
"SELECT COUNT(*) FROM restore_test;" 2>> "${TEST_LOG}" | xargs)
if [[ "${final_count}" -ne 4 ]]; then
fail_test "Post-restore database operations failed"
return
fi
pass_test "Database fully operational after restore"
}
# ============================================================================
# Main Execution
# ============================================================================
main() {
echo "=========================================="
echo "Update Automation Integration Tests"
echo "=========================================="
echo ""
log_info "Starting tests at $(date)"
log_info "Test environment: ${TEST_DIR}"
echo ""
# Check Docker is available
if ! command -v docker &> /dev/null; then
log_error "Docker is not installed or not in PATH"
exit 1
fi
if ! docker ps &> /dev/null; then
log_error "Docker daemon is not running or not accessible"
exit 1
fi
# Create log file
: > "${TEST_LOG}"
# Create test directory
mkdir -p "${TEST_DIR}"
echo "━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━"
echo "Static Analysis Tests"
echo "━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━"
echo ""
test_script_syntax
echo ""
test_docker_compose_validity
echo ""
echo "━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━"
echo "Integration Tests (Docker Required)"
echo "━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━"
echo ""
test_backup_creates_valid_archives
echo ""
test_health_checks_detect_failures
echo ""
test_update_workflow_with_rollback
echo ""
test_backup_and_restore_cycle
echo ""
# Summary
echo "=========================================="
echo "Test Summary"
echo "=========================================="
echo -e "${GREEN}Passed: ${TESTS_PASSED}${NC}"
echo -e "${RED}Failed: ${TESTS_FAILED}${NC}"
echo ""
if [[ ${TESTS_FAILED} -eq 0 ]]; then
echo -e "${GREEN}All integration tests passed!${NC}"
echo ""
log_info "Full log: ${TEST_LOG}"
exit 0
else
echo -e "${RED}${TESTS_FAILED} test(s) failed${NC}"
echo ""
log_error "Full log: ${TEST_LOG}"
exit 1
fi
}
main "$@"

View File

@ -6,7 +6,7 @@ data "aws_route53_zone" "main" {
resource "aws_route53_record" "gitea" { resource "aws_route53_record" "gitea" {
zone_id = data.aws_route53_zone.main.zone_id zone_id = data.aws_route53_zone.main.zone_id
name = "gitea.poll-streams.com" name = "git.poll-streams.com"
type = "A" type = "A"
ttl = 300 ttl = 300
records = [aws_instance.gitea.public_ip] records = [aws_instance.gitea.public_ip]

View File

@ -36,9 +36,13 @@ resource "aws_iam_role_policy" "secrets_manager_read" {
Effect = "Allow" Effect = "Allow"
Action = [ Action = [
"secretsmanager:GetSecretValue", "secretsmanager:GetSecretValue",
"secretsmanager:DescribeSecret" "secretsmanager:DescribeSecret",
"secretsmanager:UpdateSecret"
]
Resource = [
aws_secretsmanager_secret.db_credentials.arn,
aws_secretsmanager_secret.ses_smtp_credentials.arn
] ]
Resource = aws_secretsmanager_secret.db_credentials.arn
} }
] ]
}) })

View File

@ -26,17 +26,17 @@ output "ssh_connection_command" {
output "ssh_connection_via_domain" { output "ssh_connection_via_domain" {
description = "SSH command using domain name (use after DNS propagates)" description = "SSH command using domain name (use after DNS propagates)"
value = "ssh -i ${local_file.private_key.filename} -o StrictHostKeyChecking=accept-new ubuntu@gitea.poll-streams.com" value = "ssh -i ${local_file.private_key.filename} -o StrictHostKeyChecking=accept-new ubuntu@git.poll-streams.com"
} }
output "gitea_domain" { output "gitea_domain" {
description = "Gitea domain name" description = "Gitea domain name"
value = "gitea.poll-streams.com" value = "git.poll-streams.com"
} }
output "gitea_url" { output "gitea_url" {
description = "Gitea URL (will be HTTPS after SSL setup)" description = "Gitea URL (will be HTTPS after SSL setup)"
value = "https://gitea.poll-streams.com" value = "https://git.poll-streams.com"
} }
output "db_secret_arn" { output "db_secret_arn" {
@ -48,3 +48,13 @@ output "db_secret_name" {
description = "Name of the database credentials secret" description = "Name of the database credentials secret"
value = aws_secretsmanager_secret.db_credentials.name value = aws_secretsmanager_secret.db_credentials.name
} }
output "ses_smtp_secret_name" {
description = "Name of the SES SMTP credentials secret"
value = aws_secretsmanager_secret.ses_smtp_credentials.name
}
output "alert_email" {
description = "Email address for alerts"
value = var.alert_email
}

View File

@ -4,10 +4,17 @@ resource "random_password" "db_password" {
special = true special = true
} }
# Generate random password for Gitea admin user
resource "random_password" "gitea_admin_password" {
length = 32
special = true
}
# Store credentials in AWS Secrets Manager # Store credentials in AWS Secrets Manager
resource "aws_secretsmanager_secret" "db_credentials" { resource "aws_secretsmanager_secret" "db_credentials" {
name = "${var.project_name}-db-credentials" name = "${var.project_name}-db-credentials"
description = "PostgreSQL database credentials for Gitea" description = "PostgreSQL database credentials for Gitea"
recovery_window_in_days = 0
tags = { tags = {
Name = "${var.project_name}-db-credentials" Name = "${var.project_name}-db-credentials"
@ -17,10 +24,36 @@ resource "aws_secretsmanager_secret" "db_credentials" {
resource "aws_secretsmanager_secret_version" "db_credentials" { resource "aws_secretsmanager_secret_version" "db_credentials" {
secret_id = aws_secretsmanager_secret.db_credentials.id secret_id = aws_secretsmanager_secret.db_credentials.id
secret_string = jsonencode({ secret_string = jsonencode({
username = "gitea" username = "gitea"
password = random_password.db_password.result password = random_password.db_password.result
database = "gitea" database = "gitea"
host = "postgres" host = "postgres"
port = 5432 port = 5432
admin_username = "gitea_admin"
admin_password = random_password.gitea_admin_password.result
admin_email = "admin@poll-streams.com"
gitea_runner_token = "" # Will be auto-generated via API
})
}
# Store SMTP credentials in Secrets Manager
resource "aws_secretsmanager_secret" "ses_smtp_credentials" {
name = "${var.project_name}-ses-smtp-credentials"
description = "SMTP credentials for AWS SES"
recovery_window_in_days = 0
tags = {
Name = "${var.project_name}-ses-smtp-credentials"
}
}
resource "aws_secretsmanager_secret_version" "ses_smtp_credentials" {
secret_id = aws_secretsmanager_secret.ses_smtp_credentials.id
secret_string = jsonencode({
smtp_host = "email-smtp.${var.aws_region}.amazonaws.com"
smtp_port = "587"
smtp_username = aws_iam_access_key.ses_smtp_access_key.id
smtp_password = aws_iam_access_key.ses_smtp_access_key.ses_smtp_password_v4
alert_email = var.alert_email
}) })
} }

44
terraform/ses.tf Normal file
View File

@ -0,0 +1,44 @@
# ============================================================================
# AWS SES Configuration
# ============================================================================
# Configures AWS Simple Email Service for sending alert notifications
# Email identity for sending alerts
resource "aws_ses_email_identity" "alert_email" {
email = var.alert_email
}
# IAM user for SMTP authentication
resource "aws_iam_user" "ses_smtp_user" {
name = "${var.project_name}-ses-smtp-user"
path = "/system/"
tags = {
Name = "${var.project_name}-ses-smtp-user"
}
}
# Policy allowing the SMTP user to send emails via SES
resource "aws_iam_user_policy" "ses_smtp_user_policy" {
name = "${var.project_name}-ses-smtp-policy"
user = aws_iam_user.ses_smtp_user.name
policy = jsonencode({
Version = "2012-10-17"
Statement = [
{
Effect = "Allow"
Action = [
"ses:SendEmail",
"ses:SendRawEmail"
]
Resource = "*"
}
]
})
}
# Access key for SMTP authentication
resource "aws_iam_access_key" "ses_smtp_access_key" {
user = aws_iam_user.ses_smtp_user.name
}

View File

@ -1,7 +1,7 @@
# S3 Bucket for Backups # S3 Bucket for Backups
resource "aws_s3_bucket" "backups" { resource "aws_s3_bucket" "backups" {
bucket = "${var.project_name}-backups" bucket = "${var.project_name}-backups"
force_destroy = true
tags = { tags = {
Name = "${var.project_name}-backups" Name = "${var.project_name}-backups"
} }
@ -24,3 +24,29 @@ resource "aws_s3_bucket_server_side_encryption_configuration" "backups" {
} }
} }
} }
resource "aws_s3_bucket_lifecycle_configuration" "backups" {
bucket = aws_s3_bucket.backups.id
rule {
id = "backup-retention"
status = "Enabled"
filter {
prefix = "backups/"
}
transition {
days = 30
storage_class = "GLACIER"
}
expiration {
days = 90
}
noncurrent_version_expiration {
noncurrent_days = 30
}
}
}

View File

@ -9,3 +9,9 @@ variable "project_name" {
type = string type = string
default = "qvest-task" default = "qvest-task"
} }
variable "alert_email" {
description = "Email address for system alerts and notifications"
type = string
default = "generic.admin.user@gmail.com"
}