- Diun monitors Docker images - Automated updates for nginx, manual approval for gitea/postgres - Weekly cert renewal automation via cron - Health checks with automatic rollback on failure - AWS SES email notifications on update failures - Daily S3 backups + pre-update snapshots - Integration tests with Gitea Actions quality gate - Change domain from gitea.poll-streams.com to git.poll-streams.com - Add diagrams
13 KiB
Roadmap
This is the implementation road map for the project. It outlines the key milestones and features in incremental steps, allowing for a structured approach to development and deployment.
Phase 1: Conceptualization and Planning
This phase will be achieved through discussion and research and will include the following steps (no code should be implemented in this phase):
1.1 Requirements Analysis
- Define the scope and requirements of the project
- Identify constraints and non-functional requirements
- Determine host environment (cloud provider, VPS, or local)
1.2 Technology Selection ✅
Decisions documented in ADR.md
- Cloud: AWS
- Infrastructure as Code: Terraform
- Configuration Management: Ansible (kept minimal)
- Application Deployment: Docker + Docker Compose
- Database: PostgreSQL (self-hosted in Docker)
- Reverse Proxy: Nginx
- SSL: Let's Encrypt with certbot
- Update Automation: Diun + Custom Scripts
- Monitoring: Prometheus + Grafana (later phase)
- Logging: Loki + Promtail (later phase)
- Backup: Custom scripts + S3 (later phase)
1.3 Architecture Design ✅
- ✅ Overall system architecture designed
- ✅ Network topology planned (VPC, subnets, security groups)
- ✅ Three architecture diagrams created in docs/diagrams/
1.4 Project Structure ✅
- Directory structure planned (will create incrementally per phase)
- Documentation structure in place (
docs/diagrams/) - Naming conventions: lowercase, hyphens for files, descriptive names
Goals:
- ✅ A clear full Roadmap for the project available in this file
- ✅ Technology stack documented with rationale (see ADR.md)
- ✅ Architecture diagrams created (3 diagrams in docs/diagrams/)
- ✅ Project structure planned
Phase 1 Complete! Ready to begin Phase 2 (Infrastructure Setup).
Phase 2: Infrastructure Setup
This phase provisions the AWS infrastructure using Terraform.
2.1 Terraform Backend Setup ✅
- Configure AWS CLI and credentials locally
- Set up Terraform backend (S3 bucket for state storage)
- Initialize Terraform working directory
2.2 Core Infrastructure ✅
- ✅ Create VPC with single public subnet
- ✅ Set up Internet Gateway
- ✅ Configure Security Group for EC2 (ports 22, 80, 443)
- ✅ Provision EC2 instance (t3.medium, Ubuntu 24.04) with IAM role
- ✅ Create S3 bucket for backups (with versioning & encryption)
- ✅ Configure Route 53 DNS records (A record: git.poll-streams.com → EC2)
- ✅ Use official Terraform AWS modules (VPC, Security Group)
- ✅ Refactored into separate files: main.tf, vpc.tf, security.tf, compute.tf, storage.tf, iam.tf, dns.tf, outputs.tf
2.3 Security Configuration ✅
- ✅ Configure SSH key-based authentication (Ed25519, generated via Terraform)
- ✅ SSH access from anywhere (0.0.0.0/0) - security via key-based auth
- ✅ Apply IAM policies (AmazonS3FullAccess for EC2 backups)
- ✅ Security group follows least access (only 22, 80, 443 inbound; all outbound)
- ✅ Encrypted EBS root volume (30GB gp3)
Goals: ✅
- ✅ AWS infrastructure fully defined in Terraform code
- ✅ EC2 instance provisioned and accessible via SSH
- ✅ S3 backup bucket created
- ✅ Domain DNS configured and resolving
- ✅ Infrastructure can be destroyed and recreated with
terraform apply
Phase 2 Complete! Ready to begin Phase 3 (Automated Gitea Deployment).
Phase 3: Automated Gitea Deployment
This phase implements the automated, reproducible Gitea installation.
3.1 Database Setup ✅
- ✅ PostgreSQL 18.4 deployed via Docker Compose
- ✅ Database credentials stored in AWS Secrets Manager
- ✅ Random password generation via Terraform
- ✅ Volume mounted at /var/lib/postgresql (PostgreSQL 18+ requirement)
- ✅ Health checks configured with pg_isready
3.2 Gitea Installation ✅
- ✅ Gitea 1.22.6 deployed via Docker Compose
- ✅ Ansible playbooks created: setup-system.yml, deploy-gitea.yml, setup-ssl.yml, site.yml
- ✅ Docker + AWS CLI installation automated
- ✅ Gitea configured with environment variables (database, domain, ROOT_URL)
- ✅ SSH git access on port 2222
- ✅ Volumes for persistent data
3.3 Reverse Proxy Configuration ✅
- ✅ Nginx 1.27-alpine deployed via Docker Compose
- ✅ Let's Encrypt SSL certificate obtained via certbot (production)
- ✅ Domain: git.poll-streams.com (migrated to avoid rate limits)
- ✅ Two-stage nginx config (HTTP-only for ACME, then HTTPS)
- ✅ SSL termination at nginx, proxy to Gitea on port 3000
- ✅ HTTP to HTTPS redirect configured
- ✅ Security headers (HSTS, X-Frame-Options, etc.)
- ✅ WebSocket support for real-time features
- ✅ 512MB upload limit
3.4 Testing ✅
- ✅ HTTPS access verified: https://git.poll-streams.com
- ✅ Valid SSL certificate (Let's Encrypt production)
- ✅ HTTP → HTTPS redirect working
- ✅ Gitea web interface accessible and functional
- ✅ User account created, repository created
- ✅ Git push via HTTPS tested successfully
- ✅ Full deployment reproducible via
ansible-playbook site.yml
Goals: ✅
- ✅ Gitea running and accessible via HTTPS through reverse proxy
- ✅ Installation fully automated and reproducible
- ✅ Production-grade deployment with SSL
Phase 3 Complete! Gitea is fully deployed, secured with SSL, and accessible from the internet.
Phase 4: Update Automation ✅
This phase implements automated update mechanisms for Gitea and related components.
4.1 Update Strategy Design ✅
- ✅ Weekly update checks (Sunday 3:00 AM)
- ✅ Per-container update policies (automatic vs manual)
- ✅ Pre-update backup to S3
- ✅ Post-update health checks
- ✅ Automatic rollback on failure
- ✅ Email notifications via AWS SES
4.2 Update Monitoring ✅
- ✅ Diun 4.33 deployed for Docker image update detection
- ✅ Scheduled weekly checks (cron:
0 3 * * 0) - ✅ Monitors: postgres, gitea, nginx, diun
- ✅ Email notifications configured via AWS SES SMTP
- ✅ IAM user created for SMTP credentials
- ✅ Labels define update policies per container
4.3 Automated Scripts ✅
- ✅ backup.sh: Database + Gitea data backup to S3 bucket
- ✅ health-check.sh: Validates all services running and responsive
- ✅ auto-update.sh: Automatic updates for low-risk containers (nginx)
- Backup before update
- Pull new image
- Recreate container
- Health check validation
- Automatic rollback on failure
- Email notifications
- ✅ manual-update.sh: Manual updates for critical containers (gitea/postgres)
- Operator confirmation required
- Same safety flow as auto-update
- Success/failure notifications
- ✅ test-update.sh: Quality gate for CI/local validation
- Validates script syntax
- Checks required functions
- Verifies control flow logic
- Tests error handling patterns
- No live services required
4.4 Cron Jobs ✅
- ✅ Weekly automatic update (nginx only): Sunday 3:15 AM
- ✅ Weekly certificate renewal: Sunday 3:30 AM
- ✅ Daily backups: 2:00 AM
- ✅ All configured via Ansible (setup-cron.yml)
4.5 Certificate Renewal ✅
- ✅ Automated weekly renewal check via cron
- ✅ Uses certbot container:
docker compose run --rm certbot renew - ✅ Restarts nginx to load new certificates
- ✅ Process is idempotent (safe to run weekly)
4.6 Testing & Validation ✅
- ✅ Integration tests created (test-update.sh)
- ✅ All scripts tested on live system
- ✅ Cron jobs verified
- ✅ Email notifications tested
- ✅ Diun monitoring confirmed (4 containers)
- ✅ Update workflow diagram created
4.7 CI/CD Implementation ✅
- ✅ Gitea Actions enabled on instance
- ✅ Self-hosted runners deployed (2x act_runner v0.2.10)
- ✅ Runner automation via Ansible (setup-runner.yml)
- ✅ Systemd services for runner management
- ✅ Host networking configuration for job containers
- ✅ CI workflow created (.gitea/workflows/test.yml)
- ✅ Automated testing on pull requests
- ✅ Docker layer caching for performance
- ✅ Artifact upload on test failure
- ✅ Full CI/CD pipeline tested and operational
Goals:
- ✅ Automated update system operational
- ✅ Update process tested and validated on live system
- ✅ Rollback procedure implemented and tested
- ✅ Quality gate for CI/local environments
- ✅ CI/CD pipeline with self-hosted runners
- ✅ Documentation complete (workflow diagram)
Implementation Summary:
- 5 bash scripts following best practices (DRY, error handling, logging)
- Diun monitoring with AWS SES email notifications
- Per-container update policies (automatic: nginx, manual: gitea/postgres)
- Pre-update backups with automatic rollback on failure
- Certificate renewal automation
- Comprehensive testing framework
- CI/CD with Gitea Actions and 2 self-hosted runners
- Visual workflow documentation (including CI/CD flow)
Phase 4 Complete! Update automation and CI/CD fully operational with safety mechanisms.
Phase 5: Backup Strategy Implementation
This phase implements comprehensive backup solutions.
5.1 Backup Concept Document
- Document backup strategy (3-2-1 rule)
- Define backup scope (database, repos, config, etc.)
- Define retention policy
- Define RTO and RPO targets
5.2 Backup Implementation
- Automate database backups
- Automate Gitea data directory backups
- Automate configuration backups
- Set up backup storage (local + remote)
- Implement backup rotation and cleanup
- Schedule automated backups
5.3 Recovery Testing
- Document restore procedures
- Test database restore
- Test full system restore
- Document recovery time
Goals:
- Automated backup system operational
- Restore procedures tested and documented
- Backup strategy document completed
Phase 6: Monitoring Implementation
This phase implements monitoring for system health and performance.
6.1 Monitoring Concept Document
- Define key metrics to monitor
- Define alerting thresholds
- Define alert channels (email, Slack, etc.)
6.2 Monitoring Setup
- Deploy monitoring solution
- Configure system metrics collection (CPU, RAM, disk, network)
- Configure Gitea-specific metrics
- Configure database metrics
- Set up monitoring dashboards
- Configure alerting rules
6.3 Testing
- Simulate failure scenarios
- Verify alerts trigger correctly
- Validate dashboard accuracy
Goals:
- Monitoring system operational with dashboards
- Alerting configured and tested
- Monitoring concept document completed
Phase 7: Logging Implementation
This phase implements centralized logging for all components.
7.1 Logging Concept Document
- Define logging architecture
- Define log retention policy
- Define log analysis requirements
7.2 Logging Setup
- Deploy centralized logging solution
- Configure Gitea application logging
- Configure reverse proxy access logs
- Configure database logs
- Configure system logs collection
- Set up log parsing and indexing
- Create log search and visualization dashboards
7.3 Testing
- Verify logs are being collected
- Test log search functionality
- Test log-based alerts (if applicable)
Goals:
- Centralized logging operational
- All components sending logs to central system
- Logging concept document completed
Phase 8: Redundancy and High Availability
This phase implements fail-safe operations and redundancy.
8.1 Redundancy Concept Document
- Document SPOF (Single Points of Failure) analysis
- Design HA architecture
- Define failover strategy
- Define acceptable downtime
8.2 Redundancy Implementation (Optional/Simplified)
- Implement database redundancy (replication/clustering) OR document approach
- Implement application redundancy (multiple Gitea instances) OR document approach
- Implement load balancing OR document approach
- Document manual failover procedures
Goals:
- Redundancy concept document completed
- PoC or detailed plan for HA implementation
- Failover procedures documented
Phase 9: Documentation and Final Testing
This phase consolidates all documentation and performs end-to-end testing.
9.1 Documentation
- Create comprehensive README
- Document architecture with diagrams
- Document all procedures (deployment, updates, backup/restore, failover)
- Create runbooks for common scenarios
- Document interview discussion points
9.2 Final Testing
- Perform end-to-end deployment test
- Test all automated processes
- Verify all documentation is accurate
- Test system under load (optional)
9.3 Repository Organization
- Store all code and docs in Gitea repository
- Ensure repository is well-organized
- Add proper README and documentation
Goals:
- Complete documentation package
- All automation tested and validated
- Ready for interview presentation
Phase 10: Interview Preparation
This phase prepares for the interview discussion.
10.1 Preparation
- Review all concept documents
- Prepare to explain technology choices
- Prepare architecture diagrams for presentation
- Prepare to demonstrate the system
- List lessons learned and trade-offs made
- Prepare improvement suggestions
Goals:
- Ready to discuss all aspects of the implementation
- Demo environment functional and accessible
- Confident in technology choices and concepts
Success Criteria
- ✅ Gitea accessible via HTTPS through reverse proxy
- ✅ Installation fully automated and reproducible
- ✅ Automated updates configured and tested
- ✅ Comprehensive concept documents for: Backup, Monitoring, Logging, Redundancy
- ✅ At least one PoC implementation (optional but recommended)
- ✅ All code and documentation in Git repository
- ✅ System accessible to interviewer over internet