9.2 KiB
Architecture Decision Records (ADR)
This document tracks all significant architectural decisions made during the project, including rationale and trade-offs.
ADR-001: Cloud Provider - AWS
Date: 2026-06-08
Status: Accepted
Decision: Use Amazon Web Services (AWS)
Rationale:
- Industry-standard cloud provider with comprehensive service portfolio
- Access to managed services when beneficial
- Strong ecosystem and community support
- Terraform has excellent AWS provider support
ADR-002: Infrastructure as Code - Terraform
Date: 2026-06-08
Status: Accepted
Decision: Use Terraform for infrastructure provisioning
Rationale:
- Declarative approach (aligns with project philosophy)
- Industry standard for cloud infrastructure
- Excellent AWS provider
- State management enables reproducibility
Scope: VPC, EC2, Security Groups, S3, Route 53
ADR-003: Configuration Management - Ansible
Date: 2026-06-08
Status: Accepted
Decision: Use Ansible for system configuration (kept minimal)
Rationale:
- Avoids problematic user-data scripts (bad experience with debugging)
- Idempotent - can re-run if setup fails
- Real-time output visibility via SSH
- Professional separation of concerns: Terraform (infra) → Ansible (config) → Docker (apps)
Scope: Install Docker, configure system basics, setup firewall Philosophy: Keep Ansible simple - no fancy roles or complexity
Alternative Considered: User-data scripts - rejected due to debugging difficulty and one-shot nature
ADR-004: Application Deployment - Docker + Docker Compose
Date: 2026-06-08
Status: Accepted
Decision: Use Docker with Docker Compose for application orchestration
Rationale:
- Fully declarative (docker-compose.yml)
- Easy to test locally (dev/prod parity)
- Simple version control and updates
- Gitea has official Docker images
- Portable and reproducible
Scope: Gitea, nginx, PostgreSQL, monitoring stack (later)
ADR-005: Database - Self-Hosted PostgreSQL in Docker
Date: 2026-06-08
Status: Accepted
Decision: PostgreSQL container, not RDS
Rationale:
- Simpler architecture (everything in docker-compose.yml)
- Shows ability to build and manage backups ourselves
- More control over configuration
- Cost-effective
- PostgreSQL is Gitea's recommended database
Trade-offs:
- Pros: Greater control, cost-effective, simpler architecture
- Cons: Requires custom backup automation and testing
Backup Strategy: Custom scripts with pg_dump to S3 (detailed in backup phase)
Future Consideration: For higher availability requirements or larger scale, RDS would provide managed backups, point-in-time recovery, and Multi-AZ deployment
ADR-006: Reverse Proxy - Nginx
Date: 2026-06-08
Status: Accepted
Decision: Nginx as reverse proxy
Rationale:
- Lightweight and performant
- Simple configuration for basic proxying
- Industry standard
- Works well in Docker
Scope: SSL termination, proxy to Gitea, HTTP→HTTPS redirect
ADR-007: SSL Certificates - Let's Encrypt
Date: 2026-06-08 (Updated 2026-06-11)
Status: Accepted
Decision: Let's Encrypt with certbot
Rationale:
- Free, automated, trusted certificates
- Widely accepted by all browsers (no certificate warnings)
- Auto-renewal reduces operational burden
- Industry-standard solution for SSL/TLS
Requirement: Valid domain name pointing to server
Domain: git.poll-streams.com (changed from gitea.poll-streams.com)
Implementation Note: Initially encountered Let's Encrypt rate limits (5 certificates per week). Resolved by migrating to a fresh domain identifier (git.poll-streams.com), allowing immediate production certificate issuance. Production certificates obtained successfully.
ADR-008: Update Automation - Diun + Custom Scripts
Date: 2026-06-08
Status: Accepted (Updated 2026-06-09)
Decision: Diun (Docker Image Update Notifier) for monitoring + custom bash scripts for orchestration
Rationale:
- Diun monitors for updates and sends email notifications (built-in)
- Enables differentiated update policies per container
- Custom scripts provide full control over update workflow
- Supports pre-update backups and health checks
- Allows manual approval for critical components (Gitea, PostgreSQL)
- Auto-update for low-risk components (nginx, certbot)
- Demonstrates production-level engineering (not just "update everything")
Update Strategy:
- Schedule: Weekly checks during off-hours
- Nginx/Certbot: Automatic updates after backup
- Gitea/PostgreSQL: Email notification, manual approval required
- Backup: Pre-update backup to S3 (database + Gitea data)
- Health Checks: Post-update validation
- Rollback: Automatic rollback on health check failure
- Notifications: Email alerts on critical failures, logs for successful updates
Scope:
- Diun container monitors all Docker images
auto-update.sh- automated update for nginx/certbotmanual-update.sh- operator-approved update for gitea/postgres- Health check and rollback logic
Alternative Considered: Watchtower - rejected because it lacks per-container policies, pre-update backups, and proper notification support
ADR-012: CI/CD - Gitea Actions with Self-Hosted Runners
Date: 2026-06-11
Status: Accepted
Decision: Use Gitea Actions with self-hosted runners for CI/CD
Rationale:
- Native integration with Gitea (no external CI service)
- Self-hosted runners provide full control and security
- GitHub Actions-compatible workflow syntax (familiar, well-documented)
- Enables automated testing before merging changes
- Demonstrates production-grade CI/CD practices
Implementation:
- Runners: 2x act_runner v0.2.10 instances as systemd services
- Automation: Ansible playbook (setup-runner.yml) for reproducible deployment
- Runner Registration: Automated via Gitea API with token from AWS Secrets Manager
- Networking: Host network mode for job containers to access Gitea
- Registration URL: https://git.poll-streams.com (public URL for git clone operations)
- Workflow: .gitea/workflows/test.yml runs integration tests on PRs
- Features: Docker layer caching, artifact uploads, workflow_dispatch support
Technical Details:
- Each runner has dedicated config directory (/etc/act_runner-{1,2})
- Configuration includes host networking to allow job containers to reach services
- Runners registered with public URL to avoid localhost connection issues
- Systemd manages runner lifecycle with automatic restart
Benefits:
- Automated quality gates before merging
- Consistent test environment (matches CI exactly)
- Fast feedback on code changes
- Self-contained solution (no external dependencies)
ADR-009: Monitoring - Prometheus + Grafana
Date: 2026-06-08
Status: Accepted (implementation later)
Decision: Prometheus for metrics, Grafana for visualization
Rationale:
- Industry standard monitoring stack
- Powerful querying with PromQL
- Rich visualization and alerting capabilities
- Strong community and pre-built dashboards
Note: To be implemented in later phase
ADR-010: Logging - Loki + Promtail
Date: 2026-06-08
Status: Accepted (implementation later)
Decision: Loki for log aggregation, Promtail for collection
Rationale:
- Lightweight compared to ELK stack
- Integrates with Grafana (single pane of glass)
- Good fit for Docker environments
Note: To be implemented in later phase
ADR-011: Backup Strategy - Custom Scripts + S3
Date: 2026-06-08
Status: Accepted (implementation later)
Decision: Bash scripts with pg_dump and AWS S3
Rationale:
- Simple and maintainable
- Full control over backup process and scheduling
- S3 provides highly durable storage (99.999999999%)
- Easy to test and validate restore procedures
Scope:
- Database backups (pg_dump)
- Gitea repository data
- Configuration files
- Automated scheduling with cron
Note: Details to be designed in backup phase
Technology Stack Summary
| Layer | Technology | Rationale |
|---|---|---|
| Cloud | AWS | Industry standard |
| Infrastructure | Terraform | Declarative IaC |
| Configuration | Ansible (minimal) | System setup, avoids user-data |
| Compute | EC2 | Flexible VM hosting |
| Application | Docker Compose | Declarative orchestration |
| Database | PostgreSQL (Docker) | Self-managed, shows control |
| Reverse Proxy | Nginx | Lightweight, standard |
| SSL | Let's Encrypt | Free, automated, professional |
| DNS | Route 53 | AWS-native |
| Updates | Diun + Scripts | Per-container policies, backup/rollback |
| CI/CD | Gitea Actions | Self-hosted runners, native integration |
| Backups | Scripts + S3 | Custom, controlled |
| Monitoring | Prometheus + Grafana | Industry standard |
| Logging | Loki + Promtail | Lightweight, integrated |
Core Principles
- Simplicity First: Avoid overengineering
- Declarative Over Imperative: Terraform, Docker Compose
- Infrastructure as Code: Everything version-controlled
- Show Control: Build things ourselves where it demonstrates skill
- Professional: Production-grade practices