🧪 Skills

Self-Hosting Mastery

Complete self-hosting and homelab operating system. Deploy, secure, monitor, and maintain self-hosted services with production-grade reliability. Use when se...

v1.0.0
❤️ 0
⬇️ 233
👁 1
Share

Description


name: afrexai-self-hosting-mastery description: Complete self-hosting and homelab operating system. Deploy, secure, monitor, and maintain self-hosted services with production-grade reliability. Use when setting up home servers, Docker infrastructure, reverse proxies, backups, monitoring, or evaluating self-hosted alternatives to SaaS.

Self-Hosting Mastery

Complete system for building and operating reliable self-hosted infrastructure — from first server to multi-node homelab.

Phase 1: Infrastructure Assessment

Server Profile YAML

server_profile:
  name: ""
  hardware:
    cpu: ""              # e.g., "Intel i5-12400" or "Raspberry Pi 5"
    ram_gb: 0
    storage:
      - device: ""       # e.g., "/dev/sda"
        type: ""         # ssd | hdd | nvme
        size_gb: 0
        role: ""         # boot | data | backup
    network: ""          # 1gbe | 2.5gbe | 10gbe
  os: ""                 # debian | ubuntu | proxmox | unraid | truenas
  location: ""           # home | closet | rack | colo | vps
  power:
    ups: false
    wattage_idle: 0
    wattage_load: 0
    monthly_cost_estimate: ""  # electricity
  network:
    public_ip: ""        # static | dynamic | cgnat
    domain: ""
    dns_provider: ""     # cloudflare | duckdns | custom
    isp_ports_open: true # some ISPs block 80/443
  goals:
    - ""                 # media server, smart home, dev environment, etc.
  budget_monthly: ""     # electricity + domain + any VPS

Hardware Decision Matrix

Budget RAM Storage Good For Example Hardware
$0 4-8GB 64GB+ Pi-hole, AdGuard, small tools Raspberry Pi 4/5
$50-150 8-16GB 256GB+ Docker host, 5-10 services Used SFF PC (Dell Optiplex, Lenovo Tiny)
$150-400 16-32GB 1TB+ NAS + services, media server Mini PC (Intel NUC, Beelink)
$400-800 32-64GB 4TB+ Full homelab, VMs + containers Used enterprise (Dell R720, HP DL380)
$800+ 64GB+ 10TB+ Multi-node, Proxmox cluster Multiple nodes, dedicated NAS

Self-Host vs SaaS Decision

Ask before self-hosting anything:

  1. Data sensitivity — Does keeping data local matter? (passwords, health, finance = yes)
  2. Reliability need — Can you tolerate occasional downtime? (email = risky, media = fine)
  3. Maintenance budget — Do you have 2-4 hours/month for updates?
  4. Skill level — Can you debug Docker/networking issues?
  5. Cost comparison — Is the SaaS < $10/mo? Often not worth self-hosting for trivial savings.

Always self-host: Password manager, DNS/ad-blocking, VPN, bookmarks, notes Usually self-host: Media server, file sync, photo backup, monitoring, git Think twice: Email (deliverability hell), calendar (sync complexity), chat (uptime expectations) Rarely worth it: Search engine (resource hungry), social media (no network effect)


Phase 2: OS & Virtualization

OS Selection Guide

OS Best For Learning Curve Notes
Debian 12 Docker-only host Low Stable, minimal, just works
Ubuntu Server 24.04 Beginners, wide docs Low More packages, snap controversy
Proxmox VE VMs + containers Medium Free, enterprise features, ZFS
Unraid NAS + Docker + VMs Medium $59-129, great UI, parity array
TrueNAS Scale ZFS NAS + Docker Medium Free, ZFS-first, apps improving
NixOS Reproducible configs High Declarative, steep learning curve

Proxmox Quick Setup

# Post-install essentials
# 1. Remove enterprise repo (if no subscription)
sed -i 's/^deb/#deb/' /etc/apt/sources.list.d/pve-enterprise.list
echo "deb http://download.proxmox.com/debian/pve bookworm pve-no-subscription" > /etc/apt/sources.list.d/pve-no-subscription.list
apt update && apt upgrade -y

# 2. Create a Docker LXC (lightweight container)
# Download template: Datacenter → Storage → CT Templates → Download → debian-12
# Create CT: 2 cores, 2GB RAM, 32GB disk, bridge vmbr0
# Inside CT: install Docker
apt install -y curl
curl -fsSL https://get.docker.com | sh

# 3. Enable IOMMU for GPU passthrough (if needed)
# Edit /etc/default/grub: GRUB_CMDLINE_LINUX_DEFAULT="quiet intel_iommu=on"
# update-grub && reboot

VM vs LXC vs Docker Decision

Factor VM LXC Docker
Isolation Full (own kernel) Partial (shared kernel) Process-level
Overhead High (1-2GB base) Low (50-200MB) Minimal
Use when Different OS, GPU passthrough, untrusted workloads Dedicated service host, ZFS datasets Most services
Avoid when RAM-constrained Need Windows, custom kernel Stateful databases (use LXC/VM)

Rule: Docker for 90% of services. LXC for Docker hosts or isolated environments. VM for Windows, different kernel needs, or GPU passthrough.


Phase 3: Docker Infrastructure

Docker Compose Project Structure

/opt/stacks/           # or ~/docker/
├── traefik/
│   ├── docker-compose.yml
│   ├── .env
│   ├── config/
│   │   └── traefik.yml
│   └── data/
│       ├── acme.json          # chmod 600
│       └── dynamic/
├── monitoring/
│   ├── docker-compose.yml
│   ├── .env
│   └── config/
├── media/
│   ├── docker-compose.yml
│   ├── .env
│   └── config/
├── productivity/
│   ├── docker-compose.yml
│   ├── .env
│   └── config/
└── scripts/
    ├── backup.sh
    ├── update-all.sh
    └── health-check.sh

Docker Compose Best Practices

# Template: production-grade service
services:
  app:
    image: vendor/app:1.2.3           # ALWAYS pin version
    container_name: app               # Explicit name
    restart: unless-stopped           # Auto-restart
    networks:
      - proxy                         # Traefik network
      - internal                      # Backend network
    volumes:
      - ./config:/config              # Bind mount for config
      - app-data:/data                # Named volume for data
    environment:
      - TZ=Europe/London              # Always set timezone
      - PUID=1000                     # Match host user
      - PGID=1000
    env_file:
      - .env                          # Secrets in .env (gitignored)
    labels:
      - "traefik.enable=true"
      - "traefik.http.routers.app.rule=Host(`app.example.com`)"
      - "traefik.http.routers.app.tls.certresolver=letsencrypt"
      - "traefik.http.services.app.loadbalancer.server.port=8080"
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:8080/health"]
      interval: 30s
      timeout: 10s
      retries: 3
      start_period: 40s
    deploy:
      resources:
        limits:
          memory: 512M               # Prevent OOM cascades
    security_opt:
      - no-new-privileges:true        # Security hardening
    read_only: true                   # Where possible
    tmpfs:
      - /tmp

volumes:
  app-data:

networks:
  proxy:
    external: true
  internal:

Docker Security Checklist

  • Pin all image versions (never :latest in production)
  • Set restart: unless-stopped on all services
  • Use .env files for secrets (never hardcode in compose)
  • Set memory limits on all containers
  • Use security_opt: no-new-privileges:true
  • Use read_only: true where possible + tmpfs for /tmp
  • Create separate Docker networks per stack
  • Never expose database ports to 0.0.0.0
  • Run containers as non-root (PUID/PGID or user:)
  • Enable Docker content trust: export DOCKER_CONTENT_TRUST=1
  • Prune unused images/volumes monthly: docker system prune -af
  • Use named volumes (not anonymous) for all persistent data
  • Set TZ environment variable on every container

Phase 4: Reverse Proxy & SSL

Reverse Proxy Selection

Proxy Best For SSL Config Style Learning Curve
Traefik Docker-native, auto-discovery Auto (ACME) Labels + YAML Medium
Caddy Simplicity, auto-SSL Auto (built-in) Caddyfile Low
Nginx Proxy Manager GUI preference Auto (UI) Web UI Very Low
Nginx (manual) Maximum control Manual/certbot Config files High

Recommendation: Traefik for Docker power users. Caddy for simplicity. NPM for beginners.

Traefik Production Config

# traefik/config/traefik.yml
api:
  dashboard: true
  insecure: false

entryPoints:
  web:
    address: ":80"
    http:
      redirections:
        entryPoint:
          to: websecure
          scheme: https
  websecure:
    address: ":443"
    http:
      tls:
        certResolver: letsencrypt

certificatesResolvers:
  letsencrypt:
    acme:
      email: you@example.com
      storage: /data/acme.json
      # Use DNS challenge if ISP blocks port 80
      # dnsChallenge:
      #   provider: cloudflare
      httpChallenge:
        entryPoint: web

providers:
  docker:
    exposedByDefault: false    # Explicit opt-in per service
    network: proxy
  file:
    directory: /data/dynamic
    watch: true

log:
  level: WARN

accessLog:
  filePath: /data/access.log
  bufferingSize: 100

Cloudflare Tunnel (Zero Port Forwarding)

For CGNAT or ISPs blocking ports — expose services without opening firewall:

# cloudflared/docker-compose.yml
services:
  cloudflared:
    image: cloudflare/cloudflared:2024.1.0
    container_name: cloudflared
    restart: unless-stopped
    command: tunnel run
    environment:
      - TUNNEL_TOKEN=${CF_TUNNEL_TOKEN}
    networks:
      - proxy

When to use Cloudflare Tunnel vs port forwarding:

  • CGNAT (no public IP) → Tunnel (only option)
  • ISP blocks 80/443 → Tunnel or DNS challenge + non-standard ports
  • Security-first → Tunnel (no open ports)
  • Performance-first → Direct (lower latency)
  • LAN-only access → Neither (use Tailscale/WireGuard)

Phase 5: Essential Services Stack

Tier 1 — Deploy First (Foundation)

Service Purpose Image RAM Notes
Traefik/Caddy Reverse proxy + SSL traefik:v3.0 64MB Gateway to everything
Pi-hole/AdGuard DNS + ad blocking pihole/pihole 128MB Network-wide ad blocking
Authelia/Authentik SSO + 2FA authelia/authelia 128MB Protect services without built-in auth
Uptime Kuma Monitoring louislam/uptime-kuma 128MB Know when things break
Watchtower Auto-updates containrrr/watchtower 32MB Optional — some prefer manual

Tier 2 — Core Services

Service Purpose Alt RAM
Vaultwarden Password manager Bitwarden 64MB
Nextcloud File sync + office Seafile (lighter) 512MB
Immich Photo backup PhotoPrism 1-4GB
Jellyfin Media server Plex (less free) 512MB-2GB
Paperless-ngx Document management - 256MB
Home Assistant Smart home - 512MB

Tier 3 — Power User

Service Purpose RAM
Gitea/Forgejo Git hosting 256MB
n8n Workflow automation 256MB
Grafana + Prometheus Metrics & dashboards 512MB
Tandoor Recipe management 256MB
Mealie Meal planning 128MB
Linkwarden/Hoarder Bookmark manager 256MB
Stirling PDF PDF tools 512MB
IT-Tools Developer utilities 64MB

RAM Planning

Total RAM needed ≈ OS base (1-2GB) + sum of service RAM + 20% headroom
Example 16GB server:
  OS + Docker:     2 GB
  Traefik:         0.1 GB
  Pi-hole:         0.1 GB
  Authelia:        0.1 GB
  Uptime Kuma:     0.1 GB
  Vaultwarden:     0.1 GB
  Nextcloud:       0.5 GB
  Immich:          2.0 GB
  Jellyfin:        1.0 GB
  Paperless:       0.3 GB
  Home Assistant:  0.5 GB
  ──────────────────────
  Total:           6.8 GB → 8.2 GB with headroom
  Available:       ~7.8 GB free for more services

Phase 6: Networking & DNS

DNS Architecture

Internet → Cloudflare DNS → Your Public IP → Router → Server
                                                        ↓
                                             Reverse Proxy (Traefik)
                                                        ↓
                                     ┌──────────────────┼──────────────────┐
                                     ↓                  ↓                  ↓
                                app.domain.com   files.domain.com   media.domain.com

Split DNS (Access Services Locally Without Hairpin NAT)

# Pi-hole/AdGuard: Local DNS rewrites
# Point *.home.example.com → 192.168.1.100 (server LAN IP)
# External: Cloudflare points to public IP
# Result: LAN traffic stays local, external goes through internet

VPN for Remote Access

Solution Type Best For Complexity
Tailscale Mesh VPN Easiest setup, multi-device Very Low
WireGuard Point-to-point Performance, full control Medium
Headscale Self-hosted Tailscale Privacy, no vendor lock Medium-High

Recommendation: Start with Tailscale (free for 3 users). Move to Headscale when you want full control.

Firewall Rules (UFW)

# Default deny incoming
ufw default deny incoming
ufw default allow outgoing

# Allow SSH (change port from 22!)
ufw allow 2222/tcp comment 'SSH'

# Allow HTTP/HTTPS for reverse proxy
ufw allow 80/tcp comment 'HTTP redirect'
ufw allow 443/tcp comment 'HTTPS'

# Allow local network for discovery
ufw allow from 192.168.1.0/24 comment 'LAN'

# Enable
ufw enable

Phase 7: Backup Strategy

3-2-1 Rule Implementation

3 copies:  Live data + Local backup + Remote backup
2 media:   SSD/HDD (server) + External drive or NAS
1 offsite: Cloud (Backblaze B2, Wasabi) or second location

Backup Script Template

#!/bin/bash
# /opt/stacks/scripts/backup.sh
set -euo pipefail

BACKUP_DIR="/mnt/backup/docker"
STACKS_DIR="/opt/stacks"
DATE=$(date +%Y-%m-%d_%H%M)
RETENTION_DAYS=30

log() { echo "[$(date '+%Y-%m-%d %H:%M:%S')] $1"; }

# 1. Stop services that need consistent backups
log "Stopping database services..."
cd "$STACKS_DIR/productivity" && docker compose stop db

# 2. Backup Docker volumes
log "Backing up volumes..."
for vol in $(docker volume ls -q); do
    docker run --rm \
        -v "$vol":/source:ro \
        -v "$BACKUP_DIR/volumes":/backup \
        alpine tar czf "/backup/${vol}_${DATE}.tar.gz" -C /source .
done

# 3. Backup compose files and configs
log "Backing up configs..."
tar czf "$BACKUP_DIR/configs/stacks_${DATE}.tar.gz" \
    --exclude='*.log' \
    --exclude='node_modules' \
    "$STACKS_DIR"

# 4. Restart services
log "Restarting services..."
cd "$STACKS_DIR/productivity" && docker compose start db

# 5. Cleanup old backups
log "Cleaning up backups older than ${RETENTION_DAYS} days..."
find "$BACKUP_DIR" -name "*.tar.gz" -mtime +$RETENTION_DAYS -delete

# 6. Sync to remote (Backblaze B2 example)
# rclone sync "$BACKUP_DIR" b2:my-backups/docker/ --transfers 4

# 7. Verify
BACKUP_SIZE=$(du -sh "$BACKUP_DIR" | cut -f1)
log "Backup complete. Total size: $BACKUP_SIZE"

# 8. Send notification (optional)
# curl -s "https://ntfy.sh/my-backups" -d "Backup complete: $BACKUP_SIZE"

Backup Schedule

What Frequency Retention Method
Docker volumes Daily 3 AM 30 days Script + cron
Compose files + configs Daily 3 AM 90 days Script + cron
Database dumps Every 6 hours 7 days pg_dump/mysqldump
Full disk image Monthly 3 months Clonezilla/dd
Offsite sync Daily 5 AM 60 days rclone to B2/Wasabi

Backup Verification (Monthly)

  • Pick a random backup from last week
  • Restore to a test VM/container
  • Verify data integrity (check file counts, DB row counts)
  • Time the restore process (document RTO)
  • Log results in backup-verification.md

Phase 8: Monitoring & Alerting

Monitoring Stack (Docker Compose)

# monitoring/docker-compose.yml
services:
  uptime-kuma:
    image: louislam/uptime-kuma:1
    container_name: uptime-kuma
    restart: unless-stopped
    volumes:
      - uptime-data:/app/data
    labels:
      - "traefik.enable=true"
      - "traefik.http.routers.uptime.rule=Host(`status.example.com`)"

  prometheus:
    image: prom/prometheus:v2.49.0
    container_name: prometheus
    restart: unless-stopped
    volumes:
      - ./config/prometheus.yml:/etc/prometheus/prometheus.yml:ro
      - prometheus-data:/prometheus
    command:
      - '--config.file=/etc/prometheus/prometheus.yml'
      - '--storage.tsdb.retention.time=30d'

  grafana:
    image: grafana/grafana:10.3.0
    container_name: grafana
    restart: unless-stopped
    volumes:
      - grafana-data:/var/lib/grafana
    environment:
      - GF_SECURITY_ADMIN_PASSWORD=${GRAFANA_PASSWORD}

  node-exporter:
    image: prom/node-exporter:v1.7.0
    container_name: node-exporter
    restart: unless-stopped
    pid: host
    volumes:
      - /proc:/host/proc:ro
      - /sys:/host/sys:ro
      - /:/rootfs:ro
    command:
      - '--path.procfs=/host/proc'
      - '--path.sysfs=/host/sys'
      - '--path.rootfs=/rootfs'

  cadvisor:
    image: gcr.io/cadvisor/cadvisor:v0.49.0
    container_name: cadvisor
    restart: unless-stopped
    volumes:
      - /:/rootfs:ro
      - /var/run:/var/run:ro
      - /sys:/sys:ro
      - /var/lib/docker/:/var/lib/docker:ro

volumes:
  uptime-data:
  prometheus-data:
  grafana-data:

Alert Rules

Metric Warning Critical Action
Disk usage >80% >90% Cleanup or expand
RAM usage >85% >95% Identify memory leak, add RAM
CPU sustained >80% 5min >95% 5min Check runaway process
Container restart >2/hour >5/hour Check logs, fix root cause
SSL cert expiry <14 days <3 days Renew cert
Backup age >26 hours >48 hours Check backup script/cron
Service down >2 min >10 min Investigate, restart

Notification Channels

Channel Service Best For
Push notification ntfy.sh (self-hosted) Mobile alerts
Chat Discord/Slack webhook Team alerts
Email Uptime Kuma built-in Formal notifications
Dashboard Grafana + Uptime Kuma Visual monitoring

Phase 9: Security Hardening

Server Hardening Checklist

# 1. SSH hardening
# /etc/ssh/sshd_config
Port 2222                          # Change default port
PermitRootLogin no                 # No root SSH
PasswordAuthentication no          # Key-only
MaxAuthTries 3
AllowUsers yourusername

# 2. Install fail2ban
apt install fail2ban -y
systemctl enable fail2ban

# 3. Automatic security updates
apt install unattended-upgrades -y
dpkg-reconfigure -plow unattended-upgrades

# 4. Disable unused services
systemctl list-unit-files --state=enabled
# Disable anything you don't need

Authentication Architecture

Internet → Traefik → Authelia/Authentik → Service
                         ↓
                    Check: authenticated?
                    Yes → Forward to service
                    No → Redirect to login page + 2FA

Authelia (lightweight, YAML config) — good for smaller setups Authentik (full IdP, web UI) — good for many users/services, SAML/OIDC

Security Scoring (0-100)

Dimension Weight Score Guide
SSH hardened (keys, non-root, non-22) 15 0=default, 15=fully hardened
Firewall active (deny-by-default) 15 0=none, 15=UFW/iptables configured
Reverse proxy (no direct port exposure) 15 0=ports exposed, 15=all behind proxy
SSL/TLS on all services 10 0=HTTP, 10=HTTPS everywhere
Auth on all public services 15 0=open, 15=SSO/2FA on everything
Container security (non-root, limits) 10 0=default, 10=hardened
Auto-updates enabled 10 0=manual, 10=automated
Secrets management (.env, not hardcoded) 10 0=in compose, 10=.env + restricted perms

Score: 0-40 = Vulnerable, 41-70 = Acceptable, 71-90 = Good, 91-100 = Hardened


Phase 10: Maintenance & Updates

Update Strategy

Option A: Manual (Recommended for critical services)

# Update script: /opt/stacks/scripts/update-all.sh
#!/bin/bash
set -euo pipefail

STACKS_DIR="/opt/stacks"
LOG="/var/log/docker-updates.log"

for stack in "$STACKS_DIR"/*/; do
    if [ -f "$stack/docker-compose.yml" ]; then
        echo "[$(date)] Updating $(basename $stack)..." | tee -a "$LOG"
        cd "$stack"
        docker compose pull 2>&1 | tee -a "$LOG"
        docker compose up -d 2>&1 | tee -a "$LOG"
    fi
done

docker image prune -f | tee -a "$LOG"
echo "[$(date)] Update complete" | tee -a "$LOG"

Option B: Watchtower (Automated — use with caution)

services:
  watchtower:
    image: containrrr/watchtower:1.7.1
    container_name: watchtower
    restart: unless-stopped
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock
    environment:
      - WATCHTOWER_SCHEDULE=0 0 4 * * MON  # Monday 4 AM
      - WATCHTOWER_CLEANUP=true
      - WATCHTOWER_NOTIFICATIONS=shoutrrr
      - WATCHTOWER_NOTIFICATION_URL=discord://webhook
      - WATCHTOWER_LABEL_ENABLE=true    # Only update labeled containers
    # Add label to containers: com.centurylinklabs.watchtower.enable=true

Weekly Maintenance Checklist

  • Check Uptime Kuma for any downtime events
  • Review disk usage (df -h)
  • Check container health (docker ps --filter health=unhealthy)
  • Review fail2ban bans (fail2ban-client status)
  • Check backup logs (last successful backup)
  • Review Docker logs for errors (docker logs --since 7d <container>)
  • Prune unused resources (docker system prune -f)

Monthly Maintenance

  • Update all container images (read changelogs first!)
  • Update host OS (apt update && apt upgrade)
  • Test a backup restore
  • Review and rotate secrets/passwords
  • Check SSL certificate expiry dates
  • Review Grafana dashboards for trends
  • Clean up unused Docker networks/volumes

Phase 11: Advanced Patterns

Multi-Node Architecture

┌─────────────┐     ┌─────────────┐     ┌─────────────┐
│   Node 1    │     │   Node 2    │     │   Node 3    │
│ (Proxy/DNS) │────│ (Services)  │────│   (NAS)     │
│ Traefik     │     │ Apps        │     │ TrueNAS     │
│ Pi-hole     │     │ Databases   │     │ NFS/SMB     │
│ Authelia    │     │ Media       │     │ Backup      │
└─────────────┘     └─────────────┘     └─────────────┘
       ↑                   ↑                   ↑
       └───────── Tailscale Mesh ──────────────┘

Docker Compose Includes (Compose v2.20+)

# Shared fragments
include:
  - path: ../common/traefik-labels.yml
  - path: ../common/logging.yml

services:
  app:
    # inherits common configs

GitOps for Homelab

homelab-configs/           # Git repo
├── .github/
│   └── workflows/
│       └── deploy.yml     # CI: lint + push to server
├── stacks/
│   ├── traefik/
│   ├── monitoring/
│   └── media/
├── scripts/
└── README.md

Workflow: Edit compose locally → commit → push → CI deploys to server Tools: Flux/ArgoCD (overkill), or simple git pull && docker compose up -d via webhook

Hardware Redundancy

Component Solution Cost
Power UPS (APC Back-UPS 600VA+) $60-150
Storage RAID1/ZFS mirror (not RAID0!) 2x disk cost
Network Dual NIC, managed switch $30-100
Server Second node (cold spare or active) $100-400

Rule: RAID is NOT backup. It protects against disk failure only, not ransomware/deletion/corruption.


Phase 12: Troubleshooting

Common Issues Decision Tree

Service not accessible?
├── Can you ping the server? → No → Network/firewall issue
├── Is the container running? (`docker ps`) → No → Check logs: `docker logs <name>`
├── Is the port exposed? (`docker port <name>`) → No → Check compose ports/networks
├── Is Traefik routing? (Check Traefik dashboard) → No → Check labels, network
├── Is DNS resolving? (`dig app.example.com`) → No → Check DNS provider
└── SSL error? → Check acme.json permissions (chmod 600), cert resolver logs

Docker Debug Commands

# Container not starting
docker logs <name> --tail 50
docker inspect <name> | jq '.[0].State'

# Network issues
docker network ls
docker network inspect <network>
docker exec <name> ping other-container

# Resource issues
docker stats                          # Live resource usage
docker system df                      # Disk usage
docker volume ls -f dangling=true     # Orphaned volumes

# Nuclear options (use carefully)
docker compose down && docker compose up -d    # Full restart
docker system prune -af --volumes              # Clean EVERYTHING

Performance Optimization

Symptom Likely Cause Fix
Slow file access HDD for database Move DB to SSD
High CPU idle Monitoring too frequent Increase scrape intervals
OOM kills No memory limits Set deploy.resources.limits.memory
Slow Nextcloud Missing Redis cache Add Redis container
Jellyfin buffering No hardware transcoding Enable GPU passthrough
Slow Docker builds No layer caching Use multi-stage + .dockerignore

Service Configuration Quick Reference

Vaultwarden (Password Manager)

services:
  vaultwarden:
    image: vaultwarden/server:1.30.5
    container_name: vaultwarden
    restart: unless-stopped
    volumes:
      - vaultwarden-data:/data
    environment:
      - SIGNUPS_ALLOWED=false       # Disable after creating your account
      - WEBSOCKET_ENABLED=true
      - ADMIN_TOKEN=${ADMIN_TOKEN}  # Generate: openssl rand -base64 48
    labels:
      - "traefik.enable=true"
      - "traefik.http.routers.vault.rule=Host(`vault.example.com`)"

Immich (Photo Backup)

# Use their official docker-compose.yml from:
# https://github.com/immich-app/immich/releases/latest/download/docker-compose.yml
# Key settings:
# - Set UPLOAD_LOCATION to a large storage mount
# - Enable hardware transcoding if GPU available
# - Set IMMICH_MACHINE_LEARNING_URL for face detection

Paperless-ngx (Document Management)

services:
  paperless:
    image: ghcr.io/paperless-ngx/paperless-ngx:2.4
    container_name: paperless
    restart: unless-stopped
    volumes:
      - paperless-data:/usr/src/paperless/data
      - paperless-media:/usr/src/paperless/media
      - ./consume:/usr/src/paperless/consume  # Drop PDFs here
      - ./export:/usr/src/paperless/export
    environment:
      - PAPERLESS_OCR_LANGUAGE=eng
      - PAPERLESS_TIME_ZONE=Europe/London
      - PAPERLESS_ADMIN_USER=${ADMIN_USER}
      - PAPERLESS_ADMIN_PASSWORD=${ADMIN_PASS}

Homelab Quality Rubric (0-100)

Dimension Weight 0 (Poor) 50 (Decent) 100 (Excellent)
Security 20% Default passwords, open ports Firewall + SSL Hardened SSH, SSO/2FA, no-new-privileges
Backups 20% None Local only, untested 3-2-1, automated, verified monthly
Monitoring 15% None Uptime Kuma only Full stack: metrics + logs + alerts
Documentation 10% Nothing written README per stack GitOps, full runbook, diagrams
Updates 10% Never updated Manual quarterly Scheduled weekly, changelogs reviewed
Reliability 10% Frequent crashes Mostly stable UPS, auto-restart, health checks
Performance 10% Slow, OOM kills Adequate Resource limits, SSD, HW transcoding
Scalability 5% Single machine, no plan Compose organized Multi-node ready, IaC

10 Self-Hosting Mistakes

# Mistake Fix
1 Using :latest tag Pin versions: image:1.2.3
2 No backups 3-2-1 backup rule, test restores
3 Exposing ports directly Everything behind reverse proxy
4 Default passwords Change immediately, use password manager
5 No monitoring Uptime Kuma minimum, Grafana for depth
6 RAID = backup mentality RAID protects disks, not data
7 Over-engineering day 1 Start small, add complexity as needed
8 No documentation Document every service, every port, every cron
9 Ignoring updates Security patches matter, schedule updates
10 Running as root Non-root containers, restricted SSH

Natural Language Commands

Say Agent Does
"Set up a new service" Guide through compose file creation with security best practices
"Audit my homelab security" Run through security scoring checklist
"Plan my backup strategy" Design 3-2-1 backup plan for your setup
"What should I self-host?" Assess needs and recommend services by tier
"My container keeps crashing" Walk through troubleshooting decision tree
"Help me set up Traefik" Generate production Traefik config with SSL
"Compare NAS options" Compare TrueNAS vs Unraid vs DIY for your needs
"Optimize my Docker setup" Review compose files for security and performance
"Set up monitoring" Deploy Uptime Kuma + Prometheus + Grafana stack
"Plan a hardware upgrade" Assess current usage, recommend hardware by budget
"Migrate from cloud to self-hosted" Plan migration with data export and service mapping
"Set up remote access" Compare and deploy VPN/Tailscale for secure remote access

Reviews (0)

Sign in to write a review.

No reviews yet. Be the first to review!

Comments (0)

Sign in to join the discussion.

No comments yet. Be the first to share your thoughts!

Compatible Platforms

Pricing

Free

Related Configs