Antora Kubernetes Deployment Troubleshooting

This guide covers common issues and solutions when deploying Antora documentation sites on Kubernetes using the init container pattern with NGINX.

Common Issues and Solutions

Git Credentials Setup (HTTPS PAT)

If your Antora playbook points at private HTTPS repositories, Antora reads credentials from the GIT_CREDENTIALS environment variable or from the git credential store. Create a Kubernetes Secret (do NOT store real tokens in Git) and reference it from the deployment via valueFrom.secretKeyRef.

PowerShell example (create secret from literal)
# Replace USERNAME and YOUR_PAT with your Git username and personal access token
kubectl create secret generic git-credentials `
   --from-literal=GIT_CREDENTIALS='https://USERNAME:[email protected]' `
   -n <namespace>
PowerShell example (create from a temporary file to avoid leaving token in shell history)
Set-Content -Path C:\Temp\git-cred.txt -Value "https://USERNAME:[email protected]"
kubectl create secret generic git-credentials --from-file=GIT_CREDENTIALS=C:\Temp\git-cred.txt -n <namespace>
Remove-Item C:\Temp\git-cred.txt
Best practices
  • Use a minimal-scope or fine-grained token restricted to repository read access

  • Prefer short-lived tokens or rotate them frequently

  • Do not check the token into Git; use SealedSecrets or External Secrets for GitOps workflows

  • Ensure Kubernetes RBAC limits access to this Secret to the service account used by the deployment


Issue: Pods Not Starting

Symptoms:

  • Pods stuck in Pending or Init:Error state

  • Deployment shows 0/N ready pods

Diagnosis:

kubectl get pods -n <namespace>
kubectl describe pod <pod-name> -n <namespace>

Common Causes:

  1. Insufficient resources

    • Look for: FailedScheduling events

    • Solution: Increase cluster capacity or reduce resource requests

  2. Image pull errors

    • Look for: ImagePullBackOff or ErrImagePull

    • Solution: Check image names and registry access

  3. ConfigMap not found

    • Look for: MountVolume.SetUp failed

    • Solution: Ensure ConfigMaps are created before deployment

      # Apply ConfigMaps and Deployment from the k8s/ directory
      kubectl apply -n <namespace> -f k8s/

Issue: Init Container Fails (Antora Generation Error)

Symptoms:

  • Pods in Init:Error or Init:CrashLoopBackOff

  • NGINX container never starts

Diagnosis:

kubectl logs -n <namespace> <pod-name> -c antora-build

Common Causes:

  1. Git repository not accessible

    • Error message: fatal: could not read from remote repository

    • Solutions:

      • Check repository URL in your Antora playbook

      • For private repos, add Git credentials

      • Verify network connectivity from cluster

  2. Invalid playbook configuration

    • Error message: playbook validation failed or similar

    • Solution: Validate your playbook YAML syntax

      # Test locally if you have Antora installed
      antora --fetch antora-playbook.yml
  3. Invalid content structure

    • Error message: antora.yml not found or component errors

    • Solution: Ensure your Git repo has correct structure:

      docs/
      ├── antora.yml
      └── modules/ROOT/...
  4. Branch not found

    • Error message: the remote branch does not exist

    • Solution: Verify branch name in playbook matches your Git repo


Issue: NGINX Shows 404 Errors

Symptoms:

  • Pod is running but all pages return 404

  • Port-forward works but shows "404 Not Found"

Diagnosis:

kubectl exec -n <namespace> <pod-name> -c nginx -- ls -la /usr/share/nginx/html

Common Causes:

  1. Content not generated

    • Check init container logs for errors

    • Content directory is empty or has wrong structure

  2. Wrong output directory

    • Antora output should match NGINX root

    • Playbook should have: output.dir: /antora/build/site

    • NGINX should mount from shared volume

  3. Missing index file

    • Solution: Ensure your Antora site has an index page

    • Check start_page in playbook points to existing page

Fix:

# Restart to regenerate content
kubectl rollout restart deployment/<deployment-name> -n <namespace>

# Check if content is generated
kubectl exec -n <namespace> <pod-name> -c nginx -- ls -la /usr/share/nginx/html

Issue: Content Not Updating After Git Push

Symptoms:

  • Made changes to Git repository

  • Site still shows old content

Cause:

Content is generated only when pods start (init container runs once).

Solution:

# Force pod recreation to regenerate content
kubectl rollout restart deployment/<deployment-name> -n <namespace>

# OR delete pods manually
kubectl delete pods -n <namespace> -l app=<app-label>
Set up a CronJob to automatically rebuild on a schedule, or use webhooks for event-driven rebuilds.

Issue: Service Not Accessible

Symptoms:

  • Cannot access site via port-forward or Ingress

  • Connection refused or timeout

Diagnosis:

# Check service
kubectl get svc -n <namespace>

# Check endpoints
kubectl get endpoints -n <namespace>

# Check pod readiness
kubectl get pods -n <namespace> -o wide

Common Causes:

  1. Pods not ready

    • Check readiness probe logs

    • Ensure NGINX is serving content on the configured port

  2. Service selector mismatch

    • Service selector must match pod labels

    • Check: kubectl describe svc <service-name> -n <namespace>

  3. Port-forward to wrong resource

    • Use: kubectl port-forward -n <namespace> svc/<service-name> 8080:80

    • Not: kubectl port-forward -n <namespace> deploy/<deployment-name>


Issue: Ingress Not Working

Symptoms:

  • Port-forward works but external access fails

  • Domain shows 404 or connection refused

Diagnosis:

kubectl get ingress -n <namespace>
kubectl describe ingress <ingress-name> -n <namespace>

Common Causes:

  1. Ingress controller not installed

    • Solution: Install nginx-ingress or your preferred controller

      # Example for nginx-ingress
      kubectl apply -f https://raw.githubusercontent.com/kubernetes/ingress-nginx/controller-v1.8.1/deploy/static/provider/cloud/deploy.yaml
  2. Wrong ingressClassName

    • Check your ingress controller’s class name

    • Update ingressClassName in your ingress manifest

  3. DNS not configured

    • Ensure your domain points to ingress controller’s external IP

      kubectl get svc -n ingress-nginx
  4. TLS certificate issues

    • Check cert-manager is installed and configured

    • Verify certificate is issued


Issue: High Memory Usage

Symptoms:

  • Pods being OOMKilled

  • Frequent pod restarts

Diagnosis:

kubectl top pods -n <namespace>
kubectl describe pod <pod-name> -n <namespace> | grep -A 5 "State"

Solutions:

  1. Increase memory limits

    resources:
      limits:
        memory: "1Gi"  # Increase from 512Mi
  2. Optimize Antora build

    • Reduce number of versions/branches in playbook

    • Use --fetch flag

  3. For NGINX

    • Usually not needed for static content

    • Enable memory-efficient caching if required


Issue: Slow Site Generation

Symptoms:

  • Init container takes very long to complete

  • Pods slow to start

Diagnosis:

kubectl logs -n <namespace> <pod-name> -c antora-build
# Look for timing information

Solutions:

  1. Large repository

    • Use shallow clones if possible

    • Reduce number of branches fetched

  2. Multiple large repositories

    • Consider caching Git repos

    • Increase CPU limits for init container

  3. Slow network

    • Check cluster network connectivity

    • Consider mirroring Git repos closer to cluster


Issue: Permission Denied Errors

Symptoms:

  • Init container fails with permission errors

  • NGINX cannot read files

Diagnosis:

kubectl logs -n <namespace> <pod-name> -c antora-build
kubectl logs -n <namespace> <pod-name> -c nginx

Solutions:

  1. emptyDir volume permissions

    • Usually automatic, but can add fsGroup to pod spec:

      securityContext:
        fsGroup: 101  # nginx group
  2. Read-only filesystem

    • Check volume mount settings

    • Ensure site-content volume is read-write for init container


Diagnostic Commands Reference

# View all resources
kubectl get all -n <namespace>

# Detailed pod information
kubectl describe pod <pod-name> -n <namespace>

# Init container logs
kubectl logs -n <namespace> <pod-name> -c antora-build

# NGINX container logs (live)
kubectl logs -n <namespace> <pod-name> -c nginx -f

# Previous container logs (if pod restarted)
kubectl logs -n <namespace> <pod-name> -c nginx --previous

# Events (last 20)
kubectl get events -n <namespace> --sort-by='.lastTimestamp' | tail -20

# Resource usage
kubectl top pods -n <namespace>

# Execute commands in container
kubectl exec -it -n <namespace> <pod-name> -c nginx -- sh

# Port forward for testing
kubectl port-forward -n <namespace> svc/<service-name> 8080:80

# Validate manifests
kubectl apply --dry-run=client -f k8s/deployment.yaml

# Force rollout
kubectl rollout restart deployment/<deployment-name> -n <namespace>

# Check rollout status
kubectl rollout status deployment/<deployment-name> -n <namespace>

# Rollback if needed
kubectl rollout undo deployment/<deployment-name> -n <namespace>

Quick Health Check Script

#!/bin/bash
NAMESPACE=${1:-default}

echo "=== Namespace ==="
kubectl get ns $NAMESPACE

echo -e "\n=== Pods ==="
kubectl get pods -n $NAMESPACE -o wide

echo -e "\n=== Service ==="
kubectl get svc -n $NAMESPACE

echo -e "\n=== Endpoints ==="
kubectl get endpoints -n $NAMESPACE

echo -e "\n=== ConfigMaps ==="
kubectl get configmap -n $NAMESPACE

echo -e "\n=== Recent Events ==="
kubectl get events -n $NAMESPACE --sort-by='.lastTimestamp' | tail -10

echo -e "\n=== Resource Usage ==="
kubectl top pods -n $NAMESPACE 2>/dev/null || echo "Metrics not available"

Save this as health-check.sh and run with: ./health-check.sh <namespace>

Getting Additional Help

If you’re still stuck:

  1. Collect debug information:

    kubectl get all -n <namespace> -o yaml > debug-output.yaml
    kubectl describe deployment/<deployment-name> -n <namespace> >> debug-output.yaml
    kubectl logs -n <namespace> -l app=<app-label> --all-containers >> debug-output.yaml
  2. Check documentation:

Known Limitations

  1. Content updates require pod restart - Not automatic on Git push (use CronJob or webhooks)

  2. emptyDir volumes - Content lost on pod deletion (regenerated on restart)

  3. No built-in search - Requires additional integration (e.g., Lunr.js, Algolia)

  4. No authentication by default - All content is public unless you add OAuth/authentication layer