Antora Kubernetes Deployment Troubleshooting

This guide covers common issues and solutions when deploying Antora documentation sites on Kubernetes using the init container pattern with NGINX.

Common Issues and Solutions

Git Credentials Setup (HTTPS PAT)

If your Antora playbook points at private HTTPS repositories, Antora reads credentials from the GIT_CREDENTIALS environment variable or from the git credential store. Create a Kubernetes Secret (do NOT store real tokens in Git) and reference it from the deployment via valueFrom.secretKeyRef.

PowerShell example (create secret from literal)

# Replace USERNAME and YOUR_PAT with your Git username and personal access token
kubectl create secret generic git-credentials `
   --from-literal=GIT_CREDENTIALS='https://USERNAME:[email protected]' `
   -n <namespace>

PowerShell example (create from a temporary file to avoid leaving token in shell history)

Set-Content -Path C:\Temp\git-cred.txt -Value "https://USERNAME:[email protected]"
kubectl create secret generic git-credentials --from-file=GIT_CREDENTIALS=C:\Temp\git-cred.txt -n <namespace>
Remove-Item C:\Temp\git-cred.txt

Best practices

Use a minimal-scope or fine-grained token restricted to repository read access
Prefer short-lived tokens or rotate them frequently
Do not check the token into Git; use SealedSecrets or External Secrets for GitOps workflows
Ensure Kubernetes RBAC limits access to this Secret to the service account used by the deployment

Issue: Pods Not Starting

Symptoms:

Pods stuck in Pending or Init:Error state
Deployment shows 0/N ready pods

Diagnosis:

kubectl get pods -n <namespace>
kubectl describe pod <pod-name> -n <namespace>

Common Causes:

Insufficient resources
- Look for: FailedScheduling events
- Solution: Increase cluster capacity or reduce resource requests
Image pull errors
- Look for: ImagePullBackOff or ErrImagePull
- Solution: Check image names and registry access
ConfigMap not found
- Look for: MountVolume.SetUp failed
- Solution: Ensure ConfigMaps are created before deployment
  # Apply ConfigMaps and Deployment from the k8s/ directory kubectl apply -n <namespace> -f k8s/

Issue: Init Container Fails (Antora Generation Error)

Symptoms:

Pods in Init:Error or Init:CrashLoopBackOff
NGINX container never starts

Diagnosis:

kubectl logs -n <namespace> <pod-name> -c antora-build

Common Causes:

Git repository not accessible
- Error message: fatal: could not read from remote repository
- Solutions:
  - Check repository URL in your Antora playbook
  - For private repos, add Git credentials
  - Verify network connectivity from cluster
Invalid playbook configuration
- Error message: playbook validation failed or similar
- Solution: Validate your playbook YAML syntax
  # Test locally if you have Antora installed antora --fetch antora-playbook.yml
Invalid content structure
- Error message: antora.yml not found or component errors
- Solution: Ensure your Git repo has correct structure:
  docs/ ├── antora.yml └── modules/ROOT/...
Branch not found
- Error message: the remote branch does not exist
- Solution: Verify branch name in playbook matches your Git repo

Issue: NGINX Shows 404 Errors

Symptoms:

Pod is running but all pages return 404
Port-forward works but shows "404 Not Found"

Diagnosis:

kubectl exec -n <namespace> <pod-name> -c nginx -- ls -la /usr/share/nginx/html

Common Causes:

Content not generated
- Check init container logs for errors
- Content directory is empty or has wrong structure
Wrong output directory
- Antora output should match NGINX root
- Playbook should have: output.dir: /antora/build/site
- NGINX should mount from shared volume
Missing index file
- Solution: Ensure your Antora site has an index page
- Check start_page in playbook points to existing page

Fix:

# Restart to regenerate content
kubectl rollout restart deployment/<deployment-name> -n <namespace>

# Check if content is generated
kubectl exec -n <namespace> <pod-name> -c nginx -- ls -la /usr/share/nginx/html

Issue: Content Not Updating After Git Push

Symptoms:

Made changes to Git repository
Site still shows old content

Cause:

Content is generated only when pods start (init container runs once).

Solution:

# Force pod recreation to regenerate content
kubectl rollout restart deployment/<deployment-name> -n <namespace>

# OR delete pods manually
kubectl delete pods -n <namespace> -l app=<app-label>

Set up a CronJob to automatically rebuild on a schedule, or use webhooks for event-driven rebuilds.

Issue: Service Not Accessible

Symptoms:

Cannot access site via port-forward or Ingress
Connection refused or timeout

Diagnosis:

# Check service
kubectl get svc -n <namespace>

# Check endpoints
kubectl get endpoints -n <namespace>

# Check pod readiness
kubectl get pods -n <namespace> -o wide

Common Causes:

Pods not ready
- Check readiness probe logs
- Ensure NGINX is serving content on the configured port
Service selector mismatch
- Service selector must match pod labels
- Check: kubectl describe svc <service-name> -n <namespace>
Port-forward to wrong resource
- Use: kubectl port-forward -n <namespace> svc/<service-name> 8080:80
- Not: kubectl port-forward -n <namespace> deploy/<deployment-name>

Issue: Ingress Not Working

Symptoms:

Port-forward works but external access fails
Domain shows 404 or connection refused

Diagnosis:

kubectl get ingress -n <namespace>
kubectl describe ingress <ingress-name> -n <namespace>

Common Causes:

Ingress controller not installed

Solution: Install nginx-ingress or your preferred controller

# Example for nginx-ingress
kubectl apply -f https://raw.githubusercontent.com/kubernetes/ingress-nginx/controller-v1.8.1/deploy/static/provider/cloud/deploy.yaml

Wrong ingressClassName
- Check your ingress controller’s class name
- Update ingressClassName in your ingress manifest
DNS not configured
- Ensure your domain points to ingress controller’s external IP
  kubectl get svc -n ingress-nginx
TLS certificate issues
- Check cert-manager is installed and configured
- Verify certificate is issued

Issue: High Memory Usage

Symptoms:

Pods being OOMKilled
Frequent pod restarts

Diagnosis:

kubectl top pods -n <namespace>
kubectl describe pod <pod-name> -n <namespace> | grep -A 5 "State"

Solutions:

Increase memory limits

resources:
  limits:
    memory: "1Gi"  # Increase from 512Mi

Optimize Antora build
- Reduce number of versions/branches in playbook
- Use --fetch flag
For NGINX
- Usually not needed for static content
- Enable memory-efficient caching if required

Issue: Slow Site Generation

Symptoms:

Init container takes very long to complete
Pods slow to start

Diagnosis:

kubectl logs -n <namespace> <pod-name> -c antora-build
# Look for timing information

Solutions:

Large repository
- Use shallow clones if possible
- Reduce number of branches fetched
Multiple large repositories
- Consider caching Git repos
- Increase CPU limits for init container
Slow network
- Check cluster network connectivity
- Consider mirroring Git repos closer to cluster

Issue: Permission Denied Errors

Symptoms:

Init container fails with permission errors
NGINX cannot read files

Diagnosis:

kubectl logs -n <namespace> <pod-name> -c antora-build
kubectl logs -n <namespace> <pod-name> -c nginx

Solutions:

emptyDir volume permissions
- Usually automatic, but can add fsGroup to pod spec:
  securityContext: fsGroup: 101 # nginx group
Read-only filesystem
- Check volume mount settings
- Ensure site-content volume is read-write for init container

Diagnostic Commands Reference

# View all resources
kubectl get all -n <namespace>

# Detailed pod information
kubectl describe pod <pod-name> -n <namespace>

# Init container logs
kubectl logs -n <namespace> <pod-name> -c antora-build

# NGINX container logs (live)
kubectl logs -n <namespace> <pod-name> -c nginx -f

# Previous container logs (if pod restarted)
kubectl logs -n <namespace> <pod-name> -c nginx --previous

# Events (last 20)
kubectl get events -n <namespace> --sort-by='.lastTimestamp' | tail -20

# Resource usage
kubectl top pods -n <namespace>

# Execute commands in container
kubectl exec -it -n <namespace> <pod-name> -c nginx -- sh

# Port forward for testing
kubectl port-forward -n <namespace> svc/<service-name> 8080:80

# Validate manifests
kubectl apply --dry-run=client -f k8s/deployment.yaml

# Force rollout
kubectl rollout restart deployment/<deployment-name> -n <namespace>

# Check rollout status
kubectl rollout status deployment/<deployment-name> -n <namespace>

# Rollback if needed
kubectl rollout undo deployment/<deployment-name> -n <namespace>

Quick Health Check Script

#!/bin/bash
NAMESPACE=${1:-default}

echo "=== Namespace ==="
kubectl get ns $NAMESPACE

echo -e "\n=== Pods ==="
kubectl get pods -n $NAMESPACE -o wide

echo -e "\n=== Service ==="
kubectl get svc -n $NAMESPACE

echo -e "\n=== Endpoints ==="
kubectl get endpoints -n $NAMESPACE

echo -e "\n=== ConfigMaps ==="
kubectl get configmap -n $NAMESPACE

echo -e "\n=== Recent Events ==="
kubectl get events -n $NAMESPACE --sort-by='.lastTimestamp' | tail -10

echo -e "\n=== Resource Usage ==="
kubectl top pods -n $NAMESPACE 2>/dev/null || echo "Metrics not available"

Save this as health-check.sh and run with: ./health-check.sh <namespace>

Getting Additional Help

If you’re still stuck:

Collect debug information:

kubectl get all -n <namespace> -o yaml > debug-output.yaml
kubectl describe deployment/<deployment-name> -n <namespace> >> debug-output.yaml
kubectl logs -n <namespace> -l app=<app-label> --all-containers >> debug-output.yaml

Check documentation:

Known Limitations

Content updates require pod restart - Not automatic on Git push (use CronJob or webhooks)
emptyDir volumes - Content lost on pod deletion (regenerated on restart)
No built-in search - Requires additional integration (e.g., Lunr.js, Algolia)
No authentication by default - All content is public unless you add OAuth/authentication layer