User Guide

This guide provides comprehensive instructions for using kube-changejob to automate workflows based on Kubernetes resource changes.

Introduction
Getting Started
Basic Usage
Advanced Usage
Monitoring and Debugging
Best Practices
Troubleshooting

Introduction

kube-changejob is a Kubernetes operator that automatically triggers jobs when specified resources change. It continuously monitors resources and creates jobs based on your defined templates when changes are detected.

Key Concepts

ChangeTriggeredJob (CTJ): The custom resource that defines what to watch and what to run
Resource Watching: Monitoring Kubernetes resources for changes
Trigger Conditions: Rules that determine when to create jobs
Job Template: The specification for jobs to create when triggered
Cooldown Period: Minimum time between job triggers
History Management: Automatic cleanup of old jobs

Getting Started

Prerequisites

Before using kube-changejob, ensure you have:

A Kubernetes cluster (v1.29 or later)
kubectl configured to access your cluster
Appropriate RBAC permissions to create ChangeTriggeredJobs
cert-manager installed (for webhooks)

Installing kube-changejob

See the Installation Guide for detailed installation instructions.

Quick install:

kubectl apply -k github.com/nusnewob/kube-changejob/config/default

Verifying Installation

Check that the controller is running:

kubectl get pods -n kube-changejob-system
kubectl get crd changetriggeredjobs.triggers.changejob.dev

Basic Usage

Creating Your First ChangeTriggeredJob

Let’s create a simple ChangeTriggeredJob that triggers when a ConfigMap changes:

Create a ConfigMap to watch:

kubectl create configmap my-config --from-literal=key=value

Create a ChangeTriggeredJob:

apiVersion: triggers.changejob.dev/v1alpha
kind: ChangeTriggeredJob
metadata:
  name: my-first-trigger
  namespace: default
spec:
  jobTemplate:
    spec:
      template:
        spec:
          containers:
            - name: hello
              image: busybox:latest
              command: ["sh", "-c", "echo 'ConfigMap changed at:' $(date)"]
          restartPolicy: Never
  resources:
    - apiVersion: v1
      kind: ConfigMap
      name: my-config
      namespace: default

Apply the configuration:

kubectl apply -f changetriggeredjob.yaml

Update the ConfigMap to trigger the job:

kubectl patch configmap my-config -p '{"data":{"key":"newvalue"}}'

Wait for the cooldown period (default 60s), then check for created jobs:

kubectl get jobs -l changejob.dev/owner=my-first-trigger

Viewing Status

Check the status of your ChangeTriggeredJob:

# Get basic info
kubectl get changetriggeredjobs

# Get detailed status
kubectl describe changetriggeredjob my-first-trigger

# View status in YAML format
kubectl get changetriggeredjob my-first-trigger -o yaml

Key status fields to check:

status.lastTriggeredTime: When the last job was created
status.lastJobName: Name of the most recent job
status.lastJobStatus: Status of the last job (Active/Succeeded/Failed)
status.conditions: Current state conditions

Viewing Created Jobs

List jobs created by a ChangeTriggeredJob:

# List all jobs with the owner label
kubectl get jobs -l changejob.dev/owner=my-first-trigger

# View job logs
kubectl logs job/<job-name>

# Describe a specific job
kubectl describe job <job-name>

Advanced Usage

Watching Multiple Resources

You can watch multiple resources and control when to trigger:

Trigger on Any Change (OR Logic)

apiVersion: triggers.changejob.dev/v1alpha
kind: ChangeTriggeredJob
metadata:
  name: multi-resource-any
spec:
  jobTemplate:
    spec:
      template:
        spec:
          containers:
            - name: sync
              image: my-sync-tool:latest
              command: ["sync"]
          restartPolicy: Never
  resources:
    - apiVersion: v1
      kind: ConfigMap
      name: app-config
      namespace: default
    - apiVersion: v1
      kind: Secret
      name: app-secret
      namespace: default
  condition: Any # Trigger if ConfigMap OR Secret changes
  cooldown: 120s

Trigger on All Changes (AND Logic)

apiVersion: triggers.changejob.dev/v1alpha
kind: ChangeTriggeredJob
metadata:
  name: multi-resource-all
spec:
  jobTemplate:
    spec:
      template:
        spec:
          containers:
            - name: sync
              image: my-sync-tool:latest
              command: ["sync", "--coordinated"]
          restartPolicy: Never
  resources:
    - apiVersion: v1
      kind: ConfigMap
      name: app-config
      namespace: default
    - apiVersion: v1
      kind: Secret
      name: app-secret
      namespace: default
  condition: All # Trigger only if BOTH ConfigMap AND Secret change
  cooldown: 300s

Watching Specific Fields

Instead of watching the entire resource, you can monitor specific fields:

Watch Deployment Image Changes

apiVersion: triggers.changejob.dev/v1alpha
kind: ChangeTriggeredJob
metadata:
  name: image-watcher
spec:
  jobTemplate:
    spec:
      template:
        spec:
          containers:
            - name: notify
              image: notification-tool:latest
              env:
                - name: WEBHOOK_URL
                  value: "https://hooks.slack.com/..."
          restartPolicy: Never
  resources:
    - apiVersion: apps/v1
      kind: Deployment
      name: web-app
      namespace: default
      fields:
        - "spec.template.spec.containers[*].image"
  cooldown: 30s

Watch ConfigMap Data Only

apiVersion: triggers.changejob.dev/v1alpha
kind: ChangeTriggeredJob
metadata:
  name: config-data-watcher
spec:
  jobTemplate:
    spec:
      template:
        spec:
          containers:
            - name: reload
              image: reload-tool:latest
              command: ["reload-config"]
          restartPolicy: Never
  resources:
    - apiVersion: v1
      kind: ConfigMap
      name: app-config
      namespace: default
      fields:
        - "data" # Only watch the data field

Watch Multiple Fields

apiVersion: triggers.changejob.dev/v1alpha
kind: ChangeTriggeredJob
metadata:
  name: multi-field-watcher
spec:
  jobTemplate:
    spec:
      template:
        spec:
          containers:
            - name: check
              image: checker:latest
          restartPolicy: Never
  resources:
    - apiVersion: apps/v1
      kind: Deployment
      name: app
      namespace: default
      fields:
        - "spec.replicas"
        - "spec.template.spec.containers[*].image"
        - "spec.template.spec.containers[*].resources"

Watching Cluster-Scoped Resources

You can watch cluster-scoped resources like Nodes, ClusterRoles, etc.:

apiVersion: triggers.changejob.dev/v1alpha
kind: ChangeTriggeredJob
metadata:
  name: node-watcher
  namespace: kube-changejob-system
spec:
  jobTemplate:
    spec:
      template:
        spec:
          serviceAccountName: cluster-reader
          containers:
            - name: alert
              image: alert-tool:latest
              command: ["alert", "Node changed"]
          restartPolicy: Never
  resources:
    - apiVersion: v1
      kind: Node
      name: worker-1
      # No namespace field for cluster-scoped resources
  cooldown: 300s

Customizing Job Templates

Job with Environment Variables

spec:
  jobTemplate:
    spec:
      template:
        spec:
          containers:
            - name: processor
              image: processor:latest
              env:
                - name: CONFIG_NAME
                  value: "app-config"
                - name: LOG_LEVEL
                  value: "info"
          restartPolicy: Never

Job with ConfigMap/Secret Mounts

spec:
  jobTemplate:
    spec:
      template:
        spec:
          containers:
            - name: processor
              image: processor:latest
              volumeMounts:
                - name: config
                  mountPath: /config
                - name: secret
                  mountPath: /secret
                  readOnly: true
          volumes:
            - name: config
              configMap:
                name: app-config
            - name: secret
              secret:
                secretName: app-secret
          restartPolicy: Never

Job with Resource Limits

spec:
  jobTemplate:
    spec:
      template:
        spec:
          containers:
            - name: heavy-processor
              image: processor:latest
              resources:
                requests:
                  memory: "256Mi"
                  cpu: "500m"
                limits:
                  memory: "512Mi"
                  cpu: "1000m"
          restartPolicy: Never

Job with Service Account

spec:
  jobTemplate:
    spec:
      template:
        spec:
          serviceAccountName: my-job-sa
          containers:
            - name: k8s-client
              image: k8s-client:latest
              command: ["kubectl", "get", "pods"]
          restartPolicy: Never

Adjusting Cooldown Period

Control how often jobs can be triggered:

spec:
  # Short cooldown for frequent updates
  cooldown: 30s

  # Or longer cooldown for expensive operations
  # cooldown: 15m

  # Or very long cooldown
  # cooldown: 1h

Managing Job History

Configure how many historical jobs to keep:

spec:
  # Keep more jobs for debugging
  history: 10

  # Or keep fewer to save resources
  # history: 3

  # Minimum is 1
  # history: 1

Real-World Use Cases

1. Configuration Synchronization

Trigger sync jobs when configuration changes:

apiVersion: triggers.changejob.dev/v1alpha
kind: ChangeTriggeredJob
metadata:
  name: config-sync
  namespace: production
spec:
  jobTemplate:
    spec:
      template:
        spec:
          containers:
            - name: sync
              image: config-sync-tool:v1.0
              command: ["sync-config"]
              env:
                - name: TARGET_SYSTEMS
                  value: "system1,system2,system3"
          restartPolicy: Never
  resources:
    - apiVersion: v1
      kind: ConfigMap
      name: app-config
      namespace: production
  cooldown: 300s
  history: 5

2. Deployment Notifications

Send notifications when deployments change:

apiVersion: triggers.changejob.dev/v1alpha
kind: ChangeTriggeredJob
metadata:
  name: deployment-notifier
  namespace: production
spec:
  jobTemplate:
    spec:
      template:
        spec:
          containers:
            - name: notify
              image: notification-service:latest
              env:
                - name: SLACK_WEBHOOK
                  valueFrom:
                    secretKeyRef:
                      name: slack-webhook
                      key: url
                - name: MESSAGE
                  value: "Production deployment updated"
          restartPolicy: Never
  resources:
    - apiVersion: apps/v1
      kind: Deployment
      name: web-app
      namespace: production
      fields:
        - "spec.template.spec.containers[*].image"
  cooldown: 60s

3. Backup Automation

Trigger backups when data changes:

apiVersion: triggers.changejob.dev/v1alpha
kind: ChangeTriggeredJob
metadata:
  name: backup-trigger
  namespace: database
spec:
  jobTemplate:
    spec:
      template:
        spec:
          containers:
            - name: backup
              image: backup-tool:latest
              command: ["backup", "--incremental"]
              volumeMounts:
                - name: backup-storage
                  mountPath: /backups
          volumes:
            - name: backup-storage
              persistentVolumeClaim:
                claimName: backup-pvc
          restartPolicy: OnFailure
      backoffLimit: 3
  resources:
    - apiVersion: v1
      kind: Secret
      name: database-credentials
      namespace: database
    - apiVersion: v1
      kind: ConfigMap
      name: database-config
      namespace: database
  condition: Any
  cooldown: 3600s # 1 hour
  history: 24 # Keep 24 hours of backups

4. Validation Pipeline

Run validation when resources are updated:

apiVersion: triggers.changejob.dev/v1alpha
kind: ChangeTriggeredJob
metadata:
  name: policy-validator
  namespace: compliance
spec:
  jobTemplate:
    spec:
      template:
        spec:
          serviceAccountName: validator-sa
          containers:
            - name: validate
              image: policy-validator:latest
              command: ["validate-policies"]
          restartPolicy: Never
  resources:
    - apiVersion: v1
      kind: ConfigMap
      name: security-policies
      namespace: compliance
  cooldown: 120s

Monitoring and Debugging

Checking ChangeTriggeredJob Status

# List all ChangeTriggeredJobs
kubectl get ctj -A

# Get detailed status
kubectl describe ctj my-trigger

# Watch for changes
kubectl get ctj my-trigger -w

# View full status
kubectl get ctj my-trigger -o jsonpath='{.status}' | jq

Monitoring Created Jobs

# List jobs for a specific ChangeTriggeredJob
kubectl get jobs -l changejob.dev/owner=my-trigger

# Watch job creation
kubectl get jobs -l changejob.dev/owner=my-trigger -w

# View job status
kubectl describe job <job-name>

# Get job logs
kubectl logs job/<job-name>

# Follow logs in real-time
kubectl logs -f job/<job-name>

Checking Resource Hashes

View the current hash state of watched resources:

kubectl get ctj my-trigger -o jsonpath='{.status.resourceHashes}' | jq

Viewing Controller Logs

Debug controller behavior:

# View controller logs
kubectl logs -n kube-changejob-system \
  deployment/kube-changejob-controller-manager

# Follow logs
kubectl logs -f -n kube-changejob-system \
  deployment/kube-changejob-controller-manager

# Filter for specific ChangeTriggeredJob
kubectl logs -n kube-changejob-system \
  deployment/kube-changejob-controller-manager | \
  grep "my-trigger"

Checking Conditions

Understand the current state through conditions:

kubectl get ctj my-trigger -o jsonpath='{.status.conditions}' | jq

Common condition types:

Available: Is the CTJ functioning correctly?
Progressing: Is work in progress?
Degraded: Are there any issues?

Best Practices

1. Use Appropriate Cooldown Periods

Short cooldown (30s-60s): For frequently changing resources where immediate response is needed
Medium cooldown (5m-15m): For most use cases, balances responsiveness and resource usage
Long cooldown (1h+): For expensive operations or batch processing

2. Watch Specific Fields

Instead of watching entire resources, monitor only relevant fields:

# Good: Watch only what matters
fields:
  - "spec.template.spec.containers[*].image"
# Avoid: Watching everything when you only care about specific fields
# (omitting fields or using ["*"] watches everything)

3. Use Meaningful Names

Choose descriptive names for ChangeTriggeredJobs:

# Good
name: nginx-config-sync
name: database-backup-trigger
name: deployment-image-notifier

# Avoid
name: trigger1
name: test
name: my-ctj

4. Set Appropriate History Limits

Balance debugging needs with resource usage:

# Development/debugging
history: 10

# Production (most cases)
history: 5

# High-volume triggers
history: 3

5. Use Resource Limits in Job Templates

Prevent runaway jobs:

spec:
  jobTemplate:
    spec:
      template:
        spec:
          containers:
            - name: processor
              resources:
                limits:
                  memory: "512Mi"
                  cpu: "500m"

6. Implement Proper Error Handling

Use appropriate restart policies:

spec:
  jobTemplate:
    spec:
      backoffLimit: 3 # Retry failed jobs
      template:
        spec:
          containers:
            - name: processor
              # ...
          restartPolicy: OnFailure # Retry on failure

7. Use Labels and Annotations

Organize and document your ChangeTriggeredJobs:

metadata:
  name: my-trigger
  labels:
    app: myapp
    environment: production
    team: platform
  annotations:
    description: "Syncs configuration to external systems"
    owner: "platform-team@example.com"
    runbook: "https://wiki.example.com/runbooks/config-sync"

8. Monitor Job Success Rates

Regularly check job statuses:

# Check for failed jobs
kubectl get jobs -l changejob.dev/owner=my-trigger --field-selector status.successful=0

# Monitor success rate
kubectl get ctj my-trigger -o jsonpath='{.status.lastJobStatus}'

9. Test in Non-Production First

Always test new ChangeTriggeredJobs in development:

# Development
apiVersion: triggers.changejob.dev/v1alpha
kind: ChangeTriggeredJob
metadata:
  name: config-sync-dev
  namespace: development
spec:
  cooldown: 30s  # Shorter cooldown for testing
  # ...

# Production (deploy after testing)
apiVersion: triggers.changejob.dev/v1alpha
kind: ChangeTriggeredJob
metadata:
  name: config-sync-prod
  namespace: production
spec:
  cooldown: 300s  # Longer cooldown for stability
  # ...

10. Document Your Triggers

Use annotations to document purpose and behavior:

metadata:
  annotations:
    description: |
      Triggers backup jobs when database credentials change.
      Backs up data to S3 with encryption enabled.
    trigger-frequency: "Expected: 1-2 times per week"
    on-call: "database-team@example.com"

Troubleshooting

Jobs Not Being Created

Problem: ChangeTriggeredJob exists but no jobs are created when resources change.

Solutions:

Check if cooldown period has elapsed:

kubectl get ctj my-trigger -o jsonpath='{.status.lastTriggeredTime}'

Verify resources are actually changing:

# Check current hashes
kubectl get ctj my-trigger -o jsonpath='{.status.resourceHashes}' | jq

# Force a change
kubectl annotate configmap my-config test=value-$(date +%s)

Check controller logs:

kubectl logs -n kube-changejob-system deployment/kube-changejob-controller-manager

Verify resource exists and is accessible:

kubectl get configmap my-config -n default

Jobs Failing Immediately

Problem: Created jobs fail immediately or repeatedly.

Solutions:

Check job logs:

kubectl logs job/<job-name>

Check job events:

kubectl describe job <job-name>

Verify image exists and is pullable:

kubectl run test --image=<your-image> --rm -it --restart=Never -- echo "test"

Check resource limits:

# Add or adjust limits
resources:
  limits:
    memory: "512Mi"
    cpu: "500m"

“All” Condition Not Triggering

Problem: Using condition: All but jobs never trigger.

Solutions:

Verify all resources have changed since last trigger:

kubectl get ctj my-trigger -o jsonpath='{.status.resourceHashes}' | jq

Check each resource individually:

kubectl get configmap my-config
kubectl get secret my-secret

After all resources are changed, wait for the next poll interval (default 60s)

Webhook Validation Errors

Problem: Getting errors when creating/updating ChangeTriggeredJob.

Common Errors and Solutions:

Error: resource kind does not exist

Solution: Check that the resource kind is valid and properly capitalized:

# Correct
kind: ConfigMap

# Incorrect
kind: configmap

Error: namespace is required for namespaced resources

Solution: Add namespace for namespaced resources:

resources:
  - apiVersion: v1
    kind: ConfigMap
    name: my-config
    namespace: default # Add this

Error: namespace must not be set for cluster-scoped resources

Solution: Remove namespace for cluster-scoped resources:

resources:
  - apiVersion: v1
    kind: Node
    name: worker-1
    # Remove namespace field

Too Many Jobs Being Created

Problem: Jobs are created too frequently.

Solutions:

Increase cooldown period:

spec:
  cooldown: 300s # Increase from default 60s

Use “All” condition instead of “Any”:

spec:
  condition: All # Require all resources to change

Watch specific fields instead of entire resources:

spec:
  resources:
    - apiVersion: v1
      kind: ConfigMap
      name: my-config
      fields:
        - "data.important-key" # Only watch this field

Permission Errors

Problem: Jobs fail with permission errors or ChangeTriggeredJob can’t watch resources.

Solutions:

For job execution, add appropriate service account:

spec:
  jobTemplate:
    spec:
      template:
        spec:
          serviceAccountName: my-job-sa

For controller watching resources, check controller RBAC:

kubectl get clusterrole kube-changejob-manager-role -o yaml

For user creating ChangeTriggeredJobs, verify permissions:

kubectl auth can-i create changetriggeredjobs

Next Steps

Review the API Reference for detailed specification
Check Examples for more use cases
Learn about Configuration options
Read the Installation Guide for deployment options
Learn the Release Process to create and manage releases

Getting Help

If you encounter issues:

Check the Troubleshooting section
Review controller logs
Check GitHub Issues
Open a new issue with details about your problem

User Guide

Table of Contents

Introduction

Key Concepts

Getting Started

Prerequisites

Installing kube-changejob

Verifying Installation

Basic Usage

Creating Your First ChangeTriggeredJob

Viewing Status

Viewing Created Jobs

Advanced Usage

Watching Multiple Resources

Trigger on Any Change (OR Logic)

Trigger on All Changes (AND Logic)

Watching Specific Fields

Watch Deployment Image Changes

Watch ConfigMap Data Only

Watch Multiple Fields

Watching Cluster-Scoped Resources

Customizing Job Templates

Job with Environment Variables

Job with ConfigMap/Secret Mounts

Job with Resource Limits

Job with Service Account

Adjusting Cooldown Period

Managing Job History

Real-World Use Cases

1. Configuration Synchronization

2. Deployment Notifications

3. Backup Automation

4. Validation Pipeline

Monitoring and Debugging

Checking ChangeTriggeredJob Status

Monitoring Created Jobs

Checking Resource Hashes

Viewing Controller Logs

Checking Conditions

Best Practices

1. Use Appropriate Cooldown Periods

2. Watch Specific Fields

3. Use Meaningful Names

4. Set Appropriate History Limits

5. Use Resource Limits in Job Templates

6. Implement Proper Error Handling

7. Use Labels and Annotations

8. Monitor Job Success Rates

9. Test in Non-Production First

10. Document Your Triggers

Troubleshooting

Jobs Not Being Created

Jobs Failing Immediately

“All” Condition Not Triggering

Webhook Validation Errors

Too Many Jobs Being Created

Permission Errors

Next Steps

Getting Help