Canary checker is a kubernetes-native platform for monitoring health across application and infrastructure using both passive and active (synthetic) mechanisms.
- Batteries Included - 35+ built-in check types
 - Kubernetes Native - Health checks (or canaries) are CRD's that reflect health via the 
statusfield, making them compatible with GitOps, Flux Health Checks, Argo, Helm, etc.. - Secret Management - Leverage K8S secrets and configmaps for authentication and connection details
 - Prometheus - Prometheus compatible metrics are exposed at 
/metrics. A Grafana Dashboard is also available. - Dependency Free - Runs an embedded postgres instance by default, can also be configured to use an external database.
 - JUnit Export (CI/CD) - Export health check results to JUnit format for integration into CI/CD pipelines
 - JUnit Import (k6/newman/puppeter/etc) - Use any container that creates JUnit test results
 - Scriptable - Go templates, Javascript and CEL can be used to:
- Evaluate whether a check is passing and severity to use when failing
 - Extract a user friendly error message
 - Transform and filter check responses into individual check results
 - Extract custom metrics
 
 - Multi-Modal - While designed as a Kubernetes Operator, canary checker can also run as a CLI and a server without K8s
 
- Install canary checker with Helm
 
helm repo add flanksource https://flanksource.github.io/charts
helm repo update
helm install \
  canary-checker \
  flanksource/canary-checker \
 -n canary-checker \
 --create-namespace
 --wait- Create a new check
 
apiVersion: canaries.flanksource.com/v1
kind: Canary
metadata:
  name: http-check
spec:
  interval: 30
  http:
    - name: basic-check
      url: https://httpbin.flanksource.com/status/200
    - name: failing-check
      url: https://httpbin.flanksource.com/status/5002a. Run the check locally (Optional)
wget  https://github.com/flanksource/canary-checker/releases/latest/download/canary-checker_linux_amd64 \
-O canary-checker &&  chmod +x canary-checker
./canary-checker run canary.yaml- Apply the check
 
kubectl apply -f canary.yaml- Check the health status
 
kubectl get canaryNAME               INTERVAL   STATUS   LAST CHECK   UPTIME 1H        LATENCY 1H   LAST TRANSITIONED
http-check.        30         Passed   13s          18/18 (100.0%)   480ms        13s
See fixtures for more examples and docs for more comprehensive documentation.
Run simple HTTP/DNS/ICMP probes or more advanced full test suites using JMeter, K6, Playright, Postman.
# Run a container that executes a playwright test, and then collect the
# JUnit formatted test results from the /tmp folder
apiVersion: canaries.flanksource.com/v1
kind: Canary
metadata:
  name: playwright-junit
spec:
  interval: 120
  junit:
    - testResults: "/tmp/"
      name: playwright-junit
      spec:
        containers:
          - name: playwright
            image: ghcr.io/flanksource/canary-playwright:latestVerify that infrastructure is fully operational by deploying new pods, spinning up new EC2 instances and pushing/pulling from docker and helm repositories.
# Schedule a new pod with an ingress and then time how long it takes to schedule, be ready, respond to an http request and finally be cleaned up.
apiVersion: canaries.flanksource.com/v1
kind: Canary
metadata:
  name: pod-check
spec:
  interval: 30
  pod:
    - name: golang
      spec: |
        apiVersion: v1
        kind: Pod
        metadata:
          name: hello-world-golang
          namespace: default
          labels:
            app: hello-world-golang
        spec:
          containers:
            - name: hello
              image: quay.io/toni0/hello-webserver-golang:latest
      port: 8080
      path: /foo/bar
      scheduleTimeout: 20000
      readyTimeout: 10000
      httpTimeout: 7000
      deleteTimeout: 12000
      ingressTimeout: 10000
      deadline: 60000
      httpRetryInterval: 200
      expectedContent: bar
      expectedHttpStatuses: [200, 201, 202]Check that batch file processes are functioning correctly by checking the age and size of files in local file systems, SFTP, SMB, S3 and GCS.
# Checks that a recent DB backup has been uploaded
apiVersion: canaries.flanksource.com/v1
kind: Canary
metadata:
  name: folder-check
spec:
  schedule: 0 22 * * *
  folder:
    - path: s3://database-backups/prod
      name: prod-backup
      maxAge: 1d
      minSize: 10gbAggregate alerts and recommendations from Prometheus, AWS Cloudwatch, Dynatrace, etc.
apiVersion: canaries.flanksource.com/v1
kind: Canary
metadata:
  name: alertmanager-check
spec:
  schedule: "*/5 * * * *"
  alertmanager:
    - url: alertmanager.monitoring.svc
      alerts:
        - .*
      ignore:
        - KubeScheduler.*
        - Watchdog
      transform:
        # for each alert, transform it into a new check
        javascript: |
          var out = _.map(results, function(r) {
            return {
              name: r.name,
              labels: r.labels,
              icon: 'alert',
              message: r.message,
              description: r.message,
            }
          })
          JSON.stringify(out);Export custom metrics from the result of any check, making it possible to replace various other promethus exporters that collect metrics via HTTP, SQL, etc..
apiVersion: canaries.flanksource.com/v1
kind: Canary
metadata:
  name: exchange-rates
spec:
  schedule: "every 1 @hour"
  http:
    - name: exchange-rates
      url: https://api.frankfurter.app/latest?from=USD&to=GBP,EUR,ILS
      metrics:
        - name: exchange_rate
          type: gauge
          value: result.json.rates.GBP
          labels:
            - name: "from"
              value: "USD"
            - name: to
              value: GBPCanary checker is ideal for building platforms, developers can include health checks for their applications in whatever tooling they prefer, with secret management that uses native Kubernetes constructs.
apiVersion: v1
kind: Secret
metadata:
  name:  basic-auth
stringData:
   user: john
   pass: doe
---
apiVersion: canaries.flanksource.com/v1
kind: Canary
metadata:
  name: http-basic-auth-configmap
spec:
  http:
    - url: https://httpbin.flanksource.com/basic-auth/john/doe
      username:
        valueFrom:
          secretKeyRef:
            name: basic-auth
            key: user
      password:
        valueFrom:
          secretKeyRef:
            name: basic-auth
            key: passCanary checker comes with a built-in dashboard by default
There is also a grafana dashboard, or build your own using the metrics exposed.
If you have any questions about canary checker:
- Read the docs
 - Invite yourself to the CNCF community slack and join the #canary-checker channel.
 - Check out the Youtube Playlist.
 - File an issue - (We do provide user support via Github Issues, so don't worry if your issue is a bug or not)
 
Your feedback is always welcome!
| Protocol | Status | Checks | 
|---|---|---|
| HTTP(s) | GA | Response body, headers and duration | 
| DNS | GA | Response and duration | 
| Ping/ICMP | GA | Duration and packet loss | 
| TCP | GA | Port is open and connectable | 
| Data Sources | ||
| SQL (MySQL, Postgres, SQL Server) | GA | Ability to login, results, duration, health exposed via stored procedures | 
| LDAP | GA | Ability to login, response time | 
| ElasticSearch / Opensearch | GA | Ability to login, response time, size of search results | 
| Mongo | Beta | Ability to login, results, duration, | 
| Redis | GA | Ability to login, results, duration, | 
| Prometheus | GA | Ability to login, results, duration, | 
| Alerts | Prometheus | |
| Prometheus Alert Manager | GA | Pending and firing alerts | 
| AWS Cloudwatch Alarms | GA | Pending and firing alarms | 
| Dynatrace Problems | Beta | Problems deteced | 
| DevOps | ||
| Git | GA | Query Git and Github repositories via SQL | 
| Azure Devops | Beta | |
| Integration Testing | ||
| JMeter | Beta | Runs and checks the result of a JMeter test | 
| JUnit / BYO | Beta | Run a pod that saves Junit test results | 
| K6 | Beta | Runs K6 tests that export JUnit via a container | 
| Newman | Beta | Runs Newman / Postman tests that export JUnit via a container | 
| Playwright | Beta | Runs Playwright tests that export JUnit via a container | 
| File Systems / Batch | ||
| Local Disk / NFS | GA | Check folders for files that are: too few/many, too old/new, too small/large | 
| S3 | GA | Check contents of AWS S3 Buckets | 
| GCS | GA | Check contents of Google Cloud Storage Buckets | 
| SFTP | GA | Check contents of folders over SFTP | 
| SMB / CIFS | GA | Check contents of folders over SMB/CIFS | 
| Config | ||
| AWS Config | GA | Query AWS config using SQL | 
| AWS Config Rule | GA | AWS Config Rules that are firing, Custom AWS Config queries | 
| Config DB | GA | Custom config queries for Mission Control Config D | 
| Kubernetes Resources | GA | Kubernetes resources that are missing or are in a non-ready state | 
| Backups | ||
| GCP Databases | GA | Backup freshness | 
| Restic | Beta | Backup freshness and integrity | 
| Infrastructure | ||
| EC2 | GA | Ability to launch new EC2 instances | 
| Kubernetes Ingress | GA | Ability to schedule and then route traffic via an ingress to a pod | 
| Docker/Containerd | Deprecated | Ability to push and pull containers via docker/containerd | 
| Helm | Deprecated | Ability to push and pull helm charts | 
| S3 Protocol | GA | Ability to read/write/list objects on an S3 compatible object store | 
See CONTRIBUTING.md
Thank you to all our contributors !
Canary Checker core (the code in this repository) is licensed under Apache 2.0 and accepts contributions via GitHub pull requests after signing a CLA.
The UI (Dashboard) is free to use with canary checker under a license exception of Flanksource UI
