Using a debug handler with Sensu Go

A monitoring infrastructure is not much use if you’re not sure if it’s working correctly. In order to ensure its functionality, you may find a need to debug or test specific pieces of this infrastructure. In those cases, a debug handler can be a great tool to have in your Sensu Go toolbox. It can be used for multiple purposes, including:

  • Creating sample events for testing, configuring, or debugging handlers
  • Testing new checks and/or check hooks to see the events they create and what’s being passed to a handler
  • Testing the effects of a mutator
  • Ensuring that metrics are being extracted to be sent to a metrics handler

Let’s look at the first use case: creating sample events in order to test handlers. Maybe you’ve decided to write your own handler or you want to figure out the best set of arguments for an already available handler. You could run countless iterations of updating the handler definition and then forcing a check to send an event to that handler in order to do this. But, wouldn’t it be easier to just have a few sample events to pipe into the handler?

Using a debug handler with Sensu Go

Given that Sensu Go events are represented as JSON, the best tool for the job is going to be jq. We will create a simple handler that uses jq to pretty print our event(s) into a file for us.

{ 
 "type": "Handler", 
  "api_version": "core/v2", 
  "metadata": { 
    "name": "debug", 
    "namespace": "default" 
  }, 
  "spec": { 
    "command": "jq . >> /tmp/events.json", 
    "env_vars": null, 
    "filters": null, 
    "handlers": null, 
    "runtime_assets": null, 
    "timeout": 0, 
    "type": "pipe" 
  }
}

You will now want to assign this handler to a check. Given that this handler will be writing events to the local file system, I would suggest not assigning it to a check that will generate a large number of events. Also, notice that there are no filters on this handler, so any event generated by our check will be sent to your file. You could fine tune this, if desired, by adding in filters.

{ 
  "type": "CheckConfig", 
  "api_version": "core/v2", 
  "metadata": { 
    "name": "http", 
    "namespace": "default" 
  }, 
  "spec": { 
    "check_hooks": null, 
    "command": "check-http.rb -u http://webserver.example.com -q 'Welcome to CentOS'", 
    "env_vars": null, 
    "handlers": [ 
      "debug" 
    ], 
    "high_flap_threshold": 0, 
    "interval": 30, 
    "low_flap_threshold": 0, 
    "output_metric_format": "nagios_perfdata", 
    "output_metric_handlers": null, 
    "proxy_entity_name": "", 
    "publish": true, 
    "round_robin": false, 
    "runtime_assets": [ 
      "sensu/sensu-ruby-runtime", 
      "sensu-plugins/sensu-plugins-http" 
    ], 
    "stdin": false, 
    "subdue": null, 
    "subscriptions": [ 
      "linux" 
    ], 
    "timeout": 10, 
    "ttl": 0 
  }
}

After attaching this handler to a check, you should start seeing events appear in your file. If desired, you could inject a failure for your check so that you are capturing different event scenarios (e.g., passing or failing).

You will want to manage how long you have this debug handler active in order to limit the number of events in your file. You can run sensuctl edit check <check-name> to quickly remove the handler from your check once you feel you have collected a sufficient number of events.

After you’ve collected some events in your output file, you can extract single events to continue your handler testing. You can you the jq commands below, for example, to capture the first “passing” and “failing” events in your debug file:

$ jq -s '[.[] | select (.check.state == "passing")] | .[0]' /tmp/events.json > passing.json

$ jq -s '[.[] | select (.check.state == "failing")] | .[0]' /tmp/events.json > failing.json  

Now with these events handy, you can use them to test your handler.

$ cat passing.json | sensu-slack-handler --channel '#testalerts' --timeout 20 --username 'sensu'

$ cat failing.json | sensu-slack-handler --channel '#testalerts' --timeout 20 --username 'sensu'  

The second use case from our list, testing new checks, will reuse a lot of the same tooling from above. You would assign the debug handler to your new check, allow some events to be created (remembering to remove the debug handler after a short time), and then use jq to query the output.

$ jq '{name: .check.metadata.name, state: .check.state, status: .check.status, out: .check.output}' /tmp/events.json
{ 
  "name": "http", 
  "state": "passing", 
  "status": 0, 
  "out": "CheckHttp OK: 200, found /Welcome to CentOS/ in 4833 bytes\n"
}
{ 
  "name": "http", 
  "state": "failing", 
  "status": 2, 
  "out": "CheckHttp CRITICAL: Request error: Failed to open TCP connection to agent:80 (Connection refused - connect(2) for \"172.28.128.12\" port 80)\n"
}
{ 
  "name": "http", 
  "state": "passing", 
  "status": 0, 
  "out": "CheckHttp OK: 200, found /Welcome to CentOS/ in 4833 bytes\n"
}  

As you can see above you can easily use jq to output just the fields of interest (events reference). In fact, if those (or other fields) were the sole ones of interest, you could change the debug handle command itself to log them exclusively.

{ 
  "type": "Handler", 
  "api_version": "core/v2", 
  "metadata": { 
    "name": "debug", 
    "namespace": "default" 
  }, 
  "spec": { 
    "command": "jq '{name: .check.metadata.name, state: .check.state, status: .check.status, out: .check.output}' >> /tmp/events.json", 
    "env_vars": null, 
    "filters": null, 
    "handlers": null, 
    "runtime_assets": null, 
    "timeout": 0, 
    "type": "pipe" 
  }
}  

For the third use case, imagine you want to create a mutator to add some content to your event before it reaches the handler. This debug handler pattern would help in determining that the mutator is behaving as expected.

Below is the definition for a simple mutator that will add some metadata to the event based on the check that’s being run. No, this isn’t an overly useful example, but the point here is to show using a debug handler to make sure your mutator is acting as expected. And since a mutator is meant to transform the event data prior to sending it to a handler, our example will make use of jq to continue illustrating its power.

{ 
  "type": "Mutator", 
  "api_version": "core/v2", 
  "metadata": { 
    "name": "team-mutator", 
    "namespace": "default", 
    "labels": null, 
    "annotations": null 
  }, 
  "spec": { 
    "command": "jq 'if .check.metadata.name == \"http\" then .metadata.team = \"webOps\" else .metadata.team = \"Ops\" end'", 
    "timeout": 0, 
    "env_vars": [], 
    "runtime_assets": [] 
  }
}

We then add this mutator to our debug handler:

{ 
  "type": "Handler", 
  "api_version": "core/v2", 
  "metadata": { 
    "name": "debug", 
    "namespace": "default" 
  }, 
  "spec": { 
    "mutator": "team-mutator", 
    "command": "jq . >> /tmp/events.json", 
    "env_vars": null, 
    "filters": null, 
    "handlers": null, 
    "runtime_assets": null, 
    "timeout": 0, 
    "type": "pipe" 
  }
}  

And finally add the debug handler to some checks and see that the data is being transformed before being written to our events.json output file.

$ jq '{name: .check.metadata.name, state: .check.state, status: .check.status, out: .check.output, team: .metadata.team}' /tmp/events.json
{ 
  "name": "http",
  "state": "passing", 
  "status": 0, 
  "out": "CheckHttp OK: 200, found /Welcome to CentOS/ in 4833 bytes\n", 
  "team": "webOps"
}
{ 
  "name": "cpu",
  "state": "passing", 
  "status": 0, 
  "out": "CheckCPU TOTAL OK: total=0.2 user=0.2 nice=0.0 system=0.0 idle=99.8 iowait=0.0 irq=0.0 softirq=0.0 steal=0.0 guest=0.0 guest_nice=0.0\n", 
  "team": "Ops"
}
{ 
  "name": "http",
  "state": "failing", 
  "status": 2, 
  "out": "CheckHttp CRITICAL: Request error: Failed to open TCP connection to agent:80 (Connection refused - connect(2) for \"172.28.128.12\" port 80)\n", 
  "team": "webOps"
}

Finally for our last use case, ensuring that metrics are being extracted to be sent to a metrics handler. Again, you need the debug handler set up as you did in our first use case, but this time we’ll name the output file event_metrics.json.

{ 
  "type": "Handler", 
  "api_version": "core/v2", 
  "metadata": { 
    "name": "debug", 
    "namespace": "default" 
  }, 
  "spec": { 
    "command": "jq . >> /tmp/event_metrics.json", 
    "env_vars": null, 
    "filters": null, 
    "handlers": null, 
    "runtime_assets": null, 
    "timeout": 0, 
    "type": "pipe" 
  }
}

Add this debug handler to the output_metric_handler for your check:


{ 
  "type": "CheckConfig", 
  "api_version": "core/v2", 
  "metadata": { 
    "name": "metrics-cpu", 
    "namespace": "default" 
  }, 
  "spec": { 
    "check_hooks": null, 
    "command": "metrics-cpu.rb", 
    "env_vars": null, 
    "handlers": [], 
    "high_flap_threshold": 0, 
    "interval": 30, 
    "low_flap_threshold": 0, 
    "output_metric_format": "graphite_plaintext", 
    "output_metric_handlers": [ 
      "influxdb", 
      "debug" 
    ], 
    "proxy_entity_name": "", 
    "publish": true, 
    "round_robin": false, 
    "runtime_assets": [ 
      "sensu/sensu-ruby-runtime", 
      "sensu-plugins/sensu-plugins-cpu-checks" 
    ], 
    "stdin": false, 
    "subdue": null, 
    "subscriptions": [ 
      "linux" 
    ], 
    "timeout": 10, 
    "ttl": 0 
  }
}  

And, as before, allow some events to be generated and collected into your output file and then disable the debug logging by editing the check and removing the debug handler.

You can ensure that metrics are being collected by extracting the first event from the output file with the following jq command:


$ jq -s '[.[]] | .[0] | .metrics.points' /tmp/event_metrics.json
[ 
  { 
    "name": "agent.cpu.total.user", 
    "value": 7357, 
    "timestamp": 1576530759, 
    "tags": null 
  }, 
  { 
    "name": "agent.cpu.total.nice", 
    "value": 2, 
    "timestamp": 1576530759, 
    "tags": null 
  },
[additional metrics deleted for brevity]
]  

Or you can simply do a count of the metrics points collected across all captured events using this jq command:

$ jq '.metrics.points | length' /tmp/event_metrics.json
27
27
27
27
27
27
27  

Star Wars metrics reference

Having zero (0) metrics points in the above output would mean that there have been no metrics collected from the check command being run.

Hopefully the examples above have allowed you to see the value of a debug handler (as well as jq). We encourage you to give these examples a try in your environment. If you don’t have an existing Sensu Go environment, download the latest version, try our quick start guide, or check out our online course. Finally, we invite you to join us on Discourse, where you can ask questions (and get answers!) from the Sensu Community.

Join Us on Discourse