Monitoring pfSense with Nagios Using SSH – part 3

Monitoring pfSense with Nagios Using SSH – part 3

Configuring the checks on Nagios XI

This is the third and final part to monitoring pfSense with Nagios XI using SSH. If you missed either of the previous parts, I’ve included them below.
Note: If you’re configuring this on Nagios Core, scroll down to the bottom of this page for the example commands.cfg and services.cfg files.

Part 1: Setting up password-less SSH
Part 2: Downloading and testing the checks

15Dec2017 – Originally posted
9May2018 – Added uptime and CPU temperature check as well as a Nagios Core example
11May2018 – Modified the check_pf_mem plugin
1June2018 – Added Nagios Core services.cfg and commands.cfg examples
29Oct2018 – check_ping changed to check_icmp
7Jan2019 – added info about ‘username is reserved’ error
16Apr2020 – added info about limiting sudo commands
6Apr2022 – Added check_pf_gw_status (contributed by Manuel Gayer)

Finally, let’s configure the checks on Nagios XI. Go to the SSH Proxy wizard. I like to change the OS to FreeBSD, but all that really does is change the icon in the web interface.

Nagios pfSense SSH Proxy

Change the host name to whatever you’d like. In my example, I chose pfSense-home. At this time, take the checkboxes out of the 2 other remote commands and leave it for the check_disk remote command only. Also, change the remote command to the text below. Make sure you pay attention to the path because the default Nagios entry flip flops libexec and nagios! I recommend changing the display name to “Disk – Root” so when you monitor other partitions, they are all in order in the web GUI.

/usr/local/libexec/nagios/check_disk / -w 20% -c 5%

Nagios pfSense SSH Proxy Step 2

Answer the remaining questions/screens as you see fit. Once the configuration changes are made and the service checks run, you should see something in your Nagios service details.

Nagios pfSense initial service detail

Great start! But where are the rest of the checks from part 2? In Nagios XI, to add more you can do one of two things. Either a) go to the Core Config Manager and copy configs or b) go *back* through the wizard and copy/paste each of the lines below. I prefer the 2nd method because it is far less mouse clicking. Also, if you opt for the CCM copy method, don’t forget to ‘apply configuration’ at the end!

Obviously, you will need to omit or change lines to meet the needs of your firewall/environment. For instance, if you use a VPN, you will need to change the IP address and name. You will also need to change the interface names if you want to monitor those. If you use a Windows server for DHCP or DNS, don’t add the service monitors for dhcpd or unbound (DNS).

Also note the two entries that have ‘sudo’ before them. If you receive any errors stating there is a problem with “remote command execution failed” or permissions, that is likely the issue. If you need help on configuring sudo on pfSense, refer to part 1 of this series. If you would like a little more details on the individual checks, refer to part 2 of this series.

/usr/local/libexec/nagios/check_disk /var/run -w 20% -c 5%Disk – VarRun
/usr/local/libexec/nagios/check_icmp -H -w 80,10% -c 150,40%Ping to OpenDNS
/usr/local/libexec/nagios/check_ntp_time -H Variation
/usr/local/libexec/nagios/check_load -w 3,2.8,2.6 -c 10,7,5 -rLoad
/usr/local/libexec/nagios/check_procs -w 200 -c 400Total Processes
/usr/local/libexec/nagios/check_swap -w 90% -c 40%Swap Usage
/usr/local/libexec/nagios/check_pf_cpu_temp -w 75 -c 90CPU Temperature
/usr/local/libexec/nagios/check_pf_cpu -w 85 -c 95CPU Usage
/usr/local/libexec/nagios/check_pf_mem -w 90 -c 95Memory Usage
/usr/local/libexec/nagios/check_pf_interface -i em1_vlan5 -name DEVICESInterface DEVICES
sudo /usr/local/libexec/nagios/check_pf_ipsec_tunnel -e <IP address> -name DallasTXVPN to DallasTX
sudo /usr/local/libexec/nagios/check_pf_state_table -w 60 -c 90State Table
/usr/local/libexec/nagios/check_pf_services -name snortService: snort
/usr/local/libexec/nagios/check_pf_services -name pingerService: pinger
/usr/local/libexec/nagios/check_pf_services -name dhcpdService: dhcpd
/usr/local/libexec/nagios/check_pf_services -name unboundService: unbound-DNS
/usr/local/libexec/nagios/check_pf_gw_status -G WAN_DHCP -w 60,5 -c 200,20Gateway Status

So what does the final result look like (see below)? Beautiful! Now that is how you monitor a firewall! A reader was kind enough to send me their Nagios Core screenshot as well.

If there are some particular checks you would like to see added, let me know and I’ll add it in. Better yet, write them up and/or add them to the GitHub repo and I’ll give you credit!

I’ve included the Nagios XI services config file so you can download it to compare checks. I’ve also included examples of the Nagios Core services.cfg and commands.cfg files so Core users would have a better idea of how to configure this solution as well.

Nagios XI – Download services example
Nagios Core – Download commands.cfg example
Nagios Core – Download services.cfg example

Nagios XI example

Nagios pfSense services detail

Nagios Core example

nagios core pfsense monitoring

28 thoughts on “Monitoring pfSense with Nagios Using SSH – part 3

  1. Dallas, excellent write-up. This is working perfectly. btw, thanks for taking the time to write the additional monitoring checks. Save me a ton of time.

  2. Excellent,

    Is it mandatory to use the proxy? Is it possible to use another user other than Nagios? for example nagios2?

    Could you explain how to do it in Nagios Core? I tried copying the text services but I do not know what I have to replace with: check_command check_xi_by_ssh

    1. It is kind of a confusing name. The SSH proxy simply means you are checking a service via SSH, i.e. not checking the SSH service itself. In that sense, it really isn’t a proxy as you would think in terms of a web or email proxy that uses a go between service/box to access something else. Hopefully that makes sense.

      You can name the user whatever you like. After you get the initial password-less SSH configured, you never need to use the username again anyway.

      Nagios Core uses the check_by_ssh command. Here is a check_by_ssh example for the check_pf_version command. Note: if you have additional arguments, just enclose them in the single quotes as well. Also, make sure you don’t use multiple/nested single quotes in an argument. However, you can use multiple doubles quotes inside a single and vice-versa. I’ll see if I can round up a commands and corresponding config file to post here for Core users in the next few days.
      ./check_by_ssh -H -C ‘/usr/local/libexec/nagios/check_pf_uptime’
      — OUTPUT —
      WARNING – 18 Hours, 26 Minutes

  3. I am having problems declaring the check_by_ssh in the file commands.cfg then in the host file the service does not work.
    # ‘check_by_ssh’ command definition
    define command {
    command_name check_by_ssh
    command_line $USER1$/check_by_ssh -H $HOSTADDRESS$ -C “/usr/local/libexec/nagios/check_pf_cpu -w $ARG1$ -c $ARG2$”

    define service {
    service_description CPU Usage
    use generic-service
    host_name PFSENSE2
    check_command check_by_ssh!-C “/usr/local/libexec/nagios/check_pf_cpu -w 80 -c 95”

    Can you help me? more thanks!

  4. I have problems declaring the commands to use it as a service.
    # ‘check_by_ssh’ command definition
    define command {
    command_name check_by_ssh
    command_line $USER1$/check_by_ssh -H $HOSTADDRESS$ -C “/usr/local/libexec/nagios/check_pf_cpu -w $ARG1$ -c $ARG2$”

    and service:

    define service {
    service_description CPU Usage
    use generic-service
    host_name PFSENSE2
    check_command check_by_ssh!-C “/usr/local/libexec/nagios/check_pf_cpu -w 80 -c 95”

    1. Jordi, there are several ways to configure this and none of them are necessarily wrong. IMO, the commands.cfg is too restrictive and it should simply be a general check_by_ssh. As a result, the services config would define the various checks using the more general check_by_ssh command. In your examples above, the warning/critical ARGS are defined in commands.cfg, but not in the services.cfg, i.e. the ARGS are included in a single argument itself. This morning I added downloadable commands.cfg and services.cfg files at the end of part 3 for Nagios Core users. I would recommend downloading them and then comparing them against what you have. Holler if you need anything else!

  5. Thanks for your posting all steps,
    I have did all step but I have a error could you, please tell is there some option for firewall in the nagios.cfg like adding window, switch that we allow the file of windows and switch?

    1. I’m not quite sure what you are asking. At the end of part 3, I have a sample commands.cfg and services.cfg. You should also be able to run the check_by_ssh command below to verify the password-less SSH is configured and Nagios can communicate with the pfSense. If that isn’t what you were looking for, feel free to rephrase your question and I’d be happy to help out!
      ./check_by_ssh -H $HOSTADDRESS$ -C “/usr/local/libexec/nagios/check_pf_version”

  6. Hi Dallas,
    Very good page and nice scripts which I implemented all well over NRPE to monitor my pfsense box running the latest 2.4.3 release. . Thanks for sharing it all !

    The only “glitch” I couldn’t fix is the output of the check_pf_state_table script. When ran straight from the command line on the pfsense box, it works all fine and prints the outcome correctly:
    OK – PF state table: 1152 ( 0% full – limit: 197000) | current_states=1152;state_limit=197000;percent_used=0).

    But when this same command is being triggered by my Nagios server (nagios core 4.2.4) over NRPE, Nagios will only display this:

    “- PF state table: ( % full – limit: )”

    in the “status information” column.. Tried to edit the “check_pf_state_table” shell script but no success so far.. Other shell scripts work well and display correctly under status information (for ex. check_pf_cpu).
    Any idea what might cause this?

    1. Hey Nicolas! Thanks for the feedback! It sounds like you almost have it. It’s been a while since I used NRPE, but I’m guessing it is a permissions issue. Is ‘sudo’ enabled for that check? NRPE on pfSense should have a checkbox next to each command that allows you to enable that particular check to run with additional privileges. I don’t know if you need to install the separate sudo pfSense package or if it comes with the NRPE package, but that would be something else to check. I’d love to here back when you figure it out!

      — Update —
      Curiosity got the better of me… I installed NRPEv2 and configured a new check for the state table. Without checking ‘sudo’ the values were empty as you described. Clicking ‘sudo’ and hitting save allowed me to re-run the exact same check from Nagios without any issues. I’ve included my before and after below.
      $ /usr/local/nagios/libexec/check_nrpe -H -c check_pf_state_table -p 5666
      – PF state table: ( % full – limit: ) | current_states=;state_limit=;percent_used=
      $ /usr/local/nagios/libexec/check_nrpe -H -c check_pf_state_table -p 5666
      OK – PF state table: 592 ( 0% full – limit: 200000) | current_states=592;state_limit=200000;percent_used=0

      1. Hi Dallas! Added the sudo package to pfsense, ticked the sudo box for that check in pfsense/nrpe and added ‘nagios’ user in sudoers file (/usr/local/etc/sudoers), restarted nrpe2 and tried again but still no success! 😮
        When I run this from my nagios box:
        ./check_nrpe -H -c check_pf_state_table -p 5666

        i now get: NRPE: Unable to read output

        If I untick the sudo box, I get
        – PF state table: ( % full – limit: ) | current_states=;state_limit=;percent_used=

        any idea is welcome :-)?

        1. I think you about have it! FWIW, I’ve never manually added the nagios user to the sudoers file via command line because the pfSense sudo package has a graphical interface (System -> sudo). Assuming your syntax is correct, it appears you are also missing the hostname or IP address in the example above; it appears my previous response back to you removed the “bracket” IP address “bracket” so I apologize for that. Also, I’m not exactly sure if the error response you are receiving is from the Nagios web app or not, but if you test from the command line on the Nagios system you often receive a little more information if you are missing an argument. Good luck!
          # /usr/local/nagios/libexec/check_nrpe -H -c check_pf_state_table -p 5666

  7. Hello Dallas,

    Thank you very much for putting this together. Your documentation is very well written, and the work is great.

    We have Snort set-up to do IP blocking based on events. I would like to have Nagios display the current contents of the snort2c pf table.

    I am a System Admin, and we all know that means I don’t code worth a (redacted). What do you think would be the best approach to have a ‘check_pf_snort2c’ type of thing added?
    Thanks again for the great work and documentation!

    1. Hey Vince! Happy to hear it helped you! Quite honestly, I’m a horrible coder too. 😉

      In your case, you simply want to display the snort2c table? If you don’t need alert logic, displaying that would be very straightforward as the command ‘pfctl -t snort2c -T show’ will show everything in the table.

  8. Hi
    Thanks for a nice plugin.
    I have made a fresh install on pfSense 2.4.4 and all checks except three works like it should, on State Table and VPN to xxx i get the following response:
    Remote command execution failed: sudo: no tty present and no askpass program specified
    Uptime works but is in status Warning but Status information says correct uptime.
    Any Suggestions?

    1. Thanks for the feedback Peter! I believe that error indicates you have the sudo plugin installed, however, you may not have the Nagios user added to the sudo/root group (System -> sudo) or have ‘no password’ checked. Try the same commands from the command line (pfSense or Nagios) to see if they work there. Uptime will come back as a warning if the uptime is under 24 hours. More than anything, uptime serves as a visual indicator if the system unexpectedly reboots. Hope that helps!

      1. Thanks for the feedback.
        It seems that there is a small bug in the GUI for Sudo, if you create a new row and select/fill in the values and hit save it removes the tick on “No password”, so i just ticked it again han saved it and now it works like it should.
        Once again, thanks for the work you put in to it.

  9. SO, I really appreciated the detailed writeup here. I was able to set up a reasonable slate of monitoring with about a month’s time in Nagios and no time on pfSense in a couple of days.

    I am using Nagios Core 4.5 so it is all in text files. We don’t have any specific point-to-point VPN’s setup and I haven’t looked at how to best watch our openVPN but the other scripts you provided were really helpful.

    What I was missing though was a way to watch our load balancing configs. We use these in front of our website. So I write one in csh that does the trick (the only shell I like less than csh is Bourne shell).

    Anyway, I’d like to upload it or send it to you. I’m a lifer infrastructure guy and have never had a use for git so I am ill equipped to send it there without a little help. Let me know if you’re interested and I’ll be happy to send it along.

    1. Hey Don! My apologies for my late reply. I’ve had success monitoring OpenVPN as a service. The only gotcha is that you need to pay attention to the server or client id and add it to the check. I’ve included the example from the code below. I would love to add the load balance monitoring to the git repo. I’ll shoot you an email shortly. Thanks for offering!

      echo “Note: If openvpn is the service getting checked, two options must be”
      echo ” specified — the option “server” followed by the server id.”
      echo “Example: -name openvpn server 1”

  10. Hello! Where I use check_pf commands, in pnp4nagios graph won’t display warning and critical on graph, but example: check_load show warning and critical. What can I do? Is it bug, or I use wrong pnp4nagios template ?

    1. I’m not completely familiar with the pnp4nagios graph or templates. However, you won’t see warnings and criticals on the graph unless a “limit” is explicitly provided/returned in the performance data. For example, the state table has the “state_limit” returned so you know at a glance how close your states come to said limit. I hope that makes sense!

  11. Hi Dallas,

    Thank you for your great work! We have been using your plugins succesfully for a few years, but with the latest version of pfSense, 2.5, the check_pf_cpu script throws an error.

    Any thoughts?



  12. Hi, so I’ve been following this guide from part 1 through 2 and I managed to get everything right.
    However I am stuck on Part 3. I am fairly new to all of this, my nagios is running on CentOS and It’s nagios core. Could someone explain to a newbie how to add all of the CFG files for it to show up on Nagios Core.

    1. There are service and config example files on the bottom of that same page. Take a look at those and see if it helps. It’s been a while since I’ve used core, but the concept of adding them is the same as any others you would add to core. Holler if that doesn’t make sense!

Leave a Reply

Your email address will not be published. Required fields are marked *