
Monitoring pfSense with Nagios Using SSH – part 2
Downloading and testing the checks
In the part 1, we setup password-less SSH. Now that we have a secure connection between the systems, we are quite a bit closer to securely running check commands using the SSH proxy on Nagios XI or the check_by_ssh on Nagios Core.
Changelog
15Dec2017 – Originally posted
9May2018 – Added uptime and CPU temperature check as well as a Nagios Core example
11May2018 – Modified the check_pf_mem plugin
1June2018 – Added Nagios Core services.cfg and commands.cfg examples
29Oct2018 – check_ping changed to check_icmp
7Jan2019 – added info about ‘username is reserved’ error
16Apr2020 – added info about limiting sudo commands
6Apr2022 – Added check_pf_gw_status (contributed by Manuel Gayer)
First though, we need to get the various plugins on the pfSense box. We are going to use a handful of custom scripts, but we’ll also use some pre-compiled executables. You *could* compile them yourself by downloading them from https://nagios-plugins.org/downloads/, but I would not recommend it. Instead, grab pre-compiled versions from freshports.org. This is easy in FreeBSD. You just need to be at a command line on the pfSense system. If you are still logged in from the last section, remember that you are signed in as the Nagios user. In that case, use ‘sudo’ prior to the ‘pkg install nagios-plugins’ command as shown below. After replying ‘y’ to the ‘proceed with this action’ question the command will pull the files down and place them in the package’s preferred directory.
# sudo pkg install nagios-plugins Updating pfSense-core repository catalogue... pfSense-core repository is up to date. Updating pfSense repository catalogue... pfSense repository is up to date. All repositories are up to date. The following 1 package(s) will be affected (of 0 checked): New packages to be INSTALLED: nagios-plugins: 2.2.1_5,1 [pfSense] Number of packages to be installed: 1 The process will require 2 MiB more space. 366 KiB to be downloaded. Proceed with this action? [y/N]:y
Excellent! Now the pre-compiled plugins can be found in the ‘/usr/local/libexec/nagios’ directory. Give your newly installed plugins a test run by typing in the command below. If all goes well, you should receive some output specifying your current number of processes.
# /usr/local/libexec/nagios/check_procs PROCS OK: 67 processes | procs=67;;;0;
So that’s great, but those files aren’t exactly specific to pfSense. What about monitoring items such as services, VPNs, etc. I have created custom scripts for those checks, which are freely available on GitHub. You can easily download these to your pfSense firewall using the curl and tar command below. Make sure you run these commands on your pfSense system.
# curl -LO https://github.com/oneoffdallas/pfsense-nagios-checks/archive/master.zip # sudo unzip -j master.zip -d /usr/local/libexec/nagios/ # sudo chmod +x /usr/local/libexec/nagios/check_pf_*
If everything went as planned, you can run the command below and get some output back.
# /usr/local/libexec/nagios/check_pf_version Current version: 2.4.2-RELEASE / Mon Nov 20 08:12:56 CST 2017
Various Checks Explained
Below, I’ve also went through a fair amount of effort explaining the plugins I recommend as well as some default values for pfSense that should work in most cases. If you want to test out some of them on your own pfSense, just run the commands (not their output) from the /usr/local/libexec/nagios directory (unless you want to type in the full path).
If you want to just go with my recommendations and don’t want the full explanations, then head over to part 3.
Go to Part 3: Configuring the checks on Nagios
Off the shelf FreeBSD checks
./check_icmp -H 208.67.222.222 -w 80,10% -c 150,40% OK - 208.67.222.222: rta 12.252ms, lost 0%|rta=12.252ms;80.000;150.000;0; pl=0%;10;40;; rtmax=12.353ms;;;; rtmin=12.146ms;;;;
So this isn’t ping “to” the pfSense. Instead, this is ping “from” the pfSense to another system. This is useful if you are monitoring the internet connection of a firewall – local or remote. It’s also useful if you stack a few of these together, i.e. you can compare your pings to OpenDNS (in the example) vs. pings to vendor XYZ. If the pings to vendor XYZ increase and corresponding data to OpenDNS does not, your vendor probably has an issue on their hands… Even better, now you’ll have the data to show them! Assuming you have decent internet, this setup should work. It will create a warning if the roundtrip is greater than 80ms and the percentage of packet loss is greater than 10%. Likewise, it will cause a critical alert if the roundtrip exceeds 150ms and the packet loss is greater than 40%. Note: This used to be check_ping instead of check_icmp. Functionally, they do the exact same thing. That said, check_icmp is faster and better because it doesn’t use the ping command from the distribution, i.e. no parsing.
./check_ntp_time -H time.google.com NTP OK: Offset -0.006043791771 secs|offset=-0.006044s;60.000000;120.000000;
This check tests the firewall time against an NTP time source and produces a warning if the variation is greater than 60 seconds and a critical if it is greater than 120 seconds. As a best practice, I prefer to use a time source different than the one I use in the pfSense web GUI. Note: This check can have issues caused by UDP packets not returning properly or in time. If this check produces a fair amount of false positives, either try a different time source or simply increase the timeout to 30 seconds. Increasing the timeout to 30 seconds can be done by adding the “-t 30” in the Core Config Manager (Configure -> Core Config Manager -> NTP service) as shown below. This change can be made after the service is configured. Even with the occasional issue, I leave this check enabled because it is important for your firewall/logs to have the correct time. It also wouldn’t hurt to only check every few hours instead of every 5 minutes to further eliminate false positives.
./check_disk -w 20% -c 5% -p / DISK OK - free space: / 23066 MB (90.29% inode=99%);| /=2478MB;22212;26376;0;27765
The disk check might be counterintuitive to what you might think. Warn if less than 20% of disk is free and produce a critical alert if less than 5% of disk is free. You’ll also note I’m only checking on the root (/) partition. The other partition I would recommend checking in a standard pfSense setup is the /var/run partition using the command below.
./check_disk -w 20% -c 5% -p /var/run DISK OK - free space: /var/run 3 MB (96.75% inode=97%);| /var/run=0MB;2;2;0;3
./check_load -w 3,2.8,2.6 -c 10,7,5 -r OK - load average: 0.21, 0.17, 0.15|load1=0.210;3.000;10.000;0; load5=0.170;2.800;7.000;0; load15=0.150;2.600;5.000;0;
Load is a funny beast and lots of folks have different opinions on it. That’s because load can mean different things to different systems. Load is based on how busy your CPU, disk, and other resources are. Personally, on most *nix installs I prefer to keep the load under 3 so that is what I recommend here as well. The warning of 3,2.8,2.6 is the load average after 1 minute, 5 minutes, and 15 minutes respectively. You might also take note of the ‘-r’ which divides the load by the number of processors.
./check_procs -w 200 -c 400 PROCS OK: 64 processes | procs=64;200;400;0;
This checks the number of processes. On a somewhat busy system with IDS enabled, I found 200 was a good warning state and it did a good job of letting me know if something went haywire or jobs were hanging. If you find yourself getting some false positives and your system regularly sits around 200, go ahead and bump it up a bit.
./check_swap -w 90% -c 40% SWAP OK - 92% free (1879 MB out of 2047 MB) |swap=1879MB;1843;819;0;2047
Swap is another funny one and arguably optional if you are monitoring everything else. Surprisingly, the pre-configured, ARM-based netgate/pfSense systems don’t even come with swap enabled so there is that debate on whether it is necessary at all. Maybe this is from a flawed line of thinking, but I still prefer having swap. When swap is enabled, I also like watching it because if you are using it, your system ran out of RAM at some point. So you would assume the warning would be 100%… Unfortunately, you would be wrong. If you happen to use swap, the safest way to clear it back out is a reboot and I am a fan of seeing my uptime climb. And no, “swapoff -a && swapon -a” is not always the best or safest route. Also, using swap on rare instances isn’t necessarily a bad thing. So instead, I set it to 90% and leave it at that. Add more RAM if you are frequently going above that mark. Incidentally, if you find yourself using a lot of swap, a memory increase would also help in other areas (and stats) including the load due to disk I/O. It’s just one of those things that can affect other metrics, which leads to red herrings when troubleshooting <- Yes, I’m speaking from experience!
Custom pfSense checks
So those are the standard checks Nagios provides for FreeBSD and while they are helpful, they are seriously lacking when monitoring a pfSense and firewall specific functionality. I mean what about VPN tunnels, interfaces, state tables, and services? That is where the power of Nagios and custom scripts come in!!!
Load is fantastic because it can give wonderful indicators on how a system. But if the load is high, is it the CPU, the memory, the disk i/o or perhaps even a combination of all 3? Thus, in addition to monitoring load, I recommend monitoring the CPU and memory as well using the commands below.
./check_pf_cpu -w 85 -c 95 OK - CPU Usage = 0%|CPU=0;;;;
Percentage of CPU used. It creates a warning if CPU is above 85% usage and critical if it is above 95%.
./check_pf_mem -w 90 -c 95 Memory OK - 67.2% (1443078144 kB) used |pct=67.2
Percentage of memory used. It creates a warning if the memory is above 90% usage and critical if it is above 95%.
./check_pf_services -name snort OK - snort service is running
./check_pf_services -name pinger OK - dpinger service is running
./check_pf_services -name dhcpd OK - dhcpd service is running
./check_pf_services -name squid OK - squid service is running
Ever had snort stop unexpectedly? What about unbound, Snort, dhcpd, or any other pfSense services? Now you can monitor all of them! Just specify the name of the service as shown in any one of the examples.
./check_pf_services -name pfb_dnsbl OK - pfb_dnsbl service is running ./check_pf_services -name pfb_filter OK - pfb_filter service is running
If you have followed any of my other posts, you know I’m a huge fan of the pfBlockerNG package. If you use it, you should also monitor its associated services — pfb_dnsbl and pfb_filter. If you don’t use the software, you can exclude these checks… But I would highly recommend adding it!
./check_pf_interface -i em1_vlan6 OK - em1_vlan6 up and active
./check_pf_interface -i em1_vlan6 -name LAN OK - LAN(em1_vlan6) up and active
Check whether your interfaces are up. This is extremely helpful on a firewall with multiple interfaces. You can see the names of all interfaces via the ‘ifconfig’ command or by going to Interfaces -> Assignments from the web GUI. The naming on VLANs is a little odd so take note of that. Not a fan of the default name or want it to match what you have in the web interface? No problem! Use the ‘-name’ and it will instead provide a friendlier name of your choosing.
./check_pf_ipsec_tunnel -e <IP address or hostname of remote> OK - IPSEC VPN tunnel to <IP address of remote> - ESTABLISHED 70 seconds ago
./check_pf_ipsec_tunnel -e <IP address or hostname of remote> -name DallasTX OK - IPSEC VPN tunnel to DallasTX - ESTABLISHED 3 minutes ago
Ever have a VPN go down and not know about it for a while? Not anymore! Once again, if you’re not a fan of the default VPN name based on IP address or if want something more descriptive because you are monitoring numerous tunnels, you can use the ‘-name’ switch to provide a friendlier name of your choosing. On a side note, I always enjoyed calling vendors (who controlled the device on the other end) to let them know a VPN was down! If you have pfSense firewalls on both ends of the IPSEC tunnel and you’re monitoring both of them with Nagios, you will just double-up on your alerts if you monitor both ends of the tunnel.
./check_pf_state_table -w 60 -c 90 OK - PF state table: 315 ( 0% full - limit: 98000) | current_states=315;state_limit=98000;percent_used=0
The checks the percentage of the state table in use. From first-hand experience, if your state table fills up you’re going to have a bad day and your firewall will do some wonky things that are nearly impossible to pin down.
./check_pf_version Current version: 2.4.2-RELEASE / Mon Nov 20 08:12:56 CST 2017
Some time ago, this check compared the local version against the latest for your branch on the web. Unfortunately, some code changed and I haven’t circled back to uncover the reason. Instead, the check now returns the currently installed version and build date.
./check_pf_uptime OK - 127 Days, 15 Hours, 39 Minutes ./check_pf_uptime WARNING - 38 Minutes
Uptime was a suggestion by a reader of this blog. For all intensive purposes, one of the other checks will likely error out first. However, there is a chance it will be missed if a) you’re not looking at the alert screen or b) if the “only alert after X number of failed checks” is too high. Thus, if all of the other checks missed a possible downtime and/or reboot, notify with a warning if the system has been up for less than a day. You could change this rather easily within the script if one day is too long or too short.
./check_pf_cpu_temp -w 75 -c 90 OK - CPU temperature = 59 (C)|TEMP=59;;;;
CPU temperature was another reader suggestion. CPU temperature (in degrees Celsius) is displayed on the initial firewall screen so why not monitor it via Nagios as well? I tested this with several versions of pfSense and worked without issue including the ARM processor. It does only monitor the first CPU core, but all of the cores should be within a degree or two of one another anyway.
Note: This will not work in virtual environments.
./check_pf_gw_status -G WAN_DHCP -w 60,5 -c 200,30 GATEWAY WAN_DHCP OK - Status = online, Packet loss = 0.0%, RTT = 29.987ms, Monitor=8.8.8.8|rtt=29.987ms;60.000;200.000;0.000 pl=0.0%;5;30;0;100
The gateway status check was contributed by Manuel Gayer via GitHub. It is a great check whether you are monitoring one or multiple interfaces. From the command line, the ‘-l’ option lists the gateway names so you can easily copy/paste the name. The general format is the gateway name followed by the maximum RTT and packet loss for the warning and critical checks. Generally speaking, -w 60,5 and -c 200,30 should be sufficient for both checks. However, keep in mind that you should adjust the settings depending on your ISP and what gateway you are monitoring, i.e. your ISP’s gateway or a custom IP such as the the Google IP indicated above (System -> Routing -> Edit -> Monitor IP -> Save).
Go to Part 3: Configuring the checks on Nagios

Dallas Haselhorst has worked as an IT and information security consultant for over 20 years. During that time, he has owned his own businesses and worked with companies in numerous industries. Dallas holds several industry certifications and when not working or tinkering in tech, he may be found attempting to mold his daughters into card carrying nerds and organizing BSidesKC.
30 thoughts on “Monitoring pfSense with Nagios Using SSH – part 2”
Hello,
Thank you for your work.
I tried to implement your checks with icinga 2, but its not possible to check over ssh because the parameters (-c, -w) coming in alphabetic order, and your plugins show the help page if -w isnt in first position.
I ended up in editing the plugins, and change -w and -c, but maybe there is a better fix. 🙂
Lukas, you’re right. I’m not familiar with Icinga… Is it unable to specify the order of arguments? The order, -w followed by -c, has been somewhat of a standard in Nagios so I never really paid attention to it. It would require a bit of re-tooling, but I might take a look at it in the future. Thanks for the feedback!
Sure it is.
Check the “Order” statement:
https://icinga.com/docs/icinga2/latest/doc/09-object-types/#checkcommand
Stefan, thanks for the article on the order statement. That makes perfect sense.
hi dallas .. hope your good, i’m new using nagios so i don’t know much , i got to monitor some system here in venezuela and i need some help … i’m using nagios core and i got it installed and running , i haven’t setted anything yet …
what would be the next step ??? configuring the server ??? really need some help
When configuring pfSense monitoring over SSH, it’s easiest to think about it just like how the guide is configured. Basically, setup password-less SSH and make sure it is working first. Follow that with testing the scripts from the command line of the Nagios system (example below). Finally, add all of the services. Keep in mind there is an example commands.cfg and services.cfg at the end of part 3, which should help as well. Hope that helps!
./check_by_ssh -H -C “/usr/local/libexec/nagios/check_pf_version”
hi man amazing guide … but im new at this … can you detail a little bit more about how to download the pre compiled pluggins,
i’m stuck in that part, i mean what do i have to do on freshports to get the pluggins? and after that, how can i install them? … excuse me but i’m new in linux but i’m really exited about nagios on pfsense , hope you can help me dude
Hey Brad! Thanks for stopping by! You need to be at the command line (SSH) prompt on pfSense. If you are signed in as the nagios user on pfSense, use sudo before the rest of the command. If you are logged in as root, then you can skip the sudo part and just use ‘pkg install nagios-plugins’ to download/install the pre-compiled packages from freshports.org. Hope this helps!
Hello,
also monitoring using icinga2.
I would be interested in writing a check_command to monitor if bandwith, but i do not know freebsd.
Any hint about how to get the information via command line tools ?
Thanks, Stefan
I looked into this years ago and I couldn’t find much pre-built. I was still planning to build it, but I found the need wasn’t as great as I originally thought since commands like check_icmp (ping) would throw errors anyway to alert me of saturated links. I could then go into the system and use ntopng, pftop, etc. to figure out what was going on.
Nonetheless, I took a look around. There wasn’t anything in the available packages that I felt would help on the command line. I tried ‘systat -ifstat’ and it looked extremely promising, but the version with pfSense doesn’t appear to work with -b. The -b option is needed because it would gather the stats and then exit the program. I tried some of the other built-in commands, but I couldn’t find anything beyond packet/traffic counters. From the pfSense recommendations (https://www.netgate.com/docs/pfsense/monitoring/monitoring-bandwidth-usage.html), I decided to give iftop a shot. I think with a little scripting, it would work. You do have to install it first using the pkg install command below. I’ve also included some sample output from the command. It’s allows you to specify interfaces, it provides both send and receive, etc. If you decide to put a script together, let me know and I’ll happily add it to the other pfSense monitoring scripts.
: sudo pkg install iftop
: sudo iftop -t -s 1 -n -N -i em1
——————————————————————————————–
Total send rate: 60.1KB 60.1KB 60.1KB
Total receive rate: 996B 996B 996B
Total send and receive rate: 61.1KB 61.1KB 61.1KB
——————————————————————————————–
Hello,
i’m getting
“shared object libdl.so.1 not found”
when I run ./check_procs
./check_pf_* works though :\
That would lead me to think the nagios-plugins didn’t install correctly. You might try re-running the ‘sudo pkg install nagios-plugins’ to see if that can correct it. Are you on the latest version of pfSense? I’ve run this install on both ARM and x86-based in the past few weeks so can I ask what architecture you are running? FWIW, if check_procs is the only thing that doesn’t work, I wouldn’t worry about it too much. Years ago there were some runaway/stale processes from an installed package so I added it and I haven’t seen that issue since.
Thank you for your quick reply,
i’m running pfsense 2.4.2 x86 – i’ve tried to remove the pkg & install it again, but it didnt work – could quite possibly be the version.
Cheers!! & Thanks again
I could definitely see the outdated version (2.4.2) causing problems with Nagios. Best of luck!
Hi Dallas,
When i run the test command in the Pfsense local machine, it works fine. but after i apply to the Nagios XI, first error it shows plugin time out and after i add the -t 30.
run check command this is the output
[nagios@nagios.domain.local ~]$ /usr/local/nagios/libexec/check_by_ssh -H 10.0.0.1 -C “/usr/local/libexec/nagios/check_pf_uptime” -t 30
any suggestions?
Hey Tim! Is password-less SSH working for the nagios user? When logged in as nagios on the Nagios XI box, you should be able to type ‘ssh 10.0.0.1’ (minus the quotes) and get to the pfSense command line. The only other thing I can think of is sometimes the quotes get messed up with copy/paste. Maybe try replacing the double quotes with single quotes. Hope this helps!
yes is password-less. i can ssh from my nagios box to pfsense by using ssh 10.0.0.1 -p 22181. (as i reply below).
BTW, thanks for your quick reply.
BTW, just a little bit additional information about the ssh port i used in the pfsense. I used other port for ssh, like 22181. but when i configure the ssh proxy in nagios, i don’t have options to choose the port. is it the problem?
Hi Dallas,
Thanks very much. that is the problem when i change it back to port 22. it works.
If you don’t mind i connect you in the linkedin. and i am heading to security areas. if possible, can you point me to the right direction.
Regards,
Tim
Glad to see you figured it out. Yes, feel free to connect up on LinkedIn and message me. I’m happy to help!
Hi Dallas,
I’m trying to change the warning limit on the uptime check I tried ./check_pf_uptime -t 300 but this gives me the same result I would like it to not give me a warning for the entire day as this is to long could you guide me on how to change this on the script?
Hey Gareth! Sorry I missed this. The uptime check is a *really* simple and dirty script that does not accept any inputs. Instead, it looks to see if days is specified because that means the system has rebooted in the last 24 hours. Unfortunately, I didn’t have a need to do much past that. If you updated it to accept time constraints, feel free to share it and I’ll happily include it for others. Thanks!
hi, can anybody add check for openvpn tunnels?
Hey kuldeep! You can check that the OpenVPN client or server is running via the services script. An example of this is found in the script itself (below). In addition to checking the service, you could also use the check_icmp script to ping a host on the other side. In the walkthrough, there are 2 types of ICMP checks discussed. One is a ping *to* the firewall and the check_icmp below is run *from* the firewall, which is why we can use it to test our tunnel. Hope this helps!
Note: If openvpn is the service getting checked, two options must be
specified — the option server followed by the server id.
Example: check_pf_services.sh -name openvpn server 1
./check_pf_services -name openvpn server 1
OK – openvpn service is running
./check_icmp -H 10.0.0.2 -w 80,10% -c 150,40%
OK – 10.0.0.2: rta 45.995ms, lost 0%|rta=45.995ms;80.000;150.000;0; pl=0%;10;40;; rtmax=46.190ms;;;; rtmin=45.841ms;;;;
Thank you so much! The previous IT person set this up so when I updated PFsense Nagios dumped it. This write up (with a little help deleting the bad key from another site) really helped. You rock!
Excellent! Thanks for the feedback and I’m so happy it worked for you Juston!
Thanks for these scripts.
I have been seeing the following error in the PHP logs of pfsense:
/usr/local/libexec/nagios/check_pf_version: WARNING: Could not mark subsystem: pkg dirty
This is because nagios does not have the ability to write to /var/run/pkg.dirty when running get_system_pkg_version()
I fixed this for myself by editing /usr/local/libexec/nagios/check_pf_version
and replacing:
require_once(“pkg-utils.inc”);
$system_pkg_version = get_system_pkg_version();
With:
require_once(“pkg-utils.inc”);
global $g;
if (file_exists(“{$g[‘varrun_path’]}/pkg.dirty”)) {
$system_pkg_version = get_system_pkg_version(false,false);
} else {
shell_exec(“sudo touch “.”{$g[‘varrun_path’]}/pkg.dirty”);
$system_pkg_version = get_system_pkg_version(false,false);
shell_exec(“sudo rm ” . “{$g[‘varrun_path’]}/pkg.dirty”);
}
Not sure if this is the right way to fix this, but its resolved the php warnings filling my syslog server
Thanks for the feedback Daniel! I know we continued the conversation on GitHub, but the referenced script has been updated. I appreciate your help!
Hello, love this article. Just wanted to inform that the ./check_pf_cpu_temp does not work in the updated RC 23.01 of pfsense. Hope this gets to the right person who could fix the issue so I could start using this check again. Thanks
Hey Mike! A ticket was submitted on GitHub regarding this particular issue (and the version check). We have not tested as 23.01 is still a release candidate, however, we will take a look at this soon. In the meantime, if you happen to find the error in the code, please feel free to submit the updates there.
https://github.com/oneoffdallas/pfsense-nagios-checks/issues/9