All checks on that page are done every 5 minutes. The page is also set to refresh every 5 min so you don’t need to refresh. You will get the odd thing going off once in a bit, but if it stays off for extended period of time it’s worth investigating.
It contains the following sections:
###Summary
Contains a summary of nms monitored devices, ports (interfaces) and services. Busy expanding this to include all OSPF routers.
###CTWUG Sites
List of CTWUG sites and their status.
###DNS Server (172.18.1.1)
DNS servers, their status and the result of an SOA lookup for each zone and view. These are servers responding on 172.18.1.1. This also allows one to compare SOA serials visually across the different servers for a zone for problems.
###Tunnels
List of all tunnels. Note each tunnel appears twice because we check both ends.
###IRC Servers
Status of IRC servers.
###NTP Servers (172.18.1.1)
Status of time servers responding on 172.18.1.1.
###SSL Certificates
Status of SSL certificates for CTWUG sites. We using Let’s Encrypt which is free but has 90 days validity. These renewals are mostly automated but this bit should pickup if something goes wrong.
howsit i had a look myself. was wondering if there is anyway to make this viewable on the forum page itself on the side.
Maybe only list :
hostname ,site , status and time … If im correct it could squeeze in on the main forum page on the left . keep imind im not a web guru, its just a thought
Thanks. Don’t really want to clutter the forum up. Plus there is way to much information for a simple summary.
To do a forum update I might make a notification bot that posts to a Status Alerts thread automatically when things are down. Think that would be more useful and easier to do.
I will also add status link right at the top of the page on the far right next to Finances.
Haven’t seen the code, but looking at the page right now, most stuff is green, except for 1 node there is 2 sensors showing red.
Maybe a good idea, is to summarize the good stuff, and make it expandable in case you want to look at it, and make the red stuff stand out at the top below the summary, so that it isnt needed to scroll down.
That way you dont have to scroll or search if something is down, it can be the 1st thing you see when you visit the page. Once you then want to add it to the forums or main website in a small widget, its shorter to say “all monitored stuff = OK” or show what stuff is not ok.
Yeah most of it is green. It’s a simple PHP page with SQL queries to pull the data.
All of it is from NMS so I don’t want to go overboard and redesign an interface. Any summary that I do do will essentially be the summary table at the top. I reproduce it below.
Total
Up
Down
Ignored
Disabled
Devices
203
196
7
0
0
Ports
855
656
106
8
75
Services
70
70
0
0
0
The Services are all services (DNS, http, SSL, NTP etc. etc). The tunnels are essentially in the ports count. Might want to show that a bit more detail in a summary, though the summary would break if no tunnels are available in any case. Devices are OSPF routers mainly (but also the wugpi/servers themselves)
I will probably drop the ignored and disabled columns. That would be a short and sweet network summary. It tells me all services are 100%. Most devices are up. And a bunch of ports are down, though maybe not more than usual.
I’d also like people to start disabling unused ports as ports that are not disabled are counted as down if nothing is connected to them. It also lightens the load of the polling as it doesn’t need to poll so many disabled ports.
@MDE that’s a good idea I never replied on. Everything on that page is checked every 5 min though. The only checks that are down sometimes are the tunnels and there I show the poll time (mainly because I’m still tuning things).
The other checks are always active and run on a cron. The only normal case that they would be down for longer than a couple of minutes is when the server running it is down, but this would include the status page itself.