sysmon overview


SYSMON
======

A number of monitoring solutions exist in the market place. I built
SYSMON in 2001 and I hope most people will judge the architecture
elegant. Perhaps other systems are more featureful or more mature,
but I hope the simplicity, robustness and distributed-ness of this
one are not easily matched.

The designated domain of SYSMON is that of an internet company
operating POSIX environments and having a company network where the
data is refined and evaluated, with one or more client sites where it
is collected and sent "home".

The system employs several small daemon processes with the following
names and functions:

WATCHER:
this process configurably starts and keeps alive a number of
sensors. The sensors typically are shell scripts. They must
produce output in a strict and simple format. One watcher
process is started on each machine to be monitored.

COLLECTOR:
this process accepts network input from WATCHERs and
aggregates it. It pumps it out of its standard output stream.

SENDER:
this process accepts its input from the standard input stream.
Thus, you couple the COLLECTOR and the SENDER back to back.
The sender now filters different items for different
destinations, groups them into packets and delivers them in
encrypted and authenticated form.

The neat trick here is that the client may not wish to provide
all the data to the service provider. The monitoring system is
agnostic for such cases: you simply send the stuff to the provider
that the provider is allowed to have and at the same time you send
everything to the client (as in business partner purchasing
your services).

RECEIVER:
this is a CGI script. It accepts the packets, authenticates them,
validates their syntax, checks for stuff the database has not seen
before and automatically stores everything that's there.

REFINERY:
this is a daemon that performs database queries on behalf of a
distributor. Its configuration file specifies SQL queries on the
database and also lists the inputs necessary to have in the database.
The dynamic nature of the protocol ensures that a REFINERY configured
with an encompassing list of queries only provides the answers
possible at a certain time. Since the RECEIVER may create tables,
that state may change.

SERVER:
this daemon saves resources through aggregation and caching of queries,
as well as providing an access control mechanism.

Thus, in a total of 6000 lines of code, you get a monitoring solution
in sync with the UNIX philosophy: keep components small and
orthogonal, make data and configuration files human-readable, and
transfer streams via the use of anonymous pipes.

I wish you a successful deployment of the solution, and do pass back
the sensors to the community, if you write any!

Regards,

Oliver Seidel