Network categories of System Monitoring

compliances , networking

Network categories of System Monitoring

January 5, 2011

Monitoring System Configuration Changes This category includes monitoring for changes in hardware and software configurations that can be caused by an operating system upgrade, patches applied to the system, changes to kernel parameters, or the installation of a new software application.

The root cause of system problems can often be traced back to an inappropriate hardware or software configuration change. Therefore, it is important to keep accurate records of these changes, because the problem that a change causes may remain latent for a long period before it surfaces. Adding or removing hardware devices typically requires the system to be restarted, so configuration changes can be tracked indirectly (in other words, remote monitoring tools would notice system status changes).

However, software configuration changes, or the installation of a new application, are not tracked in this way, so reporting tools are needed. Also, more systems are becoming capable of adding hardware components online, so hardware configuration tracking is becoming increasingly more important.

Monitoring System Faults. After ensuring that the configuration is correct, the first thing to monitor is the overall condition of the system.

Is the system up?
Can you talk to it, ping it, run a command?

If not, a fault may have occurred. Detecting system problems ranges from determining whether the system is up to determining whether it is behaving properly. If the system either isn’t up or is up but not behaving properly, then you must determine which system component or application is having a problem.

Monitoring System Resource Utilization. For an application to run correctly, it may need certain system resources such as the amount of CPU or I/O bandwidth an application is entitled to use during a time interval. Other examples include the number of open files or sockets, message segments, and system semaphores that an application has. Usually an application (and operating system) has fixed limits for each of these resources, so monitoring their use is important. If they are exhausted, the system may no longer function properly. Another aspect of resource utilization is studying the amount of resources that an application has used. You may not want a given workload to use more than a certain amount of CPU time or fixed amount of disk space. Some resource management tools, such as quota, can help with this.

Monitoring System Performance. Monitoring the performance of system resources can help to indicate problems with the operation of the system. Bottlenecks in one area usually impact system performance in another area. CPU, memory, and disk I/O bandwidth are the important resources to watch for performance bottlenecks. establish baselines you should monitor system during typical usage periods. Understanding what is “normal” helps to identify when system resources are scares during a particular periods (for example “rush hours”). Resource management tools are available that can help you to allocate system resources among applications and users.

Monitoring System Security. System’s availability can be impacted through unauthorized use. Performance and resource controls are not useful if the system is used for the wrong purposes. The value of security tools is often overstated but in small doses they can be useful not harmful. for example it is easy to monitor for world writable files and wrong permissions on home directories and key system directories. There no reason not to implement that. In many cases static (configuration settings) security monitoring can be adapted from hardening package such as Titan.

Monitoring system logs. This is an integral area that overlaps with each and every area described above but still deserve to be treated as a separate. System logs provide a wealth of information about the health of the system, most of which is usually never used as it is buried in the noise and because regular syslog daemon outlived its usefulness. Usually log monitoring is done along with the integration of log stream on the special log server. Few people understand the flow of messages to central log server represents a decent distributed monitoring system and that instead reinventing the wheel it is possible to enhance it by writing probes which write messages to syslog.

Type above and press enter or press close to cancel.

Blog

Network categories of System Monitoring