Results tagged “networking”

Last week I mentioned that that if you have a service that requires a certain SLA, it can't depend on things of lesser SLA.

My networking friends balked and said that this isn't a valid rule for networks. I think that violations of this rule are so rare they are hard to imagine. Or, better stated, networking people do this so naturally that it is hard to imagine violating this rule.

However, here are 3 from my experience:

Run, run, run, dead.

I assume you have some kind of automated monitoring system that watches over your servers, networks and services. Service monitoring is important to a functioning system. It isn't a service if it isn't monitored. If there is no monitoring then you're just running software.

Monitoring "is it down?" is reactionary. It is better than no monitoring at all, but all it tells you is that there is already a problem. Monitoring is better when it predicts the future and prevents problems.

An analog radio (one with an old-fashion vacuum tube) sounds great at first, but you hear more static when the tube starts to wear out. Then the tube dies and you hear nothing. If you change the tube when it starts to degrade, you'll never have a dead radio. (Assume, of course, you change the tube when your favorite radio show isn't on.)

A transistor radio, on the other hand, is digital. It plays and plays and plays and then stops. Now, during your favorite song, you have to repair it.

At Bell Labs someone called this the "run run run dead" syndrome of digital electronics.

How can we monitor computers and networks in a way that makes it more like analog electronics? There are some simple tricks we can use when monitoring to be "more like analog."

Posted by Tom Limoncelli