Last week I mentioned that that if you have a service that requires a certain SLA, it can't depend on things of lesser SLA.
My networking friends balked and said that this isn't a valid rule for networks. I think that violations of this rule are so rare they are hard to imagine. Or, better stated, networking people do this so naturally that it is hard to imagine violating this rule.
However, here are 3 from my experience:
- Situation: A company who's internet connection is a DSL modem. The modem is in the hallway near the computer room, but not in the computer room. As a result, when someone knocks the modem over, the company's website is down. (web site depending on router). Improvement: move the router into the computer room.
- A computer room with excellent UPS and power infrastructure... but the router isn't on the UPS for weird historical reasons (it is depending on external power). Improvement: move the router onto the UPS.
- An excellent computer room with fine ethernet switches... but the router is in the lab one room over. Each VLAN has a physical cable connected to it with a cable that runs to that other room. I was told, "the researchers are doing some experiments on the router so they wanted it in their lab". Improvement: Move the router into the computer room.
3 true stories.
It is surprising how often this is ignored.
In reality, each serial dependency that is necessary to meet and SLA must be added together. To make that problem easy, I tend to assume that there are about 10 serial dependencies for a typical application stack (switching, routing, firewalls, load balancers, app servers, database servers, SAN, power, cooling, etc).
Then the seat of the pants, rough calc is decimal-shiftingly easy. To meet three nines, you need to build each technology layer to about four nines. To meet four nines, you need to build each layer to about five nines, etc.
It's obviously more complicated than that, but it's a good place to start.