Awesome Conferences

April showers bring May Flowers... but May brings...

April Showers bring May Flowers. What does May bring? Three-day weekends that make A/C units fail!

This is a good time to call your A/C maintenance folks and have them do a check-up on your units. Check for loose or worn belts and other problems. If you've added more equipment since last summer your unit may now be underpowered. Remember that if your computers consume 50Kw of power, your A/C units should be using about the same (or more) to cool those computers. That's the laws physics speaking, I didn't invent that rule. The energy it takes to create heat equals the energy required to remove that much heat.

Why do A/C units often fail on a 3-day weekend? During the week the office building has its own A/C. The computer room's A/C only has to remove the heat generated by the equipment in the room. On the weekends the building's A/C is powered off and now the 6 sides (4 walls, floor and ceiling) of the computer room are getting hot. Heat seeps in. Now the computer room's A/C unit has more work to do.

A 3-day weekend is 84 hours (Friday 6pm until Tuesday 6am). That's a lot of time to be running continuously. Belts wear out. Underpowered units overheat and die. Unlike a home A/C unit which turns on for a few minutes out of every hour, a computer-room A/C unit ("industrial unit") runs 12-24 hours out of every day. Industrial cooling costs more because it is an entirely different beast. Try waving your arms for 5 minutes per hour vs. 18 hours a day.

Most countries have a 3-day weekend in May. By the 2nd or 3rd day the A/C unit is working as much as a typical day during the summer. If it is about to break, this is the weekend it will break.

To prevent a cooling emergency make sure that your monitoring system is also watching the heat and humidity of your room. There are many SNMP-accessible units for less than $100. Dell recommends machines shouldn't run in a room that is hotter than 35c. I generally recommend that your monitoring system alert you at 33c; if you see now sign of it improving on its own in the next 30 minutes, start powering down machines. If that doesn't help, power them all off. (The Practice of System and Network Administration has tips about creating a "shutdown list"). Having the ability to remotely power off machines can save you a trip to the office. Most Linux systems have a "poweroff" command that is like "halt" but does the right thing to tell the motherboard to literally power off. If the server doesn't have that feature (because you bought it in the 1840s?) shutting it down and leaving it sitting at a "press any key to boot" prompt often generates little heat compared to a machine that is actively processing. If powering off the non-critical machines isn't enough, shut down critical equipment but not the equipment involved in letting you access the monitoring systems (usually the network equipment). That way you can bring things back up remotely. Of course, as a last resort you'll need to power off those bits of equipment too.

Having cooling emergency? Cooling units can be rented on an emergency basis to help you through a failed cooling unit, or to supplement a cooling unit that is underpowered. There are many companies looking to help you out with a rental unit.

If you have a small room that needs to be cooled (a telecom closet that now has a rack of machines) I've had good luck with a $300-600 unit available at Walmart. For $300-600 it isn't great, but I can buy one in less than an hour without having to wait for management to approve the purchase. Heck, for that price you can buy two and still be below the spending limit of a typical IT manager. The Sunpentown 1200 and the Amcor 12000E are models that one can purchase for about $600 that re-evaporates any water condensation and exhausts it with the hot air. Not having to empty a bucket of water every day is worth the extra cost. The unit is intended for home use, so don't try to use it as a permanent solution. (Not that I didn't use one for more than a year at a previous employer. Ugh.) It has one flaw... after a power outage it defaults to being off. I guess that is typical of a consumer unit. Be sure to put a big sign on it that explains exactly what to do to turn it back on after a power outage. (The sign I made says step by step what buttons to press, and what color each LED should be if it is running properly. I then had a non-system administrator test the process.)

In summary: test your A/C units now. Monitor them, especially on the weekends. Be ready with a backup plan if your A/C unit breaks. Do all this and you can prevent an expensive and painful meltdown.

Posted by Tom Limoncelli in Best of BlogTechnical Tips

5 Comments

Some nice environment monitoring units I've used in the past are Weather Ducks from http://www.itwatchdogs.com which use the old 1-wire protocol for its sensors.

Excellent advice, Tom. However, as a former grad student who taught an energy efficient buildings class, I need to correct your physics :)

A typical modern small air conditioner can remove 4x the energy that it uses (This figure is called the COP, coefficient of performance). Large datacenter sized units may be able to work twice as efficiently.

For small server rooms this works great!
But big datacenters often end up having air conditioning consume half of their power, despite having cooling units that are far more capable than this. This is because most datacenters have notoriously poor airflow and end up with poorly distributed cooling, so they need to be run at lower temperature settings.

Also, a quick fix for emergency cooling is to have big fans on hand which you can use to circulate office AC into a server room.

Regards,
Andrew

Or, hey, you could have your HVAC unit fry itself at the end of April, just before the last couple of weeks of classes, when all the seniors are desperately working to complete their theses, Clinic projects, and other work.

On the plus side, I think I now have a pretty good case for spending a bunch of money on monitoring equipment that might have been a hard sell otherwise....

Would that I saw your post on Friday. I mark you Cassandra.

Anyway, read this:

Environmental Issues in the datacenter

By the way, I hope you don't mind that I linked back to you

Credits