
Transcription
How Healthy is your Data Center?By Michael Fluegeman and Darrell Gardner
ABOUT US Michael Fluegeman, P.E. Sr. Consultant/Team Lead 25 years of Data Center experience Licensed Electrical Engineer Planning, Programming, Assessments and Design Concepts Construction , Commissioning and Operations Support Darrell Gardner, Principal Consultant, Mgr. Strategic Initiatives 27 years of IT experience Managed migration of thousands of servers Manage and operate Co-Lo hosting facility3
WHAT CONCERNS HAVE YOU PACING THE FLOOR? How much room do you have to expand?What type of power problem do you have?Are you running out of floor space? Power? Cooling?Do you have hot spots?Are you concerned about cooling failures?Do you know when circuits are near overload?Are you concerned about power failures?Is your data cabling a maze of confusion?4
WHAT CONCERNS HAVE YOU PACING THE FLOOR? Is your fire protection adequate?Could an overhead water leak rain on your parade?Are the EPO buttons begging to be pushed?Have you made significant energy efficiency progress?Do you know what all those old servers are doing?Are you ready for next generation technology?5
HealthyFunctionalDire Straits
WHAT IS A HEALTHY DATA CENTER? Meets business requirements,present & future Is as available as needed Is cost effective to operateWhat are: Elements of good health? Symptoms of poor health?7
WHAT IS A HEALTHY DATA CENTER? Perfect health is rare Enjoys a strong alliance between ITand Facilities Health is short-lived (unlessproperly managed & maintained).8
ELEMENTS OF GOOD HEALTH Comprehensive monitoringStructured CablingPower, capacity & reliabilityCooling where you need itTelecomm, bandwidth &reliability Maintenance & Operations Change Management policies9
ELEMENTS OF GOOD HEALTH As-built Facility DrawingsInventory & equipmentEfficient space planSpace & capacity for growthFire/Life-SafetyGreen IT strategiesEnergy Efficient Facility10
SYMPTOMS OF POOR HEALTH Incomplete MonitoringUnstructured CablingPower maxed out or not reliableHot spots or not enough coolingMany single point failure itemsService requires shutdownTelecomm inadequatePoor MaintenanceOperations lacking CapabilitiesUnmanaged change Drawings out-of-date or missingEquipment Inventory out-of-dateSpace plan not efficientGrowth constraintsInadequate fire protectionWater leak riskHigh risk of false EPO shutdownIT approach not updated to be GreenFacility not updated to be Green11
DEFINITIONS Redundancy– What level of backup, and backup to the backup do weneed?– What component or system is more likely to fail?– What failures can we not tolerate? Tier Levels– Classifications of reliability, availability & resiliency Change Management– Documented processes to deal with changes12
REDUNDANCY N: the number of units or capacity to handle the load N 1: N units, plus one extra unit in case of unitfailure N 2: has two extra redundant units. N 1, or N 2: may be one system with overall failurepotential (Single Point Failure - SPF) 2N: two complete systems, or 100% redundancy 2(N 1): two complete systems, with each systemN 1 redundant13
REDUNDANCY14
TIER LEVELS IT tier levels– Tier 1 most important applications, then Tier 2, Tier 3 least Facility Tier levels (Uptime Institute)– Tier IV most robust & redundant facility. Then Tier III, TierII, Tier I least Availability percentages; expected downtime– 99.99%, 99.9%, 99%, 98% Criticality LevelsTM (Syska Hennessy Group)– Criticality levels 0 (least)-10 (most) robust PlanNet IT & Facility categories15
TIER LEVELS IT tier levels– Tier 1 most important applications, then Tier 2, Tier 3 least– Dependant upon Business continuity plan Tier 1– No Downtime. (Facility 99.99% or better) Tier 2– 2 hour return to operations Tier 3– 2-3 Day return to operations16
TIER LEVELS 90% Availability (No 9s, Tier 0 {TUI does not cover this}) 98% Availability (One 9, Tier I, 20-40 hours downtime) 99% Availability (Two 9s, Tier II, 10-25 hours downtime) 99.9% Availability (Three 9s, Tier III, 1-15 hours downtime) 99.99% Availability (Four 9s, Tier IV, ¼-1 hours downtime) 99.999% Availability (Five 9s, Tier IV, 20-120 secondsdowntime)17
TIER LEVELS 90% Availability (No 9s, Tier 0 {TUI does not cover this}) Non-redundant utility power source No UPS; no short-term power ride-through No standby generator; no long-term power ride-through House (comfort) cooling (not precision; no humidity control), on 24/7 Telecom service as needed, no redundancy Wet-pipe sprinklers Simple EPO Data center not in separate room – in mixed use area No additional security for data center No additional facility monitoring for data center No additional facility operations staff for data center.18
TIER LEVELS98% Availability (One 9, Tier I) Non-redundant utility power source Non-redundant UPS, without external maintenance bypass Non-redundant standby generator; non-redundant ATS Precision cooling, on 24/7, with N 1 redundancy or temporary unit duringservice. Telecom service as needed, no redundancy Dry-pipe (pre-action) sprinklers Simple EPO Minimal additional security for data center Minimal additional facility monitoring for data center Minimal additional facility operations staff for data center Data center may not be located in ideal building or ideal location inbuilding, vulnerable to upper floor plumbing leaks, external problems, etc19
EXAMPLE: 98% AVAILABILITY20
TIER LEVELS99% Availability (Two 9s, Tier II) Non-redundant utility power sourceRedundant UPS (at least internally N 1), with external maintenance bypass2N redundant power distribution to equipment racks. A/B power from UPS to racks. (Not required by TUI).Load circuit power monitoring at PDU, RPP or plug-stripNon-redundant standby generator but redundant ATS or ATS w/ bypass isolation (TUI requires redundantgenerator but allows single ATS/switchgear).Provision to connect data center power distribution directly to generator without shutdown, for total UPSisolation (input and output) for servicer and load testing on utility. (Not required by TUI).Load bank breakers and connections at each UPS and generator for low-risk testingPrecision cooling with N 1 redundancy (N 2 or more for large rooms), N 1 heat rejection (chillers, coolingtowers, condensing units, etc.).Telecom service with multiple carriers but no redundancyDry-pipe (pre-action) sprinklers with VESDA or gaseous suppressionEPO system with protective cover, time delay, strobe/siren or other means to reduce erroneous shutdownAdditional security for data centerAdditional facility monitoring for data centerAdditional trained facility operations staff for data centerData center may not be located in ideal building or ideal location in building, vulnerable to upper floorplumbing leaks, external problems, etc.21
REDUNDANCY ENHANCEMENTS22
TIER LEVELS99.9% Availability (Three 9s, Tier III) Non-redundant utility power source but may have redundant onsite transformers (TUIrequires full redundant & diverse)Redundant UPS (at least full N 1 with external maintenance bypass, prefer 2N to criticalloads). 3-to-make-2, 4-to-make-3 etc configurations are N 1 for hardware but are actually 2Nat critical load, requires high monitoring and operations expertise.2N redundant power distribution to equipment racks. A/B power from UPS to racksLoad circuit power monitoring at PDU, RPP or plug-stripRedundant standby generators with redundant ATS/switchgear (TUI requires redundantgenerators but not redundant automatic transfer switchgear).Roll-up connection for back-up generator rental as needed for extended service.Provision to connect data center power distribution directly to generator without shutdown,for total UPS isolation (input and output) for servicer and load testing on utility. This isinherent with full 2N power systems but must be added to N 1 systems.Permanent load bank & switchboard with connections to each UPS and generator for low-risktesting.Precision cooling with N 1 redundancy (N 2 or more for large rooms), N 1 heat rejection(chillers, cooling towers, pumps, condensing units, etc.)23
TIER LEVELS99.9% Availability (Three 9s, Tier III) continued Roll-up connection for back-up chiller rental as needed for extended service.For water-cooled equipment, water storage tank, well, etc for backup water supply.Redundant power to critical cooling to eliminate SPFs. i.e.: redundant chillers, CRAC units,etc. are powered by different switchboards and ATSs.Redundant controls for HVAC equipment to eliminate SPFs. Drives, etc need to operate indefault position, not off after return from power failure or loss of remote control signal.Telecom service with multiple carriers, redundant pathways to building from diverse COs,redundant building entry points and pathways.Dry-pipe (pre-action) sprinklers with VESDA or gaseous suppression (gaseous needed if notstaffed 24/7)Dual A/B EPO system with protective covers. Erroneous EPO operation only shuts downredundant power and cooling; no data center shutdown. (Not required by TUI).Dedicated security for data centerDedicated NOC / monitoring for data centerTrained facility operations staff for data center 24/7Data center be located in ideal building or ideal location in building, less vulnerable to upperfloor plumbing leaks, external problems, etc.24
REDUNDANCY: 99.9% AVAILABILITY25
REDUNDANCY: 99.9% AVAILABILITY26
TIER LEVELS99.99% Availability (Four 9s, Tier IV) Redundant utility power source with ATO or manual transfer to alternate feed, maybe from same substation, redundant onsite transformers (TUI requires fullredundant & diverse).2N redundant UPS {TUI requires 2(N 1)}.2N redundant power distribution to equipment racks. A/B power from UPS toracks; no static transfer switch PDUs.Load circuit power monitoring at PDU, RPP or plug-strip2N redundant standby generator with redundant ATS/switchgear {TUI requires2(N 1) redundant generators}.Roll-up connection for back-up generator rental as needed for extended service.Permanent load bank & switchboard with connections to each UPS and generatorfor low-risk testing.Precision cooling with N 1 redundancy (N 2 or more for large rooms), N 1 heatrejection (chillers, cooling towers, pumps, condensing units, etc.).CW piping ladder-loop sectionalized with automated valves to eliminate SPFs.Roll-up connection for back-up chiller rental as needed for extended service.27
TIER LEVELS99.99% Availability (Four 9s, Tier IV) continued For water-cooled equipment, water storage tank, well, etc for backup water supply.Redundant power to critical cooling to eliminate SPFs. i.e.: redundant chillers, CRAC units,etc. are powered by different switchboards and ATSs.Redundant controls for HVAC equipment to eliminate SPFs. Drives, etc need to operate indefault position, not off after return from power failure or loss of remote control signal.Telecom service with multiple carriers, redundant pathways to building from diverse COs,redundant building entry points and pathways.Dry-pipe (pre-action) sprinklers with VESDA and gaseous suppressionDual A/B EPO system with protective covers. Erroneous EPO operation only shuts downredundant power and cooling; no data center shutdown. (Not required by TUI).Dedicated security for data centerDedicated NOC / monitoring for data centerTrained facility operations staff for data center 24/7Data center be located in ideal building and in ideal location in building, not vulnerable toupper floor plumbing leaks, external problems, etc. Single-tenant, no floors above datacenter28
REDUNDANCY: 99.99% AVAILABILITY29
TIER LEVELS99.999% Availability (Five 9s, Tier IV) Redundant utility power sources from separate substations, diverse routing,redundant onsite transformers2(N 1) redundant UPS2N redundant power distribution to equipment racks. A/B power from UPS toracks; no static transfer switch PDUs.Load circuit power monitoring at PDU, RPP or plug-strip2(N 1) redundant standby generator with redundant ATS/switchgearPermanent load bank & switchboard with connections to each UPS, generator, &UPS/generator parallel system for low-risk testing.Precision cooling with N 1 redundancy (N 2 or more for large rooms), 2N heatrejection (chillers, cooling towers, pumps, condensing units, etc.).2N redundant CW piping ladder-loop sectionalized with automated valves toeliminate SPFs.For water-cooled equipment, water storage tank, well, etc for backup watersupply.30
TIER LEVELS99.999% Availability (Five 9s, Tier IV) continued Redundant power to critical cooling to eliminate SPFs. i.e.: redundant chillers,CRAC units, etc. are powered by different switchboards and ATSs.Redundant controls for HVAC equipment to eliminate SPFs. Drives, etc need tooperate in default position, not off after return from power failure or loss ofremote control signal.Telecom service with multiple carriers, redundant pathways to building fromdiverse COs, redundant building entry points and pathways.Dry-pipe (pre-action) sprinklers with VESDA and gaseous suppressionDual A/B EPO system with protective covers. Erroneous EPO operation only shutsdown redundant power and cooling; no data center shutdown.Dedicated high security for data centerDedicated NOC / monitoring for data centerTrained facility operations staff for data center 24/7Data center be located in bunker-type building and in ideal location in building, notvulnerable to upper floor plumbing leaks, external problems, etc. Single-tenant,no floors above data center.31
TIER LEVELSLoad density (Watts/SF or Watts/rack) Related to available power & cooling capacity, access floor height, ceilingheight, hot/cold aisle configuration and containment, or other coolingapproaches.––––––Very low: 30 W/SFLow: 50-75W/SF or 1kW/cabinetMedium: 75-125W/SF or 2kW/cabinetHigh: 125-200W/SF or 4kW/cabinetVery high: 200-500W/SF or 6-15kW/cabinetUltra high: 500W/SF or 15kW/cabinet32
MEDIUM DENSITY33
TIER LEVELSFloor & ceiling load capacity Low 75 lbs/SF Medium 75-150 lbs/SF High 150-200 lbs/SF Very high 200 lbs/SF34
TIER LEVELSLoad growth capability (Power & cooling capacity) None: systems loaded to 85% of capacity with no provisions to increasecapacity. Moderate: systems loaded to 65% of capacity or higher with provisions toincrease capacity with no downtime and minimal risk during upgrade. High: Systems loaded to 50% or less.35
TIER LEVELSEnergy efficiency Poor: Very old data centers with lightly loaded power andcooling equipment, significant hot/cold air mixing. Moderate: Newer and right-sized power and coolingequipment, hot/cold aisle separation with little shortcircuiting, drives on HVAC equipment. High: Extensive use of air side and/or water side economizerson HVAC, highest efficiency HVAC equipment, ducted hot andcold air with no short-circuiting, close-coupled cooling,efficient redundant UPS configurations (not full 2N),flywheel/off-line UPS designs; higher voltage powerdistribution.36
FLOOR PLAN IT – Things to think about– Layout with security– Access to Racks– Pre-Production– Production systems.37
FLOOR PLAN – NON OPTIMIZED38
FLOOR PLAN – OPTIMIZED/ HIGH DENSITY39
BREAK!!!– Q&A– Please be back in 10 minutes 40
WELCOME BACK!!!– Quick Review41
CHANGE MANAGEMENT Deteriorating health of data centers occursover time, not initially.d42
CHANGE MANAGEMENT I T Service Management– Guidance and best practices Change Control– Actual controls and processes put in place Change Control is a subset of ChangeManagement.ChangeControl43
CHANGE MANAGEMENT Documented, detailed processes Follow-through diligence IT equipment changes– Power, circuiting, cooling needs, drawing &schedule updates– Release Management (application releases) Facility changes– Maintenance Personnel changes– Training44
I T SERVICE MANAGEMENT Develop your own (if it doesn’t exist).– What is the goal? The objective of Change Management in thiscontext is to ensure that standardized methodsand procedures are used for efficient andprompt handling of all changes to controlled ITinfrastructure, in order to minimize the numberand impact of any related incidents uponservice– Use Common Sense in implementing change as itapplies to the Goaldg45
CHANGE CONTROL Change Control – Procedures used to insurethat changes are introduced in a controlled,coordinated manner What it is NOT– Good intentions without change control OUTAGE. Disruption of Service!!!!dg46
CHANGE CONTROL Minimal disruption to services Reduction in back-out activities Economic utilization of resources involved inimplementing changedg47
CHANGE MANAGEMENT IT Service Management– ITIL :Information Technology InfrastructureLibrary– Enterprise Computing Institute's library– ISO/IEC 20000 standard (previously BS 15000).– Framework for ICT Technical Support (FITS) and isbased on ITIL,– IBM Tivoli Unified Process (ITUP).– enhanced Telecom Operations Map eTOM– Microsoft Operations Frameworkdg48
INVENTORY & EQUIPMENT You have some sort ofinventory of equipmenthat is in the datacenter.CMDB. This will tieequipment back toapplications and ownersas well as physicalcharacteristics What are theWarranties for allequipment.dg What is the lifeexpectancy of theequipment. What licenses forsoftware and OS's arerunning. Are you in compliancewith Vendorrequirements? SAN Health Check. Etc 49
CMDB CMDB Configuration Management Database– Auto Discovery– Configurable attributes– Technical– Ownership– Relationship– Continued Updating as part of your Change Management Control Control This is where we (IT) fail the most.dg50
CMDB Purchase off the shelf systems– Asset Management as well as CMDB– HP, Symantec, SunView, FrontRange, Aldon, Rackwise, ect – Google “CMDB software” Build your own– http://www.cmdb.info/pd1/html/index.php?name Sections&req viewarticle&artid 2&page 1 Use Excel– Show me What ever you do use something!!!DG51
CMDB - EXCELdg52
DATA CENTER ASSESSMENT Identify areas that pose the most risk– Failures about to happen– Limitations for future technology Develop a plan to address issues– Options– Costs– Implementation phases, risks– Priorities53
DATA CENTER ASSESSMENT Identify areas that pose the most risk– Application Level Failures including lack ofredundancy– Discrepancies between Tier of Application andredundancy of equipment / facilities. Single power supply on Server Single PDU in a Rack Develop a plan to address issues54
MONITORING Local & Remote Monitoring UPS KVA vs. KW– UPS rated in kVA but typically overloads on kW!55
KVA & KW: GRAPHIC DISPLAY56
MONITORING Circuit Loading– Monitor at circuit breaker or plug strip– Prevent overload and breakers tripping– Track and trend loads57
PLUG STRIP MONITORING58
MONITORING UPS battery monitors– Monitors don’t replace service & load testing! Power monitoring– Key to energy efficiency improvement SNMP interface in each piece of equipment– Auto shut down software for critical systems Alarms (Minor and Major)59
MONITORING How low do you go?– Servers– Applications App layer Middle Tier Data Layer– Ping, Power & Pipe? Bandwidth Monitoring (Switch Port) Local vs. Remote monitoring Which devices to monitor60
CABLING ARCHITECTURAL DESIGN Growth and Capacity– Patch panels (at each rack/row?)– Switch ports: does the layout of the room allowfor Server switches and Core switching?– Under floor vs. Above racks Rack Elevations– Room for growth?– Divided by applications?61
UNSTRUCTURED CABLING62
STRUCTURED CABLING63
PHYSICAL SECURITY Where are your check points (policies/procedures)?– Environmental Guards, Bollards, Enclosures, Fences, Landscaping, Architectural– Building Guards, Man Trap, Sign in Procedures, Access control, CCTV (DVR),intrusion detection (also applies to building perimeter)– Room CCTV (DVR), Access Control,RackAccess Control– Server Firewall Intrusion Detection64
DRAWINGS: FACILITYAssessments should begin with drawings Full A, S & MEP accurate and as-built in CAD– Any full discipline set w/ architectural backgrounds– Any as-built prints, redlines, sketches Piecemeal drawings––––––Site planFloor plansSingle line power diagramMechanical schedulesPanel schedulesSpot check as-built accuracy65
DRAWINGS: FACILITYWhat, no drawings? Make a thorough search first. Ask for redlines,sketches, etc. Reverse engineer the site– Start with the facility engineer (drawings might bein his head)– Requires engineers, electricians, mechanicalservice personnelFacility drawings will be needed to properlyoperate the data center!66
DRAWINGS: IT Often separate from facility drawingsKept by IT dept.Often in Visio formatFloor planRack elevationsStructured cabling designNetwork DiagramsContext Diagrams (application)67
BREAK!!!– Q&A– Please be back in 10 minutes 68
WELCOME BACK!!!– Quick Review69
CAPACITIES, CONDITIONS, NEEDS & GROWTHMajor Support Equipment Ratings & Condition Power starts with utility and goes to the load– Utility service transformer(s), upstreamtransmission– Standby power {generator(s)} Fuel run time Parallel switchgear– Main and transfer switchgear {ATS(s)} Note fault current ratings where available– All distribution panels in critical path Mechanical switchboards70
STANDBY DIESEL GENERATOR71
AUTOMATIC TRANSFER SWITCH72
CAPACITIES, CONDITIONS, NEEDS & GROWTHMajor Support Equipment Ratings &Condition(2) Power starts with utility and goes to the load– UPS system(s) Batteries, bypass panels, parallel switchgear Redundant configurations– UPS output panel(s)– PDUs, # of PDUs Panels, breaker spares/spaces, subfeeds– Rack/cabinet power distribution A/B power, rack high-speed transfer switches73
UPS74
PDU – UPS POWER DISTRIBUTION75
CAPACITIES, CONDITION, NEEDS & GROWTHWhat, no load data available? Utility bills may provide utility loadsGenerator service records may provide load dataUPS service records may provide load dataMeter loads––––Snapshot7 day load survey w/ peak demand30 day load survey w/ peak demandSummer afternoon loads typically the highest76
GENERATOR METER77
CAPACITIES, CONDITION, NEEDS & GROWTHMajor Support Equipment Ratings &Condition(3) CRAC/CRAH units, # of units– Configuration– Airflow, degree of supply & return isolation UPS cooling system Outside heat rejection– Chillers– Cooling towers– Condensing units78
CRAC, CRAH UNIT79
CAPACITIES, CONDITIONS, NEEDS & GROWTHFire / Life-Safety– Sprinklers: wet or dry-pipe– Gaseous: FM-200, Halon, Inergen, Ecaro25, etc.Zones– VESDA– EPO system: False activation prevention– Signage– Code clearances– Exiting, egress80
FIRE SUPPRESSION81
UPS Flameout
FM200 PUTTING OUT THE FIRE
EPO84
CAPACITIES, CONDITION, NEEDS & GROWTHPower needs & growth Determine max usable load for existingequipment– Switchgear & circuit breakers typically 80%– UPS, PDUs, cooling typically 90%– Determine max usable load to maintainredundancy Rack PDU, Circuit Panel, UPS, Generator, etc 85
CAPACITIES, CONDITION, NEEDS & GROWTH Determine growth needs– 25% minimum growth capability best practice– Load history often predicts future growth needs– Understand what may change from past, present to future 1U, 2U servers changing to blades? Consolidating, virtualizing?1U, 2U serversBlade server86
WHERE DOES THE POWER GO?All power into the data center turns into:HEAT!87
CAPACITIES, CONDITIONS, NEEDS & GROWTHCooling needs & growth Cooling capacity based on power capacity– Additional heat gain from UPS, transformers, etc– Additional heat from outside, sun, air leaks– Humidity control based on outside air intake Cooling needs for concentrated high density– Where are the hot spots?88
HOT SPOTS - THERMAL IMAGING89
LOAD DENSITYIs your cooling system capable ofcooling your load density?Today?Tomorrow?Next year?90
LOAD DENSITY Obsolete: 30W/SFToday: 75W/SF; 1.2kW/rackTomorrow: 125W/SF; 2kW/rackSoon: 200W/SF; 3.25kW/rackNear future: 500W/SF; 8kW/rackFuture: 1,500W/SF; 24kW/rack91
LOAD DENSITY LOW-MEDIUM Obsolete: 30W/SF Today: 75W/SF; 1.2kW/rack Tomorrow: 125W/SF; 2kW/rack– Limits of traditional non-ducted perimeter cooling(CRAC unit)92
LOAD DENSITY HIGH Soon: 200W/SF; 3.25kW/rack– Requires ducting or close-coupled cooling Near future: 500W/SF; 8kW/rack– Requires high-performance cooling means;difficult to make redundant Future: 1,500W/SF; 24kW/rack– May require liquid/refrigerant cooling93
HOT AISLE COLD AISLE94
ENERGY EFFICIENCY / GREENING Immediate OpEx reduction Short & long-term CapEx reduction– Delay or eliminate power & cooling capacityupgrades– Stay within building block sizes Prolong data center obsolescence– Delay maxing out capacities Address corporate mandates95
ENERGY EFFICIENCY / GREENINGGreen to the EXTREME!!!96
ENERGY EFFICIENCY / GREENING Biggest gains from reducing IT power draw– 1 watt reduction of IT draw can reduce overallpower for UPS & cooling almost 3 watts! Next opportunity is cooling– Many incremental opportunities– Radically different design options– Facility geographic location Power presents less opportunity Lighting97
GREEN IT STRATEGIES Virtualize– Reduce Hardware demand Power down non critical systems Decommission unneeded systems Tier Storage– Move to tape as needed based upon history req’s Upgrade server power supply efficiency– Star Energy Compliance– Dual speed fans, Drives that auto spin down Liquid cooled Servers98
ENERGY EFFICIENCY: COOLINGConflicting opinions Is high-density or low-density more efficient? Perimeter cooling or close-coupled cooling? Access floor or all overhead? Cooler climates? Drier climates? Underground? Free cooling: Outside air (air side economizer)99
ENERGY EFFICIENCY: COOLINGIncremental opportunities Air flow from cooling supply, throughequipment racks and back to cooling return– How much mixing of hot and cold (shortcircuiting)?– Hot aisle / cold aisle configuration Short circuiting at end of aislesShort circuiting through racks. Use of blanking platesDucted cool air supplyDucted cool air returnContained hot aisle and/or cold aisle100
ENERGY EFFICIENCY: COOLINGIncremental opportunities Access floor– Look for supply air blockage under floor– Perf tile location, number, % opening– Blocking of unwanted floor openings101
ENERGY EFFICIENCY: COOLINGIncremental opportunities Equipment near end-of-life– Simply replacing with new improves efficiency– Premium for higher efficiency102
ENERGY EFFICIENCY: COOLINGBigger opportunities Retrofit with variable speed drives– Pumps, fans, air-handlers Replace components with different design– Plug fan technology for CRAC units– Outside air free cooling– Replace air-cooled outside equip w/ water-cooled– Add close-coupled cooling for high density Inrow Overhead103
ENERGY EFFICIENCY: COOLINGBigger opportunities Relaxed ASHRAE standards– No longer necessary to maintain tighttemperature & humidity control– Allows for more efficient chilled water operationat higher temperature and away from dew point– Allows for many more days free cooling– Free cooling generally requires use of external airhandlers in lieu of CRAC units in the room104
ENERGY EFFICIENCY: POWER Higher voltages– Medium voltage utility, generators and UPS forlarge sites– Higher rack voltage, 208V or 240V 3-phase rack plug strips More efficient UPS, transformers– Avoid K-factor transformers– Avoid excessive harmonic filters. 10% THD OK105
ENERGY EFFICIENCY: POWER Consider line-interactive UPS Consider flywheel-based rotary back-up Avoid full 2N redundancy for large systemsexcept for the highest reliability needs– 4-to-make-3 or 3-to-make-2 configurations arecost-effective and efficient 2N redundant at the load Generic to different UPS vintages & vendors Requires better monitoring and operation management106
ENERGY EFFICIENCY: POWER Flywheel/line-interactive UPS107
3-TO-MAKE-2 REDUNDANT CONFIG108
ENERGY EFFICIENCY: UTILITY REBATESMany electric utilities have cash available Limited time opportunity Virtualization payback Be sure to research before beginning upgrades– Utility typically requires before & after survey!– Utility wants to see less efficient equipmentpermanently removed from location– Difficult to earn rebates for new construction– Difficult to combine efficiency improvements &growth109
ENERGY EFFICIENCY TOP 10 LIST1. Software – Virtualize & Manage Apps2. Server Hardware – More Efficient, Blades3. Power Management Features - Enable4. Decommission Legacy Servers5. Consolidate & Tier Storage6. Manage Airflow7. Manage CRAC Units8. HVAC Configuration & Component Efficiency9. UPS Efficiency – Line Interactive Designs10. UPS Redundant Configurations – Avoid Full 2N110
BREAK!!!– Q&A– Please be back in 10 minutes 111
WELCOME BACK!!!– Quick Review112
AVAILABILITY AND RELIABILITY Look for potential Single Point Failures (SPF)– “single bullet” theory– Failures that are more likely to happen UPS battery failure when utility fails Circuit breaker nuisance tripping– Power supplies on single corded servers– Any failure from NIC to switch for single attachedinterfaces– Application failures (without monitoring)113
AVAILABILITY AND RELIABILITYStart at the critical load and work back upstream Redundant and robust components &monitoring costs less at the load and delivershigher reliability. Large systems that are redundant and robustare expensive and may not offer significantimprovement114
AVAILABILITY AND RELIABILITYStart at the critical load and work back upstream Use dual-corded loads with 100% i
redundant power and cooling; no data center shutdown. (Not required by TUI). Dedicated security for data center Dedicated NOC / monitoring for data center Trained facility operations staff for data center 24/7 Data center be located in ideal building or ideal location in building, less vulnerable to upper