Transcription

Platform ComputingIBM Platform Computing Cloud ServiceReady to use Platform LSF & Symphony clustersin the SoftLayer cloudFebruary 25, 20141 2014 IBM Corporation

Platform ComputingAgendav v v v v v 2Mapping clients needs to cloud technologiesAddressing your pain pointsIntroducing IBM Platform Computing Cloud ServiceProduct features and benefitsUse casesPerformance benchmarks 2014 IBM Corporation

Platform ComputingHPC cloud characteristics and economics are different thangeneral-purpose computing High-end hardware and special purpose devices (e.g. GPUs) are typically used tosupply the needed processing, memory, network, and storage capabilities The performance requirements of technical computing and service-orientedworkloads means that performance may be impacted in a virtualized cloudenvironment, especially when latency or I/O is a constraint HPC cluster/grid utilization is usually in the 70-90% range, removing a majorpotential advantage of a public cloud service provider for stable workload volumesHPC WorkloadsRecommended forPrivate CloudHPC Workloads withBest Potential forVirtualized Public &Hybrid CloudPrimary HPC Workloads3 2014 IBM Corporation

Platform ComputingIBM’s HPC cloud strategy provides a flexible approach to addressa variety of client needsPrivateCloudsEvolve existinginfrastructure toHPC Cloud to enhanceresponsiveness,flexibility, andcost effectiveness.HybridCloudsEnable integratedapproach to improveHPC cost andcapabilityPublicCloudsAccess additionalHPC capacity withvariable cost model60%Based on HPC Cloud’s potential impact, organizations are evolving their infrastructures toenable private cloud deployments, exploring hybrid clouds, and considering public clouds.4 2014 IBM Corporation

Platform ComputingAre you experiencing any of these pain points? Unable to meet business objectives (delay to market, etc.) Existing resources insufficient to meet peek compute demand– Long run times on existing cluster or grid– No access to local technical computing resources (workstation users) Technical resources expensive and time consuming to acquire The skills/staff to architect and manage a technical computing infrastructure canbe difficult to acquirePlanned ProjectPlanned Daily Cycle (24 x 365)50,00040,00030,00020,00010,000147Financial rilMayJuneLife Sciences 2014 IBM Corporation

Platform ComputingIBM Platform Computing Cloud ServiceMaking the cloud work for youBuildManageSupportProtect Complete, ready to runclusters in the cloud Add additional capacityin hours instead ofmonths Seamless workloadmanagement, onpremise and in thecloud Transparent userexperience 24X7 cloud operationsupport Access to technicalcomputing expertisewhen you need it Data encryption,dedicated physicalmachines and network Security throughphysical isolationComplete, end to end dynamic cloud solution6 2014 IBM Corporation

Platform ComputingReady to use Platform LSF & Platform Symphony clusters in the cloudClient and ISV ApplicationsIBM Platform Computing Cloud Service (SaaS)IBM Platform LSFIBM PlatformSymphonySoftLayer, an IBM CompanyInfrastructure24X7 CloudOps Support7 2014 IBM Corporation

Platform ComputingDedicated physical and virtual machine infrastructure as a service 13 data centers190,000 17 network PoPsSERVERSGlobal private networkBare metal and virtual machines821,000 22,000,000 CUSTOMERSDOMAINS 2014 IBM Corporation

Platform ComputingReady to use Platform LSF & Platform Symphony clusters in the cloudDIFFERENTIATORWorkload I/O intensityControl (APIs,hardware / networkconfigurability)Integrated platform ofmultiple ityworkloadsLow degree ofcontrol andcustomizationHigh degree ofcontrol andcustomizationSingle platformAWS9IBM ADVANTAGESSeamlessintegrationRAX SoftLayer’s architectureoutperforms by 50% equivalentAWS instances for high I/Oworkloads SoftLayer offers hundreds ofhardware configurations vs. 14for AWS 2,000 APIs for SoftLayer vs. 60for AWS and none for RAX Unified integration & controlpanel for multiple cloudarchitectures RAX requires paid bridge,different control interfacesIBM 2014 IBM Corporation

Platform ComputingNon-shared physical machines for added security and performance Dedicated and isolated compute environment All machine instances are dedicated to the client Each cluster is isolated on a VLAN Only the VPN gateway has an addressable interface All customer data at rest is encrypted on shared file systems When machines instances are decommissioned the disks are scrubbed usingDoD approved methods10 2014 IBM Corporation

Platform ComputingOptimal performance for technical computing appsEDA Benchmark (IBM-MESA)Industrial Manufacturing Benchmark – Structural MechanicsNote: Benchmark results were obtained by IBM and have not yet been externally11 audited or validated. 2014 IBM Corporation

Platform ComputingRun and supported by dedicated, 24X7 HPC Cloud Operations TeamCloudOps functions Pre-provisioning: Provide guidance to client on how to enable VPN, multi-cluster settings &security settings on the client on-premise environment One time setup testing: Extensive testing of the cluster prior to release to the client Extensive testing of the cluster on every event of flex-up prior to release to the client Email alerts prior to flex-down & cluster shutdown operations Email alerts in case of any overage (compute hours, download bandwidth) Provide billing details of monthly usage including overage details Provide support under IBM SLA by experts highly experienced in Platform ComputingproductsValue: quality, peace of mind & minimum disruption to business Extensive quality checks ensures minimum loss of usage hours & disruptions Proactive alerts ensures that in-progress critical jobs are not killed in case of Flex-down &Cluster Shutdowns and Overages Highly trained & experienced Support ensures smooth on-boarding and minimizedisruptions12 2014 IBM Corporation

Platform ComputingIndustry-leading workload management 20 years managing distributed scale-out systems with 2000 customers in many industries High performance workload management combined withintelligent resource scheduling engine Unmatched scalability (small clusters to global grids) andproduction-proven reliability Heterogeneous – manages System x and Power plus 3rd partysystems, virtual and bare metal, accelerators / GPU, cloud, etc. Shared services for both compute and data intensive workloads Integrated solutions with vertical reference architectures1323 of 30largestcommercialenterprises60% of topfinancialservicescompaniesOver 5MCPUs undermanagement 2014 IBM Corporation

Platform ComputingIBM Platform LSFOverviewPowerful workload management for demanding, distributed and mission-critical highperformance computing environments.Key Capabilities Powerful- Policy and resource-aware scheduling- Resource consolidation for optimal performance- Advanced self-management Flexible- Heterogeneous platform support- Policy-driven automation- CLI, web services, APIs Scalable- Thousands of concurrent users and jobs- Virtualized pool of shared resources- Flexible control, multiple policiesClient Benefits Optimal utilization: reduced infrastructure cost Robust capabilities: improved productivity High throughput: faster time to results1414 2014 IBM Corporation

Platform ComputingIBM Platform SymphonyOverviewLow-latency grid management platform for distributedcomputing and analytics with sophisticated resourcesharingKey Capabilities Accelerates service-oriented applications Extreme app scalability and throughput with very lowlatency Compute and data-intensive applications on a singleplatform Sophisticated, hierarchical resource sharing Open and flexible: choice of OS, frameworks andlanguagesClient Benefits Increase performance and analytic result quality Reduces IT costs - increase utilization, simplifyapplication onboarding, reduce administration costs1515Low Latency / High throughputSub-millisecond, 17,000 tasks per secondLarge Scale10k cores per application, 40k cores per gridEfficient shared servicesHeterogeneous & OpenLinux, Windows, AIX, C/C , C#, Java, Excel,Python, R 2014 IBM Corporation

Platform ComputingUse case 1 – hybrid clusterThe problem Existing resources cannot meet peak demand Resources are expensive and time consuming to acquire Skills to architect and manage clusters are difficult to find Fixed or reduced budgets On-premise constraints in space, cooling and powerThe solution Fully functioning IBM Platform LSF or Symphony clusters areprovisioned on the SoftLayer cloud and connected to the onpremise cluster, expanding capacity as needed Leverage MultiCluster capability for managed forwarding ofjobs from on premise cluster to off premise clusterThe Value Access to additional compute capacity on a temporary basis as needed Near-zero wait times Reduce costs by paying for only what is used Pay for additional capacity as an operating expense Fully supported, end-to-end solution, from the on-premise to the on-cloud clusters Expected and reliable performance from running technical computing workloads on physical machines Transparent access to cloud resources, the end user experience does not change16 2014 IBM Corporation

Platform ComputingUse case 2 – stand-alone cluster in the cloudThe problem New and emerging need for technical computing Skills to architect and manage clusters are difficult to find Resources are expensive and time consuming to acquire Inconsistent demand does not justify the investmentThe solution Fully functioning Platform LSF and Symphony clusters areprovisioned on the SoftLayer cloud providing resources asneededThe value§ Market-leading Platform LSF and Platform Symphony software§ Access to technical computing resources on a temporary basiswithout the need to acquire, install and configure the infrastructure and cluster software§ Keep costs low by paying for only what is used§ Pay for capacity as an operating expense§ Fully supported solution§ Expected and reliable performance from running workloads on physical machines17 2014 IBM Corporation

Platform ComputingIs IBM Platform Computing Cloud Service a good fit for you?Business pain points And you experiencing lost profit due to missed deadlines? Do you experience pressure to convert your compute environment capital expense tooperational expense? Have you ever missed a deadline or delayed a project because technical computingresource procurement took too long ?Technology pain points Do your users ever scale back their analyses to lower fidelity or less accuracy in order to fitthem into the local compute environment or to a time window? Do you regularly, occasionally, or permanently have fewer resources (CPUs, disk, memory,etc) than you would like to have to service the user’s compute demand? Do you experience a large variance in compute resource utilization? Have you reached, or will you reach the capacity of your datacenter(s), and do you need aplan to grow beyond that capacity ? Are your customers asking you for cloud licenses for Platform LSF or Platform Symphony?18 2014 IBM Corporation

Platform ComputingIBM Platform Computing Cloud ServiceMaking the Cloud Work for YouIBM Hybrid CloudOnPremiseOnSmartCloudpowered bySoftware & SystemsUnmatched CapabilitiesCloud LeadershipPolicy-driven WorkloadManagementExpertise fromClient EngagementsUnmatched ExpertiseAnalytics, Technical Computing,Software, Services and ISV PartnershipsConsolidationSupporting heterogeneous IBM and non-IBM infrastructure19 2014 IBM Corporation

Platform ComputingThank You20 2014 IBM Corporation

Platform ComputingSoftLayer and Amazon EC2 Products tualRate(USD)So'Layer16641000[1]Physical 1.85[2]So'Layer88500[3]Virtual 0.88So'Layer16641000[1]Physical 0Virtual 2.40[4]87840Virtual )CPUE5- ‐[email protected] Xeon CPUE5- ‐[email protected](R)Xeon(R)CPUE5- ‐[email protected](R)Xeon(R)CPUE5- ‐[email protected](R)Xeon(R)CPUE5- ‐[email protected] 2014 IBM Corporation

Platform ComputingMemory BandwidthSTREAM(higher is ADSTREAM Price Performance2000(higher is 000.00TRIADSL PMSL VMEC2 CCI2EC2 2XLSL PM (ded)2,500.002,000.001,500.001,000.00500.000.00SL PM22SL VMEC2 CCI2 EC2 2XLSL PM(ded) 2014 IBM Corporation

Platform ComputingCPU PerformanceSuperPI(lower is better)800Elapsed Time7006005004003002001000SL PMSL VMEC2 CCI2EC2 2XLSL PM (ded)SuperPI Price-Performance(higher is better)throughput per dollar10.008.006.004.002.000.00SL PM23SL VMEC2 CCI2EC2 2XLSL PM (ded) 2014 IBM Corporation

Platform ComputingNetwork BandwidthopenMPI100000Bandwidth (Mbits/s)100001000SLVMEC2 2XLEC2 CCI2SL PM100SL PM ssage Size (Bytes)24 2014 IBM Corporation

Platform ComputingNetwork LatencyopenMPI Latency(lower is better)120100806040200SL VM MPI 2 node EC2 2XL MPI 2 node25EC2 CCI2 MPI 2nodeSL PM MPI 2 nodeSL PM (ded) MPI 2node 2014 IBM Corporation

Platform ComputingInput / Output PerformanceI/O Bandwidth - WRITE(higher is better)350000300000kB/sec250000SL VM Write200000EC2 2XL Write150000EC2 CCI2 Write100000SL PM Write50000SL PM Ded Write0012345I/O file size (factor of memory size)I/O Bandwidth - READ(higher is better)400000350000kB/sec300000250000SL VM Read200000EC2 CCI2 Read150000EC2 2XL Read100000SL PM Read50000SL PM Ded Read00261234I/O file size (factor of memory size)5 2014 IBM Corporation

Platform ComputingSoftware CompilationSoftware Compile Performance(lower is better)800Elapsed Time (s)7006005004003002001000SL VMSL PMEC2 2XLEC2 CCISL PM DedSoftware Compile Price-Performance(higher is better)9.008.00Runs / 7.006.005.004.003.002.001.000.00SL VM27SL PMEC2 2XLEC2 CCISL PM Ded 2014 IBM Corporation

Platform ComputingLife Science (BWA)Life Sciences Benchmark (BWA)(lower is better)40000Elapsed time (sec)35000300002500020000150001000050000Series1SL PM (ded)SL PMSL VMEC2 CCI2EC2 2XL20846.48126509.36825897.4422442.737491Life Sciences Benchmark (BWA) PricePerformance25.00(lower is better) / run20.0015.0010.005.000.0028 Series1SL PM (ded)SL PMSL VMEC2 CCI2EC2 2XL22.217.796.3314.966.04 2014 IBM Corporation

Platform ComputingEDA Benchmark (IBM-MESA)EDA - IBM Mesa(lower is better)3500Elapsed Time (sec)300025002000150010005000SL PM (ded)SL PMSL VMEC2 2XLEC2 CCI2EDA - IBM Mesa - Price-Performance(higher is better)2.50Runs / 2.001.501.000.500.00SL PM (ded)29SL PMSL VMEC2 2XLEC2 CCI2 2014 IBM Corporation

Platform ComputingProvisioning TimeProvisioning Time (sec)(lower is better)100000100001000100101SL PM30SL VMEC2 CCI2EC2 2XLSL PM Ded 2014 IBM Corporation

Platform ComputingIndustrial Manufacturing – Structural MechanicsOne Node - S613119SL PM7EC2 CCI2SL VM5EC2 2XL3SL PM (ded)102468101214Speedup (relative to EC2 2XL)Speedup (relative to EC2 2XL)One Node - S4D765SL PM4EC2 CCI23SL VMEC2 2XL2SL PM (ded)1016246810121416CPUsCPUsTwo Nodes - S619171513119753131SL PMEC2 CCI2SL VMEC2 2XLSL PM (ded)Speedup (relative to EC2 2XL)Speedup (relative to EC2 2XL)Two Nodes - S4D9876SL PM5EC2 CCI24SL VM3EC2 2XL2SL PM (ded)10 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 320 2 4 6 8 101214161820222426283032CPUsCPUs 2014 IBM Corporation

Platform ComputingIndustrial Manufacturing – CFDOpenFoam Speedup Backplane(higher is better)Speedup (relative to EC2 2XL)18161412SL PM (ded)10SL PM8SL VM6EC2 CCI24EC2 2XL2OpenFoam Speedup Ethernet013579111315(higher is better)8Speedup (relative to EC2 2XL)# cores76SL PM (ded)5SL PM4SL VM3EC2 CCI22EC2 2XL1013579 11 13 15 17 19 21 23 25 27 29 31# cores32 2014 IBM Corporation

Integrated platform of multiple architectures Unified integration & control panel for multiple cloud architectures RAX requires paid bridge, different control interfaces Ready to use Platform LSF & Platform Symphony clusters in the cloud Low intensity workloads Low degree of control and custom