
Transcription
An Oracle White PaperJuly 2012Garmin International Inc.Oracle Exadata Database MachineTechnical Case Study
Oracle Exadata Database Machine Technical Case Study – Garmin International Inc.ContentsExecutive Overview .2Intended Audience .4Introduction .4Oracle Exadata Database Machine .4Garmin Exadata Database Machine Deployment Architecture .5Migration .7Garmin Production Experience with Exadata .8Scaling Garmin Connect .9Garmin Using Exadata Database Machine for Database Consolidation .11Planning for consolidation .11Managing the consolidated deployment .11Garmin Exadata Database Machine High Availability Configuration .13Configuration and use of Oracle RAC .14Configuration and use of Oracle Automatic Storage Management (ASM) .15Configuration and use of Data Guard .15Configuration and use of Oracle Recovery Manager (RMAN) .16Challenges .17Summary .18Appendix .19Technical White Papers .19My Oracle Support Notes .19Acronyms .191
Oracle Exadata Database Machine Technical Case Study – Garmin International Inc.Executive OverviewGarmin designs, manufactures and markets global positioning system (GPS) navigation andcommunications equipment for the automobile/mobile, outdoor, fitness, marine and aviation markets. It isa leader in every market it serves. For market reach, Garmin has its primary company-owned distributioncenters in the United States, the United Kingdom, Australia, and Taiwan. Garmin International isheadquartered in Olathe, Kansas, with manufacturing facilities in the US and Taiwan.Garmin has successfully deployed Oracle Exadata Database Machines for consolidating several missioncritical applications, including the E-Business Suite – covering order management, manufacturing andsupply chain planning, and financials; and Garmin Connect – a widely successful application with over 1billion logged miles, where thousands of customers around the world can track and analyze personalworkout data. Garmin migrated and consolidated these applications from dedicated Sun SPARC and IntelSMP systems onto Exadata Database Machine. These applications have demonstrated improvedperformance compared to their previous systems: Garmin Connect enjoys sufficient system capacity for a workload that has more than tripled –one year ago, the rate of workout uploads to Garmin Connect was 1 million per week, now thatrate is 1 million per day. Garmin has enjoyed a 20% to 50% performance gain in their Advanced Supply Chain Planning(ASCP) reporting cycles. Garmin’s top 20 critical concurrent batch jobs now run on average 46% faster. Month-end related jobs now run on average 67% faster. Manufacturing resource planning (MRP) processes runs significantly faster. To quote Garmin,“Each day at 19:00 M-F we run the Garmin TSO/PMA Certification Set program. BeforeExadata, the program took an average of 5 hours and 2 minutes to run. After Exadata, theprogram regularly takes 1 hour 25 minutes. By my calculations, that is a 475% improvement.”Along with the improved batch performance, online users enjoy consistent response times, notingespecially how much faster the cursor moved between fields.2
Oracle Exadata Database Machine Technical Case Study – Garmin International Inc.Garmin has also benefitted from Exadata’s pre-configured Engineered Systems and support model,reflected by their experience of having new Exadata Database Machines installed and ready to load andtest within 7 days of arriving on-site.Garmin has utilized Oracle Maximum Availability Architecture (MAA) best practices for Exadata DatabaseMachine to achieve their objectives for high availability (HA) and data protection. This is critical given thenature of the applications deployed in their Exadata environment – from their customer-facing GarminConnect web site to their manufacturing and order fulfillment processing.This technical case study describes Garmin’s implementation of Exadata Database Machine.3
Oracle Exadata Database Machine Technical Case Study – Garmin International Inc.Intended AudienceReaders of this paper are assumed to have experience with Oracle Database 11g technologies,familiarity with the Oracle Maximum Availability Architecture framework (MAA), and a generaltechnical understanding of Oracle Exadata Database Machine. When referenced in this paper, in-depthbackground on these topics will be deferred, as they are covered in other documentation and technicalwhite papers available on the Oracle Technology Network 1. This paper will provide configurationdetails and benefits specific to the deployment being discussed. See the Appendix for a list ofrecommended technology white papers and acronyms used in this paper.IntroductionWith multiple commercial and retail distribution channels, Garmin needed to meet the demand fortheir products especially during the August through December timeframe – the ramp up for theChristmas season.Garmin faced serious challenges with its previous architecture based upon mixed platforms: Legacy systems lacked fault tolerance – leading to difficulty meeting availability SLAs. Vertical SMP scaling was not projected to meet cost and performance objectives. Mixed platforms created re-work for implementation, leading to longer maintenance windowsand higher costs. No standardization across legacy systems. Systems were expensive to purchase and difficult to deploy – long lead times and significanteffort was required to acquire, build up, integrate, and deploy new systems. Hardware and software components from multiple vendors were complex to support. No onesupplier was accountable at a system level.Oracle Exadata Database MachineGarmin needed to reduce its existing and long term costs based on growth of ERP and webapplications, while improving quality of service for its mission critical applications, improvingperformance, scalability, and availability for both back-office and cloud-based processing. exadata/index.html4
Oracle Exadata Database Machine Technical Case Study – Garmin International Inc.chose Exadata Database Machine due to its superior performance and availability, and because it couldfundamentally change their strategic IT focus away from building systems, to developing,consolidating, and supporting application services.Using an Exadata system resulted in Garmin realizing the following advantages relative to theirprevious environment: Cost-effective horizontal scaling to achieve performance SLAs Integrated high availability features allowing availability SLAs to be met Cost reduction, both by consolidating workloads to fewer servers, and by migrating to theEngineered System solution.Garmin Exadata Database Machine Deployment ArchitectureAt present, Garmin has two Exadata Database Machines. One hosts five production databases. Theother hosts physical standby databases for their mission-critical applications, as well as test,development, and quality assurance databases for the five production environments. Productionapplications’ databases deployed on Exadata Database Machine include: Garmin Connect, a customer-facing personal fitness web application with extreme availabilityrequirements and rapidly growing disk requirements Orbit, an implementation of Oracle E-Business Suite 11i for order fulfillment, manufacturing,inventory and warehouse management, and financials PLAN, an implementation of Oracle E-Business Suite 11i and Demantra for Oracle AdvancedSupply Chain Planning (ASCP) Hyperion 11.1.2, for analytic reporting during period-end close RubyTW, a custom Quest Shareplex application used for managing manufacturing processes for theTaiwan manufacturing facilityTo support these applications, Garmin installed an Exadata Database Machine V2 Half Rack with highperformance disk in production, and an Exadata Database Machine V2 Half Rack with high capacitydisk for their physical standby, QA, test, and development environments. They expanded the physicalmemory of all Exadata Database Machine compute nodes to 144 GB using the available memoryexpansion kit.Garmin’s Connect application contains fitness and workout data along with waypoint data for distancetraveled. The waypoints are stored as BLOBs in the Connect database. The popularity of thisapplication is resulting in rapidly growing storage requirements. As the waypoint data ages, it is notaccessed as often but must still be accessible. Garmin has implemented an Information LifecycleManagement (ILM) program, adding two stand-alone high-capacity Exadata Storage Serversconfigured as a separate ASM disk group for holding this aged data. This allows the high performance5
Oracle Exadata Database Machine Technical Case Study – Garmin International Inc.disk to be used by the frequently-accessed data of all the production environments, while maintainingonline access to the bulky, infrequently accessed waypoint data.To accommodate the physical standby databases along with test and development databases, the halfrack TEST Exadata Database Machine is configured with seven high capacity Exadata Storage Servers.Each storage server has twelve 2 TB High Capacity SAS disks. Garmin has chosen to run their test,development and project application databases on the TEST database machine that also hosts thephysical standby databases of their production databases. Should any or all of the primary productiondatabases be lost, they will fail over to their respective standby databases. At this time, Garmin willshut down any non-production application databases.The two Exadata Database Machines are located in two separate buildings at Garmin headquarters.Each database machine is a separate isolated cluster, and the two are connected with a custom builthigh speed 10GE network. This custom network utilizes two dedicated Linux servers, each connectingto one of the two database machines via InfiniBand. These two Linux servers handle routing thetraffic between the InfiniBand network and the custom 10GE network that runs between the twobuildings. This allows for high transfer throughput for redo and archive logs to their physical standbydatabases enabling it to stay in sync should the primary database machine fail or be down formaintenance.All application and mid-tier components connect over gigabit Ethernet to Exadata. This includes the11i E-Business Suite concurrent managers for Orbit and PLAN. As Parallel Concurrent Processing isnot implemented at this time, the concurrent manager is running on node 4 of their productionExadata Database Machine.FIGURE 1 – EXADATA DATABASE MACHINE DEPLOYMENT ARCHITECTURE6
Oracle Exadata Database Machine Technical Case Study – Garmin International Inc.MigrationPrior to migrating to Exadata Database Machine, Garmin’s application databases were deployed onlarge dedicated SMP servers, spanning a mix of Sun Solaris and Linux Platforms using SAN storage.These various systems hosted mission-critical applications for customer web presence (GarminConnect), order fulfillment, warehouse management, manufacturing, and financials. All of theseapplications were running on Oracle Database 10g Release 2.It was important to migrate Garmin Connect to Exadata Database Machine with as little downtime aspossible. To accomplish this, Garmin upgraded the Connect database from 10g Release 2 to 11gRelease 2 (version 11.2.0.1) while it was still on their legacy system. They instantiated a physical standbyon Exadata Database Machine, then performed a planned switchover to the physical standby database.This strategy enabled Garmin to migrate Garmin Connect to Exadata Database Machine with onlyminutes of downtime. Garmin followed best practices for Data Guard configuration found in theMAA paper, “Oracle Data Guard: Disaster Recovery for Exadata Database Machine 2”.All other databases were migrated from their legacy environments to Oracle databases running onExadata Storage Servers using Oracle Data Pump. The E-Business Suite application databases forOrbit and PLAN were upgraded from 10g Release 2 to 11g Release 2 (11.2.0.1) as part of themigration, guided by the MAA best practice paper, “Migrating Oracle E-Business Suite to ExadataDatabase Machine Using Oracle Data Pump. ilability/maa-ebs-dbm-datapump-167285.pdf7
Oracle Exadata Database Machine Technical Case Study – Garmin International Inc.Garmin Production Experience with ExadataPrior to migrating to Exadata Database Machine,the following points were observed by Garmin:On Exadata Database Machine where GarminConnect and Orbit are both running: Gamin Connect was near maximum CPUcapacity running on a single large SMP server. Peak CPU usage across all nodes on the clusterwas observed to be 50%. Garmin Connect was near maximum capacityfor I/O (IOPS and throughput). I/O utilization ranges between 15% withpeaks up to 30%. Orbit was operating at high CPU usage andwas nearing its maximum capacity. Orbit had reached its I/O capacity (IOPS andthroughput).Flash cache on Exadata Database Machinesatisfies 93% of all physical I/O requests forGarmin Connect. Flash cache on Exadata Database Machinesatisfies overall 70% of all physical I/Orequests for Orbit. No more “Terrible Tuesday” – productionbackups are offloaded to the physical standbydatabases Weekly production backups led to noticeablypoor performance every TuesdayPerformance is also improved significantly in the following areas: Online users of OLTP E-Business Suite 11i applications experience consistent and stableperformance. E-Business Suite Advanced Supply Chain Planning (ASCP) batch runs now complete 20% to 50%faster. Demantra Shipping and Booking History processes run 30% faster. Demantra ASCP upload processes now complete 60% faster. Several of the month-end batch jobs now run 67% faster.8
Oracle Exadata Database Machine Technical Case Study – Garmin International Inc.The following table lists a few specific examples of performance improvements for critical businessprocesses. The performance improvement is shown below as an X factor time faster.TABLE 1. BUSINESS PROCESS PERFORMANCE COMPARISONPROCESS NAMEPRE-EXADATA TIMEPOST-EXADATA TIME(HH:MM:SS)(HH:MM:SS)BY X TIMES FASTERCommunication for Receiving Event00:35:5400:00:20107Communication for Shipping Event00:35:2700:00:21107Report Set01:11:4300:00:4496Use Tax Liability Report00:03:3300:00:307Aging - 4 Buckets Report00:35:1500:03:0211Detail Invoice Ship-To Report07:31:5303:24:012Critical Concurrent JobsMonth End Processing Substantial cost savings from consolidation – Garmin have now replaced 12 dedicated productionservers by moving to Exadata Database Machine.Prior to Exadata, Garmin had no fault tolerant solutions in place for any of the applications. Theynow have an architecture in place to meet their SLA targets of 99.5% uptime for Orbit and PLAN(considering unplanned outages), and 99.5% uptime for Connect.oOrbit and PLAN E-Business Suite applications are now deployed on Oracle RealApplication Clusters (Oracle RAC) increasing their availability compared to theirnon-Oracle RAC single SMP server.oOrbit and Connect both have reduced downtime for some planned maintenanceactivities by using Oracle RAC and Oracle Data Guard.Scaling Garmin ConnectGarmin personal fitness devices allow their customers to record and analyze their performance duringworkouts – monitoring workout characteristics such as heart rate, distance, time, pace, elevation, etc.,while recording waypoints marking geographic location of the workout. Customers can then uploadtheir workout files to Garmin Connect, where the information is available for personal reporting andanalysis as well as exploring – e.g., searching for activities and courses uploaded by other users or9
Oracle Exadata Database Machine Technical Case Study – Garmin International Inc.searching by location. The Garmin Connect application has seen significant demand and growth,evidenced by large increases both in the number of users registered and the number of workoutsuploaded by users.Garmin Connect now benefits from the Exadata architecture by scaling CPU and memory with OracleRAC and faster I/O with high bandwidth InfiniBand –making use of all available disks as well as thefaster I/O provided by flash cache, reducing I/O latency for random reads.One important and frequently used feature in Connect is the Connect Explore function, whichprovides functionality for searching activities and courses from other users as well as keyword searchesof specific locations. Exercising this functionality results in a database query using GPS coordinatesthat must be serviced with low response times. To reduce the query response time, the table beingqueried was pinned into the Exadata Storage Server flash cache.Prior to migrating to Exadata Database Machine, the Garmin Connect application experienced poor,erratic performance. After the migration, not only are they meeting their performance expectationsthey are able to support more than 2 times the number of users and achieve much higher transactionthroughput. For example, prior to Exadata, Garmin Connect was peaking at 200,000 page visits perhour on Garmin Connect. After moving to Exadata, the Connect usage has grown to well over750,000 page visits per hour with much more system capacity still available.But even better performance could be gained The Garmin Connect application is relativelyCOMMIT intensive, with a rate averaging above 85 user commits per second. With Oracle Databaseversion 11.2.0.1 – the version available when the application was first migrated – Garmin Connectexperienced the database wait event “log file sync” averaging around 17ms per commit. It was alsoobserved that this wait event had other downstream impact such as causing buffer busy waits leadingto some performance impact.Smart Flash Logging is a feature introduced in the Exadata storage software release 11.2.2.4.0, whichcan be utilized with database version 11.2.0.2 and higher. Smart logging reduces redo log I/O writelatency by performing parallel log writes to both flash logs – a small portion of the smart flash cache,and physical storage. The write to the flash log typically completes first, at which time control isreturned to the application.Garmin has since upgraded the Garmin Connect database to 11.2.0.3 and now takes advantage ofExadata Smart Flash Logging. Once Garmin Connect was upgraded to 11.2.0.3, log file sync wentfrom an average of 17ms to 1.92ms per user commit due to smart logging. The buffer busy wait eventdisappeared from the top wait events allowing the application to scale even further.One of the Garmin Connect DBAs noted, “Connect wouldn’t be running today if it weren’t forExadata.”10
Oracle Exadata Database Machine Technical Case Study – Garmin International Inc.Garmin Using Exadata Database Machine for DatabaseConsolidationConsolidation of existing application databases involves two high-level tasks to be completed for asuccessful deployment: Planning for consolidation Managing the consolidated deploymentPlanning for consolidationWhen planning for consolidation, Garmin considered the following: High Availability SLA Planned maintenance windows Storage usage and expected growth Statistics on CPU, memory and I/O usage Workload behavior such as peak utilization, processing calendar – e.g., batch jobs for daily planningcycles, month-end close, and other required business processes Workload growth (new users, increase in transaction volumes, etc.)As mentioned earlier, Garmin has successfully consolidated five application databases onto ExadataDatabase Machine. Garmin performed assessments for each of these application databases todetermine their resource needs (CPU, memory, I/O) as they vary over the processing calendar andstorage requirements. Expected growth was also factored into the planning process for future resourceneeds and storage growth.Managing the consolidated deploymentGarmin utilized a number of strategies for managing their application databases consolidated onExadata Database Machine – load distribution across Oracle RAC nodes, configuration adjustments,various load management tools, and Oracle Enterprise Manager suite.Based on CPU and memory requirements, Garmin has decided where specific application databaseinstances are to be placed on their production half rack Exadata Database Machine: Garmin Connect instances running on nodes 1 and 2 Orbit instances running on all 4 nodes with the concurrent manger assigned to node 4 PLAN instances running on nodes 3 and 4 with the concurrent manager assigned to node 3 Hyperion instances running on nodes 3 and 411
Oracle Exadata Database Machine Technical Case Study – Garmin International Inc. RubyTW instances running on nodes 1 and 2Note that all databases are configured with Oracle RAC but not all are configured with Oracle DataGuard. See the following illustration for a graphical view of the layout.FIGURE 2 – GARMIN PRODUCTION CONSOLIDATE LAYOUTTo efficiently use memory – and avoid swapping – Garmin configured Linux hugepages on allcompute nodes. Hugepages is a Linux OS kernel parameter that allows all processes that access theOracle instance SGA to share the page tables. Linux hugepages are configured during installation butthe ideal setting will vary based on application database requirements. For more information onconfiguring Linux hugepages, see My Oracle Support note 361323.1. For a script that will compute therecommended hugepage size, see My Oracle Support note 401749.1.In addition to spreading instances and workload tasks across different Exadata Database Machinecompute nodes, Garmin is using these tools to manage resources:12
Oracle Exadata Database Machine Technical Case Study – Garmin International Inc. Instance caging is being tested on Orbit to ensure sufficient CPU resources are available for GarminConnect on node 1. For more information on instance caging, see "Database Instance Caging: ASimple Approach to Server Consolidation" white paper. 4 Exadata I/O Resource Manager (IORM) – only available on Exadata – is used to manage andpriorities IO bandwidth for some reports that are run from Orbit. Garmin is in the process of implementing Database Resource Manager (DBRM) for managing bothCPU and I/O resources for specific database services for each of their applications.As application workload resource needs grow or change, and new applications are added, these toolsare essential to managing the array of applications running at Garmin.In addition, Oracle Enterprise Manager Grid Control is used to monitor and manage Garmin’sExadata systems. Oracle Enterprise Manager provides a complete view of Exadata, including: Oracle Database Database services Grid infrastructure: ASM storage manager and Oracle Clusterware Exadata storage cell All hardware components and operating environments present in the physical rackTo maintain overall health of the system, Garmin makes extensive use of the health check tool, exachk,to automate the collection and analysis of data regarding key software, hardware, and firmware versionsand configuration best practices specific to Exadata Database Machine. The exachk output providesextensive review and crosscheck of an Exadata Database Machine installation, checking patch levels,reporting network or system faults, and validating a wide variety of configuration best practices.Garmin has followed the recommendations from the output of the exachk tool to ensure propersystem configuration and smooth operations. For more information on exachk, refer to My OracleSupport note 1070954.1.All test and development databases that support production as well as the physical standby databaseshave been consolidated onto their second (TEST/DEV) Exadata Database Machine Half Rack system.Garmin Exadata Database Machine High Availability ConfigurationOne major benefit to Garmin’s migration to Exadata Database Machine has been the availability of arobust toolkit to provide availability across a variety of scenarios for their mission critical 166854.pdf13
Oracle Exadata Database Machine Technical Case Study – Garmin International Inc.The following sections discuss implementation details for the major high availability componentsdeployed in Garmin’s Exadata Database Machine environments.Configuration and use of Oracle RACAll Garmin production databases on Exadata Database Machine have been deployed using OracleRAC. Oracle RAC active-active clustering enables the cost-effective scale-out architecture and highavailability sought by Garmin. Garmin’s applications connect to Oracle using database services. Each database service is availableon as many Oracle RAC nodes as needed to meet performance SLAs. Connect-time load balancinginsures equal utilization of all nodes hosting a particular service.Some of Garmin’s applications connect using Single Client Access Name (SCAN), which provides alevel of abstraction between application clients and the physical configuration of the Oracle RACdatabase. For example, the Garmin Connect application uses SCAN to provide seamless connectionsacross Oracle RAC instances, and to automatically reconnect should an Oracle RAC instance fail.Garmin has tested the reconnection time for Garmin Connect on Oracle RAC should one of theOracle RAC instances fail, and have found that the reconnect time is 30 seconds. For moreinformation on SCAN, see the overview white paper: Single Client Access Name (SCAN). 5Oracle RAC also provides Garmin high availability during many types of planned maintenance.Oracle Grid Infrastructure and Oracle RAC rolling upgrades best practices enable certainmaintenance (hardware, firmware and OS maintenance, one-off patches, critical patch updates) to beperformed on one node at a time in a rolling fashion, while database services remain available on allother nodes of the Oracle RAC cluster. Here is how Garmin maintains availability during plannedmaintenance:oDisable the database service on node 1 and allow traffic to continue on node 2.oOnce client traffic has been migrated from node 1 and new client requests areredirected elsewhere, its database home can be patched.oAfter patching, the instance is brought back up along with its services, and traffic isrouted back to the patched node.oThis process is repeated for the remaining nodes.For more information on performing Grid Infrastructure and Oracle RAC patching in a rollingfashion, please see My Oracle Support note se/clustering/overview/scan-129069.pdf14
Oracle Exadata Database Machine Technical Case Study – Garmin International Inc.Configuration and use of Oracle Automatic Storage Management (ASM)Garmin utilizes ASM, the integrated file system and volume manager used with Exadata DatabaseMachine, to implement a cost-effective high-performance storage grid. Garmin has followed MAAExadata Database Machine best practices for ASM. Garmin defined their DATA disk groups (user data) first when configuring their disk farm, whichplaces these disk groups on outside edges of the disk for optimal performance. The RECO diskgroups (recovery data) were defined next, placing them more to the inside of the disks, then finallythe Database File System disk group (DBFS) was defined, placing it on the innermost portion ofeach disk. Garmin uses a 75/25 split of storage capacity between DATA and RECO disk groups respectively. Garmin has followed the MAA best practice to have all ASM disk groups span all storage cells toachieve the highest I/O performance for all applications. This does not include the two stand-aloneExadata storage cells used for the less active data for Garmin Connect – here, the disk group isdefined across all the high capacity storage cells provisioned for the historical data. Exadata storage cells are patched in a rolling fashion (one cell at a time). Applying the patches ontheir TEST system enables all changes to be tested on the standby/QA/dev/test machine beforeapplying them to the production system. Rolling them across the TEST database machine alsoprovides a good test of the process while allowing some availability to be maintained on the dev, test,and QA environments.Garmin currently implements ASM normal redundancy (dual mirroring) for both DAT
RubyTW, a custom Quest Shareplex application used for managing manufacturing processes for the Taiwan manufacturing facility To support these applications, Garmin installed an Exadata Database Machine V2 Half Rack with high performance disk in production, and an Exadata Database Machine V2 Half Rack with high capacity