Transcription

DREAM: Dynamic Resource Allocation forSoftware-defined MeasurementMasoud Moshref††UniversityMinlan Yu†of Southern CaliforniaABSTRACTSoftware-defined networks can enable a variety of concurrent, dynamically instantiated, measurement tasks, that provide fine-grainvisibility into network traffic. Recently, there have been many proposals to configure TCAM counters in hardware switches to monitor traffic. However, the TCAM memory at switches is fundamentally limited and the accuracy of the measurement tasks is a function of the resources devoted to them on each switch. This paper describes an adaptive measurement framework, called DREAM, thatdynamically adjusts the resources devoted to each measurementtask, while ensuring a user-specified level of accuracy. Since thetrade-off between resource usage and accuracy can depend upon thetype of tasks, their parameters, and traffic characteristics, DREAMdoes not assume an a priori characterization of this trade-off, butinstead dynamically searches for a resource allocation that is sufficient to achieve a desired level of accuracy. A prototype implementation and simulations with three network-wide measurementtasks (heavy hitter, hierarchical heavy hitter and change detection)and diverse traffic show that DREAM can support more concurrenttasks with higher accuracy than several other alternatives.Categories and Subject DescriptorsC.2.0 [Computer-Communication Networks]; C.2.3 [NetworkOperations]: Network monitoring; C.2.4 [Distributed Systems]:Network operating systemsKeywordsSoftware-defined Measurement; Resource Allocation1.Ramesh Govindan†INTRODUCTIONToday’s data center and enterprise networks require expensivecapital investments, yet provide surprisingly little visibility intotraffic. Traffic measurement can play an important role in thesenetworks, by permitting traffic accounting, traffic engineering, loadbalancing, and performance diagnosis [7, 11, 13, 19, 8], all ofwhich rely on accurate and timely measurement of time-varyingtraffic at all switches in the network. Beyond that, tenant servicesPermission to make digital or hard copies of all or part of this work for personal orclassroom use is granted without fee provided that copies are not made or distributedfor profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others thanACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permissionand/or a fee. Request permissions from [email protected]’14, August 17–22, 2014, Chicago, Illinois, USA.Copyright 2014 ACM 978-1-4503-2836-4/14/08 . 15.00.http://dx.doi.org/10.1145/2619239.2626291. GoogleAmin Vahdat and UC San Diegoin a multi-tenant cloud may need accurate statistics of their traffic,which requires collecting this information at all relevant switches.Software-defined measurement [39, 25, 31] has the potential toenable concurrent, dynamically instantiated, measurement tasks. Inthis approach, an SDN controller orchestrates these measurementtasks at multiple spatial and temporal scales, based on a globalview of the network. Examples of measurement tasks include identifying flows exceeding a given threshold and flows whose volumechanges significantly. In a cloud setting, each tenant can issue distinct measurement tasks. Some cloud services have a large numberof tenants [1], and cloud providers already offer simple per-tenantmeasurement services [2].Unlike prior work [39, 35, 31, 29], which has either assumedspecialized hardware support on switches for measurement, or hasexplored software-defined measurements on hypervisors, our paperfocuses on TCAM-based measurement in switches. TCAM-basedmeasurement algorithms can be used to detect heavy hitters andsignificant changes [31, 41, 26]. These algorithms can leverage existing TCAM hardware on switches and so have the advantage ofimmediate deployability. However, to be practical, we must addressa critical challenge: TCAM resources on switches are fundamentally limited for power and cost reasons. Unfortunately, measurement tasks may require multiple TCAM counters, and the number of allocated counters can determine the accuracy of these tasks.Furthermore, the resources required for accurate measurement maychange with traffic, and tasks may require TCAM counters allocated on multiple switches.Contributions. In this paper, we discuss the design of a systemfor TCAM-based software-defined measurement, called DREAM.Users of DREAM can dynamically instantiate multiple concurrentmeasurement tasks (such as heavy hitter or hierarchical heavy hitterdetection, or change detection) at an SDN controller, and additionally specify a flow filter (e.g., defined over 5-tuples) over whichthis measurement task is executed. Since the traffic for each taskmay need to be measured at multiple switches, DREAM needs toallocate switch resources to each task.To do this, DREAM first leverages two important observations.First, although tasks become more accurate with more TCAM resources, there is a point of diminishing returns: beyond a certainaccuracy, significantly more resources are necessary to increase theaccuracy of the task. Moreover, beyond this point, the quality ofthe retrieved results, say heavy hitters is marginal (as we quantifylater). This suggests that it would be acceptable to maintain the accuracy of measurement tasks above a high (but below 100%) userspecified accuracy bound. Second, tasks need TCAM resourcesonly on switches at which there is traffic that matches the specified flow filter, and the number of resources required depends uponthe traffic volume and the distribution. This suggests that allocat-

ing just enough resources to tasks at switches and over time mightprovide spatial and temporal statistical multiplexing benefits.DREAM uses both of these observations to permit more concurrent tasks than is possible with a static allocation of TCAMresources. To do this, DREAM needs to estimate the TCAM resources required to achieve the desired accuracy bound. Unfortunately, the relationship between resource and accuracy for measurement tasks cannot be characterized a priori because it dependsupon the traffic characteristics. If this relationship could have beencharacterized, an optimization-based approach would have worked.Instead, DREAM contains a novel resource adaptation strategy fordetermining the right set of resources assigned to each task at eachswitch. This requires measurement algorithm-specific estimationof task accuracy, for which we have designed accuracy estimators for several common measurement algorithms. Using these,DREAM increases the resource allocated to a task at a switch whenits global estimated accuracy is below the accuracy bound and itsaccuracy at the switch is also below the accuracy bound. In thismanner, DREAM decouples resource allocation, which is performedlocally, from accuracy estimation, which is performed globally.DREAM continuously adapts the resources allocated to tasks, sincea task’s accuracy and resource requirements can change with traffic. Finally, if DREAM is unable to get enough resources for a taskto satisfy its accuracy bound, it drops the task.DREAM is at a novel point in the design space: it permits multiple concurrent measurements without compromising their accuracy, and effectively maximizes resource usage. We demonstratethrough extensive experiments on a DREAM prototype (in whichmultiple concurrent tasks three different types are executed) thatit performs significantly better than other alternatives, especiallyat the tail of important performance metrics, and that these performance advantages carry over to larger scales evaluated throughsimulation. DREAM’s satisfaction metric (the fraction of task’slifetime that its accuracy is above the bound) is 2 better at the tailfor moderate loads than an approach which allocates equal share ofresources to tasks: in DREAM, almost 95% of tasks have a satisfaction higher than 80%, but for equal allocation, 5% have a satisfaction less than 40%. At high loads, DREAM’s average satisfactionis nearly 3 that of equal allocation. Some of these relative performance advantages also apply to an approach which allocates afixed amount of resource to each task, but drops tasks that cannotbe satisfied. However, this fixed allocation rejects an unacceptablyhigh fraction of tasks: even at low load, it rejects 30% of tasks,while DREAM rejects none. Finally, these performance differencespersist across a broad range of parameter settings.2.RESOURCE-CONSTRAINED SOFTWAREDEFINED MEASUREMENTIn this section, we motivate the fundamental challenges for realtime visibility into traffic in enterprise and data center networks.Software-defined Measurement (SDM) provides this capability bypermitting a large amount of dynamically instantiated network-widemeasurement tasks. These tasks often leverage flow-based counters in TCAM in OpenFlow switches. Unfortunately, the numberof TCAM entries are often limited. To make SDM more practical, we propose to dynamically allocate measurement resources totasks, by leveraging the diminishing returns in the accuracy of eachtask, and temporal/spatial resource multiplexing across tasks.2.1TCAM-based MeasurementIn this paper, we focus on TCAM-based measurement tasks onhardware switches. Other work has proposed more advanced mea-Figure 1: TCAM-based task examplesurement primitives like sketches [39], which are currently not available in commercial hardware switches and it is unclear when theywill be available. For this reason, our paper explicitly focuses onTCAM-based measurement, but many of the techniques proposedin this paper can be extended to sketch-based measurement (weleave such extensions to future work). In more constrained environments like data centers, it may be possible to perform measurementin software switches or hypervisors (possibly even using sketches),but this approach (a) can be compromised by malicious code onend-hosts, even in data-center settings [37, 9], (b) does not generalize to wide-area deployments of SDN [24], and (c) introducesadditional constraints (like hypervisor CPU usage) [22].To understand how TCAM memory can be effectively used formeasurement, consider the heavy hitter detection algorithm proposed in [26]. The key idea behind this (and other TCAM-basedalgorithms) is, in the absence of enough TCAM entries to monitor every flow in a switch, to selectively monitor prefixes and drilldown on prefixes likely to contain heavy hitters. Figure 1 shows aprefix trie of two bits as part of source IP prefix trie of a task thatfinds heavy hitter source IPs (IPs sending more than, say, 10Mbpsin a measurement epoch). The number inside each node is the volume of traffic from the corresponding prefix based on the “current”set of monitored prefixes. The task reports source IPs (leaves) withvolume greater than threshold.If the task cannot monitor every source IP in the network becauseof limited TCAM counters, it only monitors a subset of leaves trading off some accuracy. It also measures a few internal nodes (IPprefixes) to guide which leaves to monitor next to maximize accuracy. For example in Figure 1, suppose the task is only allowed touse 3 TCAM counters, it first decides to monitor 11, 10 and 0*.As prefix 0* sends large traffic, the task decides to drill down under prefix 0* in the next epoch to find heavy hitters hoping that theywill remain active then. However, to respect the resource constraint(3 TCAM counters), it must free a counter in the other sub-tree bymonitoring prefix 1* instead of 10 and 11.2.2Task Diversity and Resource LimitationsWhile the previous sub-section described a way to measure heavyhitters at a single switch, the focus of our work is to design anSDM system that (a) permits multiple types of TCAM-based measurement tasks across multiple switches that may each contend forTCAM memory, and (b) adapts the resources required for concurrent tasks without significantly sacrificing accuracy.SDM needs to support a large number of concurrent tasks, anddynamic instantiation of measurement tasks. In an SDN-capableWAN, network operators may wish to track traffic anomalies (heavyhitters, significant changes), and simultaneously find large flows toeffect preferential routing [7], and may perform each of these taskson different traffic aggregates. Operators may also instantiate tasksdynamically to drill down into anomalous traffic aggregates. Inan SDN-capable multi-tenant data center, individual tenants mighteach wish to instantiate multiple measurement tasks. Modern cloudservices have a large number of tenants; for example, 3 million domains used AWS in 2013 [1]. Per-tenant measurement servicesare already available — Amazon CloudWatch provides tenant operators very simple network usage counters per VM [2]. In the future, we anticipate tenants instantiating many measurement tasks to

achieve distinct goals such as DDoS detection or better bandwidthprovisioning [10, 38].Each measurement task may need hundreds of TCAM entries forsufficient accuracy [31, 26, 41], but typical hardware switches haveonly a limited number of TCAMs. There are only 1k-2k TCAMentries in switches [19, 23], and this number is not expected to increase dramatically for commodity switches because of their costand power usage. Moreover, other management tasks such as routing and access control need TCAMs and this can leave fewer entriesfor measurement.2.3Dynamic Resource Allocation for SDMGiven limited resources and the need to support concurrent measurement tasks, it is important to efficiently allocate TCAM resources for measurement.Leverage: Diminishing returns in accuracy for measurement.The accuracy of a measurement task depends on the resources allocated to it [31, 39]. For example, for heavy hitter (HH) detection,recall, the fraction of true HHs that are detected, is a measure of accuracy. Figure 2 shows the result of our HH detection algorithm ona CAIDA traffic trace [3] with a threshold of 8 Mbps (See Section 5for implementation details).The figure shows that more counters leads to higher recall. Forexample, doubling counters from 512 to 1024 increases recall from60% to 80% (Figure 2(a)). There is a point of diminishing returns for many measurement tasks [17, 30, 28, 27] where additional resource investment does not lead to proportional accuracyimprovement. The accuracy gain becomes smaller as we doublethe resources; it only improves from 82% to 92% when doublingthe number of counters from 1024 to 2048, and even 8K countersare insufficient to achieve an accuracy of 99%. Furthermore, theprecise point of diminishing returns depends on the task type, parameters (e.g., heavy hitter threshold) and traffic [31].Another important aspect of the relationship between accuracyand resource usage of TCAM-based algorithms is that, beyond thepoint of diminishing returns, additional resources yield less significant outcomes, on average. For example, the heavy hitters detected with additional resources are intuitively “less important” or“smaller” heavy hitters and the changes detected by a change detection algorithm are smaller, by nearly a factor of 2 on average (wehave empirically confirmed this).This observation is at the core of our approach: we assert thatnetwork operators will be satisfied with operating these measurement tasks at, or slightly above, the point of diminishing returns,in exchange for being able to concurrently execute more measurement tasks.1 At a high-level, our approach permits operators todynamically instantiate three distinct kinds of measurement tasks(discussed later) and to specify a target accuracy for each task.It then allocates TCAM counters to these tasks to enable them toachieve the specified accuracy, adapts TCAM allocations as tasksleave or enter or as traffic changes. Finally, our approach performsadmission control because the accuracy bound is inelastic and admitting too many tasks can leave each task with fewer resourcesthan necessary to achieve the target accuracy.Leverage: Temporal and Spatial Resource Multiplexing. TheTCAM resources required for a task depends on the properties ofmonitored traffic. For example, as the number of heavy hitters increases, we need more resources to detect them. This presents anopportunity to statistically multiplex TCAM resources across tasks1 Indeed, under resource constraints, less critical measurement tasks might well returnvery interesting/important results even well below the point of diminishing returns.We have left an exploration of this point in the design space to future work.on a single switch: while a heavy hitter task on a switch may seemany heavy hitters at a given time, a concurrent change detectiontask may see fewer anomalies at the same instant, and so may needfewer resources. This dependence on TCAM resources with traffic is shown in Figure 2(a), where the recall of the HH detectiontask with 256 entries decreases in the presence of more HHs andwe need more resources to keep its recall above 50%. If we allocate fixed resources to each task, we would either over-provisionthe resource usage and support fewer tasks, or under-provision theresource usage and obtain low accuracy.Measurement tasks also permit spatial statistical multiplexing,since the task may need resources from multiple switches. For example, we may need to find heavy hitter source IPs on flows ofa prefix that come from multiple switches. Figure 2(b) shows therecall of heavy hitters found on two switches monitoring different traffic: the recall at each switch is defined by the portion ofdetected heavy hitters on this switch over true heavy hitters. Thegraph shows that with the same amount of resources, the switchesexhibit different recall; conversely, different amounts of resourcesmay be needed at different switches.These leverage points suggest that it may be possible to efficiently use TCAM resources to permit multiple concurrent measurement tasks by (a) permitting operators2 to specify desired accuracy bounds for each task, and (b) adapting the resources allocatedto each task in a way that permits temporal and spatial multiplexing. This approach presents two design challenges.Challenge: Estimating resource usage for a task with a desiredaccuracy. Given a task and target accuracy, we need to determinethe resources to allocate to the task. If we knew the dependenceof accuracy on resources, we could solve the resource allocationproblem as an optimization problem subject to resource constraints.However, it is impossible to characterize the resource-accuracy dependence a priori because it depends on the traffic, the task type, themeasurement algorithms, and the parameters [31]. Furthermore, ifwe knew the current accuracy, we could then compare it with thedesired accuracy and increase/decrease the resource usage correspondingly. Unfortunately, it is also impossible to know the current accuracy because we may not have the ground truth duringthe measurement. For example, when we run a heavy hitter detection algorithm online, we can only know the heavy hitters that thealgorithm detects (which may have false positives/negatives), butrequire offline processing to know the real number of heavy hitters.To address this challenge, we need to estimate accuracy and thendynamically increase or decrease resource usage until the desiredaccuracy is achieved. For example, to estimate recall (a measureof accuracy) for heavy hitter detection, we can compute the realnumber of heavy hitters by estimating the number of missed heavyhitters using the collected counters. In Figure 1, for example, thetask cannot miss more than two heavy hitters by monitoring prefix0* because there are only two leaves under node 0* and its totalvolume is less than three times the threshold. In Section 5, we usesimilar intuitions to describe accuracy estimators for other measurement tasks.Challenge: Spatial and Temporal Resource Adaptation. Astraffic changes over time, or as tasks enter and leave, an algorithmthat continually estimates task accuracy and adapts resource allocation to match the desired accuracy (as discussed above) will alsobe able to achieve temporal multiplexing. In particular, such an2 Operators may not wish to express and reason about accuracy bounds. Therefore, adeployed system may have reasonable defaults for accuracy bounds, or allow prioritiesinstead of accuracy bounds, and translate these priorities to desired accuracy bounds.We have left an exploration of this to future work.

10.80.60.6RecallRecall10.80.40.200100200Time (s)25651210243002048Switch 1Switch 20.40.200(a) Resource change100200Time (s)300(b) Spatial diversityFigure 2: Accuracy of HH detectionalgorithm will allocate minimal resources to measurement taskswhose traffic does not exhibit interesting phenomena (e.g., heavyhitters), freeing up resources for other tasks that may incur largechanges, for example. However, this algorithm alone is not sufficient to achieve spatial multiplexing, since, for a given task, wemay need to allocate different resources on different switches toachieve a desired global accuracy. For example, a task may wish todetect heavy hitters within a prefix P, but traffic for that prefix maybe seen on switch A and B. If the volume of prefix P’s traffic on A ismuch higher than on B, it may suffice to allocate a large number ofTCAM resources on A, and very few TCAM resources on B. Designing an algorithm that adapts network-wide resource allocationsto achieve a desired global accuracy is a challenge, especially inthe presence of traffic shifts between switches. DREAM leveragesboth the global estimated accuracy and a measure of accuracy estimated for each switch to decide on which switch a task needs moreresources in order to achieve the desired global accuracy.3.DREAM OVERVIEWDREAM enables resource-aware software-defined measurement.It supports dynamic instantiation of measurement tasks with a specified accuracy, and automatically adapts TCAM resources allocatedto each task across multiple switches. DREAM can also be extended to other measurement primitives (like sketches) and tasksfor which it is possible to estimate accuracy.Architecture and API. DREAM implements a collection of algorithms (defined later) running on an SDN controller. Users ofDREAM submit measurement tasks to the system. DREAM periodically reports measurement results to users, who can use these results to reconfigure the network, install attack defenses, or increasenetwork capacity. A DREAM user can be a network operator, or asoftware component that instantiates tasks and interprets results.Our current implementation supports three types of measurementtasks, each with multiple parameters:Heavy Hitter (HH) A heavy hitter is a traffic aggregate identifiedby a packet header field that exceeds a specified volume. Forexample, heavy hitter detection on source IP finds source IPscontributing large traffic in the network.Hierarchical Heavy Hitter (HHH) Some fields in the packet header (such as the source/destination IP addresses) embed hierarchy, and many applications such as DDoS detection requireaggregates computed at various levels within a hierarchy [33].Hierarchical heavy hitter (HHH) detection is an extension ofHH detection that finds longest prefixes that exceed a certainthreshold even after excluding any HHH descendants in the prefix trie [15].Change Detection (CD) Traffic anomalies such as attacks oftencorrelate with significant changes in some aspect of a traffic aggregate (e.g., volume or number of connections). For example,large changes in traffic volume from source IPs have been usedfor anomaly detection [41].Figure 3: DREAM System OverviewEach of these tasks takes four parameters: a flow filter specifyingthe traffic aggregate to consider for the corresponding phenomenon(HH, HHH or CD); a packet header field on which the phenomenonis defined (e.g., source IP address); a threshold specifying the minimum volume constituting a HH or HHH or CD; and a user-specifiedaccuracy bound (usually expressed as a fraction). For example, ifa user specifies, for a HH task a flow filter 10/8, 12/8, , , ,source IP as the packet header field, a threshold of 10Mb and anaccuracy of 80%, DREAM measures, with an accuracy higher than80%, heavy hitters in the source IP field on traffic from 10/8 to 12/8,where the heavy hitter is defined as any source IP sending morethan 10Mb traffic in a measurement epoch. The user does not specify the switch to execute the measurement task; multiple switchesmay see traffic matching a task’s flow filter, and it is DREAM’sresponsibility to install measurement rules at all relevant switches.Workflow. Figure 3 shows the DREAM workflow, illustrating boththe interface to DREAM and the salient aspects of its internal operation. A user instantiates a task and specifies its parameters (step 1).Then, DREAM decides to accept or reject the task based on available resources (step 2). For each accepted task, DREAM initiallyconfigures a default number of counters at one or more switches(step 3). DREAM also creates a task object for each accepted task:this object encapsulates the resource allocation algorithms run byDREAM for each task.Periodically, DREAM retrieves counters from switches and passesthese to task objects (step 4). Task objects compute measurementresults and report the results to users (step 5). In addition, eachtask object contains an accuracy estimator that measures currenttask accuracy (step 6). This estimate serves as input to the resourceallocator component of DREAM, which determines the number ofTCAM counters to allocate to each task and conveys this number tothe corresponding task object (step 7). The task object determineshow to use the allocated counters to measure traffic, and may reconfigure one or more switches (step 3). If a task is dropped forlack of resources, DREAM removes its task object and de-allocatesthe task’s TCAM counters.DREAM Generality.These network-wide measurement taskshave many applications in data centers and ISP networks. Forexample, they are used for multi-path routing [7], optical switching [12], network provisioning [21], threshold-based accounting [20],anomaly detection [42, 41, 26] and DDoS detection [33].Furthermore, DREAM can be extended to more general measurement primitives beyond TCAMs. Our tasks are limited by TCAMcapabilities because TCAM counters can only measure traffic volumes for specific prefixes. Moreover, TCAM-based tasks need afew epochs to drill down to the exact result. However, DREAM’s

key ideas — using accuracy estimators to allocate resources, andspatially multiplexing resource allocation — can be extended toother measurement primitives not currently available on commodity hardware, such as sketches. Sketches do not require controllerinvolvement to detect events and can cover a wider range of measurement tasks than TCAMs (volume and connection-based taskssuch as Super-Spreader detection) [39]. We can augment DREAMto use sketches, since sketch accuracy depends on traffic propertiesand it is possible to estimate this accuracy [17]. We leave discussion of these extensions to future work.There are two main challenges in DREAM, discussed in subsequent sections: the design of the resource allocator, and the designof task-specific accuracy estimators.4.DYNAMIC RESOURCE ALLOCATIONDREAM allocates TCAM resources to measurement tasks onmultiple switches. Let ri,s (t) denote the amount of TCAM resources allocated to the i-th task on switch s at time t. Each task isalso associated with an instantaneous global accuracy gi (t). Recallthat the accuracy of a task is a function of the task type, parameters, the number of its counters per switch and the traffic matchingits flow filter on each switch.DREAM allocates TCAM resources to maintain high averagetask satisfaction, which is the fraction of time where a task’s accuracy gi (t) is greater than the operator specified bound. More important, at each switch, DREAM must respect switch capacity: thesum of ri,s (t) for all i must be less than the total TCAM resourcesat switch s, for all t.To do this, DREAM needs a resource allocation algorithm to allocate counters to each task (i.e., the algorithm determines ri,s (t)).DREAM also needs an admission control algorithm; since the accuracy bound is inelastic (Section 2), admitting tasks indiscriminatelycan eventually lead to zero satisfaction as no task receives enoughresources to achieve an accuracy above the specified bound.Strawman approaches. One approach to resource allocation is toapply a convex optimization periodically, maximizing the numberof satisfied tasks by allocating ri,s (t) subject to switch TCAM constraints. This optimization technique requires a characterizationof the resource-accuracy curve, a function that maps target accuracy to TCAM resources needed. The same is true for an optimization technique like simulated annealing which requires the abilityto predict the “goodness” of a neighboring state. As discussed inSection 2.3, however, it is hard to characterize this curve a priori,because it depends upon traffic characteristics, and the type of task.An alternative approach is to perform this optimization iteratively: jointly (for all tasks across all switches) optimize the increase or decrease of TCAM resources, measure the resulting accuracy, and repeat until all tasks are satisfied. However, this jointoptimization is hard to scale to large numbers of switches and tasksbecause the combinatorics of the problem is a function of productof the number of switches and the number of tasks.If the total resource required for all tasks exceeds system capacity, the first approach may result in an infeasible optimization, andthe second may not converge. These approaches may then needto drop tasks after having admitted

Software-defined networks can enable a variety of concurrent, dy-namically instantiated, measurement tasks, that provide fine-grain visibility into network traffic. Recently, there have been many pro-posals to configure TCAM counters in hardware switches to moni-tor