rIoT: Enabling Seamless Context-AwareAutomation in the Internet of ThingsJie Hua† , Chenguang Liu† , Tomasz Kalbarczyk† , Catherine Wright‡ , Gruia-Catalin Roman‡ , Christine Julien†† Departmentof Electrical and Computer Engineering, University of Texas at Austin{mich94hj, liuchg, tkalbar, c.julien}‡ Department of Computer Science, University of New Mexico{wrightc, gcroman}@unm.eduAbstract—Advances in mobile computing capabilities and anincreasing number of Internet of Things (IoT) devices haveenriched the possibilities of the IoT but have also increasedthe cognitive load required of IoT users. Existing context-awaresystems provide various levels of automation in the IoT. Many ofthese systems adaptively take decisions on how to provide servicesbased on assumptions made a priori. The approaches are difficultto personalize to an individual’s dynamic environment, and thustoday’s smart IoT spaces often demand complex and specializedinteractions with the user in order to provide tailored services.We propose rIoT, a framework for seamless and personalized automation of human-device interaction in the IoT. rIoT leveragesexisting technologies to operate across heterogeneous devices andnetworks to provide a one-stop solution for device interaction inthe IoT. We show how rIoT exploits similarities between contextsand employs a decision-tree like method to adaptively capture auser’s preferences from a small number of interactions with theIoT space. We measure the performance of rIoT on two realworld data sets and a real mobile device in terms of accuracy,learning speed, and latency in comparison to two state-of-the-artmachine learning algorithms.Index Terms—pervasive computing; smart environments; device discovery and selection;I. INTRODUCTIONWith recent technology advances made in the Internet-ofThings (IoT), there is a growing number of smart deviceshelping to build the many smart-* scenarios that people havelong envisioned [40]. In scenarios like smart-homes and smartoffices, the plethora of these new devices has created manypossibilities for automating daily tasks. At the same time,new challenges arise; a particular challenge to note is thatapplications demand responsive and intelligent approaches toleverage context [10] in IoT environments. In this work, weaddress a fundamental piece of this challenge: automatinghuman-device interaction, by asking a simple yet unsolvedquestion: how can contextual information be leveraged tomake IoT device interaction more seamless and personalized?To make seamlessness and personalization concrete, consider a smart home system, embedded with sensors and actuators. Smart lights adjust lighting based on indoor illumination;a smart coffee maker automatically starts coffee when the userwakes up. While these individual applications enable someautomation, they do not achieve the full vision of disappearingcomputer [40]. One gap that remains is directly related toabstractions for user interaction chosen by manufacturers [2],[7]. Simply put, interactions are not seamless. At setup, auser usually needs to connect the devices based on macaddresses, name each device, and remember them. To interactwith devices, the user either scripts the behavior in advancein arbitrary computer-friendly languages (e.g., IFTTT [29],Hydra [12]) or must recall the name defined at setup and issuecommands like “set the light over the stove to bright”. Neitherthe scripted behavior nor the tailored commands provide aseamless interaction paradigm. We argue that a truly seamlessIoT world will allow the user to interact with devices usingsimple and generic instructions like “turn the light on”.On the other hand, a key selling point of IoT applicationsis the personalization they enable by allowing users to customize the configuration with personal preferences. While suchfeatures are common, they are often limited and contrived [1].For instance, although the smart coffee machine may allow theuser to configure a customized time to start the coffee machineevery day, such “personalization” is under the assumption thatthe interaction related to this device is based on time. If a userwants to start coffee after returning home from jogging, whichmay or may not happen every day, the user cannot benefitfrom the “smartness” of the coffee machine. Personalizationin modern IoT systems should not require a user to expressher preference via manufacturer defined assumptions.Our solution to seamlessness and personalization is throughcontext-awareness. Significant work has been done in contextawareness over the past decade [14], [31], [36], supportingbetter collecting, processing, and reasoning about context.In the IoT, a user’s context can include any informationthat describes the user’s situation, from location and time toambient conditions or the presence of others [11]. We focuson utilizing collected contextual information to predict thedevice and associated service(s) that a user needs when theuser makes a simple generic request (e.g., “turn on the light”).Unlike existing solutions, we respond immediately to userssupplied negative feedback and re-attempt the action.We propose, rIoT, a framework enabling responsive andreinforcing automation in IoT service personalization. rIoTenables context-aware automation by providing a seamlessand low effort approach to personalizing how IoT services arechosen to support a given request from a user. In contrast to thecommon IoT application workflow in which users must exertnon-negligible efforts on the process of configuring, labeling,

Requestest estququReReRequetWe use state of the art middleware in the IoT to motivate thegap that rIoT fills. IoT devices fall into two categories: sensorsand actuators. Users make requests to actuators to take someaction (e.g., turn on the lights, adjust the volume, etc.), whilesensors passively collect contextual information. A device cantake on both roles simultaneously, e.g., a thermostat can bothsense temperature and actuate the temperature set point.Motivating Application Scenario. Alice is a smart homeenthusiast who owns several IoT actuators: a smart lock, lights,security cameras, and a stereo system. She is an early adopterwho purchases solutions as they become available, so herdevices are from five different manufacturers. Alice also hasa networked sensor system throughout her home, provided byyet another manufacturer. The stereo system is her favoritebecause it supports sophisticated collaboration among all ofher speakers to provide ideal sound quality. Alice must figureout how to control all of the devices to satisfy her needs butminimize her overhead in interacting with them.Existing Middleware Architectures for the IoT. Fig. 1captures the architectures of the two primary control optionsavailable to Alice, given today’s current technologies. The first,in Fig. 1a, is a manufacturer-oriented view in which Alicecontrols actuators through different manufacturer gateways andtheir (proprietary) applications. The obvious advantage is thatmanufacturers can provide comprehensive services for theirdevices. For Alice, this means she can enjoy the featuresSensorsesquII. M OTIVATION AND R ELATED W ORKRequestFeedbackReand specifying a device before using it, rIoT incorporates acontext learning algorithm that automatically adjusts basedon user intentions and the environment. This learning is acontinuous process that adaptively evolves a learned model asusers change their interaction preferences or the environmentchanges. rIoT does not rely on a priori knowledge and learnsa user’s intentions only from the history of user interactions.In summary, rIoT leverages rich contextual informationto enable increased automation of user-oriented IoT deviceinteraction. Our key research contributions are: We propose a context-aware learning framework, rIoT,for user-oriented IoT device selection. We incorporate user configurable context abstractions toenable personalization at per device level. We devise a context-aware algorithm that learns a user’sinteraction pattern with no a priori knowledge about thedevice, space, and user. We quantitatively evaluate rIoT using two real-worlddata sets. We show that rIoT has high accuracy and canquickly recover from environmental dynamics.In Section II, we present an overview of the related workand key preliminaries of our proposed approach. We thenpresent an overview of rIoT and its position in an IoTdeployment. Section IV presents rIoT in detail, includingthe underpinning learning algorithms. We evaluate rIoT inSection V, comparing it to two alternative learning algorithmsin the context of real-world contextual data. Section VI t-Aware MiddlewareSensorsLockNestHUECamSONYActuators(a) Manufacturer oriented(b) Middleware orientedFig. 1: Existing human-device interaction in the IoTthat achieve ideal sound quality from the stereo system.On the other hand, Alice has to navigate steep and diverselearning curves associated with each of the manufacturers.Although some manufacturers allow users to define personalized automation based on primitive context information liketime, they cannot leverage sensor data provided by othermanufacturers [6], and as a result, they fail to fully respond toa user’s more subtle intentions due to lack of context [41]. Inother words, these systems’ approaches to “personalization”do not truly reflect the user.Fig. 1b shows another option in which Alice can employ ageneral-purpose IoT middleware as a sort of universal gateway(e.g., IFTTT [29], Hydra [12]). The advantage is that Aliceonly needs to learn one control language that can also leveragecontextual data collected by diverse sensors. The disadvantageis that Alice has to define all the interaction patterns by herselfusing some script language defined by the middleware. Evenwith current context-aware automation solutions [3], [4], [26],since the control interfaces are designed by a third-party, somedevice features may not be supported. For example, it may notbe feasible for a third-party framework to coordinate multiplespeakers to provide manufacture designed sound effect.Context-Awareness in the IoT. An obvious pain point isthe inability of existing middleware to internalize an expressiveand complete notion of context, a need that has been identifiedin both the research community [36] and in the industry [9].Existing work incorporating context-awareness into IoT-likeapplications adopts a semantic approach [17], [24], [38], [39],where context-awareness relies on a pre-defined ontology characterizing devices, users, and their relationships. In contrast,providing users seamless experiences requires an approach thatdoes not rely on a user having a priori knowledge about howIoT devices affect the space in which they are located. Thisis necessary to ensure the approach is suitable for new spacesa user encounters for the first time or for spaces in which thedevices or environment are dynamic.CA4IOT [31] is a context-aware architecture that selectssensors to provide context based on a likelihood index thatcaptures the weighted Euclidean distance from the user’s context and the context of the sensors. However, context reasoningis based only on a pre-defined static distance function withfixed contextual inputs. Probabilistic and association basedsolutions [27], [23] provide efficient activity sensing and fluiddevice interaction, while other approaches use Hidden MarkovModels (HMMs) to model context-awareness [4], [5], [25],[34]. These approaches either require a list of pre-defined “sit-

estFeedbackSensorsLocal ModelExternal ControllerFig. 2: The overview of the rIoT frameworkuations” to which they are restricted, or they make assumptionsrestricting the context and the environment. More recently,deep learning has provided context-aware activity recognition,interaction prediction, and smart space automation [15], [28],[33]. Despite their promise, these approaches require verylarge data sets for training, which makes them not suitablefor personalized approaches in which data is small.The aforementioned approaches inspire our work. We targeta more general space with diverse devices that may dynamically change. The key challenges of interactive machinesin general [37], articulates a gap between what systems cansense about the context and the user’s actual intentions. Thatis, no matter how many sensors we use to capture context,gaps will exist in the system’s knowledge. Therefore, unlikeexisting solutions, we emphasize that user feedback should beexplicitly included in the decision-making process.System Support for rIoT. Efficiently collecting context hasbeen well studied. Through multi-device cooperation, continuous monitoring systems like CoMon [20] and Remora [18]enable context generated by sensors to be consumed byapplications executing on nearby smartphones. Self-organizedcontext neighborhoods [22], [21] built using low-end sensorshave negligible communication overhead. It is exactly becauseof the availability of these cost-effective continuous sensingsystems that rIoT’s vision of IoT personalization can seamlessly incorporate expressive context in an IoT enabled space.We rely on existing solutions to provide connectivity amongheterogeneous IoT devices. The web-of-things [13] makesdevices available as web services and thus accessible througha canonical interface. Lightweight solutions [35] opportunistically discover surrounding devices and control them throughusers’ personal devices. In this work, we focus on how toutilize context to better select and control these devices.III. A N OVERVIEW OF R I OTIn this section, we overview rIoT’s core contributions anddefine its underlying key concepts. We describe our algorithmsin detail in the following section. Our work targets smartspaces that contain multiple rooms equipped with IoT devices.There may be one or more users sharing the space, howeverwe assume that requests from different users are compatiblewith each other (e.g., we assume that two users never simultaneously request different actions on the same devices).In Fig. 1, we identified a trade-off between user-orientedpersonalization and manufacturer-oriented features. We arguethat it is important to enable personalization yet retain the fullcapabilities of devices. As shown in Fig. 2, rIoT inserts itselfbetween applications and IoT devices to allow applicationsto leverage context to automatically determine which devicesand what actions on those devices best match a user’s needsand expectations. rIoT encapsulates a context builder thatcollects and abstracts sensor readings into high-level, usablecontext. rIoT’s decider uses context information, knowledgeabout available IoT devices, and knowledge about the user’sprior interactions to choose (i) the best device to fulfill a user’srequest and (ii) the best action to take on that device.We assume that users’ requests for IoT devices to takeactions may be of varying levels of detail. At one end ofthe spectrum, a user may ask for a specific device to take aspecific action (e.g., “turn off the kitchen light”). At the otherend, the user might simply ask for the IoT to “act”. In thiscase, rIoT needs to determine which action on which deviceis most likely to satisfy the request. There are a variety ofrequests in between; for example, given the request “turn onthe light”, rIoT knows that the right action is to “turn on” andthat the type of device is “light”, but must determine whichlight device to act on. While we support all levels of specificity,in this paper, we focus primarily on the least specified, i.e.,situations in which the user simply says “act”, and rIoT mustselect the combination of device and action that best satisfythe user.rIoT learns a local utility model for each IoT device; thismodel (fd ) captures the likelihood that a given action onthat device is the “best” action to take given a snapshot ofthe context at the moment that the user makes a request.Conceptually, each device proposes the action on that devicethat has the highest utility in the given context. rIoT’s decidercompiles all of the devices’ proposals and selects the one withthe highest overall utility. Given rIoT’s choice of action, eachdevice receives implicit feedback (i.e., if the device’s proposalwas selected, the device receives positive feedback; otherwise,the device receives negative feedback). Thus a device can learnabout the utility of its actions in the context of other co-locateddevices. In addition, once the action is taken, rIoT allows theuser to provide explicit feedback to reinforce (either positivelyor negatively) rIoT’s selection. The feedback is incorporatedinto the device’s utility model, allowing it to learn over timebased on the user’s interactions in the space.rIoT’s architecture also allows applications to maintainaccess to manufacturer-specific actions. In such situations,rIoT controls the devices as a system through an externalcontroller as depicted to the right of Fig. 2. Rather than theindividual devices proposing an action, this controller proposesfor all devices it controls. This allows exposing manufacturerspecific actions as part of the rIoT decision framework.We next define some terms that we use throughout the paper.Definition 1. (context c) Practically, a context c is any singlepiece of numerical or categorical data and can be raw sensorreading like temperature or illumination level, or an abstractvalue derived from raw data, e.g., isAtHome, Cooking.Definition 2. (context snapshot Ct ) We define Ct as a vectorof context values ct,i that describe the user’s situation at time

q1.Re2.Proposal3.FeeDispatch Request (Rt )Make Decision (maxU)dbackLearn FeedbackSensorSensorLocalReadingReadingModel (fd)Context ReadingFig. 3: The overview of data flow in rIoTt, i.e., Ct (ct,0 , . . . , ct,i , . . . , ct,n ). We assume that the ithelement of any snapshot is always the same type of context;c0 is always the user’s identity.Definition 3. (device d) A device is an actuator that can bediscovered and controlled through a virtual controller.Definition 4. (device class T ) A class T is a set of devices thathave the same type, and therefore the same action interface,e.g., d Tlight . We assume a hierarchy of classes. For example,a dimmable light is also a light, i.e., Tdimmable Tlight .Definition 5. (action a) An action is performed by a device,e.g., turnOn, turnOff, etc. A is the set of all actions; Ad is theset of actions device d can perform. We assume Ad is finite.Definition 6. (request R) A request Rt made by the user attime t is a pair of class and action, both of which are optional.Specifically, Rt hT, ai indicates that the user wants a deviced T to do action a. A request’s fields can be blank, i.e.,R h , i, which indicates that the user requests the IoT to“act”, or have only one of the two fields, e.g., R hlight, iindicates a request for some light to take some action.Formally, our problem statement is:Given a user’s request Rt at time t and a snapshotof the context Ct at the same time t, output a tuplehd, ai that specifies the action a to be taken on deviced to best satisfy the request RtIV. C ONTEXT-AWARE D ECISION M ODELS IN R I OTWe now describe the components and processes that allowus to fill in the architecture in Fig. 2. We then describe rIoT’scontextually-driven learning algorithm in (providing explicit negative feedback). This informationis used to update the local models. In the case of negativefeedback, the decider repeats the decision process and makesa new proposal to satisfy the original request.1) Building Context Snapshots: The context builder translates raw sensor data into contextual data, which the localutility models use for learning. We rely on four generic contextabstractions: (1) the numeric context allows the context builderto capture standard numerical values, e.g., temperature, pressure; (2) the cyclic numeric context captures context types thatare numerical but “roll over” on some predictable schedule,e.g., time, day of the week; (3) the N-dimensional vectorcontext captures context values that are represented by a tupleof values, e.g., location coordinates; and (4) the categoricalcontext captures labeled values, e.g., human activity, binarydata. Depending on the type of context and the availablesensors, the context builder assembles the higher level valuesout of the raw sensor data. rIoT leverages existing work incontext construction to implement the context builder.2) Context Distance Functions: The devices’ local utilitymodels will propose actions to take in a given context basedon the feedback they have received about prior actions inthe same or similar contexts. To judge the similarity of twocontexts, rIoT relies on context distance functions. We firstdefine dist(c, c0 ), or the distance between two contexts (e.g.,the distance between two locations, the distance between twotemperature values, etc.). Primitive context types typicallyhave easily defined distance function (e.g., geometric distance,absolute value, cyclical distance). rIoT makes one additionalconstraint on any dist(c, c0 ), i.e., that the distance is normalized to the range [0, 1]. With this simple definition of contextual distance, we build a distance function dist(Ca , Cb , W )that captures the distance between two context snapshots.Definition 7. (context snapshot distance dist(Ca , Cb , W ))This distance is computed using the Manhattan distance [19];the vector W weights the elemental values of the snapshots:nXdist(Ca , Cb , W ) wi dist(ca,i , cb,i )i 0where, Ca (ca,0 , ca,1 , . . . , ca,n ), Cb (cb,0 , cb,1 , . . . , cb,n ),W (w0 , w1 , . . . , wn ), (0 wi 1)A. rIoT ApproachAs described in the overview, rIoT learns a local utilitymodel for each IoT device. Conceptually, these models “belong” to the devices themselves, but, as Fig. 2 shows, the models are part of rIoT. The only exceptions are external models,which may contain manufacturer-proprietary information; inthese cases, the rIoT local model is a proxy for the externalmodel, which resides under the manufacturer’s purview.Fig. 3 shows the flow of requests, context, and decisions inrIoT. The user (at the left) makes requests to rIoT’s decider,which resolves them using input from the local utility models.These models in turn rely on context snapshots generated bythe context builder. Given a decision, the user may acceptthe proposal (providing implicit positive feedback) or rejectBecause the weight vector W is an input to the function,each local model can use a different distance function for context snapshots, enabling personalization. For instance, someusers may have a strict daily routine based on the clock, inwhich case a difference in time means more for this user thanother context types. An interaction with lights is more likelybased on location, while requests for a remote camera maydepend more on suspicious sounds or movements.Because context values can be continuous, it can be usefulto discretize context snapshots into buckets. When we do so,we need to ask whether a bucket “contains” a context snapshot.We define contain(cl , cu , c) for the first three elemental context types as cl c cu and as c (cl cu ) for categoricalcontext. We next extend this to context snapshots:

Definition 8. (context snapshot contains contain(Ca , Cb , Cx ))This function simply requires the contain function for all ofthe elements of the snapshot to be true:contain(Ca , Cb , Cx ) (FALSE, if i, s.t. contain(cai , cbi , cxi ) FALSETRUE , otherwise3) Defining and Using Local Utility Models: rIoT’s localutility models capture the suitability of devices’ actions torequests from users in certain context states.Definition 9. (local model fd ) fd : C R A IR mapsa request and context snapshot onto an action the device cantake and a utility value, u [0, 1]. The utility captures thelikelihood that the action is the “right” one given the requestand context. fd (C, R) results in a proposal Pd hd, a, ui.When a user requests an action, rIoT captures the contextand requests a proposal from each device’s local utility model.rIoT’s objective is to output the final decision PRt ,Ct hd, a, ui, which is the winning proposal, i.e., the proposal withthe maximum utility across all of the devices’ proposals.rIoT’s key challenge is therefore how to compute the fdmodels for each device d. Because we do not want to make anyassumptions about or place any constraints on the environmentin which rIoT operates, our approach is to fit the fd modelsusing each user’s history of interactions with each device,leveraging similarities among the contexts of the interactions.B. Context Learning in rIoTImagine that Alice has four IoT lighting devices in herhome. Depending on the context, a “turn light on” requestmay indicate a desire to turn on any one of these lights. Forinstance, when Alice awakens at 6:30am and says “turn lighton”, she intends to illuminate the bedroom. While cooking inthe kitchen (regardless of the time of day), a “turn light on”request should control the kitchen light. And when Alice isreading anywhere in her home, “turn light on” should affect thelight closest to her current location. Alice’s context snapshotsmay contain, for example, time (captured as a cyclic numericcontext), Alice’s location (captured, in a coordinate system),and activity (categorical context).The goal of rIoT is to learn each local utility functionfd (C, R) based on the user’s interactions and feedback.However, learning this function directly can be challenging.Context can be continuous, and the target function may benon-linear and non-continuous. Therefore, rather than learningfd (C, R) directly, we introduce a piecewise function set as anapproximation, based on a concept we call state:Definition 10. (state S) A state is defined by three contextsnapshots, S (Cmin , Cmax , Cmid ). We use two functions tocompare states: contain(S, Cx ) contain(Cmin , Cmax , Cx ),determines whether a state contains a given context snapshot,while dist(S, Cx , W ) dist(Cmid , Cx , W ) computes how fara context snapshot is from the state. We define the radius ofa state as rS dist(Cmax , Cmin )/2.Algorithm 1: Computing the Local 28S: set of known states, initially emptyfd (C, R): set of piecewise local functions, initially emptyFunction ON R ECEIVE R EQUEST: R hT, ai, Ctif (T 6 d / T ) or (areq 6 areq / Ad ) thenreturn hareq , 0iendif a then let AR Ad else let AR {a}if 6 Si S, s.t. contain(Si , Ct ) TRUE thenSnew (Ct r, Ct r, Ct )if S then a Ad , fˆd,a (Snew ) 0.5else a Ad , initialize fˆd,a (Snew ) from neighborhoodS S {Snew }fd (C, R) fd (C, R) {fˆd,a (Snew ) : a Ad }endlet SR {Si : contain(Si , Ct ) TRUE}let umax hmaxa AR ,Si SR : fˆd,a (Si )ilet amax a s.t. fˆd,a (Si ) umaxreturn Pd hd, amax , umax iendFunction ON F EEDBACK : P hd, a, ui, Ct , feedbackfor Si S s.t. contain(Si , Ct ) TRUE doif feedback is positive thenfˆd,a (Si ) fˆd,a (Si ) rewardelsefˆd,a (Si ) fˆd,a (Si ) rewardendendendThe set of states that discretizes the space of contextsnapshots is not defined a priori but are learned over time.The learned states need not cover the entire space of possiblecontext snapshots, and different devices can have differentrelevant states. In our scenario, Alice’s lights may learn statesdefined by ranges of time and activity labels; her living roomlights may not learn anything about states in the very earlymorning, though her bedroom lights will.We use this concept to approximate each device’s localutility model by combining a utility learned for each state.Definition 11. (local utility of a state, fˆd,a (S)) Each functionfˆd,a u : S IR captures the utility of taking action a on dwhen the context is contained by S. The default value is 0.5which means the likelihood action a is a good choice is 50%.Given a request R hT, areq i in which T is a (potentiallyempty: T ) device type and areq is a (potentially empty:areq ) requested action, device d’s local utility functionfd,a (C, R) can then be approximated as:(Ad , if areq AR areq , otherwiseumax hmax a, S : contain(S, C) a AR :: fˆd,a (S)ii hamax , umax i,if(T d T ) fd,a (C, hT, areq i) (areq areq Ad ) hareq , 0i, otherwiseAlgorithm 1 shows our approach. When receiving a request

R hT, ai, if Ct is not “contained” in any states, rIoT willcreate a state with a default radius r around Ct (Line 9). Thisnew state’s utility is the default (Line 10) or initialized basedon other “nearby” states (Line 11; explained in detail below).Once Ct is “contained” in a known state, rIoT will outputthe action that has the highest utility of all actions and iscompatible with the request R (Lines 15-18). When receivingfeedback from the user, rIoT takes the prior proposal P and theprior context, Ct , and updates the local model. The reward iscomputed using the Sigmoid function; we translate the utilityfrom (0, 1) to ( , ), increment or decrement it by aconstant reward, and then translate it back to (0, 1). Thisadjusts the utility more slowly when it is close to 0 or 1.1) Initializing Models from Nearby States: We assume thata user will have similar behaviors in similar contexts [31].Therefore, when initializing a new state S, rIoT computes theinitial utility values based on utility values for nearby states.For example, if Alice’s actions routinely trigger the bedroomlight at 6:30am, even the first time she requests a light at7:00am, it is likely that she also wants the bedroom light. Tocapture this “nearness” in rIoT, the first time a user makes arequest in a new state, we use the k learned states that areclosest to the new context and us

a smart coffee maker automatically starts coffee when the user wakes up. While these individual applications enable some . although the smart coffee machine may allow the user to configure a customized time to start the coffee machine . a networked sensor system throughout her home, pr