Transcription

Report from Dr Johnny Ryan – Behavioural advertising and personal dataBackground and expertise. 2How personal data are used in behavioural online advertising. 2How personal data are “broadcast”. . 3Concerns about these practices (news reports, NGO investigations, regulatory consideration etc.). 7Correspondence with the industry on this matter to date . 9Appendices . 12Appendix 1.What personal data are shared in OpenRTB bid requests? . 12Appendix 2.What personal data are shared in Google’s proprietary bid requests? . 14Appendix 3.Selected data tables from OpenRTB bid request specification documents. 16Appendix 4.Selected data tables from Google (“Authorised Buyer”) RTB bid requestspecification documents . 221

Background and expertiseMy name is Johnny Ryan. I am the Chief Policy and Industry Relations Officer forBrave, a privacy-focussed Internet Browser.I have worked on both sides of the ad tech and publisher divide. Before I joinedBrave I was responsible for research and analysis at PageFair, an advertisingtechnology company. In that role, I participated in standards setting working groupsfor the ad tech industry. In a previous role, before PageFair, I worked at The IrishTimes, a newspaper, where I was the Chief Innovation Officer.I have had other roles, in academia and in policy. I am the author of two books onInternet issues. One is a history of the technology, which has featured on the readinglist at Harvard and Stanford. The other was the most cited source in the EuropeanCommission’s impact assessment that decided against pursuing Web censorshipacross the European Union. I am a Fellow of the Royal Historical Society, and amember of the World Economic Forum’s expert network on media, entertainmentand information.I have a PhD from the University of Cambridge, where I studied the spread ofmilitant memes on the Web.My expert commentary on the online media and advertising industry has appearedin The New York Times, The Economist, The Financial Times, Wired, Le Monde,NPR, Advertising Age, Fortune, Business Week, the BBC, Sky News, and variousothers.How personal data are used in behavioural online advertising.Every time a “behaviourally” targeted advert is served to a person visiting a website,the system that selects what advert1 to show that person broadcasts their personaldata to hundreds or thousands of companies.These personal data include the URL of every page a user is visiting, their IP address(from which geographical position may be inferred), details of their device, andvarious unique IDs that may have been stored about the user previously to helpbuild up a long term profile about him or her.1This system is known as “Real-time bidding”, or sometimes referred to as “programmatic” (whichsimply means automatic) advertising.2

It is also interesting to note that this system is a relatively recent development inonline media. Only as recently as December 2010 did a consortium2 of advertisingtechnology (“AdTech”) companies agree the methodology for this approach totracking and advertising. Before this, online advertising was placed by far moresimple ad networks that sold ad slots on websites, or by highly lucrative direct salesdeals by publishers.3As detailed below, despite the grace period leading up to the GDPR, the AdTechindustry has built no adequate controls to enforce data protection among the manycompanies that receive data.How personal data are “broadcast”.A large part of the online media and advertising industry uses a system called“RTB”, which stands for “real time bidding”. There are two versions of RTB. “OpenRTB” is used by most significant companies in the online media andadvertising industry. “Authorized Buyers”, Google’s proprietary RTB system. It was recentlyrebranded from “DoubleClick Ad Exchange” (known as “AdX”) to“Authorized Buyers”.4Note that Google uses both OpenRTB and its own proprietary “Authorized Buyers”system.5The consortium included DataXu, MediaMath, Turn, Admeld, PubMatic, and The Rubicon Project.See a note on the history of OpenRTB in “OpenRTB API Specification Version 2.4, final draft”, IABTech Lab, March 2016 (URL: nRTB-APISpecification-Version-2-4-FINAL.pdf), p. 2-3.3 Only in 2006 did the first “ad exchange” emerge, and enable ad networks to auction space on theirclients’ websites to prospective buyers. A pioneer was Right Media, which was bought by Yahoo!.“RMX Direct: alternative ad networks battle for your blog”, Tech Crunch, 12 August 2006 alternative-ad-networks-battle-for-yourblog/? ga 29047)4 "Introducing Authorized Buyers", Authorized Buyers, Google 70822, retrieved 24 August 2018).5 “OpenRTB Integration”, Authorized Buyers, Google rs/rtb/openrtb-guide, retrieved 24 August 2018).23

The OpenRTB specification documents are publicly available from the New Yorkbased IAB TechLab.6 The “Authorized Buyers” specification documents are publiclyavailable from Google.Both sets of documents reveal that every time a person loads a page on a websitethat uses real-time bidding advertising, personal data about them are broadcast totens - or hundreds - of companies. Here is a sample of the personal data broadcast. What you are reading or watchingYour location (OpenRTB also includes full IP address)Description of your deviceUnique tracking IDs or a “cookie match” to allow advertising technology companies to try toidentify you the next time you are seen, so that a long-term profile can be built or consolidatedwith offline data about youYour IP address (depending on the version of “RTB” system)Data broker segment ID, if available. This could denote things like your income bracket, age andgender, habits, social media influence, ethnicity, sexual orientation, religion, political leaning, etc.(depending on the version of “RTB” system)These data show what the person is watching and reading, and can include - or bematched with - data brokers’ segment IDs that categorise what kind of people theyare.A more complete summary of the personal data in Open RTB bid requests, whichare used by all RTB advertising companies, including Google, is provided for yourconvenience in Appendix 1.A summary of the personal data in Google’s proprietary bid requests is provided inAppendix 2.Relevant excerpts from the OpenRTB “AdCOM” specification documents arepresented in Appendix 3, and excerpts from Google’s proprietary RTB specificationdocuments are provided in Appendix 4.How it worksA diagram of the flow of information is provided below.In summary, the broadcast of these personal data under RTB is referred to as an“RTB bid request”. This is generally broadcast widely, since the objective is to solicitbids from companies that might want to show an ad to the person who has just6The IAB is the standards body and trade lobby group of the global advertising technology industry.All significant ad tech companies are members. The IAB has local franchises across the globe. Itsstandards-setting organisation is IAB TechLab.4

loaded the webpage. An RTB bid request is broadcast on behalf of websites bycompanies known as “supply side platforms” (SSPs) and by “ad exchanges”.The diagram below shows how personal data are broadcast in bid requests tomultiple Demand Side Partners (DSPs), which then decide whether to place bids forthe opportunity to show an ad to the person in question. The DSP acts on behalf ofan advertiser, and decides when to bid based on the profile of person that theadvertiser has instructed it to target.Sometimes, Data Management Platforms (DMPs), of which Cambridge Analytica is anotorious example, can perform a “sync” that uses this personal data to contribute totheir existing profiles of the person. In it worth noting that this sync would not bepossible without the initial bid request.The overriding commercial incentive for many ad tech companies is to share asmuch data with as many partners as possible, and to share it with partner or parentcompanies that run data brokerages. Clearly, releasing personal data into such anenvironment has high risk.Despite this high risk, RTB establishes no control over what happens to thesepersonal data once an SSP or ad exchange broadcasts a “bid request”. Even if bidrequest traffic is secure, there are no technical measures that prevent the recipient ofa bid request from, for example, combining them with other data to create a profile,or from selling the data on. In other words, there is no data protection.5

That IAB Europe’s own documentation for its “GDPR Transparency & ConsentFramework”, says that a company that receives personal data should only sharethese data with other companies if it has “a justified basis for relying on thatVendor’s having a legal basis for processing the personal data”.7 In other words, theindustry is adopting a “trust everyone” approach to the protection of very intimatedata once they are broadcast.There are no technical measures in place to adequately protect the data. I note thatIAB Europe recently announced that it is developing a tool, in collaboration with anorganisation called The Media Trust, that will attempt to determine whether the"consent management platforms" (CMPs) that participate in the IAB EuropeFramework are complying with the Framework’s policies. According to IABEurope’s press release, the tool "validates whether a CMP’s code conforms to thetechnical specifications and protocols detailed in the IAB Europe Transparency &Consent Framework".8But the tool, which is currently only in beta, will be inadequate to protect personalintimate personal data broadcast in bid requests. This is because - even if it couldpolice all web-based data transmission9 - it would still have no way of knowingwhether, for example, a company had set up a continuous server to server transfer ofpersonal data to other companies.Once the personal data are released in a bid request to a large number of companies,the game is over. In other words, once DSPs receive personal data they can freelytrade these personal data with business partners, however they wish.This is particularly egregious since the data concerned are very likely to be “specialcategories” of personal data. The personal data in question reveal what a person iswatching online, and often reveal specific location. These alone would reveal aperson’s sexual orientation, religious belief, political leaning, or ethnicity. Inaddition, a “segment ID” that denotes what category of person a data broker orother long-term profiler has discovered a person fits in to."IAB Europe Transparency & Consent Framework – Policies", IAB Europe, 25 April 2018 s/legal/currenttcfpolicyFINAL.pdf), p. 7.8 “IAB Europe Press Release: IAB Europe CMP Validator Helps CMPs Align with Transparency &Consent Framework”, IAB Europe, 12 September 2018 (URL: ).9 See “Data compliance”, The Media Trust website (URL: 76

Moreover, the industry concerned is aware of the shortcomings of this approach,and has continued to pursue it regardless.RTB bid requests do not necessarily need to contain personal data. If all industryactors agreed, and amended the standard under the stewardship of the IAB, then bidrequests that contain no personal data could be passed between ad tech companiesto target relevant advertising by general context. This, however, would preventthese companies and their business partners from building profiles of people, whichwould have a revenue implication. The industry is currently finalising a new RTBspecification (OpenRTB 3.0), which continues to broadcast personal data withoutprotection in the same way that previous versions of the OpenRTB system. Tablesfrom OpenRTB 3.0 that show the personal data in question are presented for yourconvenience in Appendix 4.Online advertising that uses this approach will continue to disseminate details aboutwhat every person is reading or watching in a constant broadcast to a large numberof companies. These personal data are not protected. This dissemination iscontinuous, happening on virtually every website, every single time a person loads apage.This is a widespread and troubling practice. The scope of the industry affects thefundamental rights of virtually every person that uses the Internet in Europe.Concerns about these practices (news reports, NGO investigations,regulatory consideration etc.)Survey data over several years demonstrates a general and widespread concernabout these practices. The UK Information Commissioner’s Office’s own survey,published in August 2018, reports that 53% of British adults are concerned about“online activity being tracked”.10In 2017, GFK was commissioned by IAB Europe (the AdTech industry’s own tradebody) to survey 11,000 people across the EU about their attitudes to online mediaand advertising. GFK reported that only “20% would be happy for their data to beshared with third parties for advertising purposes”.11 This tallies closely with surveythat GFK conducted in the United States in 2014, which found that "7 out of 10 Baby“Information rights strategic plan: trust and confidence”, Harris Interactive for the InformationCommissioner’s Office, August 2018, p. 21.11 “Europe online: an experience driven by advertising. Summary results”, IAB Europe, September2017 (URL: 2017/09/EuropeOnline FINAL.pdf), p. 7.107

Boomers [born after 1969], and 8 out of 10 Pre-Boomers [born before 1969], distrustmarketers and advertisers with their data”.12In 2016 a Eurobarometer survey of 26,526 people across the European Union foundthat:“Six in ten (60%) respondents have already changed the privacy settings ontheir Internet browser and four in ten (40%) avoid certain websites becausethey are worried their online activities are monitored. Over one third (37%)use software that protects them from seeing online adverts and more than aquarter (27%) use software that prevents their online activities from beingmonitored”.13This corresponds with an earlier Eurobarometer survey of similar scale in 2011,which found that “70% of Europeans are concerned that their personal data held bycompanies may be used for a purpose other than that for which it was collected”.14The same concerns arise in the United States. In May 2015, the Pew Research Centrereported that:“76% of [United States] adults say they are “not too confident” or “not at allconfident” that records of their activity maintained by the onlineadvertisers who place ads on the websites they visit will remain private andsecure.”15In fact, respondents were the least confident in online advertising industry keepingpersonal data about them private than any other category of data processor,including social media platforms, search engines, and credit card companies. 50%said that no information should be shared with “online advertisers”.16“GFK survey on data privacy and trust: data highlights”, GFK, July 2015, p. 29.“Eurobarometer: e-Privacy (Eurobarometer 443)”, European commission, December 2016 FLASH/surveyKy/2124), p. 5, 36-7.14 “Special Eurobarometer 359: attitudes on data protection and electronic identity in the EuropeanUnion”, European Commission, June 2011, p. 2.15 Mary Madden and Lee Rainie, “Americans’ view about data collection and security”, Pew ResearchCenter, May 2015 (URL: .15 FINAL.pdf), p. 7.16 Mary Madden and Lee Rainie, “Americans’ view about data collection and security”, Pew ResearchCenter, May 2015 (URL: .15 FINAL.pdf), p. 25.12138

In a succession of surveys, large majorities express concern about ad tech. The UK’sRoyal Statistical Society published research on trust in data and attitudes towarddata use and data sharing in 2014, and found that:“the public showed very little support for “online retailers looking at yourpast pages and sending you targeted advertisements”, which 71% said shouldnot happen”.17Similar results have appeared in the marketing industry’s own research. RazorFish,an advertising agency, conducted a study of 1,500 people in the UK, US, China, andBrazil, in 2014 and found that 77% of respondents thought it was an invasion ofprivacy when advertising targeted them on mobile.18These concerns are manifest in how people now behave online. The enormousgrowth of adblocking (to 615 million active devices by the start of 2017)19 across theglobe demonstrates the concern that Internet users have about being tracked andprofiled by the ad tech industry companies. One industry commentator has calledthis the “biggest boycott in history”.20Concern about the misuse of personal data in online behavioural advertising is notconfided to the public. Reputable advertisers, who pay for campaigns online, areconcerned about it too. In January 2018, the CEO of the World Association ofAdvertisers, Stephan Loerke, wrote an opinion piece in AdAge attacking the currentsystem as a “data free-for-all” where “each ad being served involved data that had beentouched by up to fifty companies according to programmatic experts Labmatik”.21Correspondence with the industry on this matter to date“The data trust deficit: trust in data and attitudes toward data use and data sharing”, RoyalStatistical Society, July 2014, p. 5.18 Stephen Lepitak, “Three quarters of mobile users see targeted adverts as invasion of privacy, saysRazorfish global research”, The Drum, 30 June 2014 n-privacy-says-razorfish).19 “The state of the blocked web: 2017 global adblock report”, PageFair, January ir-2017-Adblock-Report.pdf).20 Doc Searls, “Beyond ad blocking – the biggest boycott in human history”, Doc Searls Weblog, 28September 2015 1 Stephan Loerke, "GDPR data-privacy rules signal a welcome revolution", AdAge, 25 January 2018(URL: -a-revolution/312074/).179

On 16 January 2018 I wrote to representatives of the IAB Europe working group (viaIAB UK) to privately give feedback on a private draft of the IAB-led industryresponse to GDPR. I highlighted the following.First, bid requests would leak personal data among many parties without anyprotection. This would infringe Article 5 of the GDPR.Second, a lack of granularity and informed choice in the IAB’s consentframework arose from the conflation of many separate purposes under asmall number of nebulous purposes, and inadequate information. This wouldrender consent invalid.Although I was thanked for my input, I received no substantive response.On 21 February 2018, in a video call, I raised concerns about the leakage of personaldata in bid requests with the coordinator of the IAB TechLab working groupresponsible for designing an update to the new OpenRTB specification.But when the IAB published its GDPR “framework” in March I learned that none ofthese concerns had been addressed. On 20 March 2018, I published my originalfeedback in an open letter. This is online at roblems/.On 4 September 2018 I wrote a detailed letter to the IAB and to IAB TechLab onbehalf of Brave, to highlight critical data protection flaws in OpenRTB 3, an updateto the RTB specification on which the IAB has solicited feedback. I set out in detailthe acute hazard of broadcasting the personal data of a website visitor in bidrequests, every time that the visitor loads a page. The letter I sent is available he-beta-OpenRTB-3.0specification-.pdf.On 5 September 2018, the IAB responded with a four line email that rejected thematter:Feedback on the beta OpenRTB 3.0 specification *@iabtechlab.com Wed, Sep 5, 2018 at 6:46 PMTo: Johnny Ryan *@brave.com , OpenMedia [email protected] Cc: *@iabtechlab.com , *@iabtechlab.com Johnny,Thank you for submitting this feedback to the OpenRTB working group; your feedback has beenshared with OpenRTB and Tech Lab leadership. It is (and always has been) the responsibility of10

companies themselves to be aware of any and all relevant laws and regulations, and to adjusttheir platforms and practices to be compliant. In this case, any implementer of OpenRTB whoshould also be complying with GDPR could do so perhaps by using the Transparency andConsent Framework to communicate consumer consent and/or legitimate interest. OpenRTBrepresents protocol, not policy.Thank you,Jennifer & OpenRTB working groupJennifer DerkeDirector of Product, Automation/ProgrammaticIAB Tech LabSan Francisco, CA[Quoted text hidden]11

APPENDICESAppendix 1.requests?What personal data are shared in OpenRTB bidThis summary list is incomplete. Other fields may contain personal data.22“Site”23 The specific URL that a visitor is loading, which shows what they are reading orwatching.“Device”24 Operating system and version.Browser software and version.IP address.Device manufacturer, model, andversion. Height, width, and ratio of screen. Whether JavaScript is supported. The version of Flash supported bythe browser. Language settings. Carrier / ISP. Type of connection, if mobile. Network connection type. Hardware device ID (hashed). MAC address of the device (hashed).“User”25 An Ad Exchange’s unique personal identifier for the visitor to the website. (Thismay rotate, but the specification says that it “must be stable long enough toserve reasonably as the basis for frequency capping and retargeting.”26) Advertiser’s “buyeruid”, a unique personal identifier for the data subject. The website visitor’s year of birth, if known. The website visitor’s gender, if known. The website visitor’s interests. Additional data about the website visitor, if available from a data broker.27(These may include the “segment”28 category previously decided by the databroker, based on the broker’s previous profiling of this particular person.)For example, thirty eight of the data fields in the specification contain the phrase “optional vendorspecific extensions”.23 “Object: site” in “AdCOM Specification v1.0, Beta Draft”, IAB TechLab, 24 July 2018 -site-).24 “Object: device” in ibid.25 “Object: device” in ibid.26 ibid.27 “Object: data” in ibid.28 “Object: segment” in ibid.2212

“Geo”29 Location latitude and longitude. Zip/postal code.29“Object: geo” in ibid.13

Appendix 2.bid requests?What personal data are shared in Google’s proprietary“Publisher”30 The specific URL that a visitor is loading, which shows what they are reading orwatching. Note that sometimes publishers using Google’s system prevent theirURL from being shared.31“Device” Operating system and version. Browser software and version (somedata may be partially redacted).32 Device manufacturer, model, andversion. Height, width, and ratio of screen. Language settings. Carrier. Type of connection, if mobile. Hardware device IDs33 (in “somecircumstances”, Google may impose“special constraints” on this. Theseconstraints are not defined)34“User” The Google ID of the website visitor(May be subject to some form of undefined “special constraints” in “somecircumstances”.)35 Google’s “Cookie Match Service” results, which enables a recipient to determineif the website visitor is a person they already have a profile of, and to combinetheir existing data with new data in the bid request.36All items in this appendix are drawn from “Authorized Buyers Real-Time Bidding Proto”, Google,5 September 2018 (URL: b/realtime-biddingguide).31 “Set your mobile app inventory to Anonymous or Branded in Ad Exchange”, Google Ad ManagerHelp (URL: 9?hl en)32 “Certain data may be redacted or replaced”, see “user agent” in “Authorized Buyers Real-TimeBidding Proto”, Google, 5 September 2018 (URL: /realtime-bidding-guide).33 Some fields (such as advertising id) are sent encrypted, but recipients can decrypt using keys thatGoogle gives them when they set up their accounts, or are sent using standard encrypted SSL webconnections. See “Decrypt Advertising ID”, Authorized Buyers, Google rs/rtb/response-guide/decrypt-advertising-id).34 “In some circumstances there are special constraints on what can be done with user data for an adrequest”. Google vaguely states that in such a case, “user-related data will not be sent unfettered”.User ID, Android or Apple device advertising ID, and “cookie match” data can be affected. See“User Data Treatments”, Authorized Buyers, Google rs/rtb/user data treatments).35 ibid.36 "Cookie Matching", Google, 5 September 2018 (URL: /cookie-guide?hl en).3014

(May be subject to some form of undefined “special constraints” in “somecircumstances”.)37 The website visitor’s interests. Whether the website visitor is present on a particular “user list” of targetedpeople (which may be a category previously decided by an advertiser, or thedata broker they acquired the data from, based on the broker’s previousprofiling of this particular person).“Location” Location latitude and longitude. Zip/postal code, or postal code prefix if a full post code is unavailable. Whether the user is present within a small “hyper local” area.37see note 36.15

Appendix 3.Selected data tables from OpenRTB bid requestspecification documentsThe following tables are copied from AdCOM specification v1, which is part of theOpenRTB 3.0 specification.38 This defines what data can be included in a bid request.Only selected tables relevant to website bid requests are included here. URLs of thespecific part of the specification from where the tables are taken are presented aboveeach md#object--site-38 “AdCOM Specification v1.0, Beta Draft”, IAB TechLab, 24 July 2018 au/AdCOM/blob/master/AdCOM%20BETA%201.0.md).16

ect--user-17

ment-18

Device19

ce-Location20

21

Appendix 4.Selected data tables from Google (“Authorised Buyer”)RTB bid request specification documentsThe following tables are copied from Google’s RTB documentation.39 This defineswhat data can be included in a bid request. Only selected tables relevant to websitebid requests are included here. URLs of the specific part of the specification fromwhere the tables are taken are presented above each table.39“Authorized Buyers Real-Time Bidding Proto”, Google, 5 September 2018 rs/rtb/realtime-bidding-guide)22

User23

24

25

26

Publisher27

28

Location29

Device30

31

32

A more complete summary of the personal data in Open RTB bid requests, which are used by all RTB advertising companies, including Google, is provided for your convenience in Appendix 1. A summary of the personal data in Goo