html Criteo, a French surveillance marketing giant | Tracking pixels

Criteo, a French surveillance marketing giant

Aggressive tracking on all your devices, major security breach, duplicity of speech and lack of consent but still impunity for Criteo

Published by Pixel de Tracking on May 22, 2020

Criteo, an advertising giant whose business model is based on your surveillance

Little known to the general public, Criteo is a French success story having succeeded in becoming the world leader in retargeting. According to LinkedIn, the company has more than 3000 employees, it is also listed on NASDAQ since October 2013.

How does Criteo work? All I need to do is consult a partner e-Commerce site (example: Fnac), then a media site (example: Lemonde.fr) to be bombarded with advertisements displaying previously viewed products as well as suggestions:

example

Retargeting advertisements such as those offered by Criteo are the most intrusive on the web, they make you realize that your purchasing habits hold no secrets from advertisers. And if we look the typology of advertising players, Criteo is one of the players least respectful of your privacy:

  • Criteo is an ad-network: the more it knows you, the more money it earns.
  • The data collected by Criteo is very personal: your products viewed and your purchases.
  • Unlike an e-retailer such as Amazon which has a storefront and a direct relationship with its customers, Criteo operates in a hidden manner. You don't know you're being hunted, you didn't choose it.

Revenues put at risk by web browsers

So you might say, Criteo is an old-school ad company, one that grew when mobile wasn't yet dominant, when retargeting was solely based on third-party cookies.

If you don't use an adblocker, there is a good chance that your browser is already blocking Criteo tracking: Safari with Intelligent Tracking Prevention, Firefox, Brave and even Edge have recently taken strong action to combat tracers. Chrome is very late, but it will ban third-party cookies in less than 2 years. And you can always decide to delete third-party cookies via your browser settings, in order to start fresh against adtech companies.

These browser measures protect you against invasive tracking by adtech companies, which can no longer correctly identify you, as Criteo explains to its investors (slide 7):

Criteo Browsers

The protections put in place by browsers are excellent news for users, they represent a vital danger for Criteo, its R&D is therefore investing in workaround solutions.

A history of bypassing browser protections

Criteo doesn't say this in its investor presentation, but it has experience circumventing browser protections. If we go back to the first release of Safari Intelligent Tracking Prevention in September 2017, here is Criteo's communication to investors in November 2017 :

Apples’ Intelligent Tracking Prevention feature, or ITP, was released on mobile on September 19, 2017. We believe our solution for Safari users currently allows us to mitigate about half of the potential impact from ITP. In the third quarter, ITP had a minimal net negative impact on our Revenue ex-TAC ​​of less than $1 million. Given our expectations of the roll out of Apple’s iOS11 and our coverage of Safari users, we expect ITP to have a net negative impact on our Revenue ex-TAC ​​in the fourth quarter of between 8% and 10% relative to our base case projections for the quarter. We will continue to improve and deploy our solution for Safari users over the coming quarters.

Criteo had already implemented a workaround (explained on this article :

What we’ve developed is a privacy-friendly solution, which is connecting on a [non-cookie] identifier that allows the transfer of information between websites and our servers

Note the "privacy-friendly" (!). Except that Apple quickly reacted to introduce an ITP update in early December 2017, making the Criteo bypass inoperable. So here is the new communication from Criteo to investors in December 2017 :

Earlier this month, Apple launched a new version of its mobile operating system, iOS 11.2, which disables the solution that some companies in the advertising ecosystem, including Criteo, currently use to reach Safari users. As a result, we believe the projected 9%-13% ITP net negative impact on Criteo’s 2018 Revenue ex-TAC relative to our pre-ITP base case projections, communicated on November 1, 2017, is no longer valid. We are focused on developing an alternative sustainable solution for the long term, built on our best-in-class user privacy standards, aligning the interests of Apple users, publishers and advertisers. This solution is still under development and its effectiveness cannot be assessed at this early stage. Should it not mitigate any ITP impact, we believe the ITP net negative impact on Criteo’s 2018 Revenue ex-TAC, relative to our pre-ITP base case projections, would become approximately 22%.

We must here congratulate Apple for regularly improving its ITP solution (the cat and mouse game having continued after 2017), ITP now being at the version 2.3. We can nevertheless be surprised that Criteo has never been sanctioned for its circumventions, and that it communicates without shame about it.

Regulations to better protect the privacy of Internet users

Also, with regulations evolving to better protect the privacy of Internet users (GDPR, and ePrivacy in Europe, CCPA in California), Criteo finds itself increasingly under pressure: a complaint from Privacy International was filed against Criteo, Quantcast and Tapad in November 2018 for violation of the GDPR, the CNIL has started the investigation of the complaint against Criteo in March 2020. Criteo also notes these regulations in its “Identification” deck for investors (slide 8), without dwelling on the legal risks:

Regulation

The stock market is not mistaken: if Criteo is still valued at 600 million euros, its value has fallen sharply over the last 5 years:

Criteo Nasdaq

Criteo has understood these two trends (technical via browsers and regulatory): it is now turning to much more insidious and generalized surveillance via the “Criteo Shopper Graph” and is holding a double standard on respect for privacy.

Criteo Shopper Graph: the massive database of personal data leaked to Criteo

Always in its “Identification” presentation for investors (slide 19), Criteo clearly explains its goal: to identify you without the need for third-party cookies. Today, 50% of its business would still be dependent on third-party cookies:

Third-party cookies

On its website, Criteo highlights “Criteo Shopper Graph”, its user database. How many people are in this database? Here are some figures put forward by Criteo to present the “Criteo Shopper Graph”:

By default, Criteo does not mix the personal data collected from one of its clients with its other clients. But if a customer wants to access the Shopper Graph, they must accept this mix. Example: Fnac wishes to access the Shopper Graph to improve the profitability of its advertising campaigns, it will have to accept that Criteo mixes your personal data collected on the Fnac site with your personal data collected from other Criteo customers who have subscribed to the Shopper Graph.

And the proposal works well, Criteo indicates that 75% of its customers participate:

Criteo Shopper Graph participation

How does Criteo build this massive database of personal data? Criteo offers explanations on his website, the Shopper Graph aggregates 3 data sources:

Criteo Shopper Graph

  • The Identity Graph allows you to be recognized on all your devices.
  • The Interest Map allows you to know your different purchasing intentions.
  • Measurement Data allows you to collect details of your various purchases.

Let's take a closer look at the Identity Graph, Criteo's monitoring engine.

The Identity Graph, or how Criteo recognizes you on all your devices

If we start from user surfing, the first step is to identify yourself when surfing eCommerce sites or apps. Criteo collects the different identifiers of the same user depending on the device used, it is then able to make the link between them to determine that it is one and the same person:

Identifiers

We see here that Criteo collects several types of identifiers and associates them with the Criteo id to better track you:

  • A cookie identifier for each of your web or mobile browsers.
  • A CRM id for each customer, when you are connected to the customer.
  • A mobile identifier (IDFA on Apple, Android Advertising Id on Android).
  • The hash of your e-mail address (a unique imprint, which does not allow Criteo to find the e-mail address), when the customer or Criteo partner passes on the information.

This monitoring is particularly invasive and disrespectful of your privacy because Criteo manages to make the link between your different identifiers ("Graph"), and it is difficult (mobile identifiers) or almost impossible to reset some of these identifiers (CRM id, hash email address). One example among others: the Fnac application leaks the hash of your email address to Criteo, and does not provide any way to opt out of this tracking.

Criteo has recognized you, the second step is to collect all your purchasing intentions (the Interest Map):

Purchase intentions

Regardless of the device or e-Commerce site you consult, Criteo will recover:

  • The products consulted.
  • Products added to cart.
  • Products purchased.

Criteo customers can also transmit their own customer list to Criteo, via email addresses or user identifiers such as IDFA, Android Advertising ID or Criteo Id. Criteo then indicates the percentage of users in the list who already belong to the Shopper Graph.

Then, the 3rd step is to recognize you in order to target you with the famous intrusive advertisements, when you surf on a site or media app:

Editor - advertising

Here we can note that Criteo has privileged partnerships with numerous media, which often allows it to access advertising inventory in a preferential manner (see in particular its Direct Bidder) and also to recover permanent identifiers (e-mail addresses, logins). For media with which Criteo does not have a privileged partnership, Criteo buys in RTB (programmatic) but must pay a tax to the SSPs (intermediaries).

The 4th step is to measure if you click on the ad (Criteo pays the publishers for each ad display, it is paid by the advertiser not for the purchase but for the click on the ad), then to measure if you buy from the advertiser (Measurement Data). But tracking doesn't stop at your "online" behavior, Criteo can also recover your "offline" purchases. if his clients pass on the information to him :

Offline purchases

Criteo does not talk about it in his presentation of the Shopper Graph, but his presentation Identification for investors tells us that it also collects your personal data via partnerships (slide 27):

Data sources

As you can see, Criteo reports partnerships with Liveramp, Oracle as well as publishers in order to retrieve identification data. You can learn more about the partnership with Liveramp via this commercial video. If we look his website, Criteo also mentions other partnerships allowing it to associate your different identifiers:

Criteo partner identifiers

Criteo leaks the identifiers of its Shopper Graph to its customers

Criteo not only tracks 75% of the world's buyers, it also offers customers of its Shopper Graph (those who agree to share your personal data) free of charge the ability to retrieve user identifiers from this graph (the Criteo Ids) for their own accounts. Here are the options offered by Criteo to leak personal data to its customers:

Share Criteo Id

We summarize:

  • From your smartphone, you saw a lamp on the website of the e-Retailer ABC.
  • You return to the ABC e-Retailer website, but this time on your laptop.

Criteo recognizes you, even if you have not connected to the ABC e-Retailer website because you have already connected to other Criteo partner sites, on your smartphone and on your laptop.

And Criteo allows e-Retailer ABC to recognize you as well.

Criteo once again bypasses your browser's protections... and introduces a security vulnerability

Except for Chrome, browsers take steps to protect you from trackers. Also, many Internet users install adblockers. We have already seen that Criteo has a long experience of circumvention measures taken by browsers and continues to act in this direction to continue tracking you in spite of yourself.

One of his latest initiatives? Encourage its advertising and publisher clients to delegate a subdomain to Criteo by setting up a CNAME (read about this the detailed article by Romain Cointepas, co-founder of NextDNS, app that effectively fights against this tracking). The CNAME allows you to specify that a subdomain is an alias to another domain.

It is an old technique used by certain analytics tools, it is currently experiencing renewed interest, particularly from French companies such as Eulerian, AT Internet and therefore Criteo. Here is Criteo documentation allowing advertisers and publishers to set up a CNAME. And here are examples of customers who have implemented this delegation:

  • The domain name xgctpf.allocine.fr is an alias to dnsdelegation.io, which itself points towards gum.criteo.com
  • The domain name ddhhbh.alfaromeo.fr is also an alias to dnsdelegation.io which also points to gum.criteo.com

Note the subdomains with random character strings: Criteo can thus track you in a hidden manner, and it is difficult for adblockers to update the domains to be blocked (the publisher can easily decide to change the CNAME from one week to the next). Firefox allows extensions to do CNAME resolution, which allows uBlock Origin on Firefox to properly block these calls, but other browsers do not allow this.

Criteo thus gains access to most users with an adblocker (the CNAME subject animates the communities working on adblockers) or a browser configured to block third-party cookies, which allows it to:

  • Be able to measure all visits and conversions on the advertiser's site.
  • Be able to retarget you on one of your other devices (if you do not have an adblocker on all your devices and the advertiser sends Criteo a permanent identifier such as your email address).
  • Theoretically being able to identify yourself across different sites via identification techniques fingerprinting : with your IP address or even with additional information collected on your browser. I say here theoretically because I have no proof that Criteo uses fingerprinting techniques.

An additional word on fingerprinting, clearly Criteo is interested in the subject as indicated this article from the Criteo R&D blog, detailing a research article On IP address based monitoring:

For the past few years, web browsers have increasingly limited the persistence of identifiers (cookies), making user tracking more difficult. A revealing example is Safari’s Intelligent Tracking Prevention. This paper presents a clever way to overcome the lack of persistent identifiers without infringing on user privacy, that is without using browser fingerprinting. It consists of using community detection in the Device Graph to detect stable cohorts (person or household level grouping). It is then possible to find the IP addresses that are associated with the cohort over time and thus defining a persistent ID based on these IP addresses. This technique is called Graph backfilling. This technique reaches its limits when many people use the same IP or in the case of dynamic IPs. This is why it works like a charm in the US, but is more difficult to apply in China.

IP monitoring

CNAME is one of the techniques (along with logins and emails) that allows Criteo to brag to investors (slide 15) to have “1st party” access to websites consulted by Internet users (there is no need for third-party cookies to collect your personal data):

Criteo 1st party

Here is the email sent by Criteo to its customers and partners (via f_to_k) :

Email Criteo CNAME

This email seems innocuous but the CNAME technique is much less so. It introduces a security vulnerability if the partner site has not taken precautions: Criteo can then read the cookies placed on the partner site's domain. Let's study an example by surfing allocine.fr with Safari and the tool Charles Proxy :

Criteo Allocine Cookies

As we can see, xgctpf.allocine.fr (aka Criteo) recovers the cookies placed on allocine.fr (which are not intended for it):

  • Identifier cookies placed by Google Analytics: _ga, _gat, _gid, _gads
  • A Facebook ID cookie: _fbp
  • Cookies storing your geolocation: geocode, geolevel1, geolevel2, geolevel3
  • Your authentication cookies at Allocine (I was connected): ACAUTH, ACCT, ACID, GraphToken

Via the various "stolen" identifiers, Criteo can enrich its Shopper Graph, but this is not the most serious because via your authentication cookies, Criteo can connect to your own Allociné account! Here are the steps to check it:

  • Retrieve authentication cookies from Allocine via Safari and Charles Proxy.
  • Go to allocine.fr from Chrome, make sure you are logged out.
  • Use the Chrome extension EditThisCookie to create your authentication cookies.
  • Refresh the page, you are connected!

Criteo connects Gift: Allociné does not offer a secure version of its website, so Criteo is not the only one that can intercept your authentication cookies. Your ISP and the machines located between your device and Allociné servers can also connect for you.

This important security flaw is probably present on many partner sites because Criteo has already managed to convince more than 10,000 partners to install a CNAME (Allociné was a benign example, most partners are e-Commerce sites which can hold much more sensitive information such as your credit card).

Read about it this conversation on the Reddit channel r/adops, adding a CNAME is far from trivial as the Criteo email suggests. How to avoid this information leak while using Criteo CNAME? By not allowing subdomains to read cookies from the domain (e.g. read Mozilla documentation, “Cookie Scope”).

Nevertheless, Criteo still evolves with complete impunity: the CNAME technique has been the subject of a article on the internet newspaper in which Criteo, ID5 and Prisma Media justify the practice of CNAME. The programmatic manager of Prisma Media even accuses browsers of overstepping their roles by wanting to protect the privacy of Internet users:

It is frankly problematic that a browser decides what is fair or not by claiming to protect users' privacy, when that is the role of consent management. If the person does not give their consent, of course we do not place a cookie.

For Criteo also, the browser should not "make decisions" in place of the Internet user, who must be able to consent (or not) to advertising surveillance of himself. But how does Criteo compare to the need to obtain user consent before tracking them?

For collecting consent, Criteo relies on partner advertisers and publishers

The Criteo page explaining the use of personal data gives a good idea of the extent of the personal data collected and the use to which it is put. The legal basis put forward by Criteo to justify this monopolization of your personal data while respecting GDPR is consent. But for Criteo, obtaining this is the sole responsibility of its partners, advertisers and publishers:

Criteo's processing operations comply with current regulations in countries requiring user consent for the use of cookies or any other similar technology. This consent is collected on the websites and mobile applications of Advertisers and Publishers.

Criteo emphasizes this point in its privacy policy :

Note that the use of Criteo technologies is governed by the privacy policies published on the websites and mobile applications of our partners. They are required to provide complete and appropriate information and, to the extent required by law, to obtain your consent before disclosing any personal data about you.

Criteo even indicates that it contractually requires its partner advertisers and publishers to obtain user consent before implementing trackers:

Criteo contractually requires that Advertisers and Publishers respect its general Editorial Charter and its Charter dedicated to partners, as well as the various regulations in force on the protection of personal data, in particular the GDPR. By using Criteo services, they undertake, to the extent that regulations require it: [...] to obtain the consent of users before implementing cookies or other similar technologies, for the purpose of serving personalized ads.

However, Criteo does nothing to enforce this contract, as can be seen with the example of Fnac. Obtaining real consent from the user before being able to identify them would be equivalent to Criteo going out of business, hence its double talk.

Criteo twists the definition of consent

Criteo is no stranger to doublespeak. So on the one hand it will explain to users in its privacy policy that it respects the notion of consent and contractually asks its partners to apply it. On the other hand, it tells its partners that they do not need to obtain explicit consent from users.

In his "Criteo Privacy Guidelines for Customers and Publisher Partners", Criteo advises its clients on the information clauses to include in their confidentiality policies. Also, Criteo is once again telling customers that they have an obligation to obtain consent from users within the EU. But what does consent mean for Criteo? The definition is more akin to a simple information obligation:

In particular, according to EU laws, the request for consent is considered valid when: Users are informed about the use of cookies and technologies other than cookies by Criteo for the purpose of offering targeted advertising when giving their consent.

Criteo even goes so far as to suggest using the flaw introduced by the CNIL which still allows almost the entire French web to consider that continuing to browse a site (simple scroll or click on a new page) constitutes consent:

Suggested cookie notice for countries where consent is required By continuing to browse our site, you accept the use of cookies and non-cookie technologies to provide you with personalized content and advertising across the sites.

As indicated in the European Data Protection Board, an independent European body whose objectives are to ensure the consistent application of the GDPR and to promote cooperation between EU data protection authorities, consent must be clear, affirmative and unambiguous. The latest guidelines were published on May 4, we can for example read that scrolling does not constitute consent:

scroll GDPR consent

In an article titled "GDPR: Criteo is ready to take on the challenge", Criteo details its vision of consent, considering that it does not need to obtain explicit consent from the user:

The GDPR establishes a clear distinction between unambiguous consent and explicit consent. Explicit consent implies an express choice on the part of the user. This applies, for example, to the collection of sensitive data such as race, religion, sexual orientation, political affiliation and health. In contrast, as such online tracking devices (e.g. cookies) are categorized as simple personal data. Also, according to the new regulation, an express opt-in is not required with regard to classic retargeting cookies which do not collect sensitive data.

Except that this behavior is in violation of the GDPR, which requires clear, affirmative and unambiguous consent from the user. The CNIL must also move forward to bring its doctrine into line with the GDPR: it has started the process in July 2019, but she is now using the coronavirus crisis as an excuse to pause this necessary adaptation.

Criteo even wrote a GDPR white paper, detailing its fallacious arguments, only one point remains: Criteo could not survive if it relied on genuine user consent. He therefore feels obliged to twist the definition of consent.

Criteo's lies in its privacy policy

In his privacy policy, Criteo indicates that it does not receive personal data:

No personal data (surname, first name, postal address, unencrypted email address or other) is communicated to us.

This is false, Criteo customers can communicate your email address to Criteo in plain text. as indicated on this support page to “create an audience”:

A CRM email address file containing full addresses, email addresses encrypted by MD5 or SHA256 hash of MD5 (full addresses>MD5>SHA256).

If the client sends email addresses in the clear, Criteo says it encrypts them before storing them, so you should trust it.

Always in his privacy policy, Criteo admits that data that is not of a personal nature (pseudonyms) are considered personal data in the European Union and California:

However, this information is considered personal data under the EU General Data Protection Regulation (GDPR) as well as the California Consumer Privacy Act (CCPA).

Another lie, in the “Criteo Commitments” section, we can read:

Criteo ads do not in any way involve collecting the following data: [...] persistent identifiers, such as identifiers of the devices you use (UDID, MAC address, etc.)

Except that the hash of your email address is indeed a persistent identifier. Also, this "commitment" is in contradiction with the “Online identification at Criteo” deck for investors. On slide 16, Criteo indicates that 96% of the “identities” (= users) in its “Identity Graph” contain at least one persistent identifier:

Identity Graph Criteo - persistent identifiers

On slide 22, Criteo also indicates that it uses third parties to obtain the advantages of persistent identifiers, and thus no longer be dependent on cookies:

persistent identifier partners

Criteo's duplicity is blatant: commitment to Internet users not to collect "under any circumstances" persistent identifiers on the one hand, but facilitation of the collection of these persistent identifiers from partner advertisers and publishers (as a reminder, the support page showing how to send Criteo a list of emails), and communicating these persistent identifiers to investors on the other side.

Disabling Criteo services on mobile apps does not work

As seen with the example of Fnac on iOS, Criteo continues to collect a hash of your email address, even when you have disabled ad tracking:

Limit advertising tracking

Yet his page "Disable Criteo services on mobile apps" indicates:

Criteo withdrawal consent

Here we find a double lie from Criteo: my e-mail (or even a hash of my email) is a persistent identifier, yet Criteo collects it (first lie), even if I deactivate ad tracking (second lie).

Repeated and documented abuses, but still no sanction

Criteo's unethical practices have been detailed and denounced for a long time, here are a few elements:

  • Privacy International's complaint against Criteo, Quantcast and Tapad for violation of the GDPR dates from November 2018, it took 16 months for the CNIL to react and start instruction.
  • The CNAME technique is just one element among others developed by Criteo to circumvent the protections put in place by browsers, the EFF has documented Criteo's previous attempts to circumvent Safari ITP.
  • These techniques were detailed by Gotham City Research LLC in a dedicated report.
  • Further reporting from Gotham City Research LLC denouncing widespread fraud on the inventory managed by Criteo. Note that the reports were not disinterested because Gotham openly speculated downwards on the price of Criteo, the fact remains that the practices denounced were proven.
  • Criteo is part of the Acceptable Ads committee, a initiative created by Adblock Plus, allowing Criteo to display advertisements even if you have installed Adblock Plus or other adblockers supporting the initiative (these advertisements are "adapted": they take up a little less space than traditional Criteo advertisements). Shocking because Criteo ads are particularly intrusive, but “Acceptable Ads” is “only” based on visual pollution.

However, nothing happens. Given the history of the CNIL, it is reasonable to doubt the hypothesis of real sanctions against a French digital champion with significant economic and political weight, as demonstrated the visit of Bruno Le Maire for the 1st anniversary of the Criteo artificial intelligence laboratory, last October. For Bruno Le Maire, Criteo is "one of the great French successes of the last 15 years" :

Tweet Bruno Le Maire - Criteo

What to do then? Unfortunately, while waiting for real political ambition, the solutions remain individual and technical: use an adblocker such as uBlock Origin combined with Firefox on the web (or other privacy-friendly browsers such as Brave and Safari), go through apps such as DNSCloak, Adguard or NextDNS on iOS, or even install Pi-hole on a Raspberry Pi if you have a taste for technology.