in Events

Big tech has your data. Come & get it!

By Thor Galle on Feb 27, 2020


facebook twitter

Our personal data is super valuable. It is valuable, because it contains thousands of little facts about our lives. For example, a song streamed on the ride home last month, the complete itinerary of a city trip to Berlin in 2017, and even our most-bought chips brand in 2012. All these our human mind probably can't recall, but today this is the type of information that is recorded and analyzed by big (tech) companies.

Over the last 15 years, we've gradually come to accept that we're making a deal. Big tech gives us access to the social and professional networks we almost need today, to file syncing services, photo sharing apps and applications of all kinds. Our part of the deal is to click "Agree" to allow them to follow us around, inside and outside of the internet, tracking our every click, tap and reading our messages. They then use this data to learn about us, so they can for example show us targeted advertisements (Facebook, Google), sell us what we want (Amazon) or make playlists we'll like (Spotify).

"Fair enough", is the general attitude. Why should we care? I'm here to tell you that this deal is problematic. While the volume of recorded data grows, the AI systems that can process this data into actionable insights grow along. We now for example get such relevant ads that we wonder whether our devices are listening to us. They apparently don't, but the question remains: how well does big tech know us? It's not just a matter of losing privacy. If ads and recommendation engines (eg, YouTube's side bar) can make us do things, they have a certain power over us. And because these systems are built with our personal data, we need control over that data in order to understand the systems' effects on our lives.

However, until recently big tech companies had almost full control over the personal data they collected on us. It felt strange that they could know things about us we weren't allowed to know in return. This started to change when Europe's Data Protection regulation (GDPR) came into practice. Among others, its articles 15 and 20 demand that a user can access and download all their data from a controller that processes it. These rules are intended to increase the power we have over our data and related insights. For example, with them, we could move our favourited music from Spotify to Google Play, cancelling our subscription to the former without needing to start anew in the latter.

While the GDPR's intention is great, even this seemingly simple example is not a reality. I tried accessing my data from various services, and the challenges abound. Take the following example: did you know how Spotify trains their recommendation engine? They store a full playback history of every single millisecond played on the platform. However data export tool only provides 90 days of data. It took me months, 5 tries and some angry emails to receive my full music playback history of about 50.000 lines spanning back to 2011. Now I could figure out myself: which artist did I listen to most in 2013?

  "ts": "2020-02-16 19:44:16 UTC",
  "username": "1234567",
  "platform": "Android OS 8.0.0 API 26 (samsung, SM-G930F)",
  "ms_played": "300400",
  "conn_country": "SE",
  "ip_addr_decrypted": "83.233.3.XX",
  "user_agent_decrypted": "unknown",
  "master_metadata_track_name": "Baba O'Riley",
  "master_metadata_album_artist_name": "The Who",
  "master_metadata_album_album_name": "Who's Next",
  "reason_start": "trackdone",
  "reason_end": "trackdone",
  "shuffle": false,
  "offline": false,
  "offline_timestamp": "1581881954631"
  "incognito_mode": false

A sample JSON line of my Spotify playback history, one of 53.531 lines. Requested by email to their Data Protection Officer.

The ease of requesting data is one challenge, but there are many more. In my talk, I'll discuss these challenges and I'll explain what data I could receive from various services by sending out GDPR requests. Applications are never far off. Would you like to find a running route you've never tried before with your Google Maps data? Our rediscover your online social situation in 2012?

It's time we take back control over our data. See you soon!

Photo by Christopher Burns on Unsplash

Written by

written by Thor Galle
Logo open Belgium 2020 quit