Top

'Our dream of a utopia': new course explores threats to privacy in age of big data collection

Photo: Diana Tyszko

Published on A & S News

In 2017, Sidewalk Labs and Waterfront TO announced their vision for the future of cities: a “smart neighbourhood” on the city’s eastern waterfront called Sidewalk Toronto. The city within a city would feature — along with autonomous vehicles and robots — sensors embedded in its physical infrastructure to collect vast amounts of data about traffic, energy use, mail delivery, even garbage disposal.

But in October 2018, Ann Cavoukian, Ontario's former privacy commissioner, resigned as an advisor for Sidewalk Toronto due to concerns about digital privacy practices — in particular, whether or not stored and shared data would be stripped of all information that could identify an individual.

“Sidewalk Labs is a great example of our dream of a utopia in which we harness technology to solve problems and enhance decision making,” says David Liu, an assistant professor, teaching stream, in the Faculty of Arts & Science’s Department of Computer Science.

“But I’m fundamentally skeptical about how Sidewalk Toronto’s governance will work, given the huge amount of data that will be collected. I worry whether requests for data will receive nothing more than rubber stamp approvals from the agency responsible and who such an agency would be accountable to.”

Liu is exploring these issues in a new course he developed called What, Who, How: Privacy in the Age of Big Data Collection. The course explores the countless ways our data are being collected by retailers, corporations, governments, as well as law enforcement and national security agencies — and the potential dangers of that harvest.

Describing his motivation for developing the course, Liu says, “I wanted something that was outward-looking, that examined how technology is affecting the world at large — and not necessarily in a positive way.”

“Almost every company we interact with now is collecting data on us,” explains Liu. “And it's not just the companies selling us stuff — it’s also the companies who provide free services, like Google, Facebook, Instagram and YouTube. They're collecting data on how we interact both with their services and other people using those services.

“What’s more, there’s a growing use of sensors — like those planned for Sidewalk Toronto — that are infiltrating our physical world. So it’s not just when you use an electronic device. It happens as you move through and interact with the physical world. For example, home assistants record your speech, facial recognition software watches you, and a device like an Amazon Ring front doorbell video-records you as you walk along a sidewalk.

“Modern technology has enabled mass surveillance in ways that were literally unimaginable 50 years ago,” he warns. “Surveillance that is orders of magnitude more efficient. So it’s really important for us to be cognizant of privacy issues — especially when it comes to public data collection — because it affects everyone.”

You are what you stream

Liu and his students explore the digital rewards and risks through case studies that — like Sidewalk Labs — are straight from the headlines.

Students studied the Cambridge Analytica scandal where the consulting firm harvested the personal information of millions of Facebook users without their consent and used it to target political advertising during the 2016 presidential campaign.

They learned how Spotify is sharing more than just traditional user data with marketers; the music-streaming giant is also sharing listeners’ emotional states and what they’re doing as they listen. With this intel, retailers target ads based on whether someone is listening to the Happy Beats, Breakup Songs, Girls Night Out or Barbeque playlist. As Spotify says, “You are what you stream” and marketers can use this insight to their advantage.

Students also learned that it’s more than just song choices, geographic location or purchase histories that are being collected. We share our most personal data — our DNA — with genealogy services like GEDMatch and 23andMe.

In 2018, police uploaded DNA they found at a crime scene they suspected was a murder committed by the Golden State Killer to GEDMatch. The company’s genetic database of a million users revealed distant relatives of a suspect who — after additional investigation and police work — was eventually arrested and charged.

But while we can catch killers with genetic databases, Liu says we also run unique privacy risks. “We share DNA with our family members,” he says. “So, if my second-cousin uploads their DNA without my knowledge, this reveals information about me without my consent. And that’s troubling.

“Also, to protect our privacy we can change passwords, user names, our hair colour, even our faces — but our DNA is immutable,” he warns. “You can’t disguise your DNA.”

For University College member, Ella Li, Liu’s course was eye-opening. “I never really thought much about privacy and how it relates to big data before,” she says. “But this course gave me new insight into privacy issues and security breaches around the world.”

For Innis College member, Isabella Buklarewicz, it was frightening. “The scariest thing I learned was the way in which companies use algorithms in the hiring process,” she explains. “They may make hiring easier but the biases of the algorithm’s creator can be embedded in the code — and that can be advantageous for one group of people and discriminatory toward another.”

As Liu affirms, the risks of big data collection are not shared equitably as “marginalized people are more vulnerable to surveillance and more likely to be targeted because of their race or because of their socioeconomic status or because they’re immigrants.”

This lesson is not lost on Buklarewicz. “The collection of data isn’t just a privacy issue,” she says. “It’s a human rights issue.”