Dean’s Blog: The New Great Divide – Data

When I was growing up in the UK in the early 1960s, I took what became known as the eleven-plus exam. The controversial exam pretty much defined your future at age 11—white collar or blue. If you passed you went to a grammar school, where you took Latin, English literature, languages, and further academic pursuits; if you failed you went to a secondary modern school and specialized in a trade, like woodwork or metalwork. I failed. But I got another shot when I emigrated to Australia, which allowed you to determine your own future—we studied literature and woodwork.

In the UK back then, as in many other countries, there was a sense of elitism tied to working and “working class” people—or upstairs and downstairs, if you’ve seen Downton Abbey.

As soon as you’re born they make you feel small

By giving you no time instead of it all

Till the pain is so big you feel nothing at all

A working class hero is something to be

A working class hero is something to be

John Lennon, Plastic Ono Band

Thankfully, a lot has changed since then—just try servicing your own car and tell me that does not require extensive, specialized knowledge.

I was reminded of this divide while moderating a panel on data ethics with some esteemed University of Virginia alumni recently in New York City. One of the panelists recited words he had heard,

In the future there will be two kinds of jobs—those where people control machines and those where machines control people.”

Draconian perhaps, but certainly an indicator of what could be the next great divide in a world that is already far too divisive. A step beyond the digital divide to the data divide—the ability of a new elite with access to and an understanding of the vast amount of data that are now being collected and employed to various ends.

While “machine” in this context makes one think of robotics, in broad terms a machine could be anything from a thermostat to an airplane. Let’s keep it simple and consider the thermostat. The people controlling the machine are the people we are training at the School of Data Science—data scientists analyzing data generated from many thermostats employing predictive analytics to determine what temperature to set at what time based on an input dataset. The people being controlled are those living in the house with such a thermostat. The great divide comes when the heating bill arrives. It is more than they can afford. The occupants are immediately suspicious that the data scientists work for the power company. Sure, there are overrides, but who programs all those features? For old-timers, think of the flashing 12:00 on the videocassette recorder (VCR).

A silly example perhaps, or is it? It points to the need for an absolute commitment on the part of data science educators and the students we train to close the divide that exists now, and that will only worsen if we are not responsive. As always, defining the problem is easier than finding a solution, and both are easier than implementing one.

What steps should we take as a newly emerging School of Data Science to address this new Data Divide? Let’s use the thermostat example as a guide.

  1. Communicate well and transparently—the consumer needs a simple and concise explanation of what data were collected and how it was used to run the thermostat. They also need a clear understanding of the terms of agreement and how they are impacted by them.
  2. Know the consequences of your actions—this involves both the reality of how the data are being used as well as the perception by the consumer of how the data are being used.
  3. Understand the laws and policies governing data—who owns the data coming from the thermostat? What are they empowered or prevented by law to do with it?

Above all, train students who are active, not reactive. Train leaders, not followers, who act with integrity and honesty. Leaders who understand the real-world consequences of their actions to communities and to individuals, and who feed that back into the work that they do and are not swayed by profiteering.

The societal impact of the great Data Divide as it relates to jobs and to life more generally is likely to be severe. Previous revolutions—agriculture, industrialization, computerization—have taught us that. After a period of disruption society was generally better off than it was before. The key will be training a generation that works to minimize that disruption and begins building a better world as soon as possible.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

About Phil Bourne

Stephenson Founding Dean of the School of Data Science and Professor of Data Science & Biomedical Engineering, University of Virginia