3 Policies for Big Data Privacy Problems
- by Michael Housman
In the wake of the news about the Ashley Madison scandal–in which hackers released profile information for 37 million users–concerned has grown about data security and privacy. Beyond the data itself, consumers are equally concerned about how it’s being used by the companies that capture it. A variety of stories have emerged that point out some of the scarier implications of companies using big data. For example, Target infamously mailed coupons for prenatal vitamins to a Minnesota teen whose irate father visited one store to complain, and then later discovered that his daughter was, in fact, pregnant. More recently, Uber earned some negative attention when a story emerged that claimed one of its employees had tracked a reporter’s movement using “God view.” The incident led two economists to write a New York Times op-ed titled “We Can’t Trust Uber,” in which they speculated as to what sorts of data Uber could also track after a blog post on one-night stands: “Rides to Planned Parenthood? Regular rides to a cancer hospital? Interviews at a rival company? Uber knows about them too.”
There is no question that technology has completely changed the rules when it comes to privacy and security. The rapid growth of digital technologies, advances in GPS capabilities, the ubiquitous use of cellphones and increasing use of wearable technologies, coupled with the ability of organizations to more easily generate, store and analyze massive quantities of user data, means that companies now have the ability to assemble a fairly detailed picture of someone’s social interaction and day-to-day activities. When Google acquired Nest, many bemoaned the fact that the Internet giant would soon know where someone was at all times. For many, this is a scary proposition.
Companies attempting to leverage the value of their data must recognize that there is a target on their backs. Their use of data will be increasingly scrutinized by the public, and there is simply no room for error. But the good news is that there are policies that can help them stay out of trouble. The company where I worked for the past three-and-a-half years–Evolv (acquired by Cornerstone OnDemand)–found itself in a position where articles with titles like “They’re Watching You at Work” were appearing on a regular basis and decided to take a very pro-active stance about navigating the potential PR risks.
Here are some of the things we did:
1. Be transparent about data usage and develop a “Big Data Code of Ethics.”
When these stories began emerging, our marketing team proactively gathered everyone’s input and drafted “Evolv’s Workforce Big Data Code of Ethics.” It wasn’t meant to be a legal document but was instead intended to outline our views and philosophy about what data was being collected, how it was being handled, and what data was on- and off-limits when it came to making recommendations around employment decisions. It posed a great opportunity to do some organizational soul-searching and to figure out what we stood for. More importantly, we had a document that we could point to and forward along when anyone came calling and asking about our data policy. Rather than hiding behind technical jargon and legalese, we went the other direction and decided to err on the side of making it accessible and completely transparent.
2. Be cognizant of the white, black and gray areas.
With the foundation offered up by the “Big Data Code of Ethics,” employees felt comfortable suggesting when they believed that we were wading into murky waters. For example, we were searching for any characteristics of job applicants that would predict their success on the job when we somewhat famously discovered that job applicants who logged into our platform with Chrome and Firefox browsers stayed on the job longer and performed better than job applicants who logged on using Internet Explorer or Safari. We could have used this information to further calibrate our algorithms and make the assessment more predictive, but we immediately recognized that it posed some ethical issues. Not only did we believe it was likely that browser usage was strongly correlated with someone’s age, but we thought that scraping information about someone’s browser fell into the category of “creepy data capture” and wondered how people would react if they knew. In the end, we made the deliberate decision not to use it in our scoring methodology because we felt like it was one of those gray areas that was best to avoid.
3. Investigate any biases in the data.
Since we were using data to make hiring recommendations, we recognized that this is a sensitive area where people have a collective sense of fairness. Specifically, there are laws in place to ensure that protected classes–for example, age, sex and gender–are not discriminated against. To that end, the process by which we deployed our assessments included a phase during which we deliberately examined whether any of those protected classes received lower scores on the assessment than the rest of the application pool. The goal was to ensure that we followed the letter of the law and that everyone was given a fair shot. In fact, I believe that we were able to reduce bias in the hiring process. For example, our research on the long-term unemployed uncovered the fact that their performance was no worse than applicants with more traditional work histories and was cited in the White House Report “Back to Work” as helping several large companies fulfill their pledge to President Barack Obama to give the long-term unemployed a fair shot.
The bottom line is that large companies utilizing large-scale data and analytics to generate insight about their customers need to recognize that people are becoming increasingly nervous about what data is being captured and how it’s being used.
These concerns are not likely to go away; if anything, they will continue to grow as we continue to shed more and more data. The White House is paying very close attention to the issue and recently released a report on “Big Data and Privacy” that outlines some of the risks and opportunities posed by big data. At this point, it seems fairly inevitable that there will be some legislative action on this front in the not too distant future.
Companies can’t afford to take a “wait and see” stance. They need to be proactive about defining their data policies and transparent about their philosophy toward data usage, or they’ll find themselves in the position of alienating the very consumers that they are trying to understand.