Skip to main content

GDPR quick tip: Know what data (models) you have

Amid all the kerfuffle around the General Data Protection Regulation, GDPR (which applies to any organization handling European citizen data, wherever they are located), it can be hard to know where to start. I don’t claim to be a GDPR expert – I’ll leave that to the lawyers and indeed, the government organizations responsible. However, I can report from my conversations around getting ready for the May 25th deadline.

In terms of policies and approach, GDPR is not that different to existing data management best practice. One potential difference, from a UK perspective, is that it may mean the end of unsolicited calls, letters and emails: for example, the CEO of a direct mail organization told me it may be the demise of ‘cold lists’, that is, collections of addresses to be targeted without any prior engagement (which drives many ‘legitimate interest’ justifications), contract or consent.

But this isn’t a massive leap from, say, MailChimp’s confirmation checks, themselves based on spam blacklisting and the right to complain. And indeed, in this age of public, sometimes viral discontent, no organization wants to have its reputation hauled over the coals of social media. When they do, it appears, they can get away with it for so long before they cave in to public pressure to do a better job (recent examples, Uber and a few budget airlines).

All this reinforces the point that organizations doing right by their customers, and therefore their data, are likely already on the right path to GDPR compliance. The Jeff Goldblum-sized fly in the ointment, however, is the conclusion reached in survey after survey about enterprise data management: most corporations today don’t actually know what information they have, including about the people with whom they interact.

This is completely understandable. As technology has thrown innovation after innovation at the enterprise, many have adopted a layer-the-new-on-top-of-the-old approach: to do otherwise would have left them at the wayside long ago. Each massive organisation is an attic of data archival, a den of data duplication, a cavern of complexity. To date, the solution has been a combination of coping strategies, even as we add new layers on top.

But now, faced with the massive potential fines (up to 4% of revenue or €20 million), our corporations and institutions can no longer de-prioritise how they manage their data pools. At the same time, there is no magic wand to be waved, no way of really knowing whether the data stored within is appropriate to the organization’s purposes (which indeed, may be very different to when they were established).

Meanwhile, looking at the level of systems is not going to be particularly revealing, so is there an answer? A starting point is to look somewhere in-between data and systems, focusing on meta-data. Data models, software designs and so on can be revelatory in terms of what data is held and how it is being used, and can enable prioritization of what might be higher-risk (of non-compliance) systems and data stores.

Knowing this information enables a number of decisions, not only about the data but also what to do with it. For example, a system holding information about the children of customers may still be running, without anyone’s real knowledge. Just knowing it is there, and that it hasn’t been accessed for several years, should be reason enough to switch it off and dispose of its contents. And indeed, even if 75% of marketing data will be ‘rendered obsolete‘, surely that’s not the good part anyway?

Even if you have a thousand such systems, knowing what they are and what types of data they contain puts you in a much better position than not knowing. It’s not a surprise that software vendors (such as Erwin, founded as a data modelling company in the 90’s, vanished into CA, divested and portfolio broadened), who have struggled to demonstrate their relevance in the face of “coping strategy” approaches to enterprise data governance, are now setting their stalls around GDPR.

Again, no magic wands exist but the bottom line is that it is becoming an enforceably legal requirement for organizations to be able to explain what they are holding and why. As a final thought, this has to be seen as good for business: focus on what matters, the ability to prioritize, to better engage, to deliver more personalized customer services, all of these are seen as high-value benefits above and beyond a need to comply with some legislative big stick.



from Gigaom https://gigaom.com/2018/01/30/gdpr-quick-tip-know-what-data-models-you-have/

Comments

Popular posts from this blog

Voices in AI – Bonus: A Conversation with Hilary Mason

[voices_in_ai_byline] About this Episode On this Episode of Voices in AI features Byron speaking with Hilary Mason, an acclaimed data and research scientist, about the mechanics and philosophy behind designing and building AI. Listen to this episode or read the full transcript at www.VoicesinAI.com Transcript Excerpt Byron Reese: This is Voices in AI, brought to you by Gigaom and I am Byron Reese. Today, our guest is Hilary Mason. She is the GM of Machine Learning at Cloudera, and the founder and CEO of Fast Forward Labs, and the Data Scientist in residence at Accel Partners, and a member of the Board of Directors at the Anita Borg Institute for Women in Technology, and the co-founder of hackNY.org. That’s as far down as it would let me read in her LinkedIn profile, but I’ve a feeling if I’d clicked that ‘More’ button, there would be a lot more. Welcome to the show, amazing Hilary Mason! Hilary Mason: Thank you very much. Thank you for having me. I always like to start with...