Understanding the Data Protection White Paper Part VI: Big data and IoT present huge challenges to traditional privacy principles

This article is Part 6 of a multi-part series explaining the recently issued white paper on data protection in India. The responses to the white paper will help in the formulation of India’s future data protection laws. You can read Part 1, Part 2, Part 3, Part 4 and Part 5.

The era of big data and the internet of things presents a huge challenge to traditional privacy principles, questioning their efficacy and adequacy in protecting people’s interests. Protecting privacy involves the application of certain privacy principles to the processing of their data. This includes:

Purpose specification: Where the purpose of collecting data from an individual must be specified to him prior to collection,
Use limitation: Where the data once collected must be used only for the purposes so specified, and
Storage limitation: Where data being stored must be limited to the minimum data needed for the fulfilment of the purpose and must be erased or anonymised within a reasonable time after it fulfils its purpose.

Representational Image. Reuters

These principles are designed to ensure data minimisation, or the minimum possible data collection, use and storage. Big data and IoT, on the other hand, operate on quite the opposite, a sort of data ‘maximisation’, where maximum possible data is collected and stored, for uses in the future that are impossible to predict at the time of collection.

Revolutionary uses of big data

Big data involves data from multiple sources, from online activities like transactions, clickstream, logs, search queries and social network records, from specific individuals, from companies collecting data, and even from sensors in places like homes, offices and stores. These are combined and analysed to produce the most unexpected results. Without question, data analytics when applied to big data and data from IoT, has revolutionised the way businesses, the economy and even the government works.

Big data in healthcare

A research paper by Tane and Polenetsky cites several such examples supporting the use of big data. One is research conducted on medical data revealed that two medicines — a very popular anti-depressant and another popular cholesterol-reducing drug, which were perfectly fine when taken separately, together had the side effect of raising blood sugar to diabetic levels. This research was backed by analysing the search query data with Bing, where it was found that users who searched for both drugs, were more likely to then search for diabetes related symptoms, than users who searched for each drug separately. This research could well have saved the lives of many people.

Big data for traffic management

Other examples include the use of location data for traffic management. Real-time and stored location data has been used to manage traffic congestion and take decisions on road and mass transit construction. The real-time smart routing of Google Maps is a feature every one has used and enjoyed.

Big data has also been utilised by researchers to predict when a community in Uganda could be hit by food shortages and to develop data-based predictive models to better identify and serve the needs of a Nairobi slum. Big data has also been used to effectively predict health epidemics, such as the spread of influenza through Google Flu Trends or of cholera through Twitter in Haiti.

Traffic representational reuters

Data from the Internet of Things

Similar uses can be found for data from the internet of things, where data can be collected and combined from a number of sources — an individual’s smartphone, tablets, computer, wearable devices, smart home appliances, car, RFID-equipped passports, etc. These can be used, for example, to optimise when devices can be run so as to reduce electricity consumption.

They can also have huge advantages for investigations, for example, in one murder investigation, data records from an electronic water meter revealed huge volumes of water consumed between 1 and 2 am in the morning. This data was used to support the theory that a murder had taken place in the home at the time, and water had been used to wash away blood from the crime scene.

New privacy principles to encourage big data?

The huge benefits of big data analytics have led to new schools of thought advocating a rethink of existing privacy principles, to allow the use of big data. One suggestion by Tane and Polonetsky is that instead of limiting the use of data, increase ensure greater access and transparency to the uses the data is put to.

Another suggestion by Fred Cate is to replace the current purpose specification and use limitation principles, which limits the use of data to a harm-based approach, which allows the free use of data, except for harms that are identified. This approach will prevent the use of data only when it is, for example, fraudulent, unlawful, inappropriate or can cause unjustified harm to an individual.

Privacy violations

At the same time, the increasing collection and analysis of data increasingly puts people’s privacy at stake. Consider retail privacy invasions such as e-tailer Target’s pregnancy predictive analysis, where a father was furious to find coupons for baby clothes and cribs sent to his teenage daughter, only to find that she was, in fact, pregnant.

This was predicted based on the young girl’s purchases, using data analysis that enabled predictions of when women were pregnant, which stage their pregnancy was in, and when their baby is due, based on their shopping trends. To inform the girl’s family about her pregnancy, before she herself had the chance to take a decision on informing them, was a huge violation of her privacy.

Issues with data management

In addition to such violations, big data analytics face a number of privacy hurdles with managing the data. First, the sheer volumes of data being collected far outweigh the security being provided to it. The scale of data collection, trafficking and profiling, combined with the combination of data from multiple sources, adds to the problems. Even earlier factors like anonymisation are no longer adequate as protection, with a number of researchers revealing how easily anonymised data can be deanonymised.

Illegal acquisition of big data

The huge value of data has also led to a greed for data, resulting in even illegal activities to acquire the data, such as through covert collection and surveillance. Data brokers creating profiles through illegally acquired data is one example. Another example is the collection of location data by Google even when location services were switched off on the user’s phone. A similar collection of location data without user knowledge or Ad network InMobi led to FTC charges against it.

google 1280p reuters

Restrictive psychological impact of data analytics

The results of big data analytics have equal psychological side effects. The constant prediction of user behaviour and resultant targeted advertising and automated decision making can result in ‘pigeonholing’ of people, where people are compartmentalised based on the groups created by the data analysts. People will tend to conform to this, such as, as a simple example, by sticking to the book and movie recommendations made to them.

This can also result in discrimination against people. An example of this is stores which predict a person’s purchase capability and show them rates based on that, and not based on a uniform rate. A more serious consequence is racial profiling and discrimination based on that.

Of the negative consequences of such huge data collection, increased governmental surveillance is obvious. Less obvious is the chilling effect that such surveillance and the results of constant data analysis can have on the people. The constant public gaze on their activities, be it through the government, retailers or anyone else, can drive them to discipline and regulate their activities, in an attempt to conform to socially accepted standards of conduct.

Key questions raised in the White Paper

The advantages of big data, thus, are likely being achieved at a huge cost to people’s privacy. This questions of whether the greater access approach or the harm-based approach are sufficient protections for the people. While it is definitely worth rethinking privacy principles, the resultant new principles cannot be at the cost of people’s privacy.

The White Paper’s general approach is to retain the privacy principles in their present form, the same form which is currently prescribed in several jurisdictions. It has, however, invited discussion on big data and other new uses of data, to assess how such activities can be benefited from without compromising with people’s privacy.

In view of these issues, the White Paper has sought comments on the following key questions with respect to privacy principles like purpose specification and use limitation:

What is the relevance of these principles? Can they be modified to accommodate new technologies?
What is a test to determine if a subsequent use of data is reasonably related to / compatible with the initial purpose?
What should the role of sectoral regulators be in the process of explicating standards for compliance with the law in relation to these principles?
Any other views?

Part I of the series explores the definitions of personal data and sensitive personal data, Part II of the series examines the jurisdiction and territorial scope of data protection laws, Part III of the series explores cross-border data flows and data localisation, Part IV deals with exemptions to data protection law and Part V deals with notice and consent.

The author is a lawyer and author specialising in technology laws. She is also a certified information privacy professional.