Editor's note: Ever since the USA confronted the possibility that foreign agents might have influenced its presidential elections using social media, every nation has to introspect and scrutinise data consolidators and analysers, for the inherent danger that digitally constructed personas can be used to influence the outcome of elections in an unfair manner. Is India immune to such digital commandeering of elections? Yes. The second part in our series on social media and its influence on real-world decisions deals with your online behaviour can be used easily to construct your online persona that can foretell your personality with amazing accuracy, and how when combined with hundreds of millions of digital personas, can be used to subvert the elections. The first part dealt with how you need to be wary of Facebook and other big data behemoths mine your digital life.

Representational image. Reuters

What do YouTube, Facebook and Amazon have in common?

Each of them uses massive data-sets to build personas of their users, and use recommendations to keep you engaged on their site. Clicking that funny guilty dog video on YouTube will bring up a chain of videos that you might like next, suitably interspersed with advertisements. Stories on your Facebook feed are carefully curated by algorithms that know your “likes” accompanied by embedded advertisements related to your interests. Adding an item to the cart makes Amazon recommend further items you might like. All rely on the power of large sets of user inputs to make inferences on keeping you (and your wallet) engaged.

In the first part of this article, we highlighted that Mark Zuckerberg’s seemingly audacious promise of ensuring “fair elections in India” was actually an admission of the power that large data analytics behemoths have to shape the opinions of vast sections of the population and thus interfere in their elections.

It is instructive to analyse the impact that social media companies might exert in an Indian election.

The portal Statista shows that up to 30 percent of the Indian population is on Facebook and YouTube in 2017, with upwardly surging numbers. The third largest social media group in India is WhatsApp with a share of 28 percent. Of course, most people would have multiple social media accounts and would be counted in each bucket. See the attached figure.

We note that in the 2014 general elections, an estimated 66.38 percent of the eligible voter population turned out to vote. We notice that with 30 percent penetration of Facebook, YouTube and WhatsApp a significant section of eligible voters (400 million) are exposed to the machinations of data-analytics exploiters, and can swing key elections, especially those that are closely fought.

One can be incredulous of this claim and question how social media giants and other data consolidators can exert such influence. So let us examine a few data generation and agglomeration practices that cover a large section of the Indian population and how that information can be used.

The portal Statista estimates that 36 percent of Indians have smart-phones in 2018. Each time that a user makes a call or sends a message, it is serviced by a cell tower that records the digital trail, called “call detail record” (CDR) metadata, which includes the identity and location of the cell tower, caller and called details, approximate location of the caller, and duration of the call. Over a period of time, the CDR file of a given number can be voluminous. When collated with the CDR records of a geographical area and analysed, valuable insights can be obtained on population movement, response to real-time events, and much more.

In India, cell towers are owned by a consortium called Indus Towers and shared by the major cell-phone service providers. The centralised ownership of cell towers allows the greatest potential for nationwide CDR metadata to be data-mined extensively for commercial and psephological interests.

Consider next WhatsApp, which guarantees end-to-end encryption meaning that neither they nor any third party can read the contents of the messages. However, WhatsApp collects metadata on each message or call such as caller and called information, duration of the call, operating systems, IP addresses, and browser data. If a popular video is being shared, WhatsApp might archive that video in a database for storage efficiency of its servers. Over time, the metadata can be used for building inferences about your circles and be used by its parent company, Facebook.

The sign marking the Google offices is lit up in Cambridge, Massachusetts, U.S., June 27, 2017. REUTERS/Brian Snyder - RTS18WDM

The sign marking the Google offices is lit up in Cambridge, Massachusetts. Image: Reuters.

What about free email services provided by the giants Google, Yahoo and Microsoft among others? The portal Statista estimates 400 million internet users and 200 million mobile users access email in India in 2018. Each time that a user sends free email, the services can read the data being provided, and garner information on the sender as well as those sent to. Additionally, mobile users who access email via apps expose their data to those apps too. Normally, the data collected is used to sell services and goods to the user by either the email providers or the retailers they sell the data to. In the case of giants with multiple apps such as Google, widespread agglomeration of messages, photos and other data provides deep insights into the user.

Services such as Facebook, Facebook Messenger, Google and its many apps, and others do not provide end-to-end encryption by default. Therefore any message sent on those services is mined for content. Finally, at the top of the ecosystem, ISPs have detailed internet history of up to 30 percent of all Indians — especially from unencrypted sites, which can be of tremendous value in analysing the interests and persona of the users.

What all this means is up to 400 million Indians have some sort of digital psychometric personas in varying degrees of accuracy that are in the hands of a few data-analytics firms. One might wonder what kind of accuracy one can get from such data.

In a study reported by New York Times researchers from Stanford University and Cambridge University showed how a psychometric model developed from a quiz and correlating the output of that model with the actual “likes” of the person on Facebook, they were able to predict that person’s personality outcomes. Further, with as little as 70 to 300 “likes” from Facebook posts, they were able to make accurate predictions. In other words, a daily user of Facebook awarding 10 “likes” on the average each day, risks having a highly accurate psychometric persona built within 30 days. The article notes that with 300 “likes”, the model is more accurate than a person’s spouse at predicting outcomes, or in plain words, has gone to the core of that person’s hidden personalities.

One should remember that analytics are just what they are — metrics that help to make sense of data. They can be harnessed for the good or bad of society. We have seen how the police harness metadata to capture criminals. As another positive example, research in epidemics or traffic decongestion can be done by analysing internet data. While music lovers enjoy discovering new music from Pandora and other services that have learnt their tastes, and LinkedIn discovers undisclosed (on the profile) talents of its users by soliciting reviews from contacts, and many other social media serve highly focused interests in seemingly intelligent manner, the downside is that analytics in the wrong hands can be disastrous.

One might recollect USA’s experience with an alleged foreign power meddling in their elections by using illegally obtained social media data, setting up fictitious social media accounts which gathered significant influence, and planting stories designed to evoke a desired response leading up to the elections.

Can this happen in India? I think the real question to ask is why cannot this happen in India? For the past few years, we have witnessed deep polarisation of the country where certain stories are echoed and escalated in the social and mainstream media, while certain others are marginalised, and have seen the fearsome public response that they are able to evoke. A determined coterie of special interests certainly appears capable of hijacking the elections when assisted by analytics companies. And we have noted above how pervasively data can be collected in the Indian context.

India Today reported rebel Congress Leader’s Shehzad Poonawala’s statement that Cambridge Analytica made a pitch to the Congress Party in 2017, detailing a data-analytics path to victory in the 2019 elections. Such a path would have involved taking advantage of “data mining, national situational analysis, data-driven campaign, strategic communication review, media monitoring and planning”. Hindustan Times noted that Cambridge Analytica and its India partner, Ovleno Business Intelligence Pvt Ltd, spoke to both the Congress and the BJP, and reported a success story of Cambridge Analytica in Bihar.

A polling officer applies ink on the finger of a voter at a polling centre during Maharashtra state elections, in Mumbai. Image: Reuters

How might data analytics be used to influence the Indian elections?

One easy answer is by creating hundreds and thousands of fake profiles on social media and cultivating and gathering influential friends who can act as unwitting amplifiers of planted stories to evoke outrage that confers a political advantage to actors, in the weeks and days leading up to the elections. This was the model used to influence the elections in the USA.

Facebook offers commercial services to people who create special interest pages. For example, my non-profit page on Indian History often gets solicitations from Facebook about promoting my posts on Indian history for as little as $7, spread over a week, and promising a reach of about 10,000 people whose interests are specified as keywords, and who need not be members of my group.

Facebook has data on Indians that can easily discern the familiar fault lines along language, religion, political affiliation, and other dividers traditionally thrust upon Indians. Now imagine a page created by a special interest with carefully created stories, and put in front of millions of Indians of carefully selected attributes (predicted by their digital personas), for as little as a few thousand dollars — chump change for the wheeler-dealers of Indian politics. Now imagine thousands of such special-interest pages in all regional languages spewing selective-outrage stories just days before elections. You get the idea of how change can come about with frightening efficiency and low cost.

India’s Election Commission should take notice of these trends and evolve strategies now to stymie interference from special interests of all shades from inside and outside the country. One can argue that in the absence of privacy laws, nothing is illegal about all this, and it is a level playing field for all political dispensations. One could further argue that this is a battle for minds and point out that the print and electronic media do post stories that evoke responses per the publication’s leanings. However, such a reading fails to note that data analytics that plays upon the passions of people and exploits the worst faultlines of the country while chasing the vote, flirts irresponsibly with permanent damage to the country. We already have seen signs of such exploitation in today’s discourse, leading one to suspect that such companies are already orchestrating outrage to benefit their political patrons. Thus the Election Commission needs to evolve laws now to define and enforce privacy and reasonable use of data metrics.

Enough said — welcome to the future, where no Faraday’s cage or tinfoil hat can save you.

The author is an electrical engineer who works with modelling, simulation and optimisation technologies. He holds a master’s degree in Electrical engineering from Indian Institute of Science in Bengaluru, and a doctoral degree in Electrical Engineering from Oregon State University.

Updated Date: Apr 18, 2018 13:13 PM