2020's largest leaks reveal the escalating cost of cloud security misconfigurations

(Image credit: Shutterstock / Ekaphone maneechot)

The extent to which 2020 shaped security cannot be understated, with the transition to remote and hybrid work being one of the most prominent events to impact the field. While the transition is still ongoing, there are concerning bellwethers indicating that many companies, even those which have already embraced digital transformation, need to mature their cloud security posture. One of the critical worries is that as companies go all in on cloud adoption, they bring with them staggering amounts of data. As a result, even seemingly minor security incidents, like a temporary misconfiguration of a single database, can have disastrous consequences. 2020 provides ample illustration of this. Last year, just five security incidents were responsible for exposing almost 27 billion records. These incidents were data leaks impacting the following companies: 

-Oracle (BlueKai). A division of Oracle, BlueKai is a marketing technology company that tracks up to 1 percent of all internet traffic. BlueKai is stated to have leaked “several billion” records. Much of which identified detailed demographic and purchasing intent data.

-CAM4. CAM4 is an adult entertainment site. It exposed 7TB of data totaling 10.88 billion records from a misconfigured Elasticsearch database. Records included PII such as names, gender and sexual orientation, user conversations, email correspondence, and service logs detailing login data, payment, etc. 

-Advanced Info Service (AIS). AIS is a Thai telecommunications giant with almost 40 million users and over 10,000 employees. The company leaked more than 8 billion records, including DNS queries and Netflow data.

-Keepnet Labs. The incident affecting Keepnet Labs, a security firm, proves that even firms that should be aware of cloud risks may still encounter them. The company aggregates datasets from historical data breaches so that it can notify customers when their credentials might be compromised. The exposure occurred during a routine migration of an Elasticsearch database, during which the database was indexed.

-Whisper. Whisper is an anonymous social network where users share “whispers” or intimate confessions they wish to get off their chest. The site however had been storing records on a non-password protected Elasticsearch database with records containing user age, ethnicity, gender, sexual orientation, hometown, nickname, and geolocation data. In all, just over 900 million records were exposed. 

Growing storm clouds 

The term mega-breach was used to denote breaches exposing in excess of 1 million records, a now quaint milestone exceeded as early as 2004 when an AOL insider exfiltrated 92 million screen names. Multi-billion record incidents are not new; there has been roughly one a year since 2017, starting with Yahoo’s 3 billion user data breach (while Yahoo’s breach occurred in 2013, it wasn’t disclosed until 2017). 2020 marks the first time that more than two occurred simultaneously in a year, drawing concern to what might be a new hallmark: the “giga-breach.”

A key concern with exposed cloud systems is the large volume of data contained within them. Although this has caused data exposure incidents to balloon in size, at least in terms of the number of records exposed, the total number of individuals impacted tends to have a limit. For example, while the CAM4 data leak exposed over 10 billion records, it’s believed to have only impacted some several hundred individuals. While the scope of the incident appears tiny, another concern emerges. Breaches with smaller victim sizes and a disproportionately higher number of records exposed result in there being more records associated with each individual affected. What this means is that each victim may have dozens or even hundreds of data points connected to their identities. This was illustrated pretty well by the 2018 Apollo data leak, which was believed to have exposed up to nine billion data points, but only 212 million individual names and email addresses.

One of the consequences of this disparity is that the amount of data on each victim will be highly detailed and intricate. Attackers can use such information to create personally tailored phishing campaigns or blackmail. In many instances, the entities found leaking this data are marketing data aggregation firms like Apollo, meaning that the data contained in the breach might include demographic or even psychometric data. The breach impacting Oracle’s BlueKai is a perfect illustration of this, with some records revealing the gambling habits of a specific individual, for example.

It’s unclear to what extent the trend of yearly multi-billion record breaches is poised to take off. As of the time of this writing, only two such breaches have occurred. However, the real worry here is actually the underlying cause contributing to the breaches on this list — mainly the misconfiguration of cloud systems, which has been happening for over a decade. Regardless of the size, the continuation of data leaks of this nature may further fuel the evolution of ransomware and phishing attacks, both of which have seen substantial growth in the past year and are already proving to be challenging to defend against.

The demands of cloud security 

Cloud data leaks were one of the top hits of security incidents last decade, as cloud was still maturing and adopters were still getting their bearings. Perhaps like disco, some trends may never quite die.

Why does this keep happening? Data leaks like this provide a crash course in cloud security. The first lesson? Never take your eyes off your data. When in the cloud, companies need to adopt a security-first, cloud native mindset. While mistakes happen, having intelligent security tools that can alert you to when sensitive information has been made public will let you quickly respond to such accidents before they become full-blown incidents.

Cloud providers also have a role to play here too. The good news is that many of them have taken this responsibility seriously. For example, AWS has modified their UI over time to provide more feedback and warnings when actions might make private resources public. This has helped reduce the number of AWS systems (like S3 buckets) responsible for data leaks, though incidents involving these systems still happen.

One of the most surprising things, though, is that cloud adopters are still not prioritizing security first. A recent PwC survey (PwC US Cloud Business Survey) indicates that only 17 percent of organizations consider security and compliance during the planning phase of cloud adoption. The vast majority of organizations surveyed (37 percent) take these factors into account when gathering details on business requirements, which is still better than during or after migration, but not optimal.

Ideally, security and compliance requirements should inform business requirements, not the other way around. In the long run, adoption of systems which you can’t fully secure is a bigger impediment and burden to the organization. This doesn’t mean that security should become the “department of no” — security teams are just as responsible for business enablement as other parts of the organization. While leadership should provide the vision for digital transformation within the org, security is responsible for the blueprints. This might include building a strategic hybrid strategy rather than full on cloud adoption, or putting a timeline in place for when and how compliance and security requirements can be met for a dedicated cloud approach. As part of this process, security teams must investigate the types of tools and training that would improve security as the organization shifts to cloud.

Michael Osakwe, Content Marketing Manager, Nightfall AI

Michael Osakwe is a tech writer and Content Marketing Manager at Nightfall AI.

Topics