From Paytm to UID – from the Ministry of Corporate Affairs to leading insurance and telecom companies – India Inc and the government are prone to poor design choices and sloppy programming.
The collection and treatment of user data by India Inc and the government is careless and callous at best. Credit: Reuters
At this very moment, there are a number of corporate websites in India that directly and indirectly expose the personal data of many individuals. Access to such data doesn’t require passwords or any form of sophisticated hacking. It’s there in the open and for the taking.
Not only can criminally-motivated individuals harvest this “leakage” of data to defraud individuals on a mass scale, but business are harming themselves by disclosing valuable information. To remedy the situation, website owners need to perform audits of their systems and plug any leaks they find. The government urgently needs to enact stronger laws protecting the collection, storage, use and dissemination of data and designate an agency to be responsible for enforcing the law. Finally, central and state law enforcement agencies need to vigorously enforce these laws.
While it may perhaps surprise individuals that data about them, such as customer names and contact details, is openly available online, they may also easily shrug it off as nothing serious. However, a website does not have to expose a person’s account number to put them at risk of fraud. In recent years there has been a huge increase in the number of attempts to gain access to an individual’s devices or email, social media accounts or financial accounts through targeted emails or messages using personal information. These attempts, known as phishing, rely on reaching individuals through mass email or messaging campaigns, like catching fish in trawler nets. A more targeted form of attack, called spear phishing, goes after specific individuals using personal information that would cause them to lower their guard.
Victims are often directed to a fake site that looks like the real one, in order to capture their login credentials. Another tactic is to get victims to open an attachment, which then installs some malicious software, “malware,” on their device. The malware may be used to capture passwords that the person enters on various Web sites. These passwords can give criminals access to the individual’s accounts, or, if the right person is compromised, they may give admin access to an entire system.
In the past, phishing emails told you that you had won the UK lottery or that a Nigerian minister wanted to share his looted money. Now orchestrated phishing campaigns say that your ICICI bank account has
been hacked , your PayPal account will be frozen or that your email server has run out of space. The perpetrators hope that if they send fake PayPal messages to enough people, some will reach actual customers of the service, and of those a few will take the bait. Only a small percent of the targeted individuals may fall victim, but done on a massive enough scale, this can still yield significant results.
Figure 1: Examples of PayPal phishing emails
What would help fraudsters is if they could improve the odds. Rather than sending emails to the general population, they would benefit if they had an email list of a bank’s customers or users of a specific website. Now they would be phishing with bait that is more likely to attract their prey.
Spear phishing takes more effort but it can have a high yield if the right individual is compromised. Case in point: Access to John Podesta’s emails (reportedly
breached by a Russian phishing attack) helped significantly damage Hillary Clinton’s presidential campaign. Any bit of information could be useful for spear phishing. For example, knowing which school a person’s child attends could be used to send an email purportedly from the school with an attachment that spreads malware. Very few people would hesitate before clicking. Of course, this information could also be collected on an individual by following them, going through their garbage or talking to their employees and colleagues, but online this information can be harvested on a massive scale. Instead of targeting just one individual over a few weeks, it is possible to target anywhere from hundreds to millions.
Existing laws fall short
Online data privacy and protection are addressed by the Information Technology Act, 2000 (“IT Act”) and the Information Technology (Reasonable Security Practices and Procedures and Sensitive Personal Data or Information) Rules, 2011 (“IT Rules”). The IT Act provides criminal penalties for unauthorised access to systems and data, and the IT Rules set up constraints for handling “sensitive personal data or information (SPDI)”. SPDI refers to information such as passwords, financial account details, sexual orientation, health information and biometrics.
India’s IT Rules are broadly in sync with privacy laws around the world, such as Hong Kong, the UK and the EU (although it is an altogether matter on whether the provisions in the IT Rules are actually exercised). These laws all articulate certain core principles when it comes to data protection and privacy. While there is some variation, in general the principles are as follows:
1. An entity must disclose the purpose of data it collects and how it will be collected;
2. End users must explicitly be asked to give consent for their data to be collected and allowed to opt-out;
3. Data must be used only for the purpose stated and may not be used for unlawful purposes;
4. Data that is collected should be tailored to the specific use and unnecessary information should not be collected;
5. Data should not be retained for any longer than needed to serve the purpose;
6. Data should be accurate and kept up-to-date, and there should be means for users to correct mistakes and to seek redress;
7. Data should be secured against unauthorised access and inadvertent disclosure, but also against loss or destruction;
8. There should be limits on sharing and transfer of data so that the other principles are not violated.
The issue with India’s IT Rules is that they only cover “sensitive” data. However, since even non-sensitive data can be useful for a phishing attack, the law needs to go further and cover all information about a person that could potentially be used against them. When data is available openly, it is possible to write scripts that “scrape” or read the data from web pages, harvesting thousands or millions of records. This could either be done in a few hours, before any system administrators can notice, or over weeks or months to avoid raising suspicion. Therefore even seemingly benign data leaks create a clear and present risk of individual financial fraud conducted on a large scale, such that people may lose confidence in online services and stop using them. Given the trend towards increasing digitalisation of commerce, slowing that momentum could pose a significant setback to the whole economy.
Legally, there are few repercussions to website owners if they leak data. In December 2016, Troy Hunt, an independent security researcher from Australia,
wrote about a diagnostics lab in Mumbai that had exposed over 43,000 health records on the web. The data included results of HIV tests. This is clearly sensitive data of the type covered by the IT Rules, but so far there doesn’t seem to have been any official government response to this exposure.
Given the additional cost burden, it may seem obvious that private businesses would not want the data protection law to go further and cover all information about a person. However, such a measure is also in private industry’s own best interest in two ways. First, data leaks from other websites may allow criminals to succeed in phishing an employee with unrestricted access to critical data at his or her workplace. This could then be used to defraud customers or to lock up the company’s systems and demand a ransom to return control. So in order to protect their own businesses, leaders would want others to beef up security. It needs collective action.
Secondly, businesses that are leaking data should remember that if fraudsters can reach its customers, so can competitors. What better way to lure away customers than to target them with a more attractive offer? Knowing precisely who those customers are can be worth as much as the marketing budget. Of course, there is also the loss of trust and confidence, the public shaming, and perhaps even governmental sanctions.
Clear and Present Risk
If the threat concerns raised so far appear farfetched, know that there are near daily examples of data being leaked on the Internet. On February 18, programmer Srinivas Kodali
posted a series of tweets that an undisclosed site was displaying Aadhaar numbers of children, potentially exposing 5-6 lakh numbers for harvesting, with no breach necessary to access the information.
This writer has found nearly a dozen instances of personal data being leaked by Indian websites. In most cases the sites are violating the data protection principles mentioned above. Below are a few examples that put individuals at risk of being phished. Some of the websites are being named and details of the data that is accessible are being shared because the sites were designed this way, so the site owners know or ought to know what data they are displaying. None of these involve breaking into systems using malware, backdoors or cracking passwords. The information is openly available for systematic harvesting.
ICICI Bank and Twitter Pay
In 2015, ICICI bank launched a pay-by-Twitter service. At that time the service
was flagged as lacking adequate security. However, although ICICI did modify one aspect of the Twitter authentication, the security flaw still remains. In addition, ICICI is ignoring the evidence in front of its eyes, or what is in its Twitter feed. Hundreds, if not thousands by now, of Twitter users have misunderstood the ‘direct messaging’ feature of Twitter and have tweeted their mobile numbers at the bank. This information is available to anyone and can easily be harvested via a script. While the service itself is poorly thought through, in terms of data privacy it violates principles 2 (explicit consent), 7 (securing dissemination) and 8 (data transfer).
Figure 2: The unnecessary collection of Aadhaar numbers is spreading everywhere.
Bank links with Uidai.net.in
The Aadhaar unique ID project has generated a great deal of concern about a surveillance state and fraud. While court challenges to halt its use continue, the number of entities that are asking for an Aadhaar number has mushroomed. The Ministry of Consumer Affairs’ Consumer Helpline page, Jet Airways and the government’s site for missing children are just a few examples. Many of these would seem to violate Principle 4 (not collecting unnecessary information). For example, it’s not clear why an airline would need the Aadhaar number when someone is booking a flight. Jet doesn’t explain this and did not respond to queries, which is against Principle 1 of disclosing why data is being collected (It may also violate the Aadhaar Act).
Harvesting Aadhaar numbers is a trivial matter, as Kodali’s tweets show. That site that leaked the data violates Principle 7 on data security and access, but incidents such as this are sure to increase as the number of entities collecting the number grows. There is also the potential of wilful abuse. The day after Kodali’s tweet, an article in
Dainik Bhaskar stated that Reliance employees had been arrested for harvesting and selling Aadhaar data collected while registering new mobile phone customers.
As an ID number by itself, an Aadhaar that has been leaked may not be more problematic than other personal data. However, the government’s drive to link it to financial services does pose a problem. For example, the UIDAI Web site lets the public check whether an Aadhaar number has been linked to a bank account. Enter any Aadhaar number and the site will tell you whether it is linked to an account, when it was linked and to which bank. If fraudsters can collect Aadhaar numbers, the associated mobile numbers and the bank linked to that number, it would permit crooks to spear phish on a massive scale.
Figure 3: Anyone can get the status of any Aadhaar number.
PayTM disclosed account holders’ names
PayTM has much been in the news lately due to demonetisation and people’s search for a digital payment method. Everyone who registers with PayTM must provide a mobile number. The company claims to have 147 million account holders, and according to TRAI there are about 970 million mobile numbers, so about 1 in 6 phone numbers may be a possible PayTM customer. Up until an update around February 12th (one week after this author contacted PayTM about the data leak), PayTM made it possible to increase the odds of finding PayTM customers, in fact to build a definitive list of PayTM customers. In the previous version, when a person sent money to a phone number using the app, it told the sender whether or not the number was a registered PayTM customer, and if it did belong to a customer, it displayed the name of the account holder. That is enough information to mount a phishing attack.
If someone had wanted to, they could have employed a thousand people who had a mobile phone, asked each of them to get a PayTM wallet, added Rs1,000 rupees into each wallet, and then instructed each employee to start with a different mobile number and pay Rs1 to every number sequentially, noting the details any time a phone number was registered with a wallet. They could have easily collected a list of hundreds of thousands of customers who could then be targeted. The old PayTM was violating at the very least principle 7 (securing information).
Figure 4 (to the left) shows how PayTM used to disclose who are its customers, which can be used for phishing.
Insurance company leaks email, nominee details
One of India’s largest insurance companies has a web page for renewing an insurance policy. There, the user can either sign in using their user ID and password, or simply enter the policy number. On submitting a valid policy number, the user is taken through a series of pages that confirm the type of insurance and premium amount, and the nominee name, address and the customer’s email address. This page is insecure because a person could type any policy number and if it matches an existing policy, they would get to the personal information. Therefore a systematic probe of all possible policy numbers can yield a wealth of data including name, address, email address and spouse/close relative name. This goes against principle 7, but at the moment there is no penalty for this.
There are many other instances, including the Ministry of Corporate Affairs (MCA) inadvertently leaking PAN numbers (the site has been changed to remove this after this writer contacted the MCA), a mobile company that confirms subscriber numbers and partially discloses the email address, and a bank that makes it possible to guess user IDs. Sometimes the leaks are a result of poor design or sloppy programming, and sometimes they are violating some aspects of the data protection principles. In each of these instances, the companies were asked about the data leaks, but, aside from the MCA, none of them have responded or changed their sites.
Going forward, every company should perform an audit of its web sites for data leaks. The CBI or some other law enforcement agency should investigate reports of data leaks for violation of the IT Rules and other current laws. Finally, the government should enact new legislation that extends IT Rules or the data protection principles to all personal data, while imposing accountability and assigning responsibility for financial loss or costs incurred due to data leakage.
Sushil Kambampati (@SKisContent) is the publisher of NewsPie.in, India’s first micropayment-based news and information platform, and YouRTI.in, where anyone can suggest an RTI query simply and anonymously. He writes about online security and privacy.