Table of Contents
In October 2019, Adobe faced a data breach in which around 7 million Creative Cloud accounts were compromised. The source of the breach? A database that was directly exposed to the Internet. According to some estimates, there is over an Exabyte of data currently stored in the cloud. A significant amount of that data, often confidential, is stored in databases, such as in the case of Adobe. This makes databases an appealing target for attackers.
If you are storing databases in the cloud, it’s less a question of if you’ll be attacked, but more of when it will happen. To minimize your liability, you need to take proactive steps to secure your databases.
Cloud Security Basics
Public cloud services work on a shared responsibility model. According to this model, your responsibilities differ depending on the type of cloud service you use. The table below breaks down security responsibilities for each available public cloud model.
|Software as a Service||Platform as a Service||Infrastructure as a Service|
|Your Responsibility||Data, endpoints, accounts, access management||Data, endpoints, accounts, access management||Data, endpoints, accounts, access management, Identity and directory infrastructure, applications, network controls, and OS|
|Shared Responsibility||Identity and directory infrastructure||Identity and directory infrastructure, applications, and network controls|
|Cloud Provider’s Responsibility||applications, network controls, OS, physical hosts, physical network, and physical data centers||OS, physical hosts, physical network, and physical data centers||physical hosts, physical network, and physical data centers|
Cloud providers offer a variety of paid and unpaid services to help you meet your responsibilities for security. For example, monitoring and logging solutions, key management solutions, and access control services. They also typically provide recommendations and best practices in their documentation. The cloud services you choose to use determine the specific services and practices that are applicable to you.
You have different responsibilities and risks if you connect your applications to an external database or host your own. Cloud providers do not take responsibility for self-hosted resources. If you need to host your databases separately, you can choose storage that integrates directly. For example, those available for Azure file storage. Supported integrations often include at least some direct security support from your cloud provider.
4 Best Practices for Securing Databases in the Cloud
#1. Keep Database and Applications Separate
You can use network segmentation to isolate your databases. Network segmentation is the use of firewalls and Access Control Lists (ACLs) to isolate services by controlling traffic flow. It often includes layering levels of authentication with your highest priority data at the lowest level. You can use network segmentation to ensure that Internet connections are only able to access your applications, not your databases directly.
Virtual Private Clouds (VPCs) are one way of accomplishing segmentation. AWS, Azure, and GCP all have services for creating VPCs. In a VPC, Applications can still access databases via predefined IP addresses and network gateways. Direct Internet connections are not allowed. VPCs and network segmentation can help limit access to your databases in case an application or endpoint is breached.
2#. Always Use Encryption
You should always encrypt data at rest and in transit. The major cloud vendors provide built-in encryption with their services. Depending on the service, they may also include in-transit encryption. Typically, in-transit encryption is performed with SSL and TLS. At-rest encryption is either performed client-side or server-side and typically uses AES-256.
Client-side encryption occurs on-premises. Your data is first encrypted and then transferred to the cloud. Server-side encryption occurs in the cloud and requires plaintext data to be transferred from on-premises. Client-side encryption is more secure but it requires you to independently manage your encryption keys. The major cloud vendors offer paid services that enable you to use client-side keys. Alternatively, you can use a third party service.
You can also consider using tokenization. Tokenization involves mapping data values to a generated token value. The relationship between the value and the assigned token is stored in a database called a token vault. This token value is then sent to the application or user requesting the data to be used as a reference. The data value itself is never transferred to the application or user. Tokenization is particularly useful for one-time access of data and for processing payment information.
3#. Restrict Access to Data
You should restrict access to your databases according to the principle of least privilege. The principle states that only those users and applications that require access should have it. You should grant users and applications the minimum amount of privileges necessary to perform required tasks.
Set up permissions according to roles whenever possible. Role-based permissions enable you to more easily apply uniform permissions to users. You can also more easily modify permissions when roles change.
Role-based distribution prevents you from having to modify users individually. Consider also separating administrative credentials into role-specific users. Separating high-level privileges limits the amount of damage that can be done by compromised credentials.
#4. Store Only the Data You Need
In general, you should only store the data you need. In particular, you should try to limit the amount of Personally Identifiable Information (PII) you store. PII is any information that can be used to identify an individual. This includes, name, ID number, date of birth,and biometric records. It also includes information linked directly to an individual, such as financial, medical, employment, or educational information.
You can anonymize your data before storage if you collect PII but don’t need to retain actual PII values. For example, if you are analyzing medical trend data but only need general demographic data.
You can anonymize data using a variety of techniques, including data masking, pseudonymization, and generalization:
- Data masking—involves modifying data with techniques like encryption or character substitution.
- Pseudonymization—involves replacing private IDs with fake IDs while preserving statistical accuracy.
- Generalization—involves classifying data into categories or ranges to remove specific data.
Limiting the data you store helps limit your liability in case of a breach. In addition to limiting live data, take care to limit backups and archive data. Ensure that your backups and archives don’t contain overlapping data to minimize your amount of stored data.
Store backups and archives in highly restricted areas or on-premises. Backups and archives typically only need to be accessed by administrators. There is no reason to store them with the same access permissions as your active databases.
Keeping your databases secure can be challenging but it’s not impossible. If you take the proper precautions you can significantly limit your risks. Hopefully, this article helped you understand the basics of database security and provided some useful tips for protecting your databases.
If you’ve already implemented these practices and are still looking for a little extra security, consider adding some “canary data” to your databases. Canary data is data that has no legitimate purpose. If you find it in the wild, you know it was stolen from your database.
You can also use this data as a flag, so that if an attacker attempts to transfer it outside of your system, you’ll receive an alert. Canary data can’t keep an attacker out of your system, but it can help you minimize or prevent damage.