MongoDB Security: Best Practices to Keep Your Data Safe

11.2k views

With the increasing demand for globally available systems, data has become the most valuable asset in many organizations. Without it, a company may not be able to adequately provide its services to its customers. Worse yet, if the data falls into the wrong hands it could cause irreparable harm to a business’s customers, its reputation, and its bottom line.

MongoDB, also known as Mongo, is a document database used in many modern web applications. As with any database management system, it’s critical that those responsible for managing a Mongo database adhere to the recommended security best practices, both to prevent data from being lost in the event of a disaster and to keep it out of the hands of malicious actors.

This series of conceptual articles provides a high-level overview of MongoDB’s built-in security features while also highlighting some general database security best practices.

The most fundamental way you can protect the data you store in MongoDB is to limit network access to the server on which the database is running. One way to do this is to provision a virtual private network (VPN). A VPN presents its connection as if it were a local private network, allowing for secure communications between the servers within it. By running MongoDB behind a VPN, you can block access to any machine that isn’t connected to the same VPN.

On its own, though, a VPN may not be enough to prevent unauthorized users from accessing your MongoDB installation. For instance, there may be a large number of people who need access to your VPN but only a few of them need access to your Mongo database. You could have more granular control over who has access to your data by setting up a firewall on your database server.

A firewall provides network security by filtering incoming and outgoing traffic based on a set of user-defined rules. Firewall tools generally allow you to define rules with a high level of precision, giving you the flexibility to grant connections from specific IP addresses access to specific ports on your server. For example, you could write rules that would only allow an application server access to the port on your database server used by a MongoDB installation.

Another way to limit your database’s network exposure is to configure IP binding. By default, MongoDB is bound only to localhost upon installation. This means that, without further configuration, a fresh Mongo installation will only be able to accept connections that originate from localhost, or the same server on which the MongoDB instance is installed.

This default setting is secure, since it means the database is only accessible to those who already have access to the server on which it’s installed. However, this setting will cause problems if you need to access the database remotely from another machine. In such cases, you can additionally bind your instance to an IP address or hostname where the remote computer can reach the database server.

Authorization and authentication are two concepts that are critical for understanding database security. These two concepts are similar, but it’s important to understand what they are and what makes them different. Authentication is the process of confirming whether a user or client is actually who they claim to be. Authorization, on the other hand, involves setting rules for a given user or group of users to define what actions they can perform and which resources they can access.

Authentication

MongoDB features several mechanisms that allow you to authenticate users, with the default mechanism being its Salted Challenge Response Authentication Mechanism (SCRAM). SCRAM involves MongoDB reading and verifying credentials presented by a user against a combination of their username, password, and authentication database, all of which are known by the given MongoDB instance. If any of the user’s credentials don’t match what the Mongo database expects, the database won’t authenticate the user and they won’t gain access until they present the correct username, password, and authentication database.

You can also use a text file to act as a shared password for a group of connected MongoDB instances, such as a replica set or shard cluster. This method, known as keyfile authentication, is considered to be a bare-minimum form of security and is best suited for testing or development environments, as advised by the MongoDB documentation.

For production environments that implement sharding or replication, the MongoDB documentation recommends using another authentication mechanism: x.509 authentication. This involves distributing valid x.509 certificates — either self-signed or obtained from a third-party certificate authority — to the intended cluster members or clients. These are different from keyfiles, though, in that each machine gets its own dedicated x.509 certificate. This means that one machine’s certificate will only be useful for authenticating that machine. A client that presents a stolen x.509 certificate to the database server will not be able to authenticate.

x.509 authentication leverages a concept known as mutual authentication. This means when a client or cluster member authenticates themself to the server, the server is likewise authenticating itself to the client or cluster member. If the client or cluster member attempts to connect to a database server with an invalid x.509 certificate, it will be prevented from doing so since the mutual authentication will fail.

Authorization

MongoDB manages authorization through a computer security concept known as role-based access control. Whenever you create a MongoDB user, you have the option to provide them with one or more roles. A role defines what privileges a user has, including what actions they can perform on a given database, collection, set of collections, or cluster. When you assign a role to a user, that user receives all the privileges of that role.

MongoDB comes with a number of built-in roles that provide commonly-needed privileges. A few of these are available for every database, but most are only available for the admin database, as they’re intended to provide powerful administrative privileges. For example, you can assign a user the readWrite role on any database, meaning that you can read and modify the data held in any database on your system as long as you’ve granted a user the readWrite role over it. However, the readWriteAnyDatabase role — which allows the user to read and modify data on any database except for local and config — is only available in the admin database, as it provides broader system privileges.

In addition to its built-in roles, Mongo also allows you to define custom roles, giving you even more control over what resources users can access on your system. Like users, roles are added in a specific database. Other than roles created in the admin database, which can include privileges to any database in the system, a user-defined role’s privileges only apply to the database in which the role was created. With that said, a role can include one or more existing roles in its definition, and a role can inherit privileges from other roles in the same database.

With such fine-grained control over user privileges, you can set up dedicated users to perform certain functions, like a cluster administrator to manage replica sets and sharded clusters or a user administrator to create and manage users and custom roles. This type of user management strategy can also help harden your system’s security, as it reduces the number of users with broad privileges.

Encryption is the process of converting a piece of information from plaintext, the information’s original form, into ciphertext, an unreadable form that can only be read by a person or computer that has the right cipher to decrypt it. If a malicious actor were to intercept a piece of encrypted data, they wouldn’t be able to read it until they’re able to decrypt it.

You can encrypt communications between your MongoDB instance and whatever clients or applications need access to it by configuring it to require connections that use Transport Layer Security, also known as TLS. Like it’s predecessor, Secure Sockets Layer (SSL), TLS is a cryptographic protocol that uses certificate-based authentication to encrypt data as it’s transmitted over a network.

Note that TLS only encrypts data as it moves over a network, otherwise known as data in-transit. Even if you’ve configured Mongo to require connections to be made with TLS, the static data stored on the database server, called data at rest, will still be unencrypted. It isn’t possible to encrypt data at rest with the free Community Edition of MongoDB, but it is possible with Mongo’s paid subscription-based Enterprise Edition.

Even with both encryption-at-rest and encryption-in-transit enabled, though, your sensitive data could potentially still be accessed by an unapproved user. Consider, for example, a scenario where you’ve deployed a sharded NoSQL document database to store data for an ice cream delivery application you’ve developed. The database management system allows you to encrypt data at rest, which you enable, and you also configure it to require encrypted TLS connections between the shards as well as any clients.

In this example situation, when a customer places an order they’re asked to submit a few pieces of sensitive information, like their home address or their credit card number. The application then writes this information to the database in a document, like this:

{
  "name" : "Sammy Shark",
  "address" : {
    "street" : "602 Surf Ave",
    "city" : "Brooklyn",
    "state" : "New York",
    "zip" : 11224
  },
  "phone" : "555-555-1234",
  "creditcard" : "1234567890123456"
}

This is a potential security vulnerability, since anyone who has privileges to access the database could see and take advantage of your customers’ sensitive information.

To help mitigate this type of risk, since version 4.2 the official MongoDB drivers allow you to perform client-side field level encryption. This means that, when properly configured, an application can encrypt certain fields within a document before the data is sent to the database. Once the data has been written to the database, only applications or clients that can present the correct encryption keys will be able to decrypt and read the data in these fields. Otherwise, the data document would look similar to this, assuming the street, city, zip, phone, and creditcard fields have been encrypted on the client’s side:

{
  "name" : "Sammy Shark",
  "address" : {
    "street" : BinData(6,"eirefi3eid5feiZae9t+oot0noh9oovoch3=iethoh9t"),
    "city" : BinData(6,"xiesoh+aiveez=ngee1yei+u0aijah2eeKu7jeeB=oGh"),
    "state" : "New York"
    "zip" : BinData(6,"CoYeve+ziemaehai=io1Iliehoh6rei2+oo5eic0aeCh")
  },
  "phone" : BinData6,"quas+eG4chuolau6ahq=i8ahqui0otaek7phe+Miexoo"),
  "creditcard" : BinData6,"rau0Teez=iju4As9Eeyiu+h4coht=ukae8ahFah4aRo="),
}

MongoDB stores encrypted values as binary data, as indicated by the BinData class labels in the previous example. The 6 in each value represents the binary subtype in which the data is stored, and indicates the kind of binary data that’s been encoded. Values that have been encrypted with Mongo’s client-side field level encryption always use subtype 6.

No matter what precautions you or your cloud provider take to prevent them, computers are always at risk of hardware failure. An important part of managing any computer system, not just a MongoDB installation, is to make regular backups of your important information. By taking and storing backups of your data, you can restore your application to working order if your database server crashes and your original data is lost.

Just as you should regularly back up your MongoDB data, it’s equally important that you store those backups in a separate location from the server hosting your database. If you were to store your backups in the same data center as your database, both the database and your backups would be unavailable if the data center were to experience a failure and you wouldn’t be able to use the backups to get your application back online.

Replication is a practice that’s similar to making backups: where making a backup involves taking a point-in-time snapshot of all the data held in a database, replication involves constantly synchronizing data across multiple separate databases. It’s often useful to have multiple replicas of your data, as this provides redundancy in case one of the database servers fails and can also improve a database’s availability and scalability, as well as reduce read latencies. In MongoDB, a group of servers that maintain the same data set through replication are referred to as a replica set.

The official documentation recommends that any Mongo database used in a production environment be deployed as a replica set, since MongoDB replica sets employ a feature known as automatic failover. This means that if the primary fails and is unable to communicate with the secondary members for a predetermined amount of time, the secondary members will automatically elect a new primary member, thereby ensuring that your data remains available to your application or the clients that depend on it.

No matter how much effort you put into hardening your MongoDB installation’s security, it’s inevitable that new vulnerabilities will arise over time. As important as it is to run Mongo with secure settings from the outset, it’s just as important to perform frequent checks and diagnostics to determine the status of your system’s security.

For instance, you should regularly check for new updates to MongoDB to ensure that the version you’re using doesn’t have any unpatched vulnerabilities. Mongo version numbers take the form of X.Y.Z, with X referring to the version number, Y referring to the release or development series number, and Z referring to the revision or patch number. MongoDB puts out a new release roughly every year, with the latest at the time of this writing being 4.4, but they also put out new revisions and patches as needed.

While MongoDB generally recommends that you use the latest version available to optimize security, be aware that a new release series (meaning, from version 4.4 to version 4.6) can potentially break backwards compatibility. That said, MongoDB recommends that you always upgrade to the latest stable revision of your release series (meaning, if you have version 4.4.4 installed, you should upgrade to 4.4.5 when it’s available) as these are generally backwards-compatible patches intended to fix bugs.

One should also consider how they intend to interact with their MongoDB database and whether that will change over time. MongoDB provides several commands and methods that allow you to perform server-side execution of JavaScript functions by default. As an example, you can use the $where operator to evaluate a JavaScript expression in order to query for documents. This provides you with greater flexibility, as it allows you to express queries for which there isn’t an equivalent standard operator. However, by allowing server-side Javascript execution, you’re also exposing your database to potentially malicious expressions. Hence, MongoDB recommends that you disable server-side scripting if you don’t plan on using it.

Similarly, MongoDB will, by default, validate all user input to ensure that clients can’t insert a malformed BSON into the database. This input validation isn’t necessary for every use case, but MongoDB recommends keeping input validation enabled to prevent your database from storing any invalid BSON documents.

Maintaining a MongoDB database and keeping it secure is no small task, but by following the recommendations highlighted throughout this series you can reduce the number of your database’s vulnerabilities. WIth that said, the subject of securing a MongoDB database goes far beyond what could be discussed in a series like this one. Attackers are becoming more sophisticated every day, meaning that a database system could still become compromised even if it had been secured with all of the recommendations and features highlighted here.

As MongoDB has grown more popular, a number of cloud companies have launched their own managed MongoDB database service. A managed database, sometimes referred to as database-as-a-service or DBaaS, is a cloud computing service in which the end user pays a cloud service provider for access to a database.

Unlike a self-managed database, users don’t have to set up or maintain a managed database on their own; rather, it’s the provider’s responsibility to oversee the database’s infrastructure. Likewise, the cloud provider takes on much of the responsibilities related to keeping the database secure. Oftentimes the provider will deploy the database behind a firewall they control, and may require that any remote connections be made over TLS.

A common feature among managed database services is that they provide automatic backups as a form of disaster recovery. Many also ensure high availability and failover through automatic replication. However, as with any cloud service, by using a managed database you’re giving up much of the control that comes with the “roll-your-own” approach of overseeing all aspects of the database yourself.

DigitalOcean now offers its own managed MongoDB service that comes with a number of helpful security features. For example, DigitalOcean Managed MongoDB Databases require connections to be made over TLS/SSL, ensuring that your data remains encrypted as it traverses the network. The data held in a Managed MongoDB Database is also encrypted at rest through the Linux Unified Key Setup, so you can rest assured that your data will be protected from unauthorized users.

You can deploy a DigitalOcean Managed MongoDB Database with standby nodes. In the event of a failure, the service will switch data handling over to a standby node, helping to keep your data highly available. And after spinning up a MongoDB database managed by DigitalOcean, you can secure it by restricting inbound connections to specific Droplets, Kubernetes clusters, or tags. You can even spin up a Mongo database within a Virtual Private Cloud, ensuring that your data is only accessible to resources within a trusted network.

Click here to learn more about DigitalOcean’s Managed MongoDB Databases.

If you work for a large company that uses MongoDB, it might be helpful to hire one or more full time database administrators or an outside consultant database administrator to help you consider which of MongoDB’s security features makes the most sense for you to implement. You might even consider MongoDB’s Enterprise edition, which includes advanced security features like Kerberos authentication and built-in auditing. However, the Enterprise Edition requires a paid subscription and still requires careful administration and oversight.

Conclusion

By reading this series of conceptual articles, you’ll have gained a better understanding of MongoDB’s security features as well as general database security best practices. However, be aware that everything that goes into protecting a MongoDB database is far beyond the scope of this series

For more information about MongoDB, we encourage you to check out DigitalOcean’s entire library of MongoDB content. Additionally, for more information on working with MongoDB and keeping it secure, you may be interested in checking out the official MongoDB documentation.