How To Use Schema Validation in MongoDB

Published on January 24, 2022

The author selected the Open Internet/Free Speech Fund to receive a donation as part of the Write for DOnations program.

Introduction

One important aspect of relational databases — which store databases in tables made up of rows and columns — is that they operate on fixed, rigid schemas with fields of known data types. Document-oriented databases like MongoDB are more flexible in this regard, as they allow you to reshape your documents’ structure as needed.

However, there are likely to be situations in which you might need your data documents to follow a particular structure or fulfill certain requirements. Many document databases allow you to define rules that dictate how parts of your documents’ data should be structured while still offering some freedom to change this structure if needed.

MongoDB has a feature called schema validation that allows you to apply constraints on your documents’ structure. Schema validation is built around JSON Schema, an open standard for JSON document structure description and validation. In this tutorial, you’ll write and apply validation rules to control the structure of documents in an example MongoDB collection.

Prerequisites

To follow this tutorial, you will need:

A server with a regular, non-root user with sudo privileges and a firewall configured with UFW. This tutorial was validated using a server running Ubuntu 20.04, and you can prepare your server by following this initial server setup tutorial for Ubuntu 20.04.
MongoDB installed on your server. To set this up, follow our tutorial on How to Install MongoDB on Ubuntu 20.04.
Your server’s MongoDB instance secured by enabling authentication and creating an administrative user. To secure MongoDB like this, follow our tutorial on How To Secure MongoDB on Ubuntu 20.04.
Familiarity with querying MongoDB collections and filtering results. To learn how to use MongoDB queries, follow our guide on How To Create Queries in MongoDB.

Note: The linked tutorials on how to configure your server, install MongoDB, and secure the MongoDB installation refer to Ubuntu 20.04. This tutorial concentrates on MongoDB itself, not the underlying operating system. It will generally work with any MongoDB installation regardless of the operating system as long as authentication has been enabled.

Step 1 — Inserting Documents Without Applying Schema Validation

In order to highlight MongoDB’s schema validation features and why they can be useful, this step outlines how to open the MongoDB shell to connect to your locally-installed MongoDB instance and create a sample collection within it. Then, by inserting a number of example documents into this collection, this step will show how MongoDB doesn’t enforce any schema validation by default. In later steps, you’ll begin creating and enforcing such rules yourself.

To create the sample collection used in this guide, connect to the MongoDB shell as your administrative user. This tutorial follows the conventions of the prerequisite MongoDB security tutorial and assumes the name of this administrative user is AdminSammy and its authentication database is admin. Be sure to change these details in the following command to reflect your own setup, if different:

mongo -u AdminSammy -p --authenticationDatabase admin

Enter the password set during installation to gain access to the shell. After providing the password, you’ll see the > prompt sign.

To illustrate the schema validation features, this guide’s examples use an sample database containing documents that represent the highest mountains in the world. The sample document for Mount Everest will take this form:

The Everest document

{
    "name": "Everest",
    "height": 8848,
    "location": ["Nepal", "China"],
    "ascents": {
        "first": {
            "year": 1953,
        },
        "first_winter": {
            "year": 1980,
        },
        "total": 5656,
    }
}

This document contains the following information:

name: the peak’s name.
height: the peak’s elevation, in meters.
location: the countries in which the mountain is located. This field stores values as an array to allow for mountains located in more than one country.
ascents: this field’s value is another document. When one document is stored within another document like this, it’s known as an embedded or nested document. Each ascents document describes successful ascents of the given mountain. Specifically, each ascents document contains a total field that lists the total number of successful ascents of each given peak. Additionally, each of these nested documents contain two fields whose values are also nested documents:
- first: this field’s value is a nested document that contains one field, year, which describes the year of the first overall successful ascent.
- first_winter: this field’s value is a nested document that also contains a year field, the value of which represents the year of the first successful winter ascent of the given mountain.

Run the following insertOne() method to simultaneously create a collection named peaks in your MongoDB installation and insert the previous example document representing Mount Everest into it:

db.peaks.insertOne(
    {
        "name": "Everest",
        "height": 8848,
        "location": ["Nepal", "China"],
        "ascents": {
            "first": {
                "year": 1953
            },
            "first_winter": {
                "year": 1980
            },
            "total": 5656
        }
    }
)

The output will contain a success message and an object identifier assigned to the newly inserted object:

Output{
        "acknowledged" : true,
        "insertedId" : ObjectId("618ffa70bfa69c93a8980443")
}

Although you inserted this document by running the provided insertOne() method, you had complete freedom in designing this document’s structure. In some cases, you might want to have some degree of flexibility in how documents within the database are structured. However, you might also want to make sure some aspects of the documents’ structure remain consistent to allow for easier data analysis or processing.

To illustrate why this can be important, consider a few other example documents that might be entered into this database.

The following document is almost identical to the previous one representing Mount Everest, but it doesn’t contain a name field:

The Mountain with no name at all

{
    "height": 8611,
    "location": ["Pakistan", "China"],
    "ascents": {
        "first": {
            "year": 1954
        },
        "first_winter": {
            "year": 1921
        },
        "total": 306
    }
}

For a database containing a list of the highest mountains in the world, adding a document representing a mountain but not including its name would likely be a serious error.

In this next example document, the mountain’s name is present but its height is represented as a string instead of a number. Additionally, the location is not an array but a single value, and there is no information on the total number of ascent attempts:

Mountain with a string value for its height

{
    "name": "Manaslu",
    "height": "8163m",
    "location": "Nepal"
}

Interpreting a document with as many omissions as this example could prove difficult. For instance, you would not be able to successfully sort the collection by peak heights if the height attribute values are stored as different data types between documents.

Now run the following insertMany() method to test whether these documents can be inserted into the database without causing any errors:

db.peaks.insertMany([
    {
        "height": 8611,
        "location": ["Pakistan", "China"],
        "ascents": {
            "first": {
                "year": 1954
            },
            "first_winter": {
                "year": 1921
            },
            "total": 306
        }
    },
    {
        "name": "Manaslu",
        "height": "8163m",
        "location": "Nepal"
    }
])

As it turns out, MongoDB will not return any errors and both documents will be inserted successfully:

Output{
        "acknowledged" : true,
        "insertedIds" : [
                ObjectId("618ffd0bbfa69c93a8980444"),
                ObjectId("618ffd0bbfa69c93a8980445")
        ]
}

As this output indicates, both of these documents are valid JSON, which is enough to insert them into the collection. However, this isn’t enough to keep the database logically consistent and meaningful. In the next steps, you’ll build schema validation rules to make sure the data documents in the peaks collection follow a few essential requirements.

Step 2 — Validating String Fields

In MongoDB, schema validation works on individual collections by assigning a JSON Schema document to the collection. JSON Schema is an open standard that allows you to define and validate the structure of JSON documents. You do this by creating a schema definition that lists a set of requirements that documents in the given collection must follow to be considered valid.

Any given collection can only use a single JSON Schema, but you can assign a schema when you create the collection or any time afterwards. If you decide to change your original validation rules later on, you will have to replace the original JSON Schema document with one that aligns with your new requirements.

To assign a JSON Schema validator document to the peaks collection you created in the previous step, you could run the following command:

db.runCommand({
    "collMod": "collection_name",
    "validator": {
        $jsonSchema: {JSON_Schema_document}
    }
})

The runCommand method executes the collMod command, which modifies the specified collection by applying the validator attribute to it. The validator attribute is responsible for schema validation and, in this example syntax, it accepts the $jsonSchema operator. This operator defines a JSON Schema document which will be used as the schema validator for the given collection.

Warning: In order to execute the collMod command, your MongoDB user must be granted the appropriate privileges. Assuming you followed the prerequisite tutorial on How To Secure MongoDB on Ubuntu 20.04 and are connected to your MongoDB instance as the administrative user you created in that guide, you will need to grant it an additional role to follow along with the examples in this guide.

First, switch to your user’s authentication database. This is admin in the following example, but connect to your own user’s authentication database if different:

use admin

Outputswitched to db admin

Then run a grantRolesToUser() method and grant your user the dbAdmin role over the database where you created the peaks collection. The following example assumes the peaks collection is in the test database:

db.grantRolesToUser(
  "AdminSammy",
  [ { role : "dbAdmin", db : "test" } ]
  )

Alternatively, you can grant your user the dbAdminAnyDatabase role. As this role’s name implies, it will grant your user dbAdmin privileges over every database on your MongoDB instance:

db.grantRolesToUser(
  "AdminSammy",
  [ "dbAdminAnyDatabase" ]
  )

After granting your user the appropriate role, navigate back to the database where your peaks collection is stored:

use test

Output
switched to db test

Be aware that you can also assign a JSON Schema validator when you create a collection. To do so, you could use the following syntax:

db.createCollection(
    "collection_name", {
    "validator": {
        $jsonSchema: {JSON_Schema_document}
    }
})

Unlike the previous example, this syntax doesn’t include the collMod command, since the collection doesn’t yet exist and thus can’t be modified. As with the previous example, though, collection_name is the name of the collection to which you want to assign the validator document and the validator option assigns a specified JSON Schema document as the collection’s validator.

Applying a JSON Schema validator from the start like this means every document you add to the collection must satisfy the requirements set by the validator. When you add validation rules to an existing collection, though, the new rules won’t affect existing documents until you try to modify them.

The JSON schema document you pass to the validator attribute should outline every validation rule you want to apply to the collection. The following example JSON Schema will make sure that the name field is present in every document in the collection, and that the name field’s value is always a string:

Your first JSON Schema document validating the name field

{
    "bsonType": "object",
    "description": "Document describing a mountain peak",
    "required": ["name"],
    "properties": {
        "name": {
            "bsonType": "string",
            "description": "Name must be a string and is required"
        }
    },
}

This schema document outlines certain requirements that certain parts of documents entered into the collection must follow. The root part of the JSON Schema document (the fields before properties, which in this case are bsonType, description, and required) describes the database document itself.

The bsonType property describes the data type that the validation engine will expect to find. For the database document itself, the expected type is object. This means that you can only add objects — in other words, complete, valid JSON documents surrounded by curly braces ({ and }) — to this collection. If you were to try to insert some other kind of data type (like a standalone string, integer, or an array), it would cause an error.

In MongoDB, every document is an object. However, JSON Schema is a standard used to describe and validate all kinds of valid JSON documents, and a plain array or a string is valid JSON, too. When working with MongoDB schema validation, you’ll find that you must always set the root document’s bsonType value as object in the JSON Schema validator.

Next, the description property provides a short description of the documents found in this collection. This field isn’t required, but in addition to being used to validate documents, JSON Schemas can also be used to annotate the document’s structure. This can help other users understand what the purpose of the documents are, so including a description field can be a good practice.

The next property in the validation document is the required field. The required field can only accept an array containing a list of document fields that must be present in every document in the collection. In this example, ["name"] means that the documents only have to contain the name field to be considered valid.

Following that is a properties object that describes the rules used to validate document fields. For each field that you want to define rules for, include an embedded JSON Schema document named after the field. Be aware that you can define schema rules for fields that aren’t listed in the required array. This can be useful in cases where your data has fields that aren’t required, but you’d still like for them to follow certain rules when they are present.

These embedded schema documents will follow a similar syntax as the main document. In this example, the bsonType property will require every document’s name field to be a string. This embedded document also contains a brief description field.

To apply this JSON Schema to the peaks collection you created in the previous step, run the following runCommand() method:

db.runCommand({
    "collMod": "peaks",
    "validator": {
        $jsonSchema: {
            "bsonType": "object",
            "description": "Document describing a mountain peak",
            "required": ["name"],
            "properties": {
                "name": {
                    "bsonType": "string",
                    "description": "Name must be a string and is required"
                }
            },
        }
    }
})

MongoDB will respond with a success message indicating that the collection was successfully modified:

Output{ "ok" : 1 }

Following that, MongoDB will no longer allow you to insert documents into the peaks collection if they don’t have a name field. To test this, try inserting the document you inserted in the previous step that fully describes a mountain, aside from missing the name field:

db.peaks.insertOne(
    {
        "height": 8611,
        "location": ["Pakistan", "China"],
        "ascents": {
            "first": {
                "year": 1954
            },
            "first_winter": {
                "year": 1921
            },
            "total": 306
        }
    }
)

This time, the operation will trigger an error message indicating a failed document validation:

OutputWriteError({
        "index" : 0,
        "code" : 121,
        "errmsg" : "Document failed validation",
        . . .
})

MongoDB won’t insert any documents that fail to pass the validation rules specified in the JSON Schema.

Note: Starting with MongoDB 5.0, when validation fails the error messages point towards the failed constraint. In MongoDB 4.4 and earlier, the database provides no further details on the failure reason.

You can also test whether MongoDB will enforce the data type requirement you included in the JSON Schema by running the following insertOne() method. This is similar to the last operation, but this time it includes a name field. However, this field’s value is a number instead of a string:

db.peaks.insertOne(
    {
        "name": 123,
        "height": 8611,
        "location": ["Pakistan", "China"],
        "ascents": {
            "first": {
                "year": 1954
            },
            "first_winter": {
                "year": 1921
            },
            "total": 306
        }
    }
)

Once again, the validation will fail. Even though the name field is present, it doesn’t meet the constraint that requires it to be a string:

OutputWriteError({
        "index" : 0,
        "code" : 121,
        "errmsg" : "Document failed validation",
        . . .
})

Try once more, but with the name field present in the document and followed by a string value. This time, name is the only field in the document:

db.peaks.insertOne(
    {
        "name": "K2"
    }
)

The operation will succeed, and the document will receive the object identifier as usual:

Output{
        "acknowledged" : true,
        "insertedId" : ObjectId("61900965bfa69c93a8980447")
}

The schema validation rules pertain only to the name field. At this point, as long as the name field fulfills the validation requirements, the document will be inserted without error. The rest of the document can take any shape.

With that, you’ve created your first JSON Schema document and applied the first schema validation rule to the name field, requiring it to be present and a string. However, there are different validation options for different data types. Next, you’ll validate number values stored in each document’s height field.

Step 3 — Validating Number Fields

Recall from Step 1 when you inserted the following document into the peaks collection:

Mountain with a string value for its height

{
    "name": "Manaslu",
    "height": "8163m",
    "location": "Nepal"
}

Even though this document’s height value is a string instead of a number, the insertMany() method you used to insert this document was successful. This was possible because you haven’t yet added any validation rules for the height field.

MongoDB will accept any value for this field — even values that don’t make any sense for this field, like negative values — as long as the inserted document is written in valid JSON syntax. To work around this, you can extend the schema validation document from the previous step to include additional rules regarding the height field.

Start by ensuring that the height field is always present in newly-inserted documents and that it’s always expressed as a number. Modify the schema validation with the following command:

db.runCommand({
    "collMod": "peaks",
    "validator": {
        $jsonSchema: {
            "bsonType": "object",
            "description": "Document describing a mountain peak",
            "required": ["name", "height"],
            "properties": {
                "name": {
                    "bsonType": "string",
                    "description": "Name must be a string and is required"
                },
                "height": {
                    "bsonType": "number",
                    "description": "Height must be a number and is required"
                }
            },
        }
    }
})

In this command’s schema document, the height field is included in the required array. Likewise, there’s a height document within the properties object that will require any new height values to be a number. Again, the description field is auxiliary, and any description you include should only be to help other users understand the intention behind the JSON Schema.

MongoDB will respond with a short success message to let you know that the collection was successfully modified:

Output{ "ok" : 1 }

Now you can test the new rule. Try inserting a document with the minimal document structure required to pass the validation document. The following method will insert a document containing the only two mandatory fields, name and height:

db.peaks.insertOne(
    {
        "name": "Test peak",
        "height": 8300
    }
)

The insertion will succeed:

Output{
  acknowledged: true,
  insertedId: ObjectId("61e0c8c376b24e08f998e371")
}

Next, try inserting a document with a missing height field:

db.peaks.insertOne(
    {
        "name": "Test peak"
    }
)

Then try another that includes the height field, but this field contains a string value:

db.peaks.insertOne(
    {
        "name": "Test peak",
        "height": "8300m"
    }
)

Both times, the operations will trigger an error message and fail:

OutputWriteError({
        "index" : 0,
        "code" : 121,
        "errmsg" : "Document failed validation",
        . . .
})

However, if you try inserting a mountain peak with a negative height, the mountain will save properly:

db.peaks.insertOne(
    {
        "name": "Test peak",
        "height": -100
    }
)

To prevent this, you could add a few more properties to the schema validation document. Replace the current schema validation settings by running the following operation:

db.runCommand({
    "collMod": "peaks",
    "validator": {
        $jsonSchema: {
            "bsonType": "object",
            "description": "Document describing a mountain peak",
            "required": ["name", "height"],
            "properties": {
                "name": {
                    "bsonType": "string",
                    "description": "Name must be a string and is required"
                },
                "height": {
                    "bsonType": "number",
                    "description": "Height must be a number between 100 and 10000 and is required",
                    "minimum": 100,
                    "maximum": 10000
                }
            },
        }
    }
})

The new minimum and maximum attributes set constraints on values included in height fields, ensuring they can’t be lower than 100 or higher than 10000. This range makes sense in this case, as this collection is used to store information about mountain peak heights, but you could choose any values you like for these attributes.

Now, if you try inserting a peak with a negative height value again, the operation will fail:

db.peaks.insertOne(
    {
        "name": "Test peak",
        "height": -100
    }
)

OutputWriteError({
	"index" : 0,
	"code" : 121,
	"errmsg" : "Document failed validation",
. . .

As this output shows, your document schema now validates string values held in each document’s name field as well as numeric values held in the height fields. Continue reading to learn how to validate array values stored in each document’s location field.

Step 4 — Validating Array Fields

Now that each peak’s name and height values are being verified by schema validation constraints, we can turn our attention to the location field to guarantee its data consistency.

Specifying the location for mountains is more tricky than one might expect, since peaks span more than one country, and this is the case for many of the famous eight-thousanders. Because of this, it would make sense store each peak’s location data as an array containing one or more country names instead of being just a string value. As with the height values, making sure each location field’s data type is consistent across every document can help with summarizing data when using aggregation pipelines.

First, consider some examples of location values that users might enter, and weigh which ones would be valid or invalid:

["Nepal", "China"]: this is a two-element array, and would be a valid value for a mountain spanning two countries.
["Nepal"]: this example is a single-element array, it would also be a valid value for a mountain located in a single country.
"Nepal": this example is a plain string. It would be invalid because although it lists a single country, the location field should always contain an array
[]: an empty array, this example would not be a valid value. After all, every mountain must exist in at least one country.
["Nepal", "Nepal"]: this two-element array would also be invalid, as it contains the same value appearing twice.
["Nepal", 15]: lastly, this two-element array would be invalid, as one of its values is a number instead of a string and this is not a correct location name.

To ensure that MongoDB will correctly interpret each of these examples as valid or invalid, run the following operation to create some new validation rules for the peaks collection:

db.runCommand({
    "collMod": "peaks",
    "validator": {
        $jsonSchema: {
            "bsonType": "object",
            "description": "Document describing a mountain peak",
            "required": ["name", "height", "location"],
            "properties": {
                "name": {
                    "bsonType": "string",
                    "description": "Name must be a string and is required"
                },
                "height": {
                    "bsonType": "number",
                    "description": "Height must be a number between 100 and 10000 and is required",
                    "minimum": 100,
                    "maximum": 10000
                },
                "location": {
                    "bsonType": "array",
                    "description": "Location must be an array of strings",
                    "minItems": 1,
                    "uniqueItems": true,
                    "items": {
                        "bsonType": "string"
                    }
                }
            },
        }
    }
})

In this $jsonSchema object, the location field is included within the required array as well as the properties object. There, it’s defined with a bsonType of array to ensure that the location value is always an array rather than a single string or a number.

The minItems property validates that the array must contain at least one element, and the uniqueItems property is set to true to ensure that elements within each location array will be unique. This will prevent values like ["Nepal", "Nepal"] from being accepted. Lastly, the items subdocument defines the validation schema for each individual array item. Here, the only expectation is that every item within a location array must be a string.

Note: The available schema document properties are different for each bsonType and, depending on the field type, you will be able to validate different aspects of the field value. For example, with number values you could define minimum and maximum allowable values to create a range of acceptable values. In the previous example, by setting the location field’s bsonType to array, you can validate features particular to arrays.

You can find details on all possible validation choices in the JSON Schema documentation.

After executing the command, MongoDB will respond with a short success message that the collection was successfully modified with the new schema document:

Output{ "ok" : 1 }

Now try inserting documents matching the examples prepared earlier to test how the new rule behaves. Once again, let’s use the minimal document structure, with only the name, height, and location fields present.

db.peaks.insertOne(
    {
        "name": "Test peak",
        "height": 8300,
        "location": ["Nepal", "China"]
    }
)

The document will be inserted successfully as it fulfills all the defined validation expectations. Similarly, the following document will insert without error:

db.peaks.insertOne(
    {
        "name": "Test peak",
        "height": 8300,
        "location": ["Nepal"]
    }
)

However, if you were to run any of the following insertOne() methods, they would trigger a validation error and fail:

db.peaks.insertOne(
    {
        "name": "Test peak",
        "height": 8300,
        "location": "Nepal"
    }
)

db.peaks.insertOne(
    {
        "name": "Test peak",
        "height": 8300,
        "location": []
    }
)

db.peaks.insertOne(
    {
        "name": "Test peak",
        "height": 8300,
        "location": ["Nepal", "Nepal"]
    }
)

db.peaks.insertOne(
    {
        "name": "Test peak",
        "height": 8300,
        "location": ["Nepal", 15]
    }
)

As per the validation rules you defined previously, the location values provided in these operations are considered invalid.

After following this step, three primary fields describing a mountain top are already being validated through MongoDB’s schema validation feature. In the next step, you’ll learn how to validate nested documents using the ascents field as an example.

Step 5 — Validating Embedded Documents

At this point, your peaks collection has three fields — name, height and location — that are being kept in check by schema validation. This step focuses on defining validation rules for the ascents field, which describes successful attempts at summiting each peak.

In the example document from Step 1 that represents Mount Everest, the ascents field was structured as follows:

The Everest document

{
    "name": "Everest",
    "height": 8848,
    "location": ["Nepal", "China"],
    "ascents": {
        "first": {
            "year": 1953,
        },
        "first_winter": {
            "year": 1980,
        },
        "total": 5656,
    }
}

The ascents subdocument contains a total field whose value represents the total number of ascent attempts for the given mountain. It also contains information on the first winter ascent of the mountain as well as the first ascent overall. These, however, might not be essential to the mountain description. After all, some mountains might not have been ascended in winter yet, or the ascent dates are disputed or not known. For now, just assume the information that you will always want to have in each document is the total number of ascent attempts.

You can change the schema validation document so that the ascents field must always be present and its value must always be a subdocument. This subdocument, in turn, must always contain a total attribute holding a number greater than or equal to zero. The first and first_winter fields aren’t required for the purposes of this guide, so the validation form won’t consider them and they can take flexible forms.

Once again, replace the schema validation document for the peaks collection by running the following runCommand() method:

db.runCommand({
    "collMod": "peaks",
    "validator": {
        $jsonSchema: {
            "bsonType": "object",
            "description": "Document describing a mountain peak",
            "required": ["name", "height", "location", "ascents"],
            "properties": {
                "name": {
                    "bsonType": "string",
                    "description": "Name must be a string and is required"
                },
                "height": {
                    "bsonType": "number",
                    "description": "Height must be a number between 100 and 10000 and is required",
                    "minimum": 100,
                    "maximum": 10000
                },
                "location": {
                    "bsonType": "array",
                    "description": "Location must be an array of strings",
                    "minItems": 1,
                    "uniqueItems": true,
                    "items": {
                        "bsonType": "string"
                    }
                },
                "ascents": {
                    "bsonType": "object",
                    "description": "Ascent attempts information",
                    "required": ["total"],
                    "properties": {
                        "total": {
                            "bsonType": "number",
                            "description": "Total number of ascents must be 0 or higher",
                            "minimum": 0
                        }
                   }
                }
            },
        }
    }
})

Anytime the document contains subdocuments under any of its fields, the JSON Schema for that field follows the exact same syntax as the main document schema. Just like how the same documents can be nested within one another, the validation schema nests them within one another as well. This makes it straightforward to define complex validation schemas for document structures containing multiple subdocuments in a hierarchical structure.

In this JSON Schema document, the ascents field is included within the required array, making it mandatory. It also appears in the properties object where it’s defined with a bsonType of object, just like the root document itself.

Notice that the definition for ascents validation follows a similar principle as the root document. It has the required field, denoting properties the subdocument must contain. It also defines a properties list, following the same structure. Since the ascents field is a subdocument, it’s values will be validated just like those of a larger document would be.

Within ascents, there’s a required array whose only value is total, meaning that every ascents subdocument will be required to contain a total field. Following that, the total value is described thoroughly within the properties object, which specifies that this must always be a number with a minimum value of zero.

Again, because neither the first nor the first_winter fields are mandatory for the purposes of this guide, they aren’t included in these validation rules.

With this schema validation document applied, try inserting the sample Mount Everest document from the first step to verify it allows you to insert documents you’ve already established as valid:

db.peaks.insertOne(
    {
        "name": "Everest",
        "height": 8848,
        "location": ["Nepal", "China"],
        "ascents": {
            "first": {
                "year": 1953,
            },
            "first_winter": {
                "year": 1980,
            },
            "total": 5656,
        }
    }
)

The document saves successfully, and MongoDB returns the new object identifier:

Output{
        "acknowledged" : true,
        "insertedId" : ObjectId("619100f51292cb2faee531f8")
}

To make sure the last pieces of validation work properly, try inserting a document that doesn’t include the ascents field:

db.peaks.insertOne(
    {
        "name": "Everest",
        "height": 8848,
        "location": ["Nepal", "China"]
    }
)

This time, the operation will trigger an error message pointing out a failed document validation:

OutputWriteError({
        "index" : 0,
        "code" : 121,
        "errmsg" : "Document failed validation",
        . . .
})

Now try inserting a document whose ascents subdocument is missing the total field:

db.peaks.insertOne(
    {
        "name": "Everest",
        "height": 8848,
        "location": ["Nepal", "China"],
        "ascents": {
            "first": {
                "year": 1953,
            },
            "first_winter": {
                "year": 1980,
            }
        }
    }
)

This will again trigger an error.

As a final test, try entering a document that contains an ascents field with a total value, but this value is negative:

db.peaks.insertOne(
    {
        "name": "Everest",
        "height": 8848,
        "location": ["Nepal", "China"],
        "ascents": {
            "first": {
                "year": 1953,
            },
            "first_winter": {
                "year": 1980,
            },
            "total": -100
        }
    }
)

Because of the negative total value, this document will also fail the validation test.

Conclusion

By following this tutorial, you became familiar with JSON Schema documents and how to use them to validate document structures before saving them into a collection. You then used JSON Schema documents to verify field types and apply value constraints to numbers and arrays. You’ve also learned how to validate subdocuments in a nested document structure.

MongoDB’s schema validation feature should not be considered a replacement for data validation at the application level, but it can further safeguard against violating data constraints that are essential to keeping your data meaningful. Using schema validation can be a helpful tool for structuring one’s data while retaining the flexibility of a schemaless approach to data storage. With schema validation, you are in total control of those parts of the document structure you want to validate and those you’d like to leave open-ended.

The tutorial described only a subset of MongoDB’s schema validation features. You can apply more constraints to different MongoDB data types, and it’s even possible to change the strictness of validation behavior and use JSON Schema to filter and validate existing documents. We encourage you to study the official official MongoDB documentation to learn more about schema validation and how it can help you work with data stored in the database.

Thanks for learning with the DigitalOcean Community. Check out our offerings for compute, storage, networking, and managed databases.

Learn more about our products

Tutorial Series: How To Manage Data with MongoDB

MongoDB is a document-oriented NoSQL database management system (DBMS). Unlike traditional relational DBMSs, which store data in tables consisting of rows and columns, MongoDB stores data in JSON-like structures referred to as documents.

This series provides an overview of MongoDB’s features and how you can use them to manage and interact with your data.

Browse Series: 16 tutorials

1/16 - An Introduction to Document-Oriented Databases
2/16 - How To Use MongoDB Access Control
3/16 - How To Use the MongoDB Shell

About the author(s)

Mateusz Papiernik

Author

Software Engineer, CTO @Makimo

See author profile

Creating bespoke software ◦ CTO & co-founder at Makimo. I'm a software enginner & a geek. I like making impossible things possible. And I need tea.

See author profile

Mark Drake

Editor

Manager, Developer Education

See author profile

Former Technical Writer at DigitalOcean. Focused on SysAdmin topics including Debian 11, Ubuntu 22.04, Ubuntu 20.04, Databases, SQL and PostgreSQL.

Category:

Tags:

Still looking for an answer?

Ask a question Search for more help

Was this helpful?

This textbox defaults to using Markdown to format your answer.

You can type !ref in this text area to quickly search our full set of tutorials, documentation & marketplace offerings and insert the link!

This work is licensed under a Creative Commons Attribution-NonCommercial- ShareAlike 4.0 International License.

Become a contributor for community

Get paid to write technical tutorials and select a tech-focused charity to receive a matching donation.

DigitalOcean Documentation

Full documentation for every DigitalOcean product.

Learn more

Resources for startups and SMBs

The Wave has everything you need to know about building a business, from raising funding to marketing your product.

Learn more

Get our newsletter

Stay up to date by signing up for DigitalOcean’s Infrastructure as a Newsletter.

New accounts only. By submitting your email you agree to our Privacy Policy

The developer cloud

Scale up as you grow — whether you're running one virtual machine or ten thousand.

View all products

Get started for free

Get started

*This promotional offer applies to new accounts only.

Report this

How To Use Schema Validation in MongoDB

Introduction

Prerequisites

Step 1 — Inserting Documents Without Applying Schema Validation

Step 2 — Validating String Fields

Step 3 — Validating Number Fields

Step 4 — Validating Array Fields

Step 5 — Validating Embedded Documents

Conclusion

Tutorial Series: How To Manage Data with MongoDB

About the author(s)

Still looking for an answer?

Join the Tech Talk

Deploy on DigitalOcean

Become a contributor for community

DigitalOcean Documentation

Resources for startups and SMBs

Get our newsletter

The developer cloud

Get started for free