

# What is a document database?
<a name="what-is-document-db"></a>

Some developers don't think of their data model in terms of normalized rows and columns. Typically, in the application tier, data is represented as a JSON document because it is more intuitive for developers to think of their data model as a document. 

The popularity of document databases has grown because they let you persist data in a database by using the same document model format that you use in your application code. Document databases provide powerful and intuitive APIs for flexible and agile development.

**Topics**
+ [Use cases](document-database-use-cases.md)
+ [Understanding documents](document-database-documents-understanding.md)
+ [Working with documents](document-database-working-with-documents.md)

# Document database use Cases
<a name="document-database-use-cases"></a>

Your use case drives whether you need a document database or some other type of database for managing your data. Document databases are useful for workloads that require a flexible schema for fast, iterative development. The following are some examples of use cases for which document databases can provide significant advantages:

**Topics**
+ [User profiles](#document-databases-use-cases.user-profiles)
+ [Real-time big data](#document-databases-use-cases.big-data)
+ [Content management](#document-databases-use-cases.content-management)

## User profiles
<a name="document-databases-use-cases.user-profiles"></a>

Because document databases have a flexible schema, they can store documents that have different attributes and data values. Document databases are a practical solution to online profiles in which different users provide different types of information. Using a document database, you can store each user's profile efficiently by storing only the attributes that are specific to each user.

Suppose that a user elects to add or remove information from their profile. In this case, their document could be easily replaced with an updated version that contains any recently added attributes and data or omits any newly omitted attributes and data. Document databases easily manage this level of individuality and fluidity.

## Real-time big data
<a name="document-databases-use-cases.big-data"></a>

Historically, the ability to extract information from operational data was hampered by the fact that operational databases and analytical databases were maintained in different environments—operational and business/reporting respectively. Being able to extract operational information in real time is critical in a highly competitive business environment. By using document databases, a business can store and manage operational data from any source and concurrently feed the data to the BI engine of choice for analysis. There is no requirement to have two environments.

## Content management
<a name="document-databases-use-cases.content-management"></a>

To effectively manage content, you must be able to collect and aggregate content from a variety of sources, and then deliver it to the customer. Due to their flexible schema, document databases are perfect for collecting and storing any type of data. You can use them to create and incorporate new types of content, including user-generated content, such as images, comments, and videos.

# Understanding documents
<a name="document-database-documents-understanding"></a>

Document databases are used for storing semistructured data as a document—rather than normalizing data across multiple tables, each with a unique and fixed structure, as in a relational database. Documents stored in a document database use nested key-value pairs to provide the document's structure or schema. However, different types of documents can be stored in the same document database, thus meeting the requirement for processing similar data that is in different formats. For example, because each document is self-describing, the JSON-encoded documents for an online store that are described in the topic [Example documents in a document database](#document-database-documents) can be stored in the same document database. 

**Topics**
+ [SQL vs. non-relational terminology](#document-database-sql-vs-nosql-terms)
+ [Simple documents](#document-database-documents-simple)
+ [Embedded documents](#document-database-documents-embeded)
+ [Example documents in a document database](#document-database-documents)
+ [Understanding normalization in a document database](#document-database-normalization)

## SQL vs. non-relational terminology
<a name="document-database-sql-vs-nosql-terms"></a>

The following table compares terminology used by document databases (MongoDB) with terminology used by SQL databases.


|  SQL  |  MongoDB  | 
| --- | --- | 
|  Table  |  Collection  | 
|  Row  |  Document  | 
|  Column  |  Field  | 
|  Primary key  |  ObjectId  | 
|  Index  |  Index  | 
|  View  |  View  | 
|  Nested table or object  |  Embedded document  | 
|  Array  |  Array  | 

## Simple documents
<a name="document-database-documents-simple"></a>

All documents in a document database are self-describing. This documentation uses JSON-like formatted documents, although you can use other means of encoding.

A simple document has one or more fields that are all at the same level within the document. In the following example, the fields `SSN`, `LName`, `FName`, `DOB`, `Street`, `City`, `State-Province`, `PostalCode`, and `Country` are all siblings within the document.

```
{
   "SSN": "123-45-6789",
   "LName": "Rivera",
   "FName": "Martha",
   "DOB": "1992-11-16",
   "Street": "125 Main St.",
   "City": "Anytown",
   "State-Province": "WA",
   "PostalCode": "98117",
   "Country": "USA"
}
```

When information is organized in a simple document, each field is managed individually. To retrieve a person's address, you must retrieve `Street`, `City`, `State-Province`, `PostalCode`, and `Country` as individual data items.

## Embedded documents
<a name="document-database-documents-embeded"></a>

A complex document organizes its data by creating embedded documents within the document. Embedded documents help manage data in groupings and as individual data items, whichever is more efficient in a given case. Using the preceding example, you could embed an `Address` document in the main document. Doing this results in the following document structure:

```
{
   "SSN": "123-45-6789",
   "LName": "Rivera",
   "FName": "Martha",
   "DOB": "1992-11-16",
   "Address": 
   {
       "Street": "125 Main St.",
       "City": "Anytown",
       "State-Province": "WA",
       "PostalCode": "98117",
       "Country": "USA" 
   }
}
```

You can now access the data in the document as individual fields ( `"SSN":` ), as an embedded document ( `"Address":` ), or as a member of an embedded document ( `"Address":{"Street":}` ).

## Example documents in a document database
<a name="document-database-documents"></a>

As stated earlier, because each document in a document database is self-describing, the structure of documents within a document database can be different from one another. The following two documents, one for a book and another for a periodical, are different structurally. Yet both of them can be in the same document database.

The following is a sample book document:

```
{
    "_id" : "9876543210123",
    "Type": "book",
    "ISBN": "987-6-543-21012-3",
    "Author": 
    {
        "LName":"Roe",
        "MI": "T",
        "FName": "Richard" 
    },
    "Title": "Understanding Document Databases"
}
```

The following is a sample periodical document with two articles:

```
{
    "_id" : "0123456789012",
    "Publication": "Programming Today",
    "Issue": 
    {
        "Volume": "14",
        "Number": "09"
    },
    "Articles" : [ 
        {
            "Title": "Is a Document Database Your Best Solution?",
            "Author": 
            {
                "LName": "Major",
                "FName": "Mary" 
            }
        },
        {
            "Title": "Databases for Online Solutions",
            "Author": 
            {
                "LName": "Stiles",
                "FName": "John" 
            }
        }
    ],
    "Type": "periodical"
}
```

Compare the structure of these two documents. With a relational database, you need either separate "periodical" and "books" tables, or a single table with unused fields, such as "Publication," "Issue," "Articles," and "MI," as `null` values. Because document databases are semistructured, with each document defining its own structure, these two documents can coexist in the same document database with no `null` fields. Document databases are good at dealing with sparse data.

Developing against a document database enables quick, iterative development. This is because you can change the data structure of a document dynamically, without having to change the schema for the entire collection. Document databases are well suited for agile development and dynamically changing environments.

## Understanding normalization in a document database
<a name="document-database-normalization"></a>

Document databases are not normalized; data found in one document can be repeated in another document. Further, some data discrepancies can exist between documents. For example, consider the scenario in which you make a purchase at an online store and all the details of your purchases are stored in a single document. The document might look something like the following JSON document:

```
{
    "DateTime": "2018-08-15T12:13:10Z",
    "LName" : "Santos",
    "FName" : "Paul",
    "Cart" : [ 
        {
            "ItemId" : "9876543210123",
            "Description" : "Understanding Document Databases",
            "Price" : "29.95"
        },
        {
            "ItemId" : "0123456789012",
            "Description" : "Programming Today",
            "Issue": {
                "Volume": "14",
                "Number": "09"
            },
            "Price" : "8.95"
        },
        {
            "ItemId": "234567890-K",
            "Description": "Gel Pen (black)",
            "Price": "2.49" 
        }
    ],
    "PaymentMethod" : 
    {
        "Issuer" : "MasterCard",
        "Number" : "1234-5678-9012-3456" 
    },
    "ShopperId" : "1234567890" 
}
```

All this information is stored as a document in a transaction collection. Later, you realize that you forgot to purchase one item. So you again log on to the same store and make another purchase, which is also stored as another document in the transaction collection.

```
{
    "DateTime": "2018-08-15T14:49:00Z",
    "LName" : "Santos",
    "FName" : "Paul",
    "Cart" : [ 
        {
            "ItemId" : "2109876543210",
            "Description" : "Document Databases for Fun and Profit",
            "Price" : "45.95"
        } 
    ],
    "PaymentMethod" : 
    {
        "Issuer" : "Visa",
        "Number" : "0987-6543-2109-8765" 
    },
    "ShopperId" : "1234567890" 
}
```

Notice the redundancy between these two documents—your name and shopper ID (and, if you used the same credit card, your credit card information). But that's okay because storage is inexpensive, and each document completely records a single transaction that can be retrieved quickly with a simple key-value query that requires no joins.

There is also an apparent discrepancy between the two documents—your credit card information. This is only an apparent discrepancy because it is likely that you used a different credit card for each purchase. Each document is accurate for the transaction that it documents.

# Working with documents
<a name="document-database-working-with-documents"></a>

As a document database, Amazon DocumentDB makes it easy to store, query, and index JSON data. In Amazon DocumentDB, a collection is analogous to a table in a relational database, except there is no single schema enforced upon all documents. Collections let you group similar documents together while keeping them all in the same database, without requiring that they be identical in structure.

Using the example documents from earlier sections, it is likely that you'd have collections for `reading_material` and `office_supplies`. It is the responsibility of your software to enforce which collection a document belongs in.

The following examples use the MongoDB API to show how to add, query, update, and delete documents.

**Topics**
+ [Adding documents](#document-database-adding-documents)
+ [Querying documents](#document-database-queries)
+ [Updating documents](#document-database-updating)
+ [Deleting documents](#document-database-deleting)

## Adding documents
<a name="document-database-adding-documents"></a>

In Amazon DocumentDB, a database is created when first you add a document to a collection. In this example, you are creating a collection named `example` in the `test` database, which is the default database when you connect to a cluster. Because the collection is implicitly created when the first document is inserted, there is no error checking of the collection name. Therefore, a typo in the collection name, such as `eexample` instead of `example`, will create and add the document to `eexample` collection rather than the intended collection. Error checking must be handled by your application.

The following examples use the MongoDB API to add documents.

**Topics**
+ [Adding a single document](#document-database-adding-documents-single)
+ [Adding multiple documents](#document-database-adding-documents-multiple)

### Adding a single document
<a name="document-database-adding-documents-single"></a>

To add a single document to a collection, use the `insertOne( {} )` operation with the document that you want added to the collection.

```
db.example.insertOne(
    {
        "Item": "Ruler",
        "Colors": ["Red","Green","Blue","Clear","Yellow"],
        "Inventory": {
            "OnHand": 47,
            "MinOnHand": 40
        },
        "UnitPrice": 0.89
    }
)
```

Output from this operation looks something like the following (JSON format).

```
{
    "acknowledged" : true,
    "insertedId" : ObjectId("5bedafbcf65ff161707de24f")
}
```

### Adding multiple documents
<a name="document-database-adding-documents-multiple"></a>

To add multiple documents to a collection, use the `insertMany( [{},...,{}] )` operation with a list of the documents that you want added to the collection. Although the documents in this particular list have different schemas, they can all be added to the same collection.

```
db.example.insertMany(
    [
        {
            "Item": "Pen",
            "Colors": ["Red","Green","Blue","Black"],
            "Inventory": {
                "OnHand": 244,
                "MinOnHand": 72 
            }
        },
        {
            "Item": "Poster Paint",
            "Colors": ["Red","Green","Blue","Black","White"],
            "Inventory": {
                "OnHand": 47,
                "MinOnHand": 50 
            }
        },
        {
            "Item": "Spray Paint",
            "Colors": ["Black","Red","Green","Blue"],
            "Inventory": {
                "OnHand": 47,
                "MinOnHand": 50,
                "OrderQnty": 36
            }
        }    
    ]
)
```

Output from this operation looks something like the following (JSON format).

```
{
    "acknowledged" : true,
    "insertedIds" : [
            ObjectId("5bedb07941ca8d9198f5934c"),
            ObjectId("5bedb07941ca8d9198f5934d"),
            ObjectId("5bedb07941ca8d9198f5934e")
    ]
}
```

## Querying documents
<a name="document-database-queries"></a>

At times, you might need to look up your online store's inventory so that customers can see and purchase what you're selling. Querying a collection is relatively easy, whether you want all documents in the collection or only those documents that satisfy a particular criterion.

To query for documents, use the `find()` operation. The `find()` command has a single document parameter that defines the criteria to use in choosing the documents to return. The output from `find()` is a document formatted as a single line of text with no line breaks. To format the output document for easier reading, use `find().pretty()`. All the examples in this topic use `.pretty()` to format the output.

Use the four documents you inserted into the `example` collection in the preceding two exercises — `insertOne()` and `insertMany()`.

**Topics**
+ [Retrieving all documents in a collection](#document-database-queries-all-documents)
+ [Retrieving documents that match a field value](#document-database-queries-match-criteria)
+ [Retrieving documents that match an embedded document](#document-database-queries-entire-embedded-document)
+ [Retrieving documents that match a field value in an embedded document](#document-database-queries-embeded-document-field)
+ [Retrieving documents that match an array](#document-database-queries-array-match)
+ [Retrieving documents that match a value in an array](#document-database-queries-array-value-match)
+ [Retrieving documents using operators](#document-database-query-operators)

### Retrieving all documents in a collection
<a name="document-database-queries-all-documents"></a>

To retrieve all the documents in your collection, use the `find()` operation with an empty query document.

The following query returns all documents in the `example` collection.

```
db.example.find( {} ).pretty()
```

### Retrieving documents that match a field value
<a name="document-database-queries-match-criteria"></a>

To retrieve all documents that match a field and value, use the `find()` operation with a query document that identifies the fields and values to match.

Using the preceding documents, this query returns all documents where the "Item" field equals "Pen".

```
db.example.find( { "Item": "Pen" } ).pretty()
```

### Retrieving documents that match an embedded document
<a name="document-database-queries-entire-embedded-document"></a>

To find all the documents that match an embedded document, use the `find()` operation with a query document that specifies the embedded document name and all the fields and values for that embedded document.

When matching an embedded document, the document's embedded document must have the same name as in the query. In addition, the fields and values in the embedded document must match the query.

The following query returns only the "Poster Paint" document. This is because the "Pen" has different values for "`OnHand`" and "`MinOnHand`", and "Spray Paint" has one more field (`OrderQnty`) than the query document.

```
db.example.find({"Inventory": {
    "OnHand": 47,
    "MinOnHand": 50 } } ).pretty()
```

### Retrieving documents that match a field value in an embedded document
<a name="document-database-queries-embeded-document-field"></a>

To find all the documents that match an embedded document, use the `find()` operation with a query document that specifies the embedded document name and all the fields and values for that embedded document.

Given the preceding documents, the following query uses "dot notation" to specify the embedded document and fields of interest. Any document that matches these are returned, regardless of what other fields might be present in the embedded document. The query returns "Poster Paint" and "Spray Paint" because they both match the specified fields and values.

```
db.example.find({"Inventory.OnHand": 47, "Inventory.MinOnHand": 50 }).pretty()
```

### Retrieving documents that match an array
<a name="document-database-queries-array-match"></a>

To find all documents that match an array, use the `find()` operation with the array name that you are interested in and all the values in that array. The query returns all documents that have an array with that name in which the array values are identical to and in the same order as in the query.

The following query returns only the "Pen" because the "Poster Paint" has an additional color (White), and "Spray Paint" has the colors in a different order.

```
db.example.find( { "Colors": ["Red","Green","Blue","Black"] } ).pretty() 
```

### Retrieving documents that match a value in an array
<a name="document-database-queries-array-value-match"></a>

To find all the documents that have a particular array value, use the `find()` operation with the array name and the value that you're interested in.

```
db.example.find( { "Colors": "Red" } ).pretty() 
```

The preceding operation returns all three documents because each of them has an array named `Colors` and the value "`Red`" somewhere in the array. If you specify the value "`White`," the query would only return "Poster Paint."

### Retrieving documents using operators
<a name="document-database-query-operators"></a>

The following query returns all documents where the "`Inventory.OnHand`" value is less than 50.

```
db.example.find(
        { "Inventory.OnHand": { $lt: 50 } } )
```

For a listing of supported query operators, see [Query and projection operators](mongo-apis.md#mongo-apis-query). 

## Updating documents
<a name="document-database-updating"></a>

Typically, your documents are not static and are updated as part of your application workflows. The following examples show some of the ways that you can update documents.

To update an existing document, use the `update()` operation. The `update()` operation has two document parameters. The first document identifies which document or documents to update. The second document specifies the updates to make.

When you update an existing field — whether that field is a simple field, an array, or an embedded document — you specify the field name and its values. At the end of the operation, it is as though the field in the old document has been replaced by the new field and values.

**Topics**
+ [Updating the values of an existing field](#document-database-updating-existing-fields)
+ [Adding a new field](#document-database-updating-adding-field)
+ [Replacing an embedded document](#document-database-replacing-embedded-document)
+ [Inserting new fields into an embedded document](#document-database-updating-adding-field-embedded)
+ [Removing a field from a document](#document-database-remove-field)
+ [Removing a field from multiple documents](#document-database-remove-field-all)

### Updating the values of an existing field
<a name="document-database-updating-existing-fields"></a>

Use the following four documents that you added earlier for the following updating operations.

```
{
    "Item": "Ruler",
    "Colors": ["Red","Green","Blue","Clear","Yellow"],
    "Inventory": {
        "OnHand": 47,
        "MinOnHand": 40
    },
    "UnitPrice": 0.89
},
{
    "Item": "Pen",
    "Colors": ["Red","Green","Blue","Black"],
    "Inventory": {
        "OnHand": 244,
        "MinOnHand": 72 
    }
},
{
    "Item": "Poster Paint",
    "Colors": ["Red","Green","Blue","Black","White"],
    "Inventory": {
        "OnHand": 47,
        "MinOnHand": 50 
    }
},
{
    "Item": "Spray Paint",
    "Colors": ["Black","Red","Green","Blue"],
    "Inventory": {
        "OnHand": 47,
        "MinOnHand": 50,
        "OrderQnty": 36
    }
}
```

**To update a simple field**  
To update a simple field, use `update()` with `$set` to specify the field name and new value. The following example changes the `Item` from "Pen" to "Gel Pen".

```
db.example.update(
    { "Item" : "Pen" },
    { $set: { "Item": "Gel Pen" } }
)
```

Results from this operation look something like the following.

```
{
    "Item": "Gel Pen",
    "Colors": ["Red","Green","Blue","Black"],
    "Inventory": {
        "OnHand": 244,
        "MinOnHand": 72 
    }
}
```

**To update an array**  
The following example replaces the existing array of colors with a new array that includes `Orange` and drops `White` from the list of colors. The new list of colors is in the order specified in the `update()` operation.

```
db.example.update(
    { "Item" : "Poster Paint" },
    { $set: { "Colors": ["Red","Green","Blue","Orange","Black"] } }
)
```

Results from this operation look something like the following.

```
{
    "Item": "Poster Paint",
    "Colors": ["Red","Green","Blue","Orange","Black"],
    "Inventory": {
        "OnHand": 47,
        "MinOnHand": 50 
    }
}
```

### Adding a new field
<a name="document-database-updating-adding-field"></a>

To modify a document by adding one or more new fields, use the `update()` operation with a query document that identifies the document to insert into and the new fields and values to insert using the `$set` operator.

The following example adds the field `UnitPrice` with the value `3.99` to the Spray Paints document. Note that the value `3.99` is numeric and not a string.

```
db.example.update(
    { "Item": "Spray Paint" },
    { $set: { "UnitPrice": 3.99 } } 
)
```

Results from this operation look something like the following (JSON format).

```
{
    "Item": "Spray Paint",
    "Colors": ["Black","Red","Green","Blue"],
    "Inventory": {
        "OnHand": 47,
        "MinOnHand": 50,
        "OrderQnty": 36
    },
    "UnitPrice": 3.99
}
```

### Replacing an embedded document
<a name="document-database-replacing-embedded-document"></a>

To modify a document by replacing an embedded document, use the `update()` operation with documents that identify the embedded document and its new fields and values using the `$set` operator.

Given the following document.

```
db.example.insert({
    "DocName": "Document 1",
    "Date": {
        "Year": 1987,
        "Month": 4,
        "Day": 18
    }
})
```

**To replace an embedded document**  
The following example replaces the current Date document with a new one that has only the fields `Month` and `Day`; `Year` has been eliminated.

```
db.example.update(
    { "DocName" : "Document 1" },
    { $set: { "Date": { "Month": 4, "Day": 18 } } }
)
```

Results from this operation look something like the following.

```
{
    "DocName": "Document 1",
    "Date": {
        "Month": 4,
        "Day": 18
    }
}
```

### Inserting new fields into an embedded document
<a name="document-database-updating-adding-field-embedded"></a>

**To add fields to an embedded document**  
To modify a document by adding one or more new fields to an embedded document, use the `update()` operation with documents that identify the embedded document and "dot notation" to specify the embedded document and the new fields and values to insert using the `$set` operator.

Given the following document, the following code uses "dot notation" to insert the `Year` and `DoW` fields to the embedded `Date` document, and `Words` into the parent document.

```
{
    "DocName": "Document 1",
    "Date": {
        "Month": 4,
        "Day": 18
    }
}
```

```
db.example.update(
    { "DocName" : "Document 1" },
    { $set: { "Date.Year": 1987, 
              "Date.DoW": "Saturday",
              "Words": 2482 } }
)
```

Results from this operation look something like the following.

```
{
    "DocName": "Document 1",
    "Date": {
        "Month": 4,
        "Day": 18,
        "Year": 1987,
        "DoW": "Saturday"
    },
    "Words": 2482
}
```

### Removing a field from a document
<a name="document-database-remove-field"></a>

To modify a document by removing a field from the document, use the `update()` operation with a query document that identifies the document to remove the field from, and the `$unset` operator to specify the field to remove.

The following example removes the `Words` field from the preceding document.

```
db.example.update(
    { "DocName" : "Document 1" },
    { $unset: { Words:1 } }
)
```

Results from this operation look something like the following.

```
{
    "DocName": "Document 1",
    "Date": {
        "Month": 4,
        "Day": 18,
        "Year": 1987,
        "DoW": "Saturday"
    }
}
```

### Removing a field from multiple documents
<a name="document-database-remove-field-all"></a>

To modify a document by removing a field from multiple documents, use the `update()` operation with the `$unset` operator and the `multi` option set to `true`.

The following example removes the `Inventory` field from all documents in the example collection. If a document does not have the `Inventory` field, no action is taken on that document. If `multi: true` is omitted, the action is performed only on the first document that meets the criterion.

```
db.example.update(
    {},
    { $unset: { Inventory:1 } },
    { multi: true }
)
```

## Deleting documents
<a name="document-database-deleting"></a>

To remove a document from your database, use the `remove()` operation, specifying which document to remove. The following code removes "Gel Pen" from your `example` collection.

```
db.example.remove( { "Item": "Gel Pen" } )
```

To remove all documents from your database, use the `remove()` operation with an empty query.

```
db.example.remove( { } )
```