MongoDB.live, free & fully virtual. June 9th - 10th. Register Now
HomeLearnArticle

Mapping Terms and Concepts from SQL to MongoDB

Published: Mar 18, 2020

  • MongoDB
  • SQL
  • ...

By Lauren Schaefer

Share

Perhaps, like me, you grew up on SQL databases. You can skillfully normalize a database, and, after years of working with tables, you think in rows and columns as well.

But now you've decided to dip your toe into the wonderful world of NoSQL databases, and you're exploring MongoDB. Perhaps you're wondering what you need to do differently. Can you just translate your rows and columns into fields and values and call it a day? Do you really need to change the way you think about storing your data?

We'll answer those questions and more in this three-part article series. Below is a summary of what we'll cover today:

This article is based on a presentation I gave at MongoDB World and MongoDB.local Houston entitled "From SQL to NoSQL: Changing Your Mindset."

If you prefer videos over articles, check out the recording. Slides are available here.

#Meet Ron

I'm a huge fan of the best tv show ever created: Parks and Recreation. Yes, I wrote that previous sentence as if it were a fact, because it actually is.

No.  That's your opinion.  That's the definition of an opinion.

This is Ron. Ron likes strong women, bacon, and staying off the grid.

When it comes to bacon, be prepared

In season 6, Ron discovers Yelp. Ron thinks Yelp is amazing, because he loves the idea of reviewing places he's been.

However, Yelp is way too "on the grid" for Ron. He pulls out his beloved typewriter and starts typing reviews that he intends to send via snail mail.

Ron writes some amazing reviews. Below is one of my favorites.

Dear frozen yogurt,You are the celery of desserts.Be ice cream or be nothing.Zero stars.

Unfortunately, I see three big problems with his plan:

  1. Snail mail is way slower than posting the review to Yelp where it will be instantly available for anyone to read.
  2. The business he is reviewing may never open the letter he sends as they may just assume it's junk mail.
  3. No one else will benefit from his review. (These are exactly the type of reviews I like to find on Amazon!)

#Why am I talking about Ron?

Ok, so why am I talking about Ron in the middle of this article about moving from SQL to MongoDB?

Ron saw the value of Yelp and was inspired by the new technology. However, he brought his old-school ways with him and did not realize the full value of the technology.

This is similar to what we commonly see as people move from a SQL database to a NoSQL database such as MongoDB. They love the idea of MongoDB, and they are inspired by the power of the flexible document data model. However, they frequently bring with them their SQL mindsets and don't realize the full value of MongoDB. In fact, when people don't change the way they think about modeling their data, they struggle and sometimes fail.

Ron throws computer in the trash

Don't be like Ron. (At least in this case, because, in most cases, Ron is amazing.) Don't be stuck in your SQL ways. Change your mindset and realize the full value of MongoDB.

Before we jump into how to change your mindset, let's begin by answering some common questions about non-relational databases and discussing the basics of how to store data in MongoDB.

#Relational Database and Non-Relational Databases

When I talk with developers, they often ask me questions like, "What use cases are good for MongoDB?" Developers often have this feeling that non-relational databases (or NoSQL databases) like MongoDB are for specific, niche use cases.

MongoDB is a general-purpose database that can be used in a variety of use cases across nearly every industry. For more details, see MongoDB Use Cases, MongoDB Industries, and the MongoDB Use Case Guidance Whitepaper that includes a summary of when you should evaluate other database options.

Another common question is, "If my data is relational, why would I use a non-relational database?"

MongoDB is considered a non-relational database. However, that doesn't mean MongoDB doesn't store relationship data well. (I know I just used a double-negative. Stick with me.) MongoDB stores relationship data in a different way. In fact, many consider the way MongoDB stores relationship data to be more intuitive and more reflective of the real-world relationships that are being modeled.

Let's take a look at how MongoDB stores data.

#The Document Model

Instead of tables, MongoDB stores data in documents. No, Clippy, I'm not talking about Microsoft Word Documents.

Clippy raises his eyebrows

I'm talking about BSON documents. BSON is a binary representation of JSON (JavaScript Object Notation) documents. Documents will likely feel comfortable to you if you've used any of the C-family of programming languages such as C, C#, Go, Java, JavaScript, PHP, or Python.

Documents typically store information about one object as well as any information related to that object. Related documents are grouped together in collections. Related collections are grouped together and stored in a database.

Let's discuss some of the basics of a document. Every document begins and ends with curly braces.

1
2
{ }

Inside of those curly braces, you'll find an unordered set of field/value pairs that are separated by commas.

1
2
3
4
5
{ field: value, field: value, field: value }

The fields are strings that describe the pieces of data being stored.

The values can be any of the BSON data types. BSON has a variety of data types including Double, String, Object, Array, Binary Data, ObjectId, Boolean, Date, Null, Regular Expression, JavaScript, JavaScript (with scope), 32-bit Integer, Timestamp, 64-bit Integer, Decimal128, Min Key, and Max Key. With all of these types available for you to use, you have the power to model your data as it exists in the real world.

Every document is required to have a field named _id. The value of _id must be unique for each document in a collection, is immutable, and can be of any type other than an array.

#Example Documents

Ok, that's enough definitions. Let's take a look at a real example, and compare and contrast how we would model the data in SQL vs MongoDB.

#Storing Leslie's Information

Let's say we need to store information about a user named Leslie. We'll store her contact information including her first name, last name, cell phone number, and city. We'll also store some extra information about her including her location, hobbies, and job history.

#Storing Contact Information

Let's begin with Leslie's contact information. When using SQL, we'll create a table named Users. We can create columns for each piece of contact information we need to store: first name, last name, cell phone number, and city. To ensure we have a unique way to identify each row, we'll include an ID column.

Users
IDfirst_namelast_namecellcity
1LeslieYepp8125552344Pawnee

Now let's store that same information in MongoDB. We can create a new document for Leslie where we'll add field/value pairs for each piece of contact information we need to store. We'll use _id to uniquely identify each document. We'll store this document in a collection named Users.

Users

1
2
3
4
5
6
7
{ "_id": 1, "first_name": "Leslie", "last_name": "Yepp", "cell": "8125552344", "city": "Pawnee" }

#Storing Latitude and Longitude

Now that we've stored Leslie's contact information, let's store the coordinates of her current location.

When using SQL, we'll need to split the latitude and longitude between two columns.

Users
IDfirst_namelast_namecellcitylatitudelongitude
1LeslieYepp8125552344Pawnee39.170344-86.536632

MongoDB has an array data type, so we can store the latitude and longitude together in a single field.

Users

1
2
3
4
5
6
7
8
{ "_id": 1, "first_name": "Leslie", "last_name": "Yepp", "cell": "8125552344", "city": "Pawnee", "location": [ -86.536632, 39.170344 ] }

Bonus Tip: MongoDB has a few different built-in ways to visualize location data including the schema analyzer in MongoDB Compass and the Geospatial Charts in MongoDB Charts. I generated the map below with just a few clicks in MongoDB Charts.

Map of Leslie's location

#Storing Lists of Information

We're successfully storing Leslie's contact information and current location. Now let's store her hobbies.

When using SQL, we could choose to add more columns to the Users table. However, since a single user could have many hobbies (meaning we need to represent a one-to-many relationship), we're more likely to create a separate table just for hobbies. Each row in the table will contain information about one hobby for one user. When we need to retrieve Leslie's hobbies, we'll join the Users table and our new Hobbies table.

Hobbies
IDuser_idhobby
101scrapbooking
111eating waffles
121working

Since MongoDB supports arrays, we can simply add a new field named "hobbies" to our existing document. The array can contain as many or as few hobbies as we need (assuming we don't exceed the 16 megabyte document size limit). When we need to retrieve Leslie's hobbies, we don't need to do an expensive join to bring the data together; we can simply retrieve her document in the Users collection.

Users

1
2
3
4
5
6
7
8
9
{ "_id": 1, "first_name": "Leslie", "last_name": "Yepp", "cell": "8125552344", "city": "Pawnee", "location": [ -86.536632, 39.170344 ], "hobbies": ["scrapbooking", "eating waffles", "working"] }

Let's say we also need to store Leslie's job history.

Just as we did with hobbies, we're likely to create a separate table just for job history information. Each row in the table will contain information about one job for one user.

JobHistory
IDuser_idjob_titleyear_started
201"Deputy Director"2004
211"City Councillor"2012
221"Director, National Parks Service, Midwest Branch"2014

So far in this article, we've used arrays in MongoDB to store geolocation data and a list of Strings. Arrays can contain values of any type, including objects. Let's create a document for each job Leslie has held and store those documents in an array.

Users

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
{ "_id": 1, "first_name": "Leslie", "last_name": "Yepp", "cell": "8125552344", "city": "Pawnee", "location": [ -86.536632, 39.170344 ], "hobbies": ["scrapbooking", "eating waffles", "working"], "jobHistory": [ { "title": "Deputy Director", "yearStarted": 2004 }, { "title": "City Councillor", "yearStarted": 2012 }, { "title": "Director, National Parks Service, Midwest Branch", "yearStarted": 2014 } ] }

#Storing Ron's Information

Now that we've decided how we'll store information about our users in both tables and documents, let's store information about Ron. Ron will have almost all of the same information as Leslie. However, Ron does his best to stay off the grid, so he will not be storing his location in the system.

Ron yells, "Erase all pictures of Ron!" at his tablet

#Skipping Location Data in SQL

Let's begin by examining how we would store Ron's information in the same tables that we used for Leslie's. When using SQL, we are required to input a value for every cell in the table. We will represent Ron's lack of location data with NULL. The problem with using NULL is that it's unclear whether the data does not exist or if the data is unknown, so many people discourage the use of NULL.

Users
IDfirst_namelast_namecellcitylatitudelongitude
1LeslieYepp8125552344Pawnee39.170344-86.536632
2RonSwandaughter8125559347PawneeNULLNULL
Hobbies
IDuser_idhobby
101scrapbooking
111eating waffles
121working
132woodworking
142fishing
JobHistory
IDuser_idjob_titleyear_started
201"Deputy Director"2004
211"City Councillor"2012
221"Director, National Parks Service, Midwest Branch"2014
232"Director"2002
242"CEO, Kinda Good Building Company"2014
252"Superintendent, Pawnee National Park"2018

#Skipping Location Data in MongoDB

In MongoDB, we have the option of representing Ron's lack of location data in two ways: we can omit the location field from the document or we can set location to null. Best practices suggest that we omit the location field to save space. You can choose if you want omitted fields and fields set to null to represent different things in your applications.

Users

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
{ "_id": 2, "first_name": "Ron", "last_name": "Swandaughter", "cell": "8125559347", "city": "Pawnee", "hobbies": ["woodworking", "fishing"], "jobHistory": [ { "title": "Director", "yearStarted": 2002 }, { "title": "CEO, Kinda Good Building Company", "yearStarted": 2014 }, { "title": "Superintendent, Pawnee National Park", "yearStarted": 2018 } ] }

#Storing Lauren's Information

Let's say we are feeling pretty good about our data models and decide to launch our apps using them.

Then we discover we need to store information about a new user: Lauren Burhug. She's a fourth grade student who Ron teaches about government. We need to store a lot of the same information about Lauren as we did with Leslie and Ron: her first name, last name, city, and hobbies. However, Lauren doesn't have a cell phone, location data, or job history. We also discover that we need to store a new piece of information: her school.

#Storing New Information in SQL

Let's begin by storing Lauren's information in the SQL tables as they already exist.

Users
IDfirst_namelast_namecellcitylatitudelongitude
1LeslieYepp8125552344Pawnee39.170344-86.536632
2RonSwandaughter8125559347PawneeNULLNULL
3LaurenBurhugNULLPawneeNULLNULL
Hobbies
IDuser_idhobby
101scrapbooking
111eating waffles
121working
132woodworking
142fishing
153soccer

We have two options for storing information about Lauren's school. We can choose to add a column to the existing Users table, or we can create a new table. Let's say we choose to add a column named "school" to the Users table. Depending on our access rights to the database, we may need to talk to the DBA and convince them to add the field. Most likely, the database will need to be taken down, the "school" column will need to be added, NULL values will be stored in every row in the Users table where a user does not have a school, and the database will need to be brought back up.

#Storing New Information in MongoDB

Let's examine how we can store Lauren's information in MongoDB.

Users

1
2
3
4
5
6
7
8
{ "_id": 3, "first_name": "Lauren", "last_name": "Burhug", "city": "Pawnee", "hobbies": ["soccer"], "school": "Pawnee Elementary" }

As you can see above, we've added a new field named "school" to Lauren's document. We do not need to make any modifications to Leslie's document or Ron's document when we add the new "school" field to Lauren's document. MongoDB has a flexible schema, so every document in a collection does not need to have the same fields.

For those of you with years of experience using SQL databases, you might be starting to panic at the idea of a flexible schema. (I know I started to panic a little when I was introduced to the idea.)

April screams in terror

Don't panic! This flexibility can be hugely valuable as your application's requirements evolve and change.

MongoDB provides schema validation so you can lock down your schema as much or as little as you'd like when you're ready.

#Mapping Terms and Concepts from SQL to MongoDB

Now that we've compared how you model data in SQL and MongoDB, let's be a bit more explicit with the terminology. Let's map terms and concepts from SQL to MongoDB.

#Row ⇒ Document

A row maps roughly to a document.

Row and document

Depending on how you've normalized your data, rows across several tables could map to a single document. In our examples above, we saw that rows for Leslie in the Users, Hobbies, and JobHistory tables mapped to a single document.

#Column ⇒ Field

A column maps roughly to a field. For example, when we modeled Leslie's data, we had a first_name column in the Users table and a first_name field in a User document.

Column and field

#Table ⇒ Collection

A table maps roughly to a collection. Recall that a collection is a group of documents. Continuing with our example above, our Users table maps to our Users collection.

Table and collection

#Database ⇒ Database

The term database is used fairly similarly in both SQL and MongoDB. Groups of tables are stored in SQL databases just as groups of collections are stored in MongoDB databases.

SQL Database and MongoDB Database

#Index ⇒ Index

Indexes provide fairly similar functionality in both SQL and MongoDB. Indexes are data structures that optimize queries. You can think of them like an index that you'd find in the back of a book; indexes tell the database where to look for specific pieces of information. Without an index, all information in a table or collection must be searched.

SQL Index and MongoDB Index

New MongoDB users often forget how much indexes can impact performance. If you have a query that is taking a long time to run, be sure you have an index to support it. For example, if we know we will be commonly searching for users by first or last name, we should add a text index on the first and last name fields.

Remember: indexes slow down write performance but speed up read performance. For more information on indexes including the types of indexes that MongoDB supports, see the MongoDB Manual.

#View ⇒ View

Views are fairly similar in both SQL and MongoDB. In MongoDB, a view is defined by an aggregation pipeline. The results of the view are not stored—they are generated every time the view is queried.

SQL View and MongoDB View

To learn more about views, see the MongoDB Manual.

MongoDB added support for On-Demand Materialized Views in version 4.2. To learn more, see the MongoDB Manual.

#Join ⇒ Embedding

When you use SQL databases, joins are fairly common. You normalize your data to prevent data duplication, and the result is that you commonly need to join information from multiple tables in order to perform a single operation in your application

In MongoDB, we encourage you to model your data differently. Our rule of thumb is Data that is accessed together should be stored together. If you'll be frequently creating, reading, updating, or deleting a chunk of data together, you should probably be storing it together in a document rather than breaking it apart across several documents.

You can use embedding to model data that you may have broken out into separate tables when using SQL. When we modeled Leslie's data for MongoDB earlier, we saw that we embedded her job history in her User document instead of creating a separate JobHistory document.

SQL Join and MongoDB Embedding

For more information, see the MongoDB Manual's pages on modeling one-to-one relationships with embedding and modeling one-to-many relationships with embedding.

#Join ⇒ Database References

As we discussed in the previous section, embedding is a common solution for modeling data in MongoDB that you may have split across one or more tables in a SQL database.

SQL Join and MongoDB Database Reference

However, sometimes embedding does not make sense. Let's say we wanted to store information about our Users' employers like their names, addresses, and phone numbers. The number of Users that could be associated with an employer is unbounded. If we were to embed information about an employer in a User document, the employer data could be replicated hundreds or perhaps thousands of times. Instead, we can create a new Employers collection and create a database reference between User documents and Employer documents.

For more information on modeling one-to-many relationships with database references, see the MongoDB Manual.

#Left Outer Join ⇒ $lookup (Aggregation Pipeline)

When you need to pull all of the information from one table and join it with any matching information in a second table, you can use a left outer join in SQL.

MongoDB has a stage similar to a left outer join that you can use with the aggregation framework.

For those not familiar with the aggregation framework, it allows you to analyze your data in real-time. Using the framework, you can create an aggregation pipeline that consists of one or more stages. Each stage transforms the documents and passes the output to the next stage.

$lookup is an aggregation framework stage that allows you to perform a left outer join to an unsharded collection in the same database.

SQL Left Outer Join and MongoDB $lookup

For more information, see the MongoDB Manual's pages on the aggregation framework and $lookup. MongoDB University has a fantastic free course on the aggregation pipeline that will walk you in detail through using $lookup: M121: The MongoDB Aggregation Framework.

#Recursive Common Table Expressions ⇒ $graphLookup (Aggregation Pipeline)

When you need to query hierarchical data like a company's organization chart in SQL, we can use recursive common table expressions.

MongoDB provides an aggregation framework stage that is similar to recursive common table expressions: $graphLookup. $graphLookup performs a recursive search on a collection.

SQL Recursive Common Table Expressions and $graphLookup

For more information, see the MongoDB Manual's page on $graphLookup and MongoDB University's free course on the aggregation framework.

#Multi-Record ACID Transaction ⇒ Multi-Document ACID Transaction

Finally, let's talk about ACID transactions. Transactions group database operations together so they all succeed or none succeed. In SQL, we call these multi-record ACID transactions. In MongoDB, we call these multi-document ACID transactions.

SQL Transaction and MongoDB Transaction

For more information, see the MongoDB Manual.

#Wrap Up

We've just covered a lot of concepts and terminology. The three term mappings I recommend you internalize as you get started using MongoDB are:

  • Rows map to documents.
  • Columns map to fields.
  • Tables map to collections.

I created the following diagram you can use as a reference in the future as you begin your journey using MongoDB.

Mapping of terms from SQL to MongoDB

Be on the lookout for the next post in this series where we'll discuss the top four reasons you should use MongoDB.

More from this series

SQL to MongoDB
  • Mapping Terms and Concepts from SQL to MongoDB
  • The Top 4 Reasons Why You Should Use MongoDB
  • 3 Things to Know When You Switch from SQL to MongoDB
MongoDB Icon
  • Developer Hub
  • Documentation
  • University
  • Community Forums

© MongoDB, Inc.