HomeLearnHow-to

Tutorial: Build a Movie Search Application Using Atlas Search

Published: Sep 25, 2020

  • Realm
  • Atlas
  • Compass
  • ...

By Karen Huaulme

Share

Let me guess. You want to give your application users the ability to find EXACTLY what they are looking for FAST! Who doesn't? Search is a requirement for most applications today. With MongoDB Atlas Search, we have made it easier than ever to integrate simple, fine-grained, and lightning-fast search capabilities into all of your MongoDB applications. To demonstrate just how easy it is, let's build a web application to find our favorite movies.

This tutorial is the first in a four-part series where we will learn over the next few months to build out the application featured in our Atlas Search Product Demo.

MongoDB Atlas Search - Product Demo

Armed with only a basic knowledge of HTML and Javascript, we will build out our application in the following four parts.

The Path to Our Movie Search Application
Part 1Get up and running with a basic search movie engine allowing us to look for movies based on a topic in our MongoDB Atlas movie data.
Part 2Make it even easier for our users by building more advanced search queries with fuzzy matching and wildcard paths to forgive them for fat fingers and misspellings. We'll introduce custom score modifiers to allow us to influence our movie results.
Part 3Add autocomplete capabilities to our movie application. We'll also discuss index mappings and analyzers and how to use them to optimize the performance of our application.
Part 4Wrap up our application by creating filters to query across dates and numbers to even further fine-tune our movie search results. We'll even host the application on Realm, our serverless backend platform, so you can deliver your movie search website anywhere in the world.

Now, without any further adieu, let's get this show on the road!

It's showtime!

This tutorial will guide you through building a very basic movie search engine on a free tier Atlas cluster. We will set it up in a way that will allow us to scale our search in a highly performant manner as we continue building out new features in our application over the coming weeks. By the end of Part 1, you will have something that looks like this:

Full Text Search Example Application

To accomplish this, here are our tasks for today:

To-Do List

#STEP 1. SPIN UP ATLAS CLUSTER AND LOAD MOVIE DATA

To Get Started, we will need only an Atlas cluster, which you can get for free, loaded with the Atlas sample dataset. If you do not already have one, sign up to create an Atlas cluster on your preferred cloud provider and region.

Once you have your cluster, you can load the sample dataset by clicking the ellipse button and Load Sample Dataset.

Load Sample Dataset

For more detailed information on how to spin up a cluster, configure your IP address, create a user, and load sample data, check out Getting Started with MongoDB Atlas from our documentation.

Now, let's have a closer look at our sample data within the Atlas Data Explorer. In your Atlas UI, click on Collections to examine the movies collection in the new sample_mflix database. This collection has over 23k movie documents with information such as title, plot, and cast. The sample_mflix.movies collection provides the dataset for our application.

"Movies dataset"To-Do List 1

#STEP 2. CREATE A SEARCH INDEX

Since our movie search engine is going to look for movies based on a topic, we will use Atlas Search to query for specific words and phrases in the fullplot field of the documents.

The first thing we need is an Atlas Search index. Click on the tab titled Search Indexes under Collections. Click on the green Create Search Index button. Let's accept the default settings and click Create Index. That's all you need to do to start taking advantage of Search in your MongoDB Atlas data!

Create Full Text Search Index

By accepting the default settings when we created the Search index, we dynamically mapped all the fields in the collection as indicated in the default index configuration:

1
2
3
4
5
{ mappings: { "dynamic":true } }

Mapping is simply how we define how the fields on our documents are indexed and stored. If a field's value looks like a string, we'll treat it as a full-text field, similarly for numbers and dates. This suits MongoDB's flexible data model perfectly. As you add new data to your collection and your schema evolves, dynamic mapping accommodates those changes in your schema and adds that new data to the Atlas Search index automatically.

We'll talk more about mapping and indexes in Part 3 of our series. For right now, we can check off another item from our task list.

To-Do List 2

#STEP 3. WRITE A BASIC AGGREGATION WITH $SEARCH OPERATORS

Search queries take the form of an aggregation pipeline stage. The $search stage performs a search query on the specified field(s) covered by the Search index and must be used as the first stage in the aggregation pipeline.

Let's use the aggregation pipeline builder inside of the Atlas UI to make an aggregation pipeline that makes use of our Atlas Search index. Our basic aggregation will consist of only three stages: $search, $project, and $limit.

You do not have to use the pipeline builder tool for this stage, but I really love the easy-to-use user interface. Plus, the ability to preview the results by stage makes troubleshooting a snap!

Snap!

Navigate to the Aggregation tab in the sample_mflix.movies collection:

Aggregations tab

For the first stage, select the $search aggregation operator to search for the text "werewolves and vampires" in the fullplot field path.

"Search Stage"

You can also add the highlight option, which will return the highlights by adding fields to the result payload that display search terms in their original context, along with the adjacent text content. (More on this later.)

"Search Stage with Highlight"

Your final $search aggregation stage should be:

1
2
3
4
5
6
7
8
9
{ text: { query: "werewolves and vampires", path: "fullplot", }, highlight: { path: "fullplot" } }

Note the returned movie documents in the preview panel on the right. If no documents are in the panel, double-check the formatting in your aggregation code.

#Stage 2: $project

"Project dialog"

Add stage $project to your pipeline to get back only the fields we will use in our movie search application. We also use the $meta operator to surface each document's searchScore and searchHighlights in the result set.

1
2
3
4
5
6
7
8
9
10
11
12
{ title: 1, year:1, fullplot:1, _id:0, score: { $meta:'searchScore' }, highlight:{ $meta: 'searchHighlights' } }

Let's break down the individual pieces in this stage further:

SCORE: The "$meta": "searchScore" contains the assigned score for the document based on relevance. This signifies how well this movie's fullplot field matches the query terms "werewolves and vampires" above.

Note that by scrolling in the right preview panel, the movie documents are returned with the score in descending order. This means we get the best matched movies first.

HIGHLIGHT: The "$meta": "searchHighlights" contains the highlighted results.

Because searchHighlights and searchScore are not part of the original document, it is necessary to use a $project pipeline stage to add them to the query output.

Now, open a document's highlight array to show the data objects with text values and types.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
title:"The Mortal Instruments: City of Bones" fullplot:"Set in contemporary New York City, a seemingly ordinary teenager, Clar..." year:2013 score:6.849891185760498 highlight:Array 0:Object path:"fullplot" texts:Array 0:Object value:"After the disappearance of her mother, Clary must join forces with a g..." type:"text" 1:Object value:"vampires" type:"hit" 2:Object 3:Object 4:Object 5:Object 6:Object score:3.556248188018799

highlight.texts.value - text from the fullplot field returning a match

highlight.texts.type - either a hit or a text
  • hit is a match for the query
  • text is the surrounding text context adjacent to the matching string

We will use these later in our application code.

#Stage 3: $limit

"limit"

Remember that the results are returned with the scores in descending order. $limit: 10 will therefore bring the 10 most relevant movie documents to your search query. $limit is very important in Search because speed is very important. Without $limit:10, we would get the scores for all 23k movies. We don't need that.

Finally, if you see results in the right preview panel, your aggregation pipeline is working properly! Let's grab that aggregation code with the Export Pipeline to Language feature by clicking the button in the top toolbar.

Export ButtonExport to Pipeline

Your final aggregation code will be this:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
[ { $search { text: { query: "werewolves and vampires", path: "fullplot" }, highlight: { path: "fullplot" } }}, { $project: { title: 1, _id: 0, year: 1, fullplot: 1, score: { $meta: 'searchScore' }, highlight: { $meta: 'searchHighlights' } }}, { $limit: 10 } ]

This small snippet of code powers our movie search engine!

To-Do List 3

#STEP 4. CREATE A REST API

Now that we have the heart of our movie search engine in the form of an aggregation pipeline, how will we use it in an application? There are lots of ways to do this, but I found the easiest was to simply create a RESTful API to expose this data - and for that, I leveraged MongoDB Realm's HTTP Service from right inside of Atlas.

Realm is MongoDB's serverless platform where functions written in Javascript automatically scale to meet current demand. To create a Realm application, return to your Atlas UI and click Realm. Then click the green Start a New Realm App button.

Name your Realm application MovieSearchApp and make sure to link to your cluster. All other default settings are fine.

Create a new Realm Application

Now click the 3rd Party Services menu on the left and then Add a Service. Select the HTTP service and name it movies:

Add a Service

Click the green Add a Service button, and you'll be directed to Add Incoming Webhook.

Once in the Settings tab, name your webhook getMoviesBasic. Enable Respond with Result, and set the HTTP Method to GET. To make things simple, let's just run the webhook as the System and skip validation with No Additional Authorization. Make sure to click the Review and Deploy button at the top along the way.

Webhook

In this service function editor, replace the example code with the following:

1
2
3
4
5
6
exports = function(payload) { const movies = context.services.get("mongodb-atlas").db("sample_mflix").collection("movies"); let arg = payload.query.arg; return movies.aggregate(<<PASTE AGGREGATION PIPELINE HERE>>).toArray(); };

Let's break down some of these components. MongoDB Realm interacts with your Atlas movies collection through the global context variable. In the service function, we use that context variable to access the sample_mflix.movies collection in your Atlas cluster. We'll reference this collection through the const variable movies:

1
2
const movies = context.services.get("mongodb-atlas").db("sample_mflix").collection("movies");

We capture the query argument from the payload:

1
let arg = payload.query.arg;

Return the aggregation code executed on the collection by pasting your aggregation copied from the aggregation pipeline builder into the code below:

1
return movies.aggregate(<<PASTE AGGREGATION PIPELINE HERE>>).toArray();

Finally, after pasting the aggregation code, change the terms "werewolves and vampires" to the generic arg to match the function's payload query argument - otherwise our movie search engine capabilities will be extremely limited.

Query code

Your final code in the function editor will be:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
exports = function(payload) { const movies = context.services.get("mongodb-atlas").db("sample_mflix").collection("movies"); let arg = payload.query.arg; return movies.aggregate([ { $search: { text: { query: arg, path:'fullplot' }, highlight: { path: 'fullplot' } }}, { $project: { title: 1, _id: 0, year: 1, fullplot: 1, score: { $meta: 'searchScore'}, highlight: {$meta: 'searchHighlights'} } }, { $limit: 10 } ]).toArray(); };

Now you can test in the Console below the editor by changing the argument from arg1: "hello" to arg: "werewolves and vampires".

Please make sure to change BOTH the field name arg1 to arg, as well as the string value "hello" to "werewolves and vampires" - or it won't work.

Frustrated DeveloperTest console

Click Run to verify the result:

Results of query

If this is working, congrats! We are almost done! Make sure to SAVE and deploy the service by clicking REVIEW & DEPLOY CHANGES at the top of the screen.

#Use the API

The beauty of a REST API is that it can be called from just about anywhere. Let's execute it in our browser. However, if you have tools like Postman installed, feel free to try that as well.

Switch back to the Settings of your getMoviesBasic function, and you'll notice a Webhook URL has been generated.

Webhook URL

Click the COPY button and paste the URL into your browser. Then append the following to the end of your URL: ?arg="werewolves and vampires"

Query output

If you receive an output like what we have above, congratulations! You have successfully created a movie search API! 🙌 💪

To-Do List 4Almost there

#STEP 5. FINALLY! THE FRONT-END

Now that we have this endpoint, it takes a single call from the front-end application using the Fetch API to retrieve this data. Download the following index.html file and open it in your browser. You will see a simple search bar:

Search Bar

Entering data in the search bar will bring you movie search results because the application is currently pointing to an existing API.

Now open the HTML file with your favorite text editor and familiarize yourself with the contents. You'll note this contains a very simple container and two javascript functions:

  • Line 81 - userAction() will execute when the user enters a search. If there is valid input in the search box and no errors, we will call the buildMovieList() function.
  • Line 125 - buildMovieList() is a helper function for userAction().

The buildMovieList() function will build out the list of movies along with their scores and highlights from the fullplot field. Notice in line 146 that if the highlight.texts.type === "hit" we highlight the highlight.texts.value with a style attribute tag.*

1
2
3
4
5
if (movies[i].highlight[j].texts[k].type === "hit") { txt += `<b><span style="background-color: #FFFF00"> ${movies[i].highlight[j].texts[k].value} </span></b>`; } else { txt += movies[i].highlight[j].texts[k].value; }

#Modify the Front-End Code to Use Your API

In the userAction() function, notice on line 88 that the webhook_url is already set to a RESTful API I created in my own Movie Search application.

1
let webhook_url = "https://webhooks.mongodb-realm.com/api/client/v2.0/app/ftsdemo-zcyez/service/movies-basic-FTS/incoming_webhook/movies-basic-FTS";

We capture the input from the search form field in line 82 and set it equal to searchString. In this application, we append that searchString input to the webhook_url

1
let url = webhook_url + "?arg=" + searchString;

before calling it in the fetch API in line 92.

To make this application fully your own, simply replace the existing webhook_url value on line 88 with your own API from the getMoviesBasic Realm HTTP Service webhook you just created. 🤞 Now save these changes, and open the index.html file once more in your browser, et voilà! You have just built your movie search engine using Atlas Search. 😎

Pass the popcorn! 🍿 What kind of movie do you want to watch?!

Screencast demo

#That's a Wrap!

You have just seen how easy it is to build a simple, powerful search into an application with MongoDB Atlas Search. In our next tutorial, we continue by building more advanced search queries into our movie application with fuzzy matching and wildcard to forgive fat fingers and typos. We'll even introduce custom score modifiers to allow us to shape our search results. Check out our $search documentation for other possibilities.

I will find you

Harnessing the power of Apache Lucene for efficient search algorithms, static and dynamic field mapping for flexible, scalable indexing, all while using the same MongoDB Query Language (MQL) you already know and love, spoken in our very best Liam Neeson impression - MongoDB now has a very particular set of skills. Skills we have acquired over a very long career. Skills that make MongoDB a DREAM for developers like you.

Looking forward to seeing you in Part 2. Until then, if you have any questions or want to connect with other MongoDB developers, check out our community forums. Come to learn. Stay to connect.

Related

GitHub Repo: MovieSearchApp
Video: In [Atlas] Search of Your Even Better FIFA Dream Team
Article: Building an Autocomplete Form Element with Atlas Search and JavaScript
MongoDB Icon
  • Developer Hub
  • Documentation
  • University
  • Community Forums

© MongoDB, Inc.