HomeLearnArticle

A Free GraphQL API for Johns Hopkins University COVID-19 Dataset

Published: Sep 25, 2020

  • Atlas
  • Realm
  • API
  • ...

By Maxime Beugnet

Share

#TL;DR

You can retrieve an access token using the API like so:

1
curl -X POST 'https://realm.mongodb.com/api/client/v2.0/app/covid-19-qppza/auth/providers/anon-user/login'

Then you can read the GraphQL API documentation and start running queries like this one using the access token you just retrieved:

1
2
3
4
curl 'https://realm.mongodb.com/api/client/v2.0/app/covid-19-qppza/graphql' \ --header 'Authorization: Bearer ACCESS_TOKEN' \ --header 'Content-Type: application/json' \ --data-raw '{"query":"query {countries_summary {_id combined_names confirmed country country_codes country_iso2s country_iso3s date deaths population recovered states uids}}"}'

#Introduction

Recently, we built the MongoDB COVID-19 Open Data project using the dataset from Johns Hopkins University (JHU).

It's a great dataset for education purposes and for pet projects. The MongoDB Atlas cluster is freely accessible using the user readonly and the password readonly using the connection string:

1
mongodb+srv://readonly:readonly@covid-19.hip2i.mongodb.net/covid19

This cluster contains 2 databses: covid19 and covid19jhu. Only the first one will be exposed with this GraphQL API. The second one contains the unprocessed raw data from JHU's CSV files. Learn about the databases and collection in the dedicated blog post.

You can use this cluster to build your application, but I also set up a GraphQL API using MongoDB Realm to expose this data for you.

In this blog post, I will first show you how to access our GraphQL endpoint securely. Then we will have a look at the documentation together to build out a variety of GraphQL queries to access all sorts of information in our dataset - all against a single API endpoint. Learning how to to use filters and request only specific fields in our data will help optimize the performance of your applications - by bringing you exactly the data you want - nothing more, nothing less.

Trevor Noah saying wow gif

#Prerequisites

  • Command line and cURL - which I'm using here for the sake of simplicity.
  • Or any other tool which handles HTTP queries. I also tested to run the following queries with Postman, and it works great.

#COVID-19 GraphQL API

#Get an Access Token

I used MongoDB Realm to create this GraphQL API. Since MongoDB Realm is secure by default, only authenticated queries can be made. Yet since I want to keep this service as open as possible, I simply used the Anonymous Authentication offered by MongoDB Realm.

So now, all you need to retrieve an access token is the MongoDB Realm Application ID — or APP_ID, for short — of my MongoDB Realm application:

1
covid-19-qppza

And the API to authenticate HTTP client request, which is:

1
curl -X POST 'https://realm.mongodb.com/api/client/v2.0/app/<APP_ID>/auth/providers/anon-user/login'

Inserting the APP_ID into the URL gives you:

1
curl -X POST 'https://realm.mongodb.com/api/client/v2.0/app/covid-19-qppza/auth/providers/anon-user/login'

If you execute this query in your favorite shell, you will receive a JSON answer that looks like this one:

1
2
3
4
5
6
{ "access_token":"eyJhbG......6dbnY", "refresh_token":"eyJhbG......uKpoM", "user_id":"5f692c44033b9e7f1554d475", "device_id":"000000000000000000000000" }

Access tokens expire 30 minutes after MongoDB Realm grants them. When an access token expires, you can request a new one using the same API, or you can get a new one using the refresh token. Although it's often just easier to request a new one.

#Query the GraphQL API

Once you have the token, you can start browsing the GraphQL API documentation that I generated with GraphDoc and build your first query.

Five collections are available in this GraphQL API, and you can learn more about each one of them in our documentation. Each have a "singular" query which can be used to retrieve a single document and a "plural" one to retrieve a list of documents. As an exception, the metadata collection contains only a single document so it offers no "plural" query.

You can see all the possible queries in the "query" page in the documentation, but I have also summarized them in the following table:

GraphQL API
CollectionQuery (singular / plural)Fields available
metadata
  • metadatum
  • _id
  • countries
  • states
  • states_us
  • counties
  • iso3s
  • uids
  • first_date
  • last_date
countries_summary
  • countries_summary
  • countries_summarys
  • _id
  • combined_names
  • confirmed
  • country
  • country_codes
  • country_iso2s
  • country_iso3s
  • date
  • deaths
  • population
  • recovered
  • states
  • uids
global
  • global
  • globals
  • _id
  • combined_name
  • confirmed
  • country
  • country_code
  • country_iso2
  • country_iso3
  • date
  • deaths
  • loc { type coordinates }
  • population
  • recovered
  • state
  • uid
global_and_us
  • global_and_u
  • global_and_us
  • _id
  • combined_name
  • confirmed
  • country
  • country_code
  • country_iso2
  • country_iso3
  • county
  • date
  • deaths
  • fips
  • loc { type coordinates }
  • population
  • recovered
  • uid
us_only
  • us_only
  • us_onlys
  • _id
  • combined_name
  • confirmed
  • country
  • country_code
  • country_iso2
  • country_iso3
  • county
  • date
  • deaths
  • fips
  • loc { type coordinates }
  • population
  • state
  • uid

Find all the details in the GraphQL documentation. To explore this data and the flexibility of GraphQL, let's build out three example queries.

#Query 1. The Metadata Collection

Let's first query the metadata collection. This collection contains only one single document listing all the values (obtained with mongodb distinct function) for the major fields.

Here is the GraphQL query:

1
2
3
4
5
6
7
8
9
10
11
12
13
query { metadatum { _id countries states states_us counties iso3s uids first_date last_date } }

Now let's build an HTTP query with it. Don't forget to replace the ACCESS_TOKEN in the query with your own valid token.

1
2
3
4
curl 'https://realm.mongodb.com/api/client/v2.0/app/covid-19-qppza/graphql' \ --header 'Authorization: Bearer ACCESS_TOKEN' \ --header 'Content-Type: application/json' \ --data-raw '{"query": "query { metadatum { _id countries states states_us counties iso3s uids first_date last_date } }" }'

This will answer a JSON document which will help you populate your filters for the other queries:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
{"data": {"metadatum": { "_id":"metadata", "countries": [ "Afghanistan", "Albania", "Algeria","..."], "states":["Alabama","Alaska","Alberta","American Samoa","..."], "states_us":["Alabama","Alaska","American Samoa","..."], "counties":["Abbeville","Acadia","Accomack","..."], "iso3s":["ABW","AFG","AGO","..."], "uids":[4,8,12,16,...], "first_date":"2020-01-22T00:00:00Z", "last_date":"2020-09-23T00:00:00Z" } } }

#Query 2. The countries_summary Collection

Now let's refine our data query even further. We want to see how France is trending for the last week, so we will use a query filter:

1
{country: "France", date_gte: "2020-09-16T00:00:00Z"}

Plus we will sort the dates in descending order with the most recent dates first. Remember with GraphQL, we can request as many or as few data fields as we want for the client. In this example, we'll only ask for the number of confirmed cases, deaths, and recoveries, along with the date. The final query with this filter and those specific fields is:

1
2
3
4
5
6
7
8
query { countries_summarys(query: {country: "France", date_gte: "2020-09-16T00:00:00Z"}, sortBy: DATE_DESC) { confirmed date deaths recovered } }

Or with cURL:

1
2
3
4
curl 'https://realm.mongodb.com/api/client/v2.0/app/covid-19-qppza/graphql' \ --header 'Authorization: Bearer ACCESS_TOKEN' \ --header 'Content-Type: application/json' \ --data-raw '{"query": "query { countries_summarys(query: {country: \"France\", date_gte: \"2020-09-16T00:00:00Z\"}, sortBy: DATE_DESC) { confirmed date deaths recovered } }" }'

Which gives me:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
{ "data": { "countries_summarys": [ { "confirmed": 507150, "date": "2020-09-22T00:00:00Z", "deaths": 31426, "recovered": 94961 }, { "confirmed": 496851, "date": "2020-09-21T00:00:00Z", "deaths": 31346, "recovered": 94289 }, { "confirmed": 467614, "date": "2020-09-20T00:00:00Z", "deaths": 31257, "recovered": 93586 }, { "confirmed": 467614, "date": "2020-09-19T00:00:00Z", "deaths": 31257, "recovered": 93586 }, { "confirmed": 467421, "date": "2020-09-18T00:00:00Z", "deaths": 31257, "recovered": 92700 }, { "confirmed": 454266, "date": "2020-09-17T00:00:00Z", "deaths": 31103, "recovered": 91765 }, { "confirmed": 443869, "date": "2020-09-16T00:00:00Z", "deaths": 31056, "recovered": 91293 } ] } }

#Query 3. The global_and_us Collection

Finally, let's find the three counties in the USA with the greatest number of confirmed cases:

1
2
3
4
5
6
7
8
query { global_and_us(query: {country: "US", date: "2020-09-22T00:00:00Z"}, sortBy: CONFIRMED_DESC, limit: 3) { confirmed deaths state county } }

Or with cURL:

1
2
3
4
curl 'https://realm.mongodb.com/api/client/v2.0/app/covid-19-qppza/graphql' \ --header 'Authorization: Bearer ACCESS_TOKEN' \ --header 'Content-Type: application/json' \ --data-raw '{"query": "query { global_and_us(query: {country: \"US\", date: \"2020-09-22T00:00:00Z\"}, sortBy: CONFIRMED_DESC, limit: 3) { confirmed deaths state county } }" }'

Results:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
{ "data": { "global_and_us": [ { "confirmed": 262133, "county": "Los Angeles", "deaths": 6401, "state": "California" }, { "confirmed": 167515, "county": "Miami-Dade", "deaths": 3085, "state": "Florida" }, { "confirmed": 140314, "county": "Maricopa", "deaths": 3275, "state": "Arizona" } ] } }

#But How Did I Build This GraphQL API?

Simple and easy, I used the MongoDB Realm GraphQL API which just took me a few clicks.

If you want to build this very same service using this dataset, check out our blog which explains the dataset's content and how you can grab it. Then, have a look at Nic's blog post which explains how to set up a GraphQL API using MongoDB Realm.

You want to improve your knowledge even more around MongoDB and GraphQL? Then you must read the blog post GraphQL: The Easy Way to Do the Hard Stuff from Karen and Brian.

#Wrap-Up

MongoDB made setting up a GraphQL API really easy. And this GraphQL API made querying our Covid19 dataset to gain insight even easier. We update this data every hour, and we hope you will enjoy using this data to explore and learn.

Are you trying to help solve this pandemic in any way? Remember that if you are trying to build an application that helps to detect, understand, or stop the spread of the COVID-19 virus, we have a FREE MongoDB Atlas credit program that can help you scale and hopefully solve this global pandemic.

I truly hope you will be able to build something amazing with this GraphQL API. Even if it won't save the world from the COVID-19 pandemic, I hope it will be a great source of motivation and training for your next pet project.

Send me a tweet or ping me in our Community Forum with your project using this API. I will definitely check it out!

More from this series

COVID-19
  • Coronavirus Map and Live Data Tracker with MongoDB Charts
  • How to work with Johns Hopkins University COVID-19 Data in MongoDB Atlas
  • A Free REST API for Johns Hopkins University COVID-19 dataset
  • A Free GraphQL API for Johns Hopkins University COVID-19 Dataset

Related

Article: Introducing GraphQL Support in MongoDB Atlas
Article: GraphQL: The Easy Way to Do the Hard Stuff
Video: GraphQL: The Easy Way to Do the Hard Stuff
Video: Realm Creating Sophisticated GraphQL APIs in Minutes
MongoDB Icon
  • Developer Hub
  • Documentation
  • University
  • Community Forums

© MongoDB, Inc.