Is there a faster way to access the data than the cursor

I am accessing data from MongoDB using the C++ driver, and have written the following code to extract data from the database.

From testing, it takes around 20 seconds to extract 100 documents with 5 values each.
I achieve this by running 4 threads at once. This however is too slow for my purposes. is there a quicker way to access the data than the cursor, as the cursor seems to be the bottleneck at the moment. please note the function records.extractRecord is simply a wrapper around extracting the data from the Bson Types.

for (bsoncxx::document::element ele : doc) {	
    bsoncxx::stdx::string_view field_key{ ele.key() };
    savelocation = -1;
    switch (ele.type()) {
    case bsoncxx::type::k_double:
        records.extractRecord(ele.get_double(), field_key.to_string(), threadNumber, recordCount);
        //extractedvalue = std::to_string(ele.get_double());
        break;
    /*case bsoncxx::type::k_utf8:
        savelocation = records.extractRecord(boost::string_view(ele.get_utf8()).to_string(), field_key.to_string(), threadNumber, recordCount);
        extractedvalue = boost::string_view(ele.get_utf8()).to_string();
        break;*/
    case bsoncxx::type::k_date:
        records.extractRecord((int64_t)ele.get_date(), field_key.to_string(), threadNumber, recordCount);
        //extractedvalue = std::to_string((int64_t)ele.get_date());
        break;
    case bsoncxx::type::k_int32:
        records.extractRecord((int32_t)ele.get_int32(), field_key.to_string(), threadNumber, recordCount);
        //extractedvalue = std::to_string((int32_t)ele.get_int32());
        break;
    case bsoncxx::type::k_int64:
        records.extractRecord((int64_t)ele.get_int64(), field_key.to_string(), threadNumber, recordCount);
        extractedvalue = std::to_string((int64_t)ele.get_int64());
        break;
    }

In my opinion you are way too fast to claim that the bottleneck is the cursor. There is a thousand things that can influence performance. You only supplied few numbers 20 seconds, 100 documents, 5 values and 4 threads. We know nothing about the capacity of the servers, the size of the data sets, the type of documents, indexes or not, …

Sorry, i made a mistake in my original post.
I meant I am accessing 1,000,000 documents, and the fastest i have been able to access the documents is around 14 seconds.

I am not using an index, but the results are not sorted.
I understand that there are alot of variables that can affect performance, but My question is, is there a faster way to access the data than the cursor, or is the cursor the fastest way to access it.

I thought maybe the data transformation between bson types and C++ types might be slowing the code down, so i ran the code without running my extractRecord function, so all the code did was run through the cursor, and check each type. This activity took 5 seconds.
so it looks like its taking around 10 seconds to physically transform 5,000,000 variables from bson types to C++ standard types.
And 5 seconds to run through the cursor.