Pymongo lib import is slow

Hi
We are using pymongo direct in our AWS Lambda functions. We saw that our cold starts are quite slow (1,2-2sec). During profiling we saw, that the import of pymongo (“import pymongo” or “from pymongo import MongoClient”) will take on our LOCAL(!) system 300ms. That’s to slow for just the import. When we remove the pymongo import the cold starts decrease to 400-800ms.
We implemented a mongodb proxy now and it’s working. But it will add 40ms (network) to every DB request. And that’s not optimal.
Do you know this issue? Can we improve our imports somehow? I could not find anything online so far.
We install it with pip install pymongo==4.6.2. Wie used a clean venv.

2 questions:

  1. Have you tried experimenting with increasing the Lambda function memory?
  • The amount of memory allocated to a Lambda function defaults to 128 MB. You can configure the amount of memory allocated to a Lambda function, between 128 MB and 10,240 MB. Ensure you allocate enough memory. Increase the memory to increase the amount of virtual CPU available and improve MongoDB driver performance. To learn more, see Memory and computing power.
  1. Can you post your Lambda installed python packages? It’s possible we’re automatically importing some code that we don’t need to based on the presence of an optional pymongo dependency: Installing / Upgrading - PyMongo 4.6.2 documentation

As I already mentioned I tested it on my local system with a clean venv (python 3.11) with only pymongo==4.6.2. No other libs!
All of my lambda functions run with 1GB of RAM. No other lib has this problem.
With my first profiling I would guess it is the bson (inside of pymongo).
Does a minimal installtion exists?

Thanks for reporting. A clean install on Python 3.11 and PyMongo 4.6.2 on macOS arm64 takes 515ms:

$ pip install pymongo  
Collecting pymongo
  Using cached pymongo-4.6.2-cp311-cp311-macosx_10_9_universal2.whl (534 kB)
Requirement already satisfied: dnspython<3.0.0,>=1.16.0 in ./git/venv-import/lib/python3.11/site-packages (from pymongo) (2.6.1)
Installing collected packages: pymongo
Successfully installed pymongo-4.6.2
$ python3 -X importtime -c "import pymongo" 2> import-pymongo-clean.log
$ tuna import-pymongo-clean.log                                        

Looks like 80% of that time is spent on importing the C extensions (bson._cbson and pymongo._cmessage).

I opened https://jira.mongodb.org/browse/PYTHON-4260 to see if we can optimize this.

One workaround to alleviate the cost of a cold start could be to configure Lambda with provisioned concurrency.

You could also install pymongo from the source without the C extensions which reduces the improt time to 74ms on my machine. Without the C extensions pymongo will be slower in general so you may want to see if that affects your application:

$ NO_EXT=1 pip install --no-binary=":all:" pymongo
Collecting pymongo
  Using cached pymongo-4.6.2-py3-none-any.whl
Collecting dnspython<3.0.0,>=1.16.0 (from pymongo)
  Downloading dnspython-2.6.1.tar.gz (332 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 332.7/332.7 kB 13.2 MB/s eta 0:00:00
  Installing build dependencies ... done
  Getting requirements to build wheel ... done
  Preparing metadata (pyproject.toml) ... done
Building wheels for collected packages: dnspython
  Building wheel for dnspython (pyproject.toml) ... done
  Created wheel for dnspython: filename=dnspython-2.6.1-py3-none-any.whl size=307696 sha256=5ef3b9680161f6fa89daf8ad451b5f1a33b18ae8a1c6778cdf4b43f08c0a6e50
  Stored in directory: /Users/shane/Library/Caches/pip/wheels/64/7a/78/c66f17167e54a96b77bebd0a18414c9183788ae2f8c4a12e2a
Successfully built dnspython
Installing collected packages: dnspython, pymongo
Successfully installed dnspython-2.6.1 pymongo-4.6.2
$ python3 -X importtime -c "import pymongo" 2> import-pymongo-no-C.log
$ tuna import-pymongo-no-C.log                                        

Thanks for the information. I will try it without the C extension and check how my queries will perform.

The provisioned concurrency setting we already tried. It’s not an option for us. It will be too expensive and for scaling it’s not perfect.

Looks like this is not working with windows. I will change my hole deployment to WSL and make a new measurment.
Normally I deploy my project with this pip settings:
–platform=manylinux2014_x86_64
–only-binary=“:all:”

PyMongo supports Windows. What error were you getting there?

For more context:

$ NO_EXT=1 pip install --no-binary=":all:" pymongo

--no-binary=":all:" tells pip to install from the source dist (eg pymongo-4.6.2.tar.gz) and NO_EXT=1 is a pymongo specific env var which disables building the C extensions during the install step.

this command will not work in windows. I changed to WSL to remove the problems with Windows.

I deployed pymongo like you descriped. I still get import times around 350ms. I will try to find the problem for it.

create an AWS Lambda layer containing the pymongo package. AWS lambda layer allows you to reuse the code across the multiple function, reducing cold start time by preloading the dependencies.

1 Like

All of my tests and all of the information I found so far online regarding using layers, this will not improve the cold start. The problem I have is the very slow import time. The import will only happen during the function init. Once the function is loaded everything is fine.
Can you provide me an example how I shoudl implement the layer it will reduce the cold start time?