PyMongoArrow 0.1.1 Released

We are proud to announce the initial release of PyMongoArrow - a companion library to PyMongo that makes it easy to move data between MongoDB and Python’s scientific & numerical computing libraries like Pandas, PyArrow and NumPy.

PyMongoArrow extends PyMongo’s APIs and makes it possible to materialize query result sets as pandas DataFrames:

>>> data_frame = client.db.test.find_pandas_all({'qty': {'$gt': 5}}, schema=schema)
>>> data_frame
   _id   qty
0    1  25.4
1    2  16.9

Similar APIs facilitate loading result sets as PyArrow Tables:

>>> arrow_table = client.db.test.find_arrow_all({'qty': {'$gt': 5}}, schema=schema)
>>> arrow_table
pyarrow.Table
_id: int64
qty: double

As well as NumPy ndarrays:

>>> ndarrays = client.db.test.find_numpy_all({'qty': {'$gt': 5}}, schema=schema)
>>> ndarrays
{'_id': array([1, 2, 3]), 'qty': array([25.4, 16.9,  2.3])}

Installation

Wheels are available on PyPI for macOS and Linux platforms on x86_64 architectures.

$ python -m pip install pymongoarrow

Links

3 Likes

Are there plans to support Windows

Yes, please follow PYTHON-2691 for updates.

3 Likes

@Prashant_Mital Interesting lib.
It’s working so far. But I couldn’t figure out how to put fields into my schema from nested objects.
How to do that?

Hi @Chris_Haus,

You can use aggregation pipeline to export data of the nested fields out of MongoDB into any of the supported data formats.

For example, let’s say we want to export MongoDB data into pandas dataframe. We can use Pymongoarrow’s aggregate_pandas_all() function to achieve this.

Let’s say this is our sample document containing nested fields:

{'_id': ObjectId('62cd854a73939396fff10edd'), 'a': {'b': 1, 'c': 2}}

Using $project, we can rename the nested field and use the new names to define the Schema. For example:

schema = Schema({'ab': int, 'ac': int})
df = coll.aggregate_pandas_all([{'$project':{'ab':'$a.b', 'ac':'$a.c'}}], schema = schema)

We also have a ticket open (ARROW-9) for adding a direct support for this.

If you have any other questions/feedback related to PyMongoArrow, please feel free to get back to us and we would be happy to chat more with you :slight_smile:

~ Shubham

1 Like