Data modeling for Tinder app

My question is as follows. I will bring up 10 available contacts from the patron list for a user. However, it has to go through some filtering. These are as follows:

They are not matched anyway, the user is not already swiping it to the right or left, the user is not blocking it or it is not blocking it.


       matches: [user1, user2...]
       my_blockeds: [user1, user2...]
       blockeds: [user1, user2...]
       likes: [user1, user2...]
       dislikes: [user1, user2...]


const logged_id = "...";
const user_ids = [user1, user2...];

const min_age = 18;
const max_age = 30;
const gender_preference = 'male';

const users = await User.find({
                    _id: { $in: user_ids },

                    matches: { $ne: logged_id },
                    my_blockeds: { $ne: logged_id },
                    blockeds: { $ne: logged_id },

                    likes: { $ne: logged_id },
                    dislikes: { $ne: logged_id },

                    age: { $gte: min_age, $lte: max_age },
                    gender: { $eq: gender_preference },

                .select('display_name avatars age')

I thought of a model and query as above. It works fine now, but for some users I am afraid the list will grow significantly. I thought of using Outlier Pattern for this, but I think it is not suitable for this model. How do you think I should go about it? How can I create a model? Thank you from now.

Hello Enes,

I have thought about such a model as well, and my tests shown that you need to be careful with the arrays.
Because, onde the arrays go past the 10000 element size, things start to be extremely difficult to maintain.
and in your data-model, the likes and dislikes (for example) will grow to an unmanageable size, after a few days of using the app.

the blocks list not so much (I imagine).

From my considerations, one idea, is to (assuming you use geofencing and most recently active), would be to use a TTL-indexed collection with “active profiles”, get those, query a collection for your “likes” and “dislikes” and so on (the ones to be excluded) (these would be single documents for each), and then in the application subtract the latest to the first.

Because otherwise, you will always need to deal with arrays (that will be enormous) or define a bucketing parameter that may not be flexible or fitting to your needs.

This is using Mongodb/document db.

Using an RDBMS (hope I am not being blasphemous) you would have a “recently_online” table , get the ids and picture and so on, and just subtract from the “already_voted_on” table the relevant ids.

These are some ideas. Hope it helped in some way.


Yes. Interesting read:

I think using RDBMS is not suitable for something like this. Because it will be Big Data and SQL cannot remove something like this. Tinder etc. applications can do with a NoSQL database such as MongoDB. I did some research and thought I could do it using Bloom Filter. Sample topic is: Scaling a data model using bloom filters

Does this make sense or can you suggest something else? How can I use it if it makes sense?

@steevej I am already using this. I have collections of “blockeds, matches, likes, dislikes” but in addition, I have to keep them from a single collection and write a query accordingly.

Hello @Enes_Karadeniz

I actually don’t agree that a tinder-like app is a big data problem: it is a lot of data problem (number), but not a big data (type, diversity, variety, velocity, source) problem. It is a very structured data model. Fixed and stable.
Because you can run 2 extremely quick queries, by index, just subtracting the like/disliked profile ids by a given user from the currently online ids (from a given geo area). The online table (if geofenced and including recent activity) is relatively small, compared to the number of likes/dislikes.
I ran some tests on postgres in a simple VM, and it worked well (but I didn’t create 100M records… true).

That being said…
certainly there is a solution in mongodb, but using arrays will eventually hit a brickwall because they have performance limits when they go past the 10ths of thousands.
It is more likely just insert one document per like/dislike and recent activity, get all of those by index quickly, and let the code in the app sort out the needed and not needed.

also, you just need to get some results, not all, to the user quickly. While the user selects, clicks or not on some profiles, you can get some more behind the scenes :wink: