Hi!
I have built an Internet indexing service (similar to Shodan, Censys, etc.) that is backed by a cluster consisting of: 9 shards, 3 routers and 3 config servers. Across this cluster I have 5 sharded collections keyed on either ip or ip+port. Let’s say the structure looks something like this:
collection_a
{
ip: “1.2.3.4”,
ports: [80, 443]
}collection_b
{
ip: “1.2.3.4”,
port: 443
common_name: “blah.com”
}collection_c
{
ip: 1.2.3.4,
banner: “some raw text”,
port: 443
}
Is there an optimal way to do something like: return all ips that match something in collection_a, collection_b and collection_c?
Another, hopefully simpler, question: is there a way to, at the database layer, run the same query over multiple collections and aggregate the results into a pseudo-document? For example, if we take the above collections, is there a way to search for {ip: “1.2.3.4”} across all collections? Obviously I can implement this at the application layer if needed, but thought I’d ask!
Cheers!