How to make distinct on big dataset

on mysql,oracle,sybase, i got never problem with a just simple distinct

i need to check no duplicate uuid field before create index with unique id on very big dataset of 70 millions records

but if i use distinct(“uuid”) i got message exeeed limit of 16MB

i use aggregation with count on uuid and allowDisckUse: true, i got bson data to large error

is it impossible to do a distinct like other database sql ?


Try these and count the distinct values:

  {$group: { _id: "$uuid" } }, 

  { $group: { _id: "$uuid" } },
  { $count: "c" }

it’s working, thanks

but i must add allowDiskUse:true
db[“mycollection”].aggregate([ {$group: { _id: “$uuid” } } ],{allowDiskUse:true}).itcount()