Store MongoDB data and log files on external drive/usb

If you can answer this, you’ll have a fan! :slight_smile:

I wrote the configuration file below to store both data and logs in a shared folder in an external hard drive of the host computer (macOS) and calling the configuration below from mongodb installed in Vagrant, and although I can read and edit those files in the guest VM, this mongodb call doesn’t work because of fsync support, as explained at this link (highlighted):

https://docs.mongodb.com/manual/administration/production-notes/#prod-notes-platform-considerations

Then, I thought, let’s try NFS; that should support fsync. So, I configured my host system (macOS) fine using:

https://www.bresink.com/osx/NFSManager.html

and its NFS tests run OK on the host. Then I checked multiple things:

  • /etc/exports on the host looks fine
  • “showmount -e” gives the expected output
  • I allowed the program “nfsd” in my mac’s firewall
  • set up fstab on the client (Vagrant/Ubuntu Xenial)

and it doesn’t mount. I’m running:

(a) Client
Virtual Box 6.0.14
Ubuntu 16.04.6 LTS (GNU/Linux 4.4.0-171-generic x86_64)

(b) Host
macOS High Sierra 10.13.6
External hard drive: Mac OS Extended (Case sensitive, Journaled)

f you have a link to a tutorial or could advise, I would be really thankful. Maybe this will be discussed further down the road in M310, but I still have a lot of ground to cover. Basically, my issue is: how to I store data and logs to persistent memory in an external hard drive from a VM? Time to try VMWare? :slight_smile:

Cheers

mongod.conf

for documentation of all options, see:

http://docs.mongodb.org/manual/reference/configuration-options/

Where and how to store data.

storage:
dbPath: /media/sf_share/db/data

where to write logging data.

systemLog:
destination: file
logAppend: true
path: /media/sf_share/db/log/mongod.log

network interfaces

net:
bindIp: “localhost”
port: 30000

#security:
security:
authorization: enabled

processManagement:
fork : true

OBJECTIVE:
To run a MongoDB instance in a Linux guest VM and have its data and log files pointing to and saved in an external drive (USB) that’s plugged-in to a host machine running on any OS. The approach explained here mounts an appropriately formatted external drive in the guest VM (which subsequently gets disconnected from the host machine) each time you bring up and provision the VM. For auto-mounting, create an /etc/fstab entry in the guest VM (not covered here).

It’s not discussed, and I’d be cautious using this for anything other than recreational.

So, I grabbed an old USB stick and got this working on a Win 10 Pro host machine:

  1. The filesystem of the external drive must match that of the guest VM for MongoDB to run normally
  2. USB formatted as Ext2, Ext3, Ext4, XFS mounts fine and works with all MongoDB storage types (i.e. MMAPv1, In-Memory and WiredTiger).
  3. USB formatted as NTFS mounts fine but requires root (i.e. sudo) to startup the mongod on all storage types. Not recommended!
  4. Don’t bother with FAT32 USBs.
  5. As an added bonus… VBox and SMB shares from host NTFS machine works with MMAPv1 and In-Memory but not with WiredTiger. NFS share isn’t compatible with Windows on vagrant.

… I would expect similar behaviours with the Mac OS filesystems (APFS and Extended).

Contrary to the doc, if VBox share doesn’t support fsync(), then In-Memory and MMAPv1 storage shouldn’t have worked on an NTFS host filesystem.

For the implementation, I used a Vagrantfile from a different course (M103) hence the mongod-m103 machine name because it was what was handy at the time, but you can easily adapt it.

AUTOMATED SETUP:

  1. Format the USB as Ext2 filesystem. You can try Ext3, Ext4, XFS or perhaps Mac OS Extended but as a first try I’d suggest Ext2.

  2. List all USB devices to find your specific USB:
    vboxmanage list usbhost
    Sample output:
    image
    PS: Remember to add the PATH variable to vboxmanage if you haven’t already

  3. For the Vagrantfile. The yellow code lines will make the USB device visible to the guest machine and subsequently disconnect it from the host machine. Fill in the blanks with the values from #2.
    Sample screenshot and relevant code below:

       vb.customize ["modifyvm", :id, "--cpus", "2", "--paravirtprovider", "kvm"]
       vb.customize ["modifyvm", :id, "--usbxhci", "on"]
       vb.customize ["usbfilter", "add", "0",
          "--target", :id,
          "--name", "usb_mongodb_data",
          "--action", "",
          "--active", "yes",
          "--vendorid", "",
          "--productid", "",
          "--revision", "",
          "--manufacturer", "",
          "--product", "",
          "--serialnumber", "",]

       #vb.gui = true
  1. For the provision file. The code below will mount the USB to the /media/usb mount point. Remember to add a line to call the function too!
function mount_usb(){
  # Mounts the USB that's specified in the Vagrantfile
  
  # The search term to get the block device name is removable media (RM) == 1 and type == "part"
  declare SEARCH_TERM='RM=\"1\".*TYPE=\"part\"'
  declare MOUNT_POINT="/media/usb"
  
  # Return the 4 letter name of the USB block device
  export BLOCK_NAME=`lsblk -P -o NAME,RM,TYPE,SIZE,RO | grep -oP "^NAME=\"\K([[:alnum:]]{4})(?=\".+$SEARCH_TERM)"`

  # Mount USB device only if it's not already mounted
  if [[ -n $BLOCK_NAME ]]
  then
    export BLOCK_PATH=/dev/$BLOCK_NAME
    
    if [[ -z `findmnt -S $BLOCK_PATH -fnr` ]]
    then
      sudo mkdir -p $MOUNT_POINT
      sudo mount $BLOCK_PATH $MOUNT_POINT
      # sudo mkdir -p $MOUNT_POINT/{db,log}
      # sudo chmod -R 777 $MOUNT_POINT/{db,log}
      # sudo chown -R vagrant:vagrant $MOUNT_POINT/{db,log}
    fi
    echo "USB mounted!"
    
  else
    echo "USB mount unsuccessful!!!"
  fi
}
  1. Destroy the machine
    vagrant destroy <machine name> -f

  2. As an initial check:
    i) Bring the machine up but don’t provision it. We just want to make sure the device block/partition is visible in the guest.
    vagrant up <machine name> --no-provision
    vagrant ssh <machine name>
    ii) Check if the USB is visible
    lsblk -o NAME,MODEL,RM,TYPE,SIZE,RO,STATE,MODE,FSTYPE

  3. Provision the machine to mount the USB; only if #6 was successful
    vagrant provision <machine name>

  4. Check if it mounted to /media/usb
    vagrant ssh <machine name>
    df -h -T
    findmnt -m

  5. Create your sub folders and change ownership to vagrant

If you’re not having success automating it, then I’d suggest doing it manually.

MANUAL SETUP:

  1. Destroy the machine

  2. Bring the machine up without provisioning it (same as #6i Automated Setup)

  3. Halt the machine
    vagrant halt <machine name>

  4. Open the VM Settings > select USB > add a filter that points to your USB by using the button in the screenshot:


    NB: The filter is important. Its purpose is to automatically make the USB visible to the VM (same as #3 Automated Setup)

  5. Bring the machine back up without provisioning again

  6. From VirtualBox, bring up the GUI window by clicking the Show button (big green right arrow)

  7. From the GUI, hover over the button in the screenshot to see if the USB is visible/active. If it isn’t, right-click and select the USB from the list.
    image

  8. SSH into the machine

  9. Create /media/usb as the mount point, then mount it using:
    sudo mount /dev/sdb1 /media/usb
    Use the lsblk command from #6ii (Automated Setup) to find the appropriate substitute value for sdb1. Note, it’s the one that ends with a digit and ensure it’s the right model.

  10. Once mounted, create your sub folders and change ownership to vagrant. In my case, I used two sub dirs, /media/sub/db and /media/sub/log that are owned by vagrant.

  11. Run the tests from #8 (Automated Setup). Also test if you can create a file.

  12. Log out and provision the machine. Note, don’t destroy the machine, just provision it.
    vagrant provision <machine name>

  13. Now fire up mongod and run your tests. Here’s what my config file looks like:
    image

That should be it!

Hey, thanks for your help. I appreciate the detailed response.

I did try this solution, but it didn’t work for me. In the process of trying it, though, I learned how to set up a samba share, which I prefer over the virtual box share and good to know.

However I get the same issue: WiredTiger errors # 17, 22, -31804. For error # 22, it says directory-sync: fdatasync: Invalid Argument.

I found a solution that works for me, though:

  • I installed the VM directly on the external hard drive
  • The configuration file and relevant bit of the vagrant file are shown below
  • I added a shared folder where the mongod.conf is stored; it would not be deleted with the VM
  • All I have to do is create a shared folder to store backups of the database

I tested; it runs fine!

While this is not the ideal solution, I can still keep my MongoDB data in the external hard drive, since that’s where the VM is installed. Plus, I should be able to access it from the host computer via Robo 3T or Mongo Compass.

No problem @Thiago_18528! Yes, it’s pretty lengthy! What exactly happened when you tried it, where did it fail?

The SAMBA share was the SMB share I was referring to in #3 of my findings. Did you try it on all three storage options? Or you only want WiredTiger?

I did actually think about installing the VM directly on the external drive but I didn’t think that was what you wanted. But it sounds like you’re happy with this. :+1:

I also tried out another alternative which works but I didn’t post the steps:

  1. Create a folder on your external drive
  2. Create a VirtualBox VMDK storage on your external drive. It can be set to a fixed or dynamic sized storage. You can do this via VirtualBox machine settings.
  3. Login to the guest > create a partition > format it as XFS, Ext2/3/4 > mount it
    Basically, you’re mounting that folder on your external drive as a virtual disk and all of this can be automated.

With this setup, all storage options work with XFS, Ext2/3/4. But the interesting part is, WiredTiger now works with an NTFS fileformat if I startup mongod as root. I think I’ll get the same result with the USB, I’ll test it out at some point and update my post.

PS: I’ve just renamed your post to make it more relevant for the benefit of others

As promised, I’ve just updated the original post to reflect my findings on this :arrow_up:. The code and steps remain unchanged.

I’ve automated this process too so let me know if you’d like the steps and code. Your external drive would look like the screenshot below and all your data will be stored inside the VMDK file. This virtual disk can be set to a fixed or dynamic size:
image
… and since it’s VMDK, it’s portable to VMWare and a host of others.