MongoDB start fails due to a "file system limitation"

Hi everybody,

our customer has 5 identical systems were MongoDB 3.4 is running on a Windows 7 PC.
Sometimes, MongoDB will not start after rebooting the system with the following startup log:

2021-06-14T14:24:16.575-0700 I CONTROL  [initandlisten] MongoDB starting : pid=952 port=27017 dbpath=D:\DB\MongoDB 64-bit host=PC-03487-03
2021-06-14T14:24:16.575-0700 I CONTROL  [initandlisten] targetMinOS: Windows 7/Windows Server 2008 R2
2021-06-14T14:24:16.575-0700 I CONTROL  [initandlisten] db version v3.4.9-25-g2bf0b19
2021-06-14T14:24:16.575-0700 I CONTROL  [initandlisten] git version: 2bf0b19b41bbaacf4b9da75ad109a203adf6aa71
2021-06-14T14:24:16.575-0700 I CONTROL  [initandlisten] OpenSSL version: OpenSSL 1.0.1u-fips  22 Sep 2016
2021-06-14T14:24:16.575-0700 I CONTROL  [initandlisten] allocator: tcmalloc
2021-06-14T14:24:16.575-0700 I CONTROL  [initandlisten] modules: none
2021-06-14T14:24:16.575-0700 I CONTROL  [initandlisten] build environment:
2021-06-14T14:24:16.575-0700 I CONTROL  [initandlisten]     distmod: 2008plus-ssl
2021-06-14T14:24:16.575-0700 I CONTROL  [initandlisten]     distarch: x86_64
2021-06-14T14:24:16.575-0700 I CONTROL  [initandlisten]     target_arch: x86_64
2021-06-14T14:24:16.575-0700 I CONTROL  [initandlisten] options: { storage: { dbPath: "D:\DB\MongoDB" }, systemLog: { destination: "file", path: "D:\DB\MongoDB\logs\log", verbosity: 1 } }
2021-06-14T14:24:16.575-0700 W -        [initandlisten] Detected unclean shutdown - D:\DB\MongoDB\mongod.lock is not empty.
2021-06-14T14:24:16.576-0700 I -        [initandlisten] Detected data files in D:\DB\MongoDB created by the 'wiredTiger' storage engine, so setting the active storage engine to 'wiredTiger'.
2021-06-14T14:24:16.576-0700 W STORAGE  [initandlisten] Recovering data from the last clean checkpoint.
2021-06-14T14:24:16.576-0700 I STORAGE  [initandlisten] wiredtiger_open config: create,cache_size=7591M,session_max=20000,eviction=(threads_min=4,threads_max=4),config_base=false,statistics=(fast),log=(enabled=true,archive=true,path=journal,compressor=snappy),file_manager=(close_idle_time=100000),checkpoint=(wait=60,log_size=2GB),statistics_log=(wait=0),
2021-06-14T14:24:17.456-0700 E STORAGE  [initandlisten] WiredTiger error (0) [1623705857:456265][952:1992306464], file:collection-13-3714778000169942699.wt, WT_SESSION.checkpoint: D:\DB\MongoDB\collection-13-3714778000169942699.wt: handle-write: WriteFile: failed to write 8192 bytes at offset 79908962304: The requested operation could not be completed due to a file system limitation

2021-06-14T14:24:17.456-0700 E STORAGE  [initandlisten] WiredTiger error (-31802) [1623705857:456265][952:1992306464], file:sizeStorer.wt, txn-recover: Recovery failed: WT_ERROR: non-specific WiredTiger error
2021-06-14T14:24:17.456-0700 E STORAGE  [initandlisten] WiredTiger error (22) [1623705857:456265][952:1992306464], file:sizeStorer.wt, connection: sizeStorer.wt: unexpected checkpoint ordering: Invalid argument
2021-06-14T14:24:17.456-0700 E STORAGE  [initandlisten] WiredTiger error (-31804) [1623705857:456265][952:1992306464], file:sizeStorer.wt, connection: the process must exit and restart: WT_PANIC: WiredTiger library panic
2021-06-14T14:24:17.456-0700 I -        [initandlisten] Fatal Assertion 28558 at src\mongo\db\storage\wiredtiger\wiredtiger_util.cpp 361
2021-06-14T14:24:17.456-0700 I -        [initandlisten] 

***aborting after fassert() failure


2021-06-14T14:24:17.530-0700 I CONTROL  [initandlisten] mongod.exe    ...\src\mongo\util\stacktrace_windows.cpp(239)                     mongo::printStackTrace+0x43
2021-06-14T14:24:17.530-0700 I CONTROL  [initandlisten] mongod.exe    ...\src\mongo\util\signal_handlers_synchronous.cpp(180)            mongo::`anonymous namespace'::printSignalAndBacktrace+0x74
2021-06-14T14:24:17.530-0700 I CONTROL  [initandlisten] mongod.exe    ...\src\mongo\util\signal_handlers_synchronous.cpp(236)            mongo::`anonymous namespace'::abruptQuit+0x85
2021-06-14T14:24:17.530-0700 I CONTROL  [initandlisten] ucrtbase.DLL                                                                     raise+0x1e7
2021-06-14T14:24:17.530-0700 I CONTROL  [initandlisten] ucrtbase.DLL                                                                     abort+0x31
2021-06-14T14:24:17.530-0700 I CONTROL  [initandlisten] mongod.exe    ...\src\mongo\util\assert_util.cpp(172)                            mongo::fassertFailedWithLocation+0x181
2021-06-14T14:24:17.530-0700 I CONTROL  [initandlisten] mongod.exe    ...\src\mongo\db\storage\wiredtiger\wiredtiger_util.cpp(361)       mongo::`anonymous namespace'::mdb_handle_error+0x205
2021-06-14T14:24:17.530-0700 I CONTROL  [initandlisten] mongod.exe    ...\src\third_party\wiredtiger\src\support\err.c(275)              __wt_eventv+0x376
2021-06-14T14:24:17.530-0700 I CONTROL  [initandlisten] mongod.exe    ...\src\third_party\wiredtiger\src\support\err.c(317)              __wt_err+0x32
2021-06-14T14:24:17.530-0700 I CONTROL  [initandlisten] mongod.exe    ...\src\third_party\wiredtiger\src\support\err.c(509)              __wt_panic+0x22
2021-06-14T14:24:17.530-0700 I CONTROL  [initandlisten] mongod.exe    ...\src\third_party\wiredtiger\src\block\block_mgr.c(634)          __wt_block_panic+0x5a
2021-06-14T14:24:17.530-0700 I CONTROL  [initandlisten] mongod.exe    ...\src\third_party\wiredtiger\src\block\block_ckpt.c(401)         __ckpt_process+0x5f
2021-06-14T14:24:17.530-0700 I CONTROL  [initandlisten] mongod.exe    ...\src\third_party\wiredtiger\src\block\block_ckpt.c(255)         __wt_block_checkpoint+0xab
2021-06-14T14:24:17.530-0700 I CONTROL  [initandlisten] mongod.exe    ...\src\third_party\wiredtiger\src\btree\bt_io.c(365)              __wt_bt_write+0x3a0
2021-06-14T14:24:17.530-0700 I CONTROL  [initandlisten] mongod.exe    ...\src\third_party\wiredtiger\src\reconcile\rec_write.c(5836)     __rec_write_wrapup+0x2fa
2021-06-14T14:24:17.530-0700 I CONTROL  [initandlisten] mongod.exe    ...\src\third_party\wiredtiger\src\reconcile\rec_write.c(443)      __wt_reconcile+0x212
2021-06-14T14:24:17.530-0700 I CONTROL  [initandlisten] mongod.exe    ...\src\third_party\wiredtiger\src\evict\evict_file.c(72)          __wt_evict_file+0xb3
2021-06-14T14:24:17.530-0700 I CONTROL  [initandlisten] mongod.exe    ...\src\third_party\wiredtiger\src\txn\txn_ckpt.c(1466)            __checkpoint_tree+0xef
2021-06-14T14:24:17.530-0700 I CONTROL  [initandlisten] mongod.exe    ...\src\third_party\wiredtiger\src\txn\txn_ckpt.c(1698)            __wt_checkpoint_close+0x133
2021-06-14T14:24:17.530-0700 I CONTROL  [initandlisten] mongod.exe    ...\src\third_party\wiredtiger\src\conn\conn_dhandle.c(203)        __wt_conn_btree_sync_and_close+0xd6
2021-06-14T14:24:17.530-0700 I CONTROL  [initandlisten] mongod.exe    ...\src\third_party\wiredtiger\src\conn\conn_dhandle.c(608)        __wt_conn_dhandle_discard_single+0x30
2021-06-14T14:24:17.530-0700 I CONTROL  [initandlisten] mongod.exe    ...\src\third_party\wiredtiger\src\conn\conn_dhandle.c(675)        __wt_conn_dhandle_discard+0x72
2021-06-14T14:24:17.530-0700 I CONTROL  [initandlisten] mongod.exe    ...\src\third_party\wiredtiger\src\conn\conn_open.c(134)           __wt_connection_close+0x1fa
2021-06-14T14:24:17.530-0700 I CONTROL  [initandlisten] mongod.exe    ...\src\third_party\wiredtiger\src\conn\conn_api.c(2479)           wiredtiger_open+0xcac
2021-06-14T14:24:17.530-0700 I CONTROL  [initandlisten] mongod.exe    ...\src\mongo\db\storage\wiredtiger\wiredtiger_kv_engine.cpp(272)  mongo::WiredTigerKVEngine::WiredTigerKVEngine+0x942
2021-06-14T14:24:17.530-0700 I CONTROL  [initandlisten] mongod.exe    ...\src\mongo\db\storage\wiredtiger\wiredtiger_init.cpp(91)        mongo::`anonymous namespace'::WiredTigerFactory::create+0x12f
2021-06-14T14:24:17.530-0700 I CONTROL  [initandlisten] mongod.exe    ...\src\mongo\db\service_context_d.cpp(202)                        mongo::ServiceContextMongoD::initializeGlobalStorageEngine+0x59c
2021-06-14T14:24:17.530-0700 I CONTROL  [initandlisten] mongod.exe    ...\src\mongo\db\db.cpp(599)                                       mongo::`anonymous namespace'::_initAndListen+0x77b
2021-06-14T14:24:17.530-0700 I CONTROL  [initandlisten] mongod.exe    ...\src\mongo\db\db.cpp(841)                                       mongo::`anonymous namespace'::initAndListen+0x27
2021-06-14T14:24:17.530-0700 I CONTROL  [initandlisten] mongod.exe    ...\src\mongo\db\db.cpp(1191)                                      mongoDbMain+0x227
2021-06-14T14:24:17.530-0700 I CONTROL  [initandlisten] mongod.exe    ...\src\mongo\db\db.cpp(879)                                       wmain+0x35
2021-06-14T14:24:17.530-0700 I CONTROL  [initandlisten] mongod.exe    f:\dd\vctools\crt\vcstartup\src\startup\exe_common.inl(253)        __scrt_common_main_seh+0x11d
2021-06-14T14:24:17.530-0700 I CONTROL  [initandlisten] kernel32.dll                                                                     BaseThreadInitThunk+0xd
2021-06-14T14:24:17.530-0700 F -        [initandlisten] Got signal: 22 (SIGABRT).
2021-06-14T14:24:17.530-0700 I CONTROL  [initandlisten] *** unhandled exception 0x0000000E at 0x000007FEFCC9B87D, terminating
2021-06-14T14:24:17.531-0700 I CONTROL  [initandlisten] *** stack trace for unhandled exception:
2021-06-14T14:24:17.540-0700 I CONTROL  [initandlisten] KERNELBASE.dll                                                                     RaiseException+0x3d
2021-06-14T14:24:17.540-0700 I CONTROL  [initandlisten] mongod.exe      ...\src\mongo\util\signal_handlers_synchronous.cpp(237)            mongo::`anonymous namespace'::abruptQuit+0x9d
2021-06-14T14:24:17.540-0700 I CONTROL  [initandlisten] ucrtbase.DLL                                                                       raise+0x1e7
2021-06-14T14:24:17.540-0700 I CONTROL  [initandlisten] ucrtbase.DLL                                                                       abort+0x31
2021-06-14T14:24:17.540-0700 I CONTROL  [initandlisten] mongod.exe      ...\src\mongo\util\assert_util.cpp(172)                            mongo::fassertFailedWithLocation+0x181
2021-06-14T14:24:17.540-0700 I CONTROL  [initandlisten] mongod.exe      ...\src\mongo\db\storage\wiredtiger\wiredtiger_util.cpp(361)       mongo::`anonymous namespace'::mdb_handle_error+0x205
2021-06-14T14:24:17.540-0700 I CONTROL  [initandlisten] mongod.exe      ...\src\third_party\wiredtiger\src\support\err.c(275)              __wt_eventv+0x376
2021-06-14T14:24:17.540-0700 I CONTROL  [initandlisten] mongod.exe      ...\src\third_party\wiredtiger\src\support\err.c(317)              __wt_err+0x32
2021-06-14T14:24:17.540-0700 I CONTROL  [initandlisten] mongod.exe      ...\src\third_party\wiredtiger\src\support\err.c(509)              __wt_panic+0x22
2021-06-14T14:24:17.540-0700 I CONTROL  [initandlisten] mongod.exe      ...\src\third_party\wiredtiger\src\block\block_mgr.c(634)          __wt_block_panic+0x5a
2021-06-14T14:24:17.540-0700 I CONTROL  [initandlisten] mongod.exe      ...\src\third_party\wiredtiger\src\block\block_ckpt.c(401)         __ckpt_process+0x5f
2021-06-14T14:24:17.540-0700 I CONTROL  [initandlisten] mongod.exe      ...\src\third_party\wiredtiger\src\block\block_ckpt.c(255)         __wt_block_checkpoint+0xab
2021-06-14T14:24:17.540-0700 I CONTROL  [initandlisten] mongod.exe      ...\src\third_party\wiredtiger\src\btree\bt_io.c(365)              __wt_bt_write+0x3a0
2021-06-14T14:24:17.540-0700 I CONTROL  [initandlisten] mongod.exe      ...\src\third_party\wiredtiger\src\reconcile\rec_write.c(5836)     __rec_write_wrapup+0x2fa
2021-06-14T14:24:17.540-0700 I CONTROL  [initandlisten] mongod.exe      ...\src\third_party\wiredtiger\src\reconcile\rec_write.c(443)      __wt_reconcile+0x212
2021-06-14T14:24:17.540-0700 I CONTROL  [initandlisten] mongod.exe      ...\src\third_party\wiredtiger\src\evict\evict_file.c(72)          __wt_evict_file+0xb3
2021-06-14T14:24:17.540-0700 I CONTROL  [initandlisten] mongod.exe      ...\src\third_party\wiredtiger\src\txn\txn_ckpt.c(1466)            __checkpoint_tree+0xef
2021-06-14T14:24:17.540-0700 I CONTROL  [initandlisten] mongod.exe      ...\src\third_party\wiredtiger\src\txn\txn_ckpt.c(1698)            __wt_checkpoint_close+0x133
2021-06-14T14:24:17.540-0700 I CONTROL  [initandlisten] mongod.exe      ...\src\third_party\wiredtiger\src\conn\conn_dhandle.c(203)        __wt_conn_btree_sync_and_close+0xd6
2021-06-14T14:24:17.540-0700 I CONTROL  [initandlisten] mongod.exe      ...\src\third_party\wiredtiger\src\conn\conn_dhandle.c(608)        __wt_conn_dhandle_discard_single+0x30
2021-06-14T14:24:17.540-0700 I CONTROL  [initandlisten] mongod.exe      ...\src\third_party\wiredtiger\src\conn\conn_dhandle.c(675)        __wt_conn_dhandle_discard+0x72
2021-06-14T14:24:17.540-0700 I CONTROL  [initandlisten] mongod.exe      ...\src\third_party\wiredtiger\src\conn\conn_open.c(134)           __wt_connection_close+0x1fa
2021-06-14T14:24:17.540-0700 I CONTROL  [initandlisten] mongod.exe      ...\src\third_party\wiredtiger\src\conn\conn_api.c(2479)           wiredtiger_open+0xcac
2021-06-14T14:24:17.540-0700 I CONTROL  [initandlisten] mongod.exe      ...\src\mongo\db\storage\wiredtiger\wiredtiger_kv_engine.cpp(272)  mongo::WiredTigerKVEngine::WiredTigerKVEngine+0x942
2021-06-14T14:24:17.540-0700 I CONTROL  [initandlisten] mongod.exe      ...\src\mongo\db\storage\wiredtiger\wiredtiger_init.cpp(91)        mongo::`anonymous namespace'::WiredTigerFactory::create+0x12f
2021-06-14T14:24:17.540-0700 I CONTROL  [initandlisten] mongod.exe      ...\src\mongo\db\service_context_d.cpp(202)                        mongo::ServiceContextMongoD::initializeGlobalStorageEngine+0x59c
2021-06-14T14:24:17.541-0700 I CONTROL  [initandlisten] mongod.exe      ...\src\mongo\db\db.cpp(599)                                       mongo::`anonymous namespace'::_initAndListen+0x77b
2021-06-14T14:24:17.541-0700 I CONTROL  [initandlisten] mongod.exe      ...\src\mongo\db\db.cpp(841)                                       mongo::`anonymous namespace'::initAndListen+0x27
2021-06-14T14:24:17.541-0700 I CONTROL  [initandlisten] mongod.exe      ...\src\mongo\db\db.cpp(1191)                                      mongoDbMain+0x227
2021-06-14T14:24:17.541-0700 I CONTROL  [initandlisten] mongod.exe      ...\src\mongo\db\db.cpp(879)                                       wmain+0x35
2021-06-14T14:24:17.541-0700 I CONTROL  [initandlisten] mongod.exe      f:\dd\vctools\crt\vcstartup\src\startup\exe_common.inl(253)        __scrt_common_main_seh+0x11d
2021-06-14T14:24:17.541-0700 I CONTROL  [initandlisten] kernel32.dll                                                                       BaseThreadInitThunk+0xd
2021-06-14T14:24:17.541-0700 I -        [initandlisten] 
2021-06-14T14:24:17.541-0700 I CONTROL  [initandlisten] writing minidump diagnostic file C:\Program Files\MongoDB\Server\3.2021-06-14T21-24-17.mdmp
2021-06-14T14:24:17.566-0700 I CONTROL  [initandlisten] *** immediate exit due to unhandled exception

This has happened multiple times on at least 3 of the 5 machines. According to my researches, this can be fixed in most cases by defragmenting the disk, but this is not the problem here. The disks are defragmented weekly, and were also defragmented manually after the error but with no success. Disk space is 1.6TB, and around 1.34TB were free in the lastest case.

The only way to recover the database we found is:

  • Copy database files to another machine
  • Start MongoDB once with these database files, stop afterwards
  • Copy files back to production machine

Does anybody know how to find the root cause of the file system limitation?

Thanks!
Andreas

The error

Detected unclean shutdown - D:\DB\MongoDB\mongod.lock is not empty.

seems to indicate that the server were not shutdown correctly.

Did you missed electricity and UPS was out of juice?

Anybody pulled a plug?

Do you have the showdown log?

As an editorial comment, it might be time to upgrade from both Windows 7 and Mongo 3.4.

Hi Steeve,

Yes, I know that the system should be updated, but this is a production machine running 24/7 at customer site and this is the only issue they currently have. Upgrading MongoDB is an option, but I fear Windows will stay on 7.

Back to topic:
The system is secured by a UPS and should shut down properly on power failure. It could be a power failure, but could the mongod.lock also be a remnant of the failed start?
I will try to get a remote connection and check the event viewer for any power failure events.

But I still wonder why the files can be recovered/used on another machine and then, after copying them back, everything is working again.

Most likely not. I never seen it. But I have not seen so many failures. The failures I saw with a mongod.lock still in place were all runtime failure. Out of memory, out of power, …

Most likely because the storage engine was not actively writing to any of the file. Next time, rather than copying the data files, just try to remove mongod.lock (if you are absolutely sure that the related mongod is not running anymore) and restart. It works all the time on my test system that is not connected to a UPS and is often out of power.

Thanks for your tips. I already tried to remove the mongod.lock last time this issue happened without success.

Meanwhile I checked the event logs, and there was no unclean shutdown before this error happened. I can only find an entry from “Service Control Manager”:
The "MongoDB" service terminated unexpectedly. It has done this 1 time(s).
No other error around in event log.
I need to check if there are MongoDB logs for this period, but I have to wait to get remote access again.

We are currently recreating the partitions to isolate the database partition from all other files.
Hopefully this will fix the issue.