MongoDB 3.0: Hitting 1.3 Million Inserts/Sec with mongorestore
This will likely be my last post about MongoDB 3.0 performance, as I am leaving MongoDB (after 3 fantastic years) to join Riot Games – more about that in later posts. Before I go, some testing I was doing to verify WiredTiger performance with the journal disabled led to some eye-catching numbers. While performing a restore using 3.0 mongorestore to a 3.0 mongod with 4 collections, I noticed I was pushing close to 300,000 inserts/sec.
NOTE: This testing would eventually have hit a bottleneck in terms of flushing data out to disk (I had relatively small, low IOPS EBS volumes attached), and I would expect the sustained insert rate to drop significantly after that kicked in. Also, running without a journal is not generally a good idea for a production system. This testing was more about showing off the capabilities of the new tools, and how quickly you can use mongorestore to get data into an instance now, rather than a commentary on mongod performance – there will be far more rigorous testing than mine done to show the sustainable, production capabilities there in the future.
For anyone used to the old mongorestore (single threaded, not terribly fast), this is an amazing improvement. The new version, which has been completely rewritten in Go along with the rest of the tools for 3.0, allows you to restore multiple collections in parallel. It defaults to four collections in parallel (if available), so I was curious as to what I could do to push it higher.
I have no doubt that there will be plenty of impressive testing and performance numbers to accompany the MongoDB 3.0 launch, but I doubt the revamped tools will get much attention in the initial announcements. Given the improvement I noticed, I decided to give them some lone and once I had completed the journal testing, I went back to see how far I could push things with the now apparently not-so-humble mongorestore.
It turned out to be not very far, but the limiting factor was the CPU on my (3 year old) Macbook Pro which I was using as the client in the testing – the server with the mongod running was not even breathing hard (<50% CPU, <50% IO). It also looked like I might be getting close to the limits of what my network set up would allow (not quite full gigabit ethernet), but it was not yet the limiting factor.
Therefore, I went in search of more CPU and more network capacity, and settled on using c4.8xl instance in EC2. For anyone not familiar, these are compute optimized hosts with the following stats:
- 36 vCPUs
- 60GB RAM
- 10Gb Networking
In other words, plenty of horsepower and bandwidth for my test (my local test server has 32GB RAM, Intel SSD, 1Gb Networking so I knew the server side mongod would have no issues). This also has the benefit of giving a more standard and reproducible test environment if anyone wants to try recreating the numbers (versus my laptop and desktop boxes). I used 2 C4.8xl instances in the same availability zone running Amazon Linux for my testing:
- Client using mongorestore 3.0.0-rc8, with 20 collections to be restored in parallel (totaling 38GB of data, just over 200,000,000 documents)
- Server running mongod 3.0.0-rc8, with WiredTiger engine, journal disabled, no other load
The result was pretty impressive, over 2 runs I saw spikes of inserts touching the 1.5 million inserts/sec mark. While not quite that high, the overall numbers for the full restores were still very respectable:
Run 1: 154.32 seconds, 204,800,000 docs = 1,327,112 inserts/sec
Run 2: 160.30 seconds, 204,800,000 docs = 1,277,604 inserts/sec
Average: 1,302,358 inserts/sec
As well as the eye-catching insert rate, the fact that I was able to load 38GB of data into the server in less than 3 minutes is pretty decent too. I suspect I could push this even higher with a bit of tweaking for burst imports, but I will have to leave that to others, I have reports and such to finish before I move on to my new adventures (plus, those big instances are expensive)
For anyone wondering about the documents used, they were generated by a variant of this gist which I have used before in other testing, and the only index present on the collections was the default _id index.