Archiving and Compression in MongoDB Tools

Author: ~ Published: July 1, 2015, 2:19 p.m.
###Introduction My talk at MongoDB World 2015, "[Putting the Go in MongoDB](" focused on the rewrite of the MongoDB tools from C++ to Go and some of the usability and performance improvements that were realized, but I only briefly touched on two new features planned for the 3.2 release - archiving and compression. In this post, I’ll provide a more detailed explanation of the archiving and compression in mongodump and mongorestore, as well as explore some of the ...

Announcing PyMongo 3.0.3

Author: A. Jesse Jiryu Davis ~ Published: July 1, 2015, 2:06 p.m.

Tree python

Bernie Hackett, Luke Lovett, Anna Herlihy, and I are pleased to announce PyMongo 3.0.3. This release fixes bugs reported since PyMongo 3.0.2—most importantly, a bug that broke Kerberos authentication. We also fixed a TypeError if you try to turn off SSL hostname validation using an option in the MongoDB connection string, and an infinite loop reading certain kinds of corrupt GridFS files.

For the full list of bugs fixed in PyMongo 3.0.3, please see the release in Jira.

If you use PyMongo 3.0.x, upgrade.

If you are on PyMongo 2.8.x, you should probably wait to upgrade: we are about to make it easier for you. PyMongo 2.9, which will be released shortly, provides a smooth bridge for you to upgrade from the old API to the new one.

Let us know if you have any problems by opening a ticket in Jira, in the PYTHON project.

Image: Ian C. on Flickr.

MongoDB 3.1.5 is released

Author: ~ Published: June 30, 2015, 5:06 p.m.

MongoDB 3.1.5 has been released. As a reminder, 3.1.5 is a development release and is not intended for production use. The 3.1 series will evolve into 3.2, which will be for production.

New/fixed in this release:

  • SERVER-19143 race in setting OpDebug ns can cause invalid BSON to be returned from currentOp command
  • SERVER-19135 Tune default cache size settings for WiredTiger
  • SERVER-19034 log command failures at level 1
  • SERVER-18977 Initscript does not stop a running mongod daemon
  • SERVER-18974 dropCollection fails because setShardVersion request is incorrect
  • SERVER-18902 Retrieval of large documents slower on WiredTiger than MMAPv1
  • SERVER-17254 WT: drop collection while concurrent oplog tailing may greatly reduce throughput TOOLS-16 Mongodump should not use SlaveOk flag by default

Downloads | All Issues

As always, please let us know of any issues.

– The MongoDB Team

Examining performance for MongoDB and the insert benchmark

Author: Mark Callaghan ~ Published: June 30, 2015, 1:41 a.m.
My previous post has results for the insert benchmark when the database fits in RAM. In this post I look at MongoDB performance as the database gets larger than RAM. I ran these tests while preparing for a talk and going on vacation, so I am vague on some of the configuration details. The summary is that RocksDB does much better than mmapv1 and the WiredTiger B-Tree when the database is larger than RAM because it is more IO efficient. RocksDB doesn't read index pages during non-unique secondary index maintenance. It also does fewer but larger writes rather than the many smaller/random writes required by a B-Tree. This is more of a benefit for servers that use disk.

Average performance

I present performance results using a variety of metrics. The first is average throughput during the test. RocksDB is much faster than WiredTiger and WiredTiger is much faster than mmapv1. But you should be careful about benchmark reports that only include the average. Read on to learn more.

Cumulative average

This displays the cumulative average. That is the average from test start to the current point in time. At test end the value is the same as the average performance. This metric is a bit more useful than the average performance because it can show some variance. In the result below RocksDB quickly reaches a steady rate while WiredTiger and mmapv1 degrade over time as the database gets larger than RAM. However this can still hide intermittent variance.


This displays throughput per 10-second interval for a subset of the test. mmapv1 has the least variance while WiredTiger and RocksDB have much more. The variance is a problem and was not visible in previous graphs.

Variance, part 2

The final two graphs show the per-interval throughput for all engines and then only for WiredTiger and RocksDB. The second graph was added to avoid compressing the results for RocksDB to the left hand side of the graph. The lines for RocksDB and WiredTiger are very wide because of the large variance in throughput.

How and why we evolved legacy Java application to Scala

Author: Cesar Trigo ~ Published: June 29, 2015, 11:28 p.m.
Hello everyone! Finally here is the presentation about the evolution from a legacy Java Application to Scala presented by Katia Aresti during our last MongoDB User Group Madrid that took place last week at Telefónica Flagship...

Announcing libbson and libmongoc 1.1.9

Author: A. Jesse Jiryu Davis ~ Published: June 29, 2015, 4:33 a.m.

Sea black and white flight sky

I'm releasing libmongoc with an urgent bugfix for a common crash in 1.1.8, which itself was introduced while I was fixing a rare crash in 1.1.7. For further details:

In the process of validating my latest fix I expanded test coverage, and noticed that ./configure --enable-coverage didn't work. That is now fixed in libbson and libmongoc.

libbson 1.1.9 can be downloaded here:

libmongoc 1.1.9 can be downloaded here:

Changes to MMS: MMS is now MongoDB Cloud Manager

Author: MMS Release Notes ~ Published: June 25, 2015, 2:46 a.m.

Introducing the new MongoDB Cloud Manager! Now when you create a new group in MongoDB Cloud Manager (formerly MMS), you immediately enter a 30 day free trial. All the great features of Cloud Manager are enabled during this period. At the conclusion of the 30 day free trial, you will have the option to choose between the Standard Plan or the Free Tier Plan. A note about the differences in the plans:

Standard Plan

  • $39 per server per month
  • Full History of Your Monitoring Data
  • Automation

Free Tier Plan

  • No per-server charge
  • Monitoring data is limited to 5 minute granularity only for previous 24 hours.
  • Automation functionality is not enabled.

If you decide to pick the Standard plan before the free 30 day trial is over, you will still get the remaining days of your trial for free.


Are there any changes to backup plans?

  • Backup remains an optional service. Pricing for backup remains exactly the same as before with no changes. You can sign up for our Cloud Backup service whether you choose the Standard or Free Tier Plan.

I’m already running MMS Basic or MMS Classic, what happens to my group?

  • Your group will remain in the existing plan that you have today and continue receiving the same functionality. If you wish, you may switch to the new Standard Plan or Free Tier Plan.

What are data bearing servers?

  • A “server” means a single virtual machine or physical server running MongoDB that stores data. Config servers, pure arbiters, and servers only running mongos routers do not count. A server is only counted once even if it has multiple data-bearing mongoD processes running on it. You can get this count by going to your “Deployment” tab, changing to the server list view (≡), unchecking “mongos” and “config”, and counting any servers which have at least one non-arbiter process on them.

If I decide to stop using Automation, how do I unmanage my group from MMS?

  • Unmanaging will transfer the control of your deployment from MMS back to your servers. Any further changes must be made manually. From your Deployment tab, click the “…” menu at the Replica Set or Sharded Cluster level. Select ‘Remove from MMS’.

Can I still choose the MMS Basic plan to get 8 free servers?

  • The MMS Basic plan is no longer available for new groups.

MongoDB Aggregation Framework (part 2)

Author: Cesar Trigo ~ Published: June 24, 2015, 11:12 p.m.
Part II of the post MongoDB Aggregation Framework MongoDB vs SQL As I know that most of you come from SQL world, I am going to show the equivalence between the sql functions and...

MMS is now MongoDB Cloud Manager

Author: ~ Published: June 24, 2015, 6:49 p.m.
Today we’re announcing changes to MongoDB Management Service (MMS) that our user community should be aware of: ### 1. Name change MMS is now MongoDB Cloud Manager. This name is consistent with our on-premises management product, MongoDB Ops Manager. ### 2. Simplified pricing for new groups Any new groups that you create will be part of a free trial of Cloud Manager*. We invite you to use this 30-day period to explore the rich functionality Cloud Manager offers, including new automated provisioning and configuration management features. Existing groups will not be affected by these changes. ### New to Cloud Manager? Cloud Manager provides a single, unified interface to perform the tasks required to maintain a healthy MongoDB deployment. Here are a few of the tasks that you can do with Cloud Manager: - Monitor important stats like opcounters and connections at 1 minute granularity - Receive alerts when key metrics are out of range - Automate backups - Upgrade MongoDB with zero downtime - Easily create sharded clusters and replica sets Cloud Manager is the easiest way to run MongoDB. Get started now:
Start Free Trial
*If you currently have access to automation, your new groups are not eligible for a free trial period.
About the Author - Leng Lee Leng Lee is the Director of Product (Cloud) at MongoDB. Previously he was Director of Product at Codecademy where he was the first employee.

Announcing libbson and libmongoc 1.1.8

Author: A. Jesse Jiryu Davis ~ Published: June 22, 2015, 3:01 p.m.

Deep sea fish

I released libbson and libmongoc 1.1.8 today. The significant change is the defeat of a stubborn crash reported weeks ago. Very rarely, when a mongoc_client_t is connected to a replica set while a member is added, and authentication fails, it leaves the client's data structures in an inconsistent state that makes it seg fault later in mongoc_client_destroy().

I had already gone one round with this bug and given up: I released 1.1.7 with extra checking and logging along this code path, but without a theory about the cause of the crash, much less a fix. The customer who reported the crash could reproduce it a couple times in each of their days-long durability tests, so they sent me core dumps. My colleague Spencer Jackson devoted heroic effort to understanding the core dumps (including one with no debug symbols!), and we finally discovered the sequence that leads to the crash.

The bug was in _mongoc_cluster_reconnect_replica_set(), which has two loops. The first loop tries nodes until it finds a replica set primary. In the second loop, it iterates over the primary's peer list connecting and authenticating with each peer, including the primary itself.

The crash comes when we:

  1. Connect to a 2-node replica set.
  2. The function enters its first loop, connects to the primary and finds two peers.
  3. nodes_len is set to 2 and the nodes list is reallocated, but the second node's struct is uninitialized.
  4. The function enters its second loop.
  5. Auth fails on the first node (the primary) so the driver breaks from the loop with goto CLEANUP.
  6. Now nodes_len is 2 but the second node is still uninitialized!
  7. Later, mongoc_client_destroy iterates the nodes list, destroying them.
  8. Since nodes_len is 2, the client tries to destroy the second, uninitialized node.
  9. If the stream field in the second node happens to be non-NULL, the client calls stream->close on it and segfaults.

This was particularly hard for the customer's test to reproduce, because the driver has to connect while the test framework is reconfiguring auth in the replica set, and the buffer reallocation has to return a non-zero chunk of memory.

The fix is to properly manage nodes_len: don't increment it to N unless N nodes have actually been initialized. Additionally, zero-out all nodes right after reallocating the nodes list to ensure all data structures are NULL.

Details about the bug and the fix are in Jira.

It's satisfying to nail this bug after a long chase, but also painful: that code path is long gone in the 1.2.0 branch, replaced by Samantha Ritter's implementation of the Server Discovery And Monitoring spec. If I could've released 1.2.0 by now we'd have saved all the trouble of debugging the old code. It only redoubles my drive to release a beta of the new driver this quarter and get out of this bind.

Image: The deep sea fish eurypharynx pelecanoides, Popular Science Monthly, 1883.