Comments
3 days ago
Last year, Craigslist moved their archive to MongoDB from MySQL. After the initial set up, we spoke with Jeremy Zawodny, software engineer at Craigslist and the author of High Performance MySQL (O’Reilly), and asked him some questions about their cluster. In advance of their talk at MongoSF tomorrow, we caught up with Jeremy to get the scoop on what’s happening at Craigslist one year later.
Last time we spoke you were building a MongoDB store for 5 Billion Documents. What do your numbers look like now?
We’re currently approaching the 3 billion mark. The 5 billion number was our target capacity when building the system. Back then we had about 2.5 billion documents that we migrated into MongoDB, and we’ve continued to add documents ever since then.
Can you share an anecdote on the benefits of replica sets/sharding and something you’d like to change/improve in that feature set?
The sharding has made it easy for handling growth. We know that when the day comes, we can add an additional replica set to our cluster and it will help ease any space crunch. The replica sets have been great for handling machine failures. We’ve had several machines lock-up on us and require unplanned reboots or service. Throughout that time, the worst thing we’ve seen is some read-only time for the cluster metadata (when a config server dropped) but we’ve been able to serve requests without stopping.
Can you share some anecdotes about how your team adjusted to working with MongoDB?
There was a bit of adjustment that our systems administration team performed to the original deployment and configuration to make it better mesh with our home-grown management and deployment tools. But other than that, MongoDB has been pretty hands-off for most of the team. As long as it behaves well (which it does), we don’t need to touch it that often.
Any exciting plans for your MongoDB clusters?
We’ve been testing MongoDB in a few new roles at Craigslist and plan to present some of those challenges at MongoSF on May 4th.
Thanks to Jeremy for giving us some insight into how MongoDB powers Craigslist!
Comments
4 days ago
Stripe offers a simple platform for developers to accept online payments. They are a long-time user of MongoDB and have built a powerful and flexible system for enabling transactions on the web. In advance of their talk at MongoSF on MongoDB for high availability, Stripe’s engineer, Greg Brockman spoke with us about what’s going on with MongoDB at Stripe.
Stripe has a heavy write load with large query volumes. Can you give us some insight into your tips and tricks for wrangling with MongoDB’s replica sets on your system?
Getting replica sets up and running is actually incredibly easy. I used to run MySQL clusters where configuring and maintaining replication was a pain, and it was a joy to just be able to run “rs.add(node)” and watch it join the cluster.
In order to avoid losing any operations even if we lose our database primary, we structure our application such that all writes are idempotent. We then wrap our calls to the MongoDB driver in a retry block. If the call fails because our MongoDB cluster is currently reconfiguring, we try the operation again (with the usual backoff and timeout you’d expect from a scheme like this). We’re very careful to avoid operations which could result in evicting hot data from the cache. Running unindexed queries is an obvious example of this, but we’ve also found that running a large multi-update can have production impact.
So when we need to change our schema for an entire collection of documents, we’ll usually run a slower (but non-impactful) document-by-document migration at the application level.
Let’s take a step back to your past talk at MongoSV ‘11 — what are you doing with Monster (Stripe’s native events processing system for payments)?
Monster is our framework for event production and event consumption, which uses MongoDB as a highly-available, persistent queue. With Monster, our engineers can start logging a new type of event with only a few lines of code, and at any time in the future can add a consumer that will automatically be passed relevant events (possibly even historical ones). We use it for a variety of purposes: structured logging, incremental updating of state (such as people’s graphs of payment volume), and background jobs.
Lots of people are innovating in the financial space — in particular building APIs for mobile payments. For those just starting up, why should they use MongoDB?
As a payments processor, our uptime is incredibly important. We were initially drawn to MongoDB because replica sets make it incredibly easy to run your database in a highly-available fashion. I came from a world where my database master could never be rebooted, since there was no zero-downtime failover strategy even for routine maintenance — MongoDB gives you this almost out of the box.
MongoDB also makes it easy to do zero-downtime migrations, with features such as background index builds and allowing multiple schemas in a single collection. Anyone caring about their availability should look very hard at MongoDB.
How are you guys using the Ruby driver in your system? Anything interesting?
We’ve banged on the Ruby driver in a variety of configurations, ensuring that it behaves properly when exposed to all the possible failures we can imagine (or have noticed) our database servers experiencing. These days, we’re very happy with how robust the Ruby driver is against the wide variety of failure modes of the distributed MongoDB nodes.
What’s your wish list for the Ruby Driver?
I wish there were a configuration option for forcing reads from a secondary. (Right now, you can request that reads be on a secondary if one is available, but they’ll start reading from the primary if no secondary is available.)
What’s on stripe’s engineering roadmap?
While making Stripe available outside the US is our top priority, our biggest engineering challenges at the moment are scaling our systems to keep up with the phenomenal growth we’ve been experiencing.
Many thanks to Greg for taking the time to tell a bit about the magic at Stripe.
Comments
5 days ago
We’re revamping MongoDB’s documentation. The new design in the MongoDB Manual has an improved reference section and an index for simplified search. It will also eventually support multiple MongoDB versions at the same time.
This project is a work in progress, and things are changing quickly. Our goal is to consolidate, sharpen, organize, and continue to improve the documentation in support of MongoDB. For now, the new docs will live alongside the original MongoDB Wiki. But over the next few months, we’ll be transitioning everything to the new manual.
In the spirit of open source, the docs are housed on Github. Feedback is welcome! Feel free to fork the repository and issue pull requests. You can also open tickets in JIRA, and we’ll promptly address any suggestions.
Comments
1 week ago
Variety is a lightweight tool which gives a feel for an application’s schema, as well as any schema outliers. It is particularly useful for
• quickly learning how data is structured, if inheriting a codebase with a production data dump
• finding all rare keys in a given collection
An Easy Example
We’ll make a collection, within the MongoDB shell:
db.users.insert({name: "Tom", bio: "A nice guy.", pets: ["monkey", "fish"], someWeirdLegacyKey: "I like Ike!"});
db.users.insert({name: "Dick", bio: "I swordfight."});
db.users.insert({name: "Harry", pets: "egret"});
db.users.insert({name: "Geneviève", bio: "Ça va?"});
Read More
Comments
1 week ago
With their strong roots in JavaScript, Node.js and MongoDB have always been a natural fit, and the Node.js community has embraced MongoDB with a number of open source projects. To support the community’s efforts, 10gen is happy to announce that the MongoDB Node.js driver will join the existing set of 12 officially supported drivers for MongoDB.
The Node.js driver was born out of necessity. Christian Kvalheim started using Node.js in early 2010. He had heard good things about MongoDB but was disappointed to discover that no native driver had yet been developed. So, he got to work. Over the past two years, Christian has done amazing work in his driver, and it has matured through the contributions of a large community and the rigors of production. For some time now, the driver has been on par with 10gen’s officially supported MongoDB drivers. So we were naturally thrilled to welcome Christian full time at 10gen to continue his work on the Node.js driver.
Read More
Comments
2 months ago
Groovy and Grails’ speed and simplicity are a perfect match to the flexibility and power of MongoDB. Dozens of plugins and libraries connect these two together, making it a breeze to get Grooving with MongoDB.
Using Grails with MongoDB
For the purpose of this post, let’s pretend we’re writing a hospital application that uses the following domain class.
class Doctor {
String first
String last
String degree
String specialty
}
There are a few grails plugins that help communicate with MongoDB, but one of the easiest to use is the one created by Graeme Rocher himself (Grails project lead). The MongoDB GORM plugin allows you to persist all your domain classes in MongoDB. To use it, first remove any unneeded persistance-related plugins after you’ve executed the ‘grails create-app’ command, and install the MongoDB GORM plugin.
Read More
Comments
3 months ago
Available in 2.1 development release. Will be stable for production in the 2.2 release
Built by Chris Westin (@cwestin63)
MongoDB has built-in MapReduce functionality that can be used for complex analytics tasks. However, we’ve found that most of the time, users need the kind of group-by functionality that SQL implementations have. This can be implemented using map/reduce, but doing so is more work than it was in SQL. In version 2.1, MongoDB is introducing a new aggregation framework that will make it much easier to obtain the kind of results SQL group-by is used for, without having to write custom JavaScript.
Read More
Comments
4 months ago
Last week over 1,100 developers came together for MongoSV, the largest MongoDB conference to date. 10gen kicked off MongoSV with our inaugural MongoDB Masters program, which brought together MongoDB evangelists from around the world.
At the opening keynote, 10gen CTO Eliot Horowitz demoed a twitter app for #mongoSV tweets, featuring the new aggregation framework expected for the MongoDB 2.2 release. These gather all the tweets sent out with the hashtag #mongoSV and organizes them in by recency and most retweets. Get the source code for the demo app here
Read More
Comments
5 months ago
A new preview release of the MongoDB controller for Azure is available. This release includes support for replica sets, and over the coming months, we’ll be adding support for MongoDB’s sharding facilities. We’ll also be working to more tightly integrate MongoDB with the features of Azure platform. Each member of a replica set is hosted by an instance of an Azure worker role, so the size of the replica set is determined by the number of instances configured for the the replica set worker roles. Each replica set worker role creates a child process to run the mongod server process. The controller defines an Azure worker role which represents a MongoDB cluster.
Read More
Comments
6 months ago
MongoDB Monitoring Service (MMS) documentation now available: http://mms.10gen.com/help/
Comments
6 months ago
Last week 250 developers converged at the Microsoft New England Research and Design Center for Mongo Boston. Highlights from the event include presentations on MongoDB 2.0, how MTV leverages MongoDB for CMS, rapid prototyping, and more.

More photos from the event are available on the MongoDB Flickr page.
If you missed the event, join the Boston MongoDB User Group, which also meets at NERD. The next meeting is on November 15.
Comments
7 months ago
On Thursday MongoDB core committer Eliot Horowitz presented to the New York MongoDB User Group on the latest features in v2.0. The event was hosted by Sailthru, a MongoDB-powered startup doing intelligent email marketing. The meetup was announced Monday night and within a day was oversubscribed. After the presentation, we all went out for drinks to celebrate the release.
The NY MUG has over 1,000 members and meets monthly. There are also MongoDB user groups in San Francisco, Washington DC, London, Boston, Japan, and more. And if there isn’t a MongoDB meetup in your city, we’re happy to support you if you would like to start one.
Read More
Comments
7 months ago
An important aspect to keep in mind with databases is the cost of cache reheating after a server restart. Consider the following diagram which shows several cache servers (e.g., memcached) in front of a database server.

This sort of setup is common and can work quite well when appropriate; it removes read load from the database and allows more RAM to be utilized for scaling (when the database doesn’t scale horizontally). But what happens if all the cache servers restart at the same time, say, on a power glitch in a data center?
Read More
Comments
7 months ago
The MongoDB development team is pleased to announce the release of version 2.0.0. Version 2.0 is the latest stable release, following the March 2011 release of version 1.8. This release includes many new features, improvements to existing features, and performance enhancements.
Please note version 2.0 is a significant new release, but is 2.0 solely because 1.8 + 0.2 = 2.0; for example the upgrade from 1.6 to 1.8 was similar in scope.
Highlights of the 2.0 release:
Read More
Comments
8 months ago
There’s a lot of good things about JSON — it’s a standards based, language independent, representation of object-like data. Also, it’s easy to read (for users and programmers alike). Each document is only about data, not complex object graphs and links. Thus it’s easy to inspect without knowing all the code of an application.
Further, JSON is “schemaless”. We do not have to predefine our (protocol) schema. This can be quite helpful: imagine RPC’ing data from client A to server B with a fixed schema for the messages. On a schema change both need to be ‘updated’ with the new schema. If there are many components to the system it’s even more complicated of course. There is some analogy here to XML, which can (optionally) be schemaless.
It would be nice to have a binary representation of JSON. That is what BSON is all about.
Read More
Source :
http://blog.mongodb.org