Everything for nosql

We’ve had a big month with updates and improvements to our drivers. Here’s a summary:

We released v1.0 of the Mongo-Hadoop connector, which provides working input and output adapters for MongoDB on Hadoop’s MapReduce.

We released Node.js Driver v1.0.

Aaron Heckmann made a bunch of bug fixes and feature updates to Mongoose 2.6.x

The 1.4 release of the C# Driver now supports LINQ queries.

The 1.6.2 release of the Ruby driver now implements socket timeout using non-blocking IO. This change improve performance when timeouts are enabled.

There are a number of bug fixes in the PHP driver’s 1.2.10 release

The PyMongo team pushed PyMongo 2.2, which has support for Python 3, Gevent and improved pooling.

Source : http://blog.mongodb.org/post/22638600720/mongodb-driver-releases-april

MongoDB Driver Releases: April

We’ve had a big month with updates and improvements to our drivers. Here’s a summary:

We released v1.0 of the Mongo-Hadoop connector, which provides working input and output adapters for MongoDB on Hadoop’s MapReduce.

We released Node.js Driver v1.0.

Aaron Heckmann made a bunch of bug fixes and feature updates to Mongoose 2.6.x

The 1.4 release of the C# Driver now supports LINQ queries.

The 1.6.2 release of the Ruby driver now implements socket timeout using non-blocking IO. This change improve performance when timeouts are enabled.

There are a number of bug fixes in the PHP driver’s 1.2.10 release

The PyMongo team pushed PyMongo 2.2, which has support for Python 3, Gevent and improved pooling.

MongoDB at Craigslist: 1 Year Later

Update: watch the video of Jeremy Zawodny and Chris Mooney’s talk on A Year of MongoDB at Craigslist at MongoSF ‘12

Last year, Craigslist moved their archive to MongoDB from MySQL. After the initial set up, we spoke with Jeremy Zawodny, software engineer at Craigslist and the author of High Performance MySQL (O’Reilly), and asked him some questions about their cluster. In advance of their talk at MongoSF tomorrow, we caught up with Jeremy to get the scoop on what’s happening at Craigslist one year later.

Last time we spoke you were building a MongoDB store for 5 Billion Documents. What do your numbers look like now?

We’re currently approaching the 3 billion mark. The 5 billion number was our target capacity when building the system. Back then we had about 2.5 billion documents that we migrated into MongoDB, and we’ve continued to add documents ever since then.

MongoDB: Powering the Magic and the Monsters at Stripe

Update: Watch the video of Greg Brockman’s talk on MongoDB for High Availability at MongoSF ‘12

Stripe offers a simple platform for developers to accept online payments. They are a long-time user of MongoDB and have built a powerful and flexible system for enabling transactions on the web. In advance of their talk at MongoSF on MongoDB for high availability, Stripe’s engineer, Greg Brockman spoke with us about what’s going on with MongoDB at Stripe.

Revamp of MongoDB’s Documentation

We’re revamping MongoDB’s documentation. The new design in the MongoDB Manual has an improved reference section and an index for simplified search. It will also eventually support multiple MongoDB versions at the same time.

This project is a work in progress, and things are changing quickly. Our goal is to consolidate, sharpen, organize, and continue to improve the documentation in support of MongoDB. For now, the new docs will live alongside the original MongoDB Wiki. But over the next few months, we’ll be transitioning everything to the new manual.

In the spirit of open source, the docs are housed on Github. Feedback is welcome! Feel free to fork the repository and issue pull requests. You can also open tickets in JIRA, and we’ll promptly address any suggestions.

Meet Variety, a Schema Analyzer for MongoDB

Variety is a lightweight tool which gives a feel for an application’s schema, as well as any schema outliers. It is particularly useful for

• quickly learning how data is structured, if inheriting a codebase with a production data dump

• finding all rare keys in a given collection

An Easy Example

We’ll make a collection, within the MongoDB shell:

db.users.insert({name: "Tom", bio: "A nice guy.", pets: ["monkey", "fish"], someWeirdLegacyKey: "I like Ike!"});
db.users.insert({name: "Dick", bio: "I swordfight."}); 
db.users.insert({name: "Harry", pets: "egret"});
db.users.insert({name: "Geneviève", bio: "Ça va?"});

MongoDB and Node.js at 10gen

3 weeks ago

With their strong roots in JavaScript, Node.js and MongoDB have always been a natural fit, and the Node.js community has embraced MongoDB with a number of open source projects. To support the community’s efforts, 10gen is happy to announce that the MongoDB Node.js driver will join the existing set of 12 officially supported drivers for MongoDB.

The Node.js driver was born out of necessity. Christian Kvalheim started using Node.js in early 2010. He had heard good things about MongoDB but was disappointed to discover that no native driver had yet been developed. So, he got to work. Over the past two years, Christian has done amazing work in his driver, and it has matured through the contributions of a large community and the rigors of production. For some time now, the driver has been on par with 10gen’s officially supported MongoDB drivers. So we were naturally thrilled to welcome Christian full time at 10gen to continue his work on the Node.js driver.

Grails in the Land of MongoDB

2 months ago

Groovy and Grails’ speed and simplicity are a perfect match to the flexibility and power of MongoDB. Dozens of plugins and libraries connect these two together, making it a breeze to get Grooving with MongoDB.

Using Grails with MongoDB

For the purpose of this post, let’s pretend we’re writing a hospital application that uses the following domain class.

class Doctor { 
  String first 
  String last 
  String degree 
  String specialty 
}

There are a few grails plugins that help communicate with MongoDB, but one of the easiest to use is the one created by Graeme Rocher himself (Grails project lead). The MongoDB GORM plugin allows you to persist all your domain classes in MongoDB. To use it, first remove any unneeded persistance-related plugins after you’ve executed the ‘grails create-app’ command, and install the MongoDB GORM plugin.

Operations in the New Aggregation Framework

4 months ago

Available in 2.1 development release. Will be stable for production in the 2.2 release

Built by Chris Westin (@cwestin63)

MongoDB has built-in MapReduce functionality that can be used for complex analytics tasks. However, we’ve found that most of the time, users need the kind of group-by functionality that SQL implementations have. This can be implemented using map/reduce, but doing so is more work than it was in SQL. In version 2.1, MongoDB is introducing a new aggregation framework that will make it much easier to obtain the kind of results SQL group-by is used for, without having to write custom JavaScript.

MongoSV Recap

Last week over 1,100 developers came together for MongoSV, the largest MongoDB conference to date. 10gen kicked off MongoSV with our inaugural MongoDB Masters program, which brought together MongoDB evangelists from around the world.

At the opening keynote, 10gen CTO Eliot Horowitz demoed a twitter app for #mongoSV tweets, featuring the new aggregation framework expected for the MongoDB 2.2 release. These gather all the tweets sent out with the hashtag #mongoSV and organizes them in by recency and most retweets. Get the source code for the demo app here

MongoDB On Microsoft Azure

A new preview release of the MongoDB controller for Azure is available. This release includes support for replica sets, and over the coming months, we’ll be adding support for MongoDB’s sharding facilities. We’ll also be working to more tightly integrate MongoDB with the features of Azure platform. Each member of a replica set is hosted by an instance of an Azure worker role, so the size of the replica set is determined by the number of instances configured for the the replica set worker roles. Each replica set worker role creates a child process to run the mongod server process. The controller defines an Azure worker role which represents a MongoDB cluster.

MongoDB Monitoring Service Docs Available

MongoDB Monitoring Service (MMS) documentation now available: http://mms.10gen.com/help/

Mongo Boston Recap

Last week 250 developers converged at the Microsoft New England Research and Design Center for Mongo Boston. Highlights from the event include presentations on MongoDB 2.0, how MTV leverages MongoDB for CMS, rapid prototyping, and more.

More photos from the event are available on the MongoDB Flickr page.

If you missed the event, join the Boston MongoDB User Group, which also meets at NERD. The next meeting is on November 15.

2.0 Presentation at New York MongoDB User Group

On Thursday MongoDB core committer Eliot Horowitz presented to the New York MongoDB User Group on the latest features in v2.0. The event was hosted by Sailthru, a MongoDB-powered startup doing intelligent email marketing. The meetup was announced Monday night and within a day was oversubscribed. After the presentation, we all went out for drinks to celebrate the release.

The NY MUG has over 1,000 members and meets monthly. There are also MongoDB user groups in San Francisco, Washington DC, London, Boston, Japan, and more. And if there isn’t a MongoDB meetup in your city, we’re happy to support you if you would like to start one.

Cache Reheating - Not to be Ignored

An important aspect to keep in mind with databases is the cost of cache reheating after a server restart. Consider the following diagram which shows several cache servers (e.g., memcached) in front of a database server.

This sort of setup is common and can work quite well when appropriate; it removes read load from the database and allows more RAM to be utilized for scaling (when the database doesn’t scale horizontally). But what happens if all the cache servers restart at the same time, say, on a power glitch in a data center?

MongoDB 2.0 Released

The MongoDB development team is pleased to announce the release of version 2.0.0. Version 2.0 is the latest stable release, following the March 2011 release of version 1.8. This release includes many new features, improvements to existing features, and performance enhancements.

Please note version 2.0 is a significant new release, but is 2.0 solely because 1.8 + 0.2 = 2.0; for example the upgrade from 1.6 to 1.8 was similar in scope.

Highlights of the 2.0 release:

Concurrency improvements

Index enhancements to improve size and performance

Authentication with sharded clusters

Replica Set Data Center Awareness

Journaling is now enabled by default. Journal files are now compressed.

User defined getLastError options

Smaller default stack size to accommodate more connections

Compact Command

Geo-spatial features
- Multi-location documents
- Polygon searches

Output map/reduce results to sharded collection

Source : http://blog.mongodb.org

0 MongoDB Driver Releases: April

MongoDB Driver Releases: April

6 days ago

We’ve had a big month with updates and improvements to our drivers. Here’s a summary:

We released v1.0 of the Mongo-Hadoop connector, which provides working input and output adapters for MongoDB on Hadoop’s MapReduce.

We released Node.js Driver v1.0.

Aaron Heckmann made a bunch of bug fixes and feature updates to Mongoose 2.6.x

The 1.4 release of the C# Driver now supports LINQ queries.

The 1.6.2 release of the Ruby driver now implements socket timeout using non-blocking IO. This change improve performance when timeouts are enabled.

There are a number of bug fixes in the PHP driver’s 1.2.10 release

The PyMongo team pushed PyMongo 2.2, which has support for Python 3, Gevent and improved pooling.

Source : http://blog.mongodb.org/post/22638600720/mongodb-driver-releases-april

MongoDB Driver Releases: April

6 days ago

We’ve had a big month with updates and improvements to our drivers. Here’s a summary:

We released v1.0 of the Mongo-Hadoop connector, which provides working input and output adapters for MongoDB on Hadoop’s MapReduce.

We released Node.js Driver v1.0.

Aaron Heckmann made a bunch of bug fixes and feature updates to Mongoose 2.6.x

The 1.4 release of the C# Driver now supports LINQ queries.

The 1.6.2 release of the Ruby driver now implements socket timeout using non-blocking IO. This change improve performance when timeouts are enabled.

There are a number of bug fixes in the PHP driver’s 1.2.10 release

The PyMongo team pushed PyMongo 2.2, which has support for Python 3, Gevent and improved pooling.

MongoDB at Craigslist: 1 Year Later

Update: watch the video of Jeremy Zawodny and Chris Mooney’s talk on A Year of MongoDB at Craigslist at MongoSF ‘12

Last time we spoke you were building a MongoDB store for 5 Billion Documents. What do your numbers look like now?

MongoDB: Powering the Magic and the Monsters at Stripe

Update: Watch the video of Greg Brockman’s talk on MongoDB for High Availability at MongoSF ‘12

Revamp of MongoDB’s Documentation

Meet Variety, a Schema Analyzer for MongoDB

Variety is a lightweight tool which gives a feel for an application’s schema, as well as any schema outliers. It is particularly useful for

• quickly learning how data is structured, if inheriting a codebase with a production data dump

• finding all rare keys in a given collection

An Easy Example

We’ll make a collection, within the MongoDB shell:

db.users.insert({name: "Tom", bio: "A nice guy.", pets: ["monkey", "fish"], someWeirdLegacyKey: "I like Ike!"});
db.users.insert({name: "Dick", bio: "I swordfight."}); 
db.users.insert({name: "Harry", pets: "egret"});
db.users.insert({name: "Geneviève", bio: "Ça va?"});

MongoDB and Node.js at 10gen

Grails in the Land of MongoDB

2 months ago

Using Grails with MongoDB

For the purpose of this post, let’s pretend we’re writing a hospital application that uses the following domain class.

class Doctor { 
  String first 
  String last 
  String degree 
  String specialty 
}

Operations in the New Aggregation Framework

3 months ago

Available in 2.1 development release. Will be stable for production in the 2.2 release

Built by Chris Westin (@cwestin63)

MongoSV Recap

MongoDB On Microsoft Azure

MongoDB Monitoring Service Docs Available

MongoDB Monitoring Service (MMS) documentation now available: http://mms.10gen.com/help/

Mongo Boston Recap

More photos from the event are available on the MongoDB Flickr page.

If you missed the event, join the Boston MongoDB User Group, which also meets at NERD. The next meeting is on November 15.

2.0 Presentation at New York MongoDB User Group

Cache Reheating - Not to be Ignored

MongoDB 2.0 Released

Please note version 2.0 is a significant new release, but is 2.0 solely because 1.8 + 0.2 = 2.0; for example the upgrade from 1.6 to 1.8 was similar in scope.

Highlights of the 2.0 release:

Concurrency improvements

Index enhancements to improve size and performance

Authentication with sharded clusters

Replica Set Data Center Awareness

Journaling is now enabled by default. Journal files are now compressed.

User defined getLastError options

Smaller default stack size to accommodate more connections

Compact Command

Geo-spatial features
- Multi-location documents
- Polygon searches

Output map/reduce results to sharded collection

Source : http://blog.mongodb.org

0 MongoDB Driver Releases: April

MongoDB Driver Releases: April

18 seconds ago

We’ve had a big month with updates and improvements to our drivers. Here’s a summary:

We released v1.0 of the Mongo-Hadoop connector, which provides working input and output adapters for MongoDB on Hadoop’s MapReduce.

We released Node.js Driver v1.0.

Aaron Heckmann made a bunch of bug fixes and feature updates to Mongoose 2.6.x

The 1.4 release of the C# Driver now supports LINQ queries.

The 1.6.2 release of the Ruby driver now implements socket timeout using non-blocking IO. This change improve performance when timeouts are enabled.

There are a number of bug fixes in the PHP driver’s 1.2.10 release

The PyMongo team pushed PyMongo 2.2, which has support for Python 3, Gevent and improved pooling.

Source : http://blog.mongodb.org/post/22638600720/mongodb-driver-releases-april

There is nothing easier than obtaining high quality links that help you promote your Website. And yet, given how easy and simple it is to obtain these high quality links, people continue to struggle with the concept. Why is that? I submit that people are using the wrong metrics to judge the quality of...

Search market share has for several years now been misrepresented as the number of page views that are displayed on each “search engine”. Google dominates this faux search market because it generates the most measurable page views. Real search market share is better defined...

UPDATE: Letter.ly has resolved the issue. People may subscribe to the SEO Theory Premium Newsletter now.

Search on the query “what are the basic steps for an SEO campaign” and you’ll find a hodge-podge of poorly written articles that rip the SEO cycle (keyword research,...

I love this question. Nothing seems to trip up the SEO community more than the quest to know what it is that Google is looking for. There are —...

For those of you who have not noticed the change in my LinkedIn status, I am no longer working for Quinstreet. I have been thinking since about January that...

Given all the attention that people in the search marketing industries have paid to keywords-in-domain-names, it amazes me that so few people actually pay attention to page URLs and...

So what do you think? Are subdomains bad for SEO? Everyone has an opinion. On April 4 Todd Friesen Tweeted: “When it comes to domains and subdomains: consolidate and...

People ask all sorts of questions about the minutiae of search engine optimization. Unfortunately, the best answer to many of these questions is the rather unsatisfying “It depends …”...

A theorist or theoretician studies the “theoretical aspects” of a subject. People often use theory interchangeably with hypothesis, usually in the adjectival form such as “theoretically speaking, all blue...

I took a look at Webalizer recently when I removed Google Analytics from most of my Websites. Webalizer is good for collecting raw data and putting that into a...

It amazes me how people are so willing to burden themselves with unnecessary baggage when gearing up for search engine optimization. There are a lot of SEO vendors out...

Source : http://seo-theory.com

0 MongoDB at Craigslist: 1 Year Later MongoDB: Powering the Magic and the Monsters at Stripe Revamp of MongoDB’s Documentation Meet Variety, a Schema Analyzer for MongoDB MongoDB and Node.js at 10gen Grails in the Land of MongoDB Operations in the New Aggregation Framework MongoSV Recap MongoDB On Microsoft Azure MongoDB Monitoring Service Docs Available Mongo Boston Recap 2.0 Presentation at New York MongoDB User Group Cache Reheating - Not to be Ignored MongoDB 2.0 Released BSON and Data Interchange

MongoDB at Craigslist: 1 Year Later

3 days ago

Last time we spoke you were building a MongoDB store for 5 Billion Documents. What do your numbers look like now?

Can you share an anecdote on the benefits of replica sets/sharding and something you’d like to change/improve in that feature set?

The sharding has made it easy for handling growth. We know that when the day comes, we can add an additional replica set to our cluster and it will help ease any space crunch. The replica sets have been great for handling machine failures. We’ve had several machines lock-up on us and require unplanned reboots or service. Throughout that time, the worst thing we’ve seen is some read-only time for the cluster metadata (when a config server dropped) but we’ve been able to serve requests without stopping.

Can you share some anecdotes about how your team adjusted to working with MongoDB?

There was a bit of adjustment that our systems administration team performed to the original deployment and configuration to make it better mesh with our home-grown management and deployment tools. But other than that, MongoDB has been pretty hands-off for most of the team. As long as it behaves well (which it does), we don’t need to touch it that often.

Any exciting plans for your MongoDB clusters?

We’ve been testing MongoDB in a few new roles at Craigslist and plan to present some of those challenges at MongoSF on May 4th.

Thanks to Jeremy for giving us some insight into how MongoDB powers Craigslist!

MongoDB: Powering the Magic and the Monsters at Stripe

4 days ago

Stripe has a heavy write load with large query volumes. Can you give us some insight into your tips and tricks for wrangling with MongoDB’s replica sets on your system?

Getting replica sets up and running is actually incredibly easy. I used to run MySQL clusters where configuring and maintaining replication was a pain, and it was a joy to just be able to run “rs.add(node)” and watch it join the cluster.

In order to avoid losing any operations even if we lose our database primary, we structure our application such that all writes are idempotent. We then wrap our calls to the MongoDB driver in a retry block. If the call fails because our MongoDB cluster is currently reconfiguring, we try the operation again (with the usual backoff and timeout you’d expect from a scheme like this). We’re very careful to avoid operations which could result in evicting hot data from the cache. Running unindexed queries is an obvious example of this, but we’ve also found that running a large multi-update can have production impact.

So when we need to change our schema for an entire collection of documents, we’ll usually run a slower (but non-impactful) document-by-document migration at the application level.

Let’s take a step back to your past talk at MongoSV ‘11 — what are you doing with Monster (Stripe’s native events processing system for payments)?

Monster is our framework for event production and event consumption, which uses MongoDB as a highly-available, persistent queue. With Monster, our engineers can start logging a new type of event with only a few lines of code, and at any time in the future can add a consumer that will automatically be passed relevant events (possibly even historical ones). We use it for a variety of purposes: structured logging, incremental updating of state (such as people’s graphs of payment volume), and background jobs.

Lots of people are innovating in the financial space — in particular building APIs for mobile payments. For those just starting up, why should they use MongoDB?

As a payments processor, our uptime is incredibly important. We were initially drawn to MongoDB because replica sets make it incredibly easy to run your database in a highly-available fashion. I came from a world where my database master could never be rebooted, since there was no zero-downtime failover strategy even for routine maintenance — MongoDB gives you this almost out of the box.

MongoDB also makes it easy to do zero-downtime migrations, with features such as background index builds and allowing multiple schemas in a single collection. Anyone caring about their availability should look very hard at MongoDB.

How are you guys using the Ruby driver in your system? Anything interesting?

We’ve banged on the Ruby driver in a variety of configurations, ensuring that it behaves properly when exposed to all the possible failures we can imagine (or have noticed) our database servers experiencing. These days, we’re very happy with how robust the Ruby driver is against the wide variety of failure modes of the distributed MongoDB nodes.

What’s your wish list for the Ruby Driver?

I wish there were a configuration option for forcing reads from a secondary. (Right now, you can request that reads be on a secondary if one is available, but they’ll start reading from the primary if no secondary is available.)

What’s on stripe’s engineering roadmap?

While making Stripe available outside the US is our top priority, our biggest engineering challenges at the moment are scaling our systems to keep up with the phenomenal growth we’ve been experiencing.

Many thanks to Greg for taking the time to tell a bit about the magic at Stripe.

Revamp of MongoDB’s Documentation

5 days ago

Meet Variety, a Schema Analyzer for MongoDB

Variety is a lightweight tool which gives a feel for an application’s schema, as well as any schema outliers. It is particularly useful for

• quickly learning how data is structured, if inheriting a codebase with a production data dump

• finding all rare keys in a given collection

An Easy Example

We’ll make a collection, within the MongoDB shell:

db.users.insert({name: "Tom", bio: "A nice guy.", pets: ["monkey", "fish"], someWeirdLegacyKey: "I like Ike!"});
db.users.insert({name: "Dick", bio: "I swordfight."}); 
db.users.insert({name: "Harry", pets: "egret"});
db.users.insert({name: "Geneviève", bio: "Ça va?"});

MongoDB and Node.js at 10gen

Grails in the Land of MongoDB

2 months ago

Using Grails with MongoDB

For the purpose of this post, let’s pretend we’re writing a hospital application that uses the following domain class.

class Doctor { 
  String first 
  String last 
  String degree 
  String specialty 
}

Operations in the New Aggregation Framework

3 months ago

Available in 2.1 development release. Will be stable for production in the 2.2 release

Built by Chris Westin (@cwestin63)

MongoSV Recap

4 months ago

MongoDB On Microsoft Azure

MongoDB Monitoring Service Docs Available

MongoDB Monitoring Service (MMS) documentation now available: http://mms.10gen.com/help/

Mongo Boston Recap

More photos from the event are available on the MongoDB Flickr page.

If you missed the event, join the Boston MongoDB User Group, which also meets at NERD. The next meeting is on November 15.

2.0 Presentation at New York MongoDB User Group

Cache Reheating - Not to be Ignored

MongoDB 2.0 Released

Please note version 2.0 is a significant new release, but is 2.0 solely because 1.8 + 0.2 = 2.0; for example the upgrade from 1.6 to 1.8 was similar in scope.

Highlights of the 2.0 release:

Concurrency improvements

Index enhancements to improve size and performance

Authentication with sharded clusters

Replica Set Data Center Awareness

Journaling is now enabled by default. Journal files are now compressed.

User defined getLastError options

Smaller default stack size to accommodate more connections

Compact Command

Geo-spatial features
- Multi-location documents
- Polygon searches

Output map/reduce results to sharded collection

BSON and Data Interchange