Thứ Sáu, 27 tháng 4, 2012

0 Meet Variety, a Schema Analyzer for MongoDB

Meet Variety, a Schema Analyzer for MongoDB


Comments
   


3 minutes ago


Variety is a lightweight tool which gives a feel for an application’s schema, as well as any schema outliers. It is particularly useful for


• quickly learning how data is structured, if inheriting a codebase with a production data dump


• finding all rare keys in a given collection


An Easy Example


We’ll make a collection, within the MongoDB shell:


db.users.insert({name: "Tom", bio: "A nice guy.", pets: ["monkey", "fish"], someWeirdLegacyKey: "I like Ike!"});

db.users.insert({name: "Dick", bio: "I swordfight."}); db.users.insert({name: "Harry", pets: "egret"});

db.users.insert({name: "Geneviève", bio: "Ça va?"}); END JAVASCRIPT

Let’s use Variety on this collection, and see what it can tell us:


$ mongo test --eval "var collection = 'users'" variety.js

The above is executed from terminal.”test” is the database containing the collection we are analyzing.


Variety’s output:


{ "_id" : { "key" : "_id" }, "value" : { "types" : [ "object" ] }, "totalOccurrences" : 4, "percentContaining" : 100 }

{ "_id" : { "key" : "name" }, "value" : { "types" : [ "string" ] }, "totalOccurrences" : 4, "percentContaining" : 100 }

{ "_id" : { "key" : "bio" }, "value" : { "types" : [ "string" ] }, "totalOccurrences" : 3, "percentContaining" : 75 }

{ "_id" : { "key" : "pets" }, "value" : { "types" : [ "string", "array" ] }, "totalOccurrences" : 2, "percentContaining" : 50 }

{ "_id" : { "key" : "someWeirdLegacyKey" }, "value" : { "type" : "string" }, "totalOccurrences" : 1, "percentContaining" : 25 }

Every document in the “users” collection has a “name” and “_id”. Most, but not all have a “bio”. Interestingly, it looks like “pets” can be either an array or a string. The application code really only expects arrays of pets. Have we discovered a bug, or a remnant of a previous schema? The first document created has a weird legacy key I’ve never seen before- the people who built the prototype didn’t clean up after themselves. These rare keys, whose contents are never used, have a strong potential to confuse developers, and could be removed once we verify our findings. For future use, results are also stored a varietyResults database.


Learn More!


Learn more about Variety now, including


• How to download Variety


• How to set a limit on the number of documents analyzed from a collection


• How to contribute, and report issues


Variety is free, open source, and written in 100% JavaScript. Check it out on Github.


-by James Cropcho


Source : http://blog.mongodb.org/post/21923016898/meet-variety-a-schema-analyzer-for-mongodb

0 nhận xét:

Đăng nhận xét

 

Everything for nosql Copyright © 2011 - |- Template created by O Pregador - |- Powered by Blogger Templates