Saturday, June 1, 2013

MongoDB Map-Reduce Example with Code

Today I was learning mongoDB map-Reduce function. To understand the concept I built the following example.

Assume we have the collections called “things” which has following records



[
        {
                "_id" : ObjectId("51a9e753d7a333351958d6be"),
                "id_" : 1234,
                "title" : "Microsoft .Net",
                "tags" : [
                        "dot net",
                        ".net",
                        ". net"
                ]
        },
        {
                "_id" : ObjectId("51a9e7a7d7a333351958d6bf"),
                "id_" : 5678,
                "title" : "J2EE",
                "tags" : [
                        "Java",
                        "Servelet",
                        "JSP"
                ]
        },
        {
                "_id" : ObjectId("51a9ebefd7a333351958d6c0"),
                "id_" : 2,
                "title" : "J2EE",
                "tags" : [
                        "JSF",
                        "Portlet",
                        "REST"
                ]
        },
        {
                "_id" : ObjectId("51a9ec3fd7a333351958d6c1"),
                "id_" : 3,
                "title" : "Microsoft .Net",
                "tags" : [
                        "C#",
                        "VB.net",
                        "ASP.net"
                ]
        }
]


If you see in the above collection, there are 2 records with title “J2EE”. Each record has 3 tags. My objective is to know totally how many tags for each course title(J2EE). For this I am set to use map-Reduce of mongoDB

Map function goes like this:


var mapFun = function(){
var k=this.title;
var v=this.tags.length;
emit(k,v);
};


In the above map function for each record, I am finding the total number of “tags” and emitting (throws) the key value pair as (J2EE, [3,3])

MongoDB map-Reduce function catches the emitted key value pairs from emit function. And it the groups the similar key records and puts the value in a array. Hence the resulting record is of form (“J2EE”, [3,3])

Reduce function goes like this:


var redFun = function(key,values){
            return Array.sum(values);
};


In the above reduce function, it totals the values present in the values array. In out example it would have received input like redFun(“J2EE”,[3,3]). So the return value would be 6 which is total number of tags for J2EE title.

Now our map-Reduce function is written as follows


db.things.mapReduce(
mapFun,
redFun,
{out:"tagcount"}
)


The above mapReduce function, we are outputting the value in “tagcount” collection. When “tagcount” collection is seen, it looks like bellow


[
        {
                "_id" : "J2EE",
                "value" : 6
        },
        {
                "_id" : "Microsoft .Net",
                "value" : 6
        }
]


So now we got the needed collection.


No comments:

Post a Comment