As far as I can see, the idea of hadoop here is to split the work, then map them to different nodes and reduce the results. Well, this is just a very naive understanding. But I guess for the most part of hadoop it is doing these things. The nice thing is that hadoop provide a scalable mechanism to implement this map-reduce idea. By assigning the responsibilities to different machines(they are treated as nodes like NameNode, DataNode, JobTracker, TaskTracker and so on). By carefully configuring a cluster, these nodes could work together and finish some heavy-loaded tasks.
Then I talked about it with my friend. He is also doing some computing research right now. He suggest me read some Google stuff about this kind of map-reduce. As hadoop is originated by Yahoo. Google may have a different idea to do the same thing. Well, the problem is how I can get to know the secrets of Google's infrastructure. Every one knows there is a big table in data search engine. But how does Google split the work and implement the big table. It is obviously more difficult to get the information from Google rather than Hadoop:( Perhaps I have to read other people's research about this field instead of find it out by myself.
No comments:
Post a Comment