At Last.fm, the number of "write once, run never again" Hadoop programs has been growing steadily, especially in the research team. Since Java is a very verbose and compiled programming language, it is not very suitable for writing such programs. A better way to quickly write MapReduce programs is provided by Hadoop Streaming, but it still is less convenient than it could be. Dumbo is a simple enhancement to Hadoop Streaming that addresses this issue. More specifically, it is a Python module that makes Hadoop Streaming elegant and easy.
YOU MAY ALSO LIKE:
Dumbo: Hadoop streaming made elegant and easy
Klaas Bosteels is a Hadoop expert and works at the Department of Applied Mathematics and Computer Science, Ghent University, where he is working towards a Ph.D. degree as a member of the Fuzziness and Uncertainty Modelling Research Group, in close