Please log in to watch this conference skillscast.
But the MapReduce computing model is hard to use. It’s very course-grained and relatively inflexible. Translating many otherwise intuitive algorithms to MapReduce requires specialized expertise. The industry is already starting to look elsewhere…
However, the very name MapReduce tells us its roots, the core concepts of mapping and reducing familiar from Functional Programming (FP). We’ll discuss how to return MapReduce and Copious Data, in general, to its ideal place, rooted in FP. We’ll discuss the core operations (“combinators”) of FP that meet our requirements, finding the right granularity for modularity, myths of mutability and performance, and trends that are already moving us in the right direction. We’ll see why the dominance of Java in Hadoop is harming progress. You might think that concurrency is the “killer app” for FP and maybe you’re right. I’ll argue that Copious Data is just as important for driving FP into the mainstream. Actually, FP has a long tradition in data systems, but we’ve been calling it SQL…
The world of Copious Data (permit me to avoid the overexposed term Big Data) is currently dominated by Apache Hadoop, a clean-room version of the MapReduce computing model and a distributed, (mostly) reliable file system invented at Google.
YOU MAY ALSO LIKE:
Copious Data, the “Killer App” for Functional Programming
Dean Wampler
Product Engineering Director for Accelerated Discovery
IBM Research