Meet up

Hadoop User Group Meeting #2

Tuesday, 14th April at Sun Customer Briefing Centre, London

This meetup is run by HUGUK: Hadoop User Group UK. Starts at 10:00 AM.

The next meetup is shaping up nicely, we now have a venue booked thanks to Sun Microsystems! You can find the Preliminary schedule of the meetup below. All presentation times include questions and a short break to get the next presentation setup.

Practical MapReduce

MapReduce is a programming model and an associated implementation for processing and generating large data sets. Users specify a map function that processes a key/value pair to generate a set of intermediate key/value pairs, and a reduce function...

Tom White

Tom White is one of the foremost experts on Hadoop. He has been an Apache Hadoop committer since February 2007, and is a Member of the Apache Software Foundation. Tom is a software engineer at Cloudera, where he has worked, since its foundation, on t

Introducing Apache Mahout

Mahout's goal is to build scalable, Apache licensed machine learning libraries. Initially, the interest is on building out the ten machine learning libraries...

Isabel Drost

Isabel Drost is a committer to Mahout, the project to build scalable, Apache licensed machine learning libraries.

Scalable Reasoning on RDF Documents with Hadoop and HBase

Michele Catasta gave a talk on Scalable Reasoning on RDF Documents with Hadoop and HBase.

The Terrier Project

Terrier is a robust and modular Information Retrieval engine. From version 2.2, Terrier supports the indexing of large collections in a Hadoop Map Reduce fashion. This uses the single-pass indexer to index sections of each collection (as batches of files) as map tasks...

Iadh Ounis

Iadh Ounis works as a Reader at the Department of Computing Science at the University of Glasgow and is the principal investigator of the Terrier project. The Terrier Project is doing a lot of work on Web, Blog and Enterprise search, Desktop, Intrane

Having Fun with PageRank and MapReduce

This talk is about implementing PageRank with MapReduce. PageRank is a link analysis algorithm used by the Google Internet search engine that assigns...

Paolo Castagna

Paolo Castagna is a researcher in the Enterprise Informatics Lab at HP, focused on developing the technology foundations and tools for the next generation of enterprise information management systems.

Apache HBase

HBase is the Hadoop database. Its an open-source, distributed, column-oriented store modeled after the Google paper, Bigtable...

Michael Stack

Michael Stack is on the Hadoop Project Management Comittee and a full-time committer to HBase, part of the Hadoop project.

Hypercubes in HBase

Zohmg is a data store for multi-dimensional time series. it's built on top of hbase and provides tools for data aggregation using hadoop. Data retrieval is designed to be quick; most of the aggregation happens during data import...

HADOOP-1722 and Typed Bytes

This talk is about the Hadoop Core - Jira issue with the key HADOOP-1722, which is about Make streaming handle non-utf8 byte array...

Klaas Bosteels

Klaas Bosteels is a Hadoop expert and works at the Department of Applied Mathematics and Computer Science, Ghent University, where he is working towards a Ph.D. degree as a member of the Fuzziness and Uncertainty Modelling Research Group, in close

Thanks to our sponsors

Who's coming?

In order to view registered members you need to be logged in with a "confirmed" Skills Matter account!

If you haven't already done so click here to create a new account.

Find the "confirmation instructions" email we sent you when you signed up and click on the "Confirm Account" link.