Any company that is challenged by analysis of terabytes or even petabytes of
data on a daily or weekly basis needs to take a long hard look at what they
are paying for to analyse that data. The intersection of this ever
increasing data tsunami and economics has produced new ways to structure and
store incredibly large volumes of data with Apache Hadoop.
For most companies however, Hadoop is not a complete, single solution for
analytics. Instead, it's part of a hybrid data pyramid with a tier of raw
data stored inexpensively in Hadoop; a secondary tier of key data aggregated
out of Hadoop and placed in traditional datamarts, which are more expensive
but offer better performance; and a third tier of data required for
speed-of-thought response times residing in memory. As part of this data
pyramid, Hadoop can dramatically lower costs without any compromise in
business performance.
While Hadoop is a very powerful technology, its command line interface and
lack of front and back end applications make it difficult and cumbersome to
use. In order to take full advantage of the power of Hadoop, a variety of
tools and applications are needed to assist in data loading, transformations
and analytics.
What attendees will learn:
- A check list to help users determine what their big data analysis needs
are
- The key questions to answer to get the right solutions implemented
- What critical applications are needed to take full advantage of the power
of Hadoop