Constructing real-time production systems is easier than ever, with new use cases enabled by new big data frameworks, and complexity and cost cut by the cloud. But big data processing comes with some problems. This two-day intensive course will teach you to build systems which can contend with the scale of data required for real-time processing, and add real business value to your insights.
Join expert Jesse Anderson and gain the skills you need to choose the right cloud provider for your company, and create systems which can meet the demands of real-time big data, on this intensive two-day course.
Explore the latest real-time frameworks before learning how to build real-time data pipelines in the cloud, ingest big data, use Apache Spark streaming to process your data, analyze, and visualise using Kafka REST.
Upon completion of the course, you will have gained the both the understanding and skills you need, to create large scale real-time systems using Apache Kafka and Apache Spark Streaming.
Learn how to:
- Create large scale real-time data pipelines using both Apache Kafka and Apache Spark Streaming- Ingest and process data and create products from sources in real-time and at scale
- Understand how real-time distributed systems are different from batch systems
- Create Kafka producers and consumers
- Process data in Kafka with Spark Streaming and place the results back into Kafka
- Visualize data and show data in real-time on a web page
What the community says
"This class provided a lot insight into Real-time/Near Real-time Data Engineering and integration into the cloud. It introduced technologies and technical opinions of those technologies that I was unaware of or had little awareness of. The instructor Jesse Anderson was very knowledgeable and presented the material very clearly and was able to address all questions. Best class I've taken in years!!"Attendee on 1st Jan 2018
"Very knowledgeable, the extensive experience of the instructor is palpable."Attendee on 1st Jan 2018
Real-time Data Pipelines
- Real-time Technologies
- Real-time Pipelines
- Pros and Cons of Real-time
Using the Cloud
- Cloud Providers
- Real-time Technologies
- Choosing a Provider
- Real-time Ingestion
- Real-time ETL
- About Kafka
- Kafka Internals
- Kafka API
- Real-time Data Processing
- Real-time Processing Technologies
- Spark Streaming
- Streaming API
- Advanced Streaming
- Analysis of Data
This course is intended for software engineers, QA and Analysts who want to learn more about Big Data systems.
To participate in this course you will need to have intermediate level Java knowledge.
Bring your own hardware
You are required to bring your own laptop for this course, so that you can develop with an environment you are familiar with.