Iwhyosk1t0v5t3yr5owo
2 DAY COURSE

Real-time Data Engineering in the Cloud

Topics covered at CLOUD-DATA-01-02
View Schedule & Book More dates available

Next up:

Learn the benefits and challenges of using real-time Big Data systems on this two-day hands-on guide to Big Data Engineering in the Cloud.

This course will cover open source Big Data services, or those managed by Cloud providers. Create consumers and publishers in Kafka, before using Apache Spark Streaming to process the data and send it back to Kafka. Finally, vizualise data in real-time on a webpage using Kafka REST.

Learn how to:

  • Create large scale real-time systems using both Apache Kafka and Apache Spark Streaming
  • Understand how real-time distributed systems are different from batch systems
  • Create Kafka producers and consumers
  • Process data in Kafka with Spark Streaming and place the results back into Kafka
  • Visualize data and show data in real-time on a web page

What the community says

"This class provided a lot insight into Real-time/Near Real-time Data Engineering and integration into the cloud. It introduced technologies and technical opinions of those technologies that I was unaware of or had little awareness of. The instructor Jesse Anderson was very knowledgeable and presented the material very clearly and was able to address all questions. Best class I've taken in years!!"

Attendee on 1st Jan 2018

"Very knowledgeable, the extensive experience of the instructor is palpable."

Attendee on 1st Jan 2018

About the Author

Jesse Anderson

Jesse Anderson is a data engineer, creative engineer, and managing director of the Big Data Institute. Jesse trains employees on big data—including cutting-edge technology like Apache Kafka, Apache Hadoop, and Apache Spark. He has taught thousands of students at companies ranging from startups to Fortune 100 companies the skills to become data engineers. He is widely regarded as an expert in the field and recognized for his novel teaching practices. Jesse is published by O’Reilly and Pragmatic Programmers and has been covered in such prestigious media outlets as the Wall Street Journal, CNN, BBC, NPR, Engadget, and Wired. You can learn more about Jesse at Jesse-Anderson.com.

Real-time Data Pipelines

  • Real-time Technologies
  • Real-time Pipelines
  • Pros and Cons of Real-time ###Using the Cloud
  • Cloud Providers
  • Real-time Technologies
  • Choosing a Provider ###Ingesting Data
  • Real-time Ingestion
  • Real-time ETL ###Kafka
  • About Kafka
  • Kafka Internals
  • Kafka API ###Processing Data
  • Real-time Data Processing
  • Real-time Processing Technologies ###Spark Streaming
  • Spark Streaming
  • Streaming API
  • Advanced Streaming ###Data Products
  • Analysis of Data
  • Dashboarding

Technologies Covered

In-depth Coverage

  • Apache Spark Streaming
  • Apache Kafka ###Covered
  • Amazon Web Services
  • Microsoft Azure
  • Google Cloud
  • IBM Softlayer
  • Amazon Kinesis
  • Microsoft Event Hubs
  • Google Pub/Sub
  • Ache NiFi
  • Apache Flink
  • Apache Apex
  • Apache Storm
  • Heron
  • Azure Stream Analytics
  • Google Cloud Dataflow
  • Apache Beam

Audience

This course is intended for software engineers, QA and Analysts who want to learn more about Big Data systems.

Prerequisites

To participate in this course you will need to have intermediate level Java knowledge.

Bring your own hardware

You are required to bring your own laptop for this course, so that you can develop with an environment you are familiar with.

Overview

Learn the benefits and challenges of using real-time Big Data systems on this two-day hands-on guide to Big Data Engineering in the Cloud.

This course will cover open source Big Data services, or those managed by Cloud providers. Create consumers and publishers in Kafka, before using Apache Spark Streaming to process the data and send it back to Kafka. Finally, vizualise data in real-time on a webpage using Kafka REST.

Learn how to:

  • Create large scale real-time systems using both Apache Kafka and Apache Spark Streaming
  • Understand how real-time distributed systems are different from batch systems
  • Create Kafka producers and consumers
  • Process data in Kafka with Spark Streaming and place the results back into Kafka
  • Visualize data and show data in real-time on a web page

What the community says

"This class provided a lot insight into Real-time/Near Real-time Data Engineering and integration into the cloud. It introduced technologies and technical opinions of those technologies that I was unaware of or had little awareness of. The instructor Jesse Anderson was very knowledgeable and presented the material very clearly and was able to address all questions. Best class I've taken in years!!"

Attendee on 1st Jan 2018

"Very knowledgeable, the extensive experience of the instructor is palpable."

Attendee on 1st Jan 2018

About the Author

Jesse Anderson

Jesse Anderson is a data engineer, creative engineer, and managing director of the Big Data Institute. Jesse trains employees on big data—including cutting-edge technology like Apache Kafka, Apache Hadoop, and Apache Spark. He has taught thousands of students at companies ranging from startups to Fortune 100 companies the skills to become data engineers. He is widely regarded as an expert in the field and recognized for his novel teaching practices. Jesse is published by O’Reilly and Pragmatic Programmers and has been covered in such prestigious media outlets as the Wall Street Journal, CNN, BBC, NPR, Engadget, and Wired. You can learn more about Jesse at Jesse-Anderson.com.

Program

Real-time Data Pipelines

  • Real-time Technologies
  • Real-time Pipelines
  • Pros and Cons of Real-time ###Using the Cloud
  • Cloud Providers
  • Real-time Technologies
  • Choosing a Provider ###Ingesting Data
  • Real-time Ingestion
  • Real-time ETL ###Kafka
  • About Kafka
  • Kafka Internals
  • Kafka API ###Processing Data
  • Real-time Data Processing
  • Real-time Processing Technologies ###Spark Streaming
  • Spark Streaming
  • Streaming API
  • Advanced Streaming ###Data Products
  • Analysis of Data
  • Dashboarding

Technologies Covered

In-depth Coverage

  • Apache Spark Streaming
  • Apache Kafka ###Covered
  • Amazon Web Services
  • Microsoft Azure
  • Google Cloud
  • IBM Softlayer
  • Amazon Kinesis
  • Microsoft Event Hubs
  • Google Pub/Sub
  • Ache NiFi
  • Apache Flink
  • Apache Apex
  • Apache Storm
  • Heron
  • Azure Stream Analytics
  • Google Cloud Dataflow
  • Apache Beam
Audience

Audience

This course is intended for software engineers, QA and Analysts who want to learn more about Big Data systems.

Prerequisites

To participate in this course you will need to have intermediate level Java knowledge.

Bring your own hardware

You are required to bring your own laptop for this course, so that you can develop with an environment you are familiar with.