Ctxsgjxu6naw7srvtav9
2 DAY COURSE

Real-time Systems with Spark Streaming and Kafka

Topics covered at CLOUD-DATA-01-02
View Schedule & Book More dates available

Next up:

This two-day intensive Real-time Systems course at Skills Matter gives you an in-depth look at building systems that can contend with the scale of data required for real-time processing. You will come away with a developed understanding of using Apache Spark Streaming and Kafka in the creation of large-scale, real-time data pipelines.

In a collaborative learning environment, Jesse Anderson, Managing Director of the Big Data Institute, will help you choose the right cloud provider for your company, build large-scale and real-time data pipelines, visualise and analyse data using Kafka REST, and integrate your systems into the cloud using Apache Kafka and Apache Spark Streaming. In a collaborative environment, meet other passionate engineers to exchange ideas on what you learn and turn these into concrete approaches to your own projects.

- Create, analyse and integrate sophisticated real-time data systems -


Who you will be learning with

Data, software and creative engineers as well as QAs and analysts who want to develop their knowledge of Big Data systems.

How to apply these skills

Apply an advanced understanding of real-time data systems to position your organisations to be efficiently data-driven.

What next?

Book early to receive a discount on the course price and in doing so you will not only grow your own skill set, but help us grow our community of over 140,000 passionate techies.

Learn how to:

  • Create large scale real-time data pipelines using both Apache Kafka and Apache Spark Streaming
  • Ingest and process data and create products from sources in real-time and at scale
  • Understand how real-time distributed systems are different from batch systems
  • Create Kafka producers and consumers
  • Process data in Kafka with Spark Streaming and place the results back into Kafka
  • Visualise data and show data in real-time on a web page

What the community says

"This class provided a lot insight into Real-time/Near Real-time Data Engineering and integration into the cloud. It introduced technologies and technical opinions of those technologies that I was unaware of or had little awareness of. The instructor Jesse Anderson was very knowledgeable and presented the material very clearly and was able to address all questions. Best class I've taken in years!!"

Attendee

"Very knowledgeable, the extensive experience of the instructor is palpable."

Attendee

About the Author

Jesse Anderson

Jesse Anderson is a data engineer, creative engineer, and managing director of the Big Data Institute. Jesse trains employees on big data—including cutting-edge technology like Apache Kafka, Apache Hadoop, and Apache Spark. He has taught thousands of students at companies ranging from startups to Fortune 100 companies the skills to become data engineers. He is widely regarded as an expert in the field and recognized for his novel teaching practices. Jesse is published by O’Reilly and Pragmatic Programmers and has been covered in such prestigious media outlets as the Wall Street Journal, CNN, BBC, NPR, Engadget, and Wired. You can learn more about Jesse at Jesse-Anderson.com.

Real-time Data Pipelines

  • Real-time Technologies
  • Real-time Pipelines
  • Pros and Cons of Real-time

Using the Cloud

  • Cloud Providers
  • Real-time Technologies
  • Choosing a Provider

Ingesting Data

  • Real-time Ingestion
  • Real-time ETL

Kafka

  • About Kafka
  • Kafka Internals
  • Kafka API

Processing Data

  • Real-time Data Processing
  • Real-time Processing Technologies

Spark Streaming

  • Spark Streaming
  • Streaming API
  • Advanced Streaming

Data Products

  • Analysis of Data
  • Dashboarding

Audience

This course is intended for software engineers, QA and Analysts who want to learn more about Big Data systems.

Prerequisites

To participate in this course you will need to have intermediate level Java knowledge.

Diversity Scholarship

Skills Matter is proud and happy to share our Diversity Scholarship which provides support to those from traditionally underrepresented in the technology and/or open source communities who may not have the opportunity to attend our courses.

Scholarships are awarded based on a combination of need and impact. Scholarship recipients will receive a complimentary ticket to the course

Please note, travel expenses are not covered under this scholarship and are the responsibility of the scholarship recipient.

Eligibility
Applicants should be from a traditionally underrepresented and/or marginalized group in the technology and/or open source community and be unable to attend without some assistance.

To apply please fill in this form.

Bring your own hardware

You are required to bring your own laptop for this course, so that you can develop with an environment you are familiar with.

Overview

This two-day intensive Real-time Systems course at Skills Matter gives you an in-depth look at building systems that can contend with the scale of data required for real-time processing. You will come away with a developed understanding of using Apache Spark Streaming and Kafka in the creation of large-scale, real-time data pipelines.

In a collaborative learning environment, Jesse Anderson, Managing Director of the Big Data Institute, will help you choose the right cloud provider for your company, build large-scale and real-time data pipelines, visualise and analyse data using Kafka REST, and integrate your systems into the cloud using Apache Kafka and Apache Spark Streaming. In a collaborative environment, meet other passionate engineers to exchange ideas on what you learn and turn these into concrete approaches to your own projects.

- Create, analyse and integrate sophisticated real-time data systems -


Who you will be learning with

Data, software and creative engineers as well as QAs and analysts who want to develop their knowledge of Big Data systems.

How to apply these skills

Apply an advanced understanding of real-time data systems to position your organisations to be efficiently data-driven.

What next?

Book early to receive a discount on the course price and in doing so you will not only grow your own skill set, but help us grow our community of over 140,000 passionate techies.

Learn how to:

  • Create large scale real-time data pipelines using both Apache Kafka and Apache Spark Streaming
  • Ingest and process data and create products from sources in real-time and at scale
  • Understand how real-time distributed systems are different from batch systems
  • Create Kafka producers and consumers
  • Process data in Kafka with Spark Streaming and place the results back into Kafka
  • Visualise data and show data in real-time on a web page

What the community says

"This class provided a lot insight into Real-time/Near Real-time Data Engineering and integration into the cloud. It introduced technologies and technical opinions of those technologies that I was unaware of or had little awareness of. The instructor Jesse Anderson was very knowledgeable and presented the material very clearly and was able to address all questions. Best class I've taken in years!!"

Attendee

"Very knowledgeable, the extensive experience of the instructor is palpable."

Attendee

About the Author

Jesse Anderson

Jesse Anderson is a data engineer, creative engineer, and managing director of the Big Data Institute. Jesse trains employees on big data—including cutting-edge technology like Apache Kafka, Apache Hadoop, and Apache Spark. He has taught thousands of students at companies ranging from startups to Fortune 100 companies the skills to become data engineers. He is widely regarded as an expert in the field and recognized for his novel teaching practices. Jesse is published by O’Reilly and Pragmatic Programmers and has been covered in such prestigious media outlets as the Wall Street Journal, CNN, BBC, NPR, Engadget, and Wired. You can learn more about Jesse at Jesse-Anderson.com.

Program

Real-time Data Pipelines

  • Real-time Technologies
  • Real-time Pipelines
  • Pros and Cons of Real-time

Using the Cloud

  • Cloud Providers
  • Real-time Technologies
  • Choosing a Provider

Ingesting Data

  • Real-time Ingestion
  • Real-time ETL

Kafka

  • About Kafka
  • Kafka Internals
  • Kafka API

Processing Data

  • Real-time Data Processing
  • Real-time Processing Technologies

Spark Streaming

  • Spark Streaming
  • Streaming API
  • Advanced Streaming

Data Products

  • Analysis of Data
  • Dashboarding
Audience

Audience

This course is intended for software engineers, QA and Analysts who want to learn more about Big Data systems.

Prerequisites

To participate in this course you will need to have intermediate level Java knowledge.

Diversity Scholarship

Skills Matter is proud and happy to share our Diversity Scholarship which provides support to those from traditionally underrepresented in the technology and/or open source communities who may not have the opportunity to attend our courses.

Scholarships are awarded based on a combination of need and impact. Scholarship recipients will receive a complimentary ticket to the course

Please note, travel expenses are not covered under this scholarship and are the responsibility of the scholarship recipient.

Eligibility
Applicants should be from a traditionally underrepresented and/or marginalized group in the technology and/or open source community and be unable to attend without some assistance.

To apply please fill in this form.

Bring your own hardware

You are required to bring your own laptop for this course, so that you can develop with an environment you are familiar with.