Advanced Streaming Big Data with Spark - eLearning
450,00 EUR
- 25 hours
Step into real-time data processing with the Streaming Big Data with Spark Training, designed to help you build high-performance, scalable data pipelines that process information as it happens. This course introduces you to Apache Spark’s streaming capabilities, enabling you to work with continuous data flows for modern analytics and decision-making systems.
Key Features
Language
Course and material in English
Level
Intermediate - Advanced level
Access
1 Year access to the learning platform
9 Hours of On-Demand Videos
with 25+ hours recommended study time
38 Guided Hands-on Exercises
13 Auto-Graded Assessments
33 Recall Quizzes
3 Real-world projects
Certificate
Program completion certification included

Learning Outcomes
At the end of this Course, you will be able to understand:
Runtime
Gain a complete understanding of Spark runtime architecture
DataFrame
Perform essential DataFrame operations and functions in Spark
Stream
Learn the fundamentals of stream processing with Spark
Kafka
Explore direct integration of Spark Streaming with Apache Kafka
Amazon
Work with Spark Streaming using Amazon Kinesis
Apply
Understand and apply sliding window operations in stream processing

Course timeline
The Spark Runtime
Lesson 01
- Understanding the Spark RDD
- Understanding the Spark DataFrame
- Spark Runtime Architecture Overview
ETL with Spark
Lesson 02
- Map Transformations
- The Transformations
- Basic Actions
- key-value pair Transformations
- Join Operations
- Numeric RDD Operations and Sampling Functions
- Partitioning in Spark
- Controlling Partitions in Spark
- Using External Programs with Spark
SparkSQL and DataFrames
Lesson 03
- Spark SQL Architecture
- DataFrame API Overview
- Creating DataFrames
- DataFrame Data Model and Schemas
- Basic DataFrame Operations
- DataFrame Functions
- Set Operations and Aggregations in DataFrames
- DataFrame Storage and Output
- DEMO Spark SQL and DataFrames
Introduction to Stream Processing with Spark
Lesson 04
- Introduction to Spark Streaming
- Introduction to DStreams
- The DStream Operations
Stateful processing with Spark Streaming
Lesson 05
- The State Operations
- Introduction to Event Sourcing
- Demonstration of Stateful Streaming with Spark
Sliding Window Operations with Spark Streaming
Lesson 06
- Windowing Operations
- Windowing Functions
- DEMO Sliding Window Operations with Spark Streaming
Introduction to Structured Streaming
Lesson 07
- Structured Streaming Overview
- Output Modes and Triggering with Structured Streaming
- DEMO Introduction to Structured Streaming
Introduction to Apache Kafka
Lesson 08
- Apache Kafka Overview and Architecture
- Messaging with Kafka
- Demo: Local Installation of Apache Kafka
Kafka Integration with Spark Streaming
Lesson 09
Using Spark Streaming with Apache Kafka
Using the Receiver Approach
Lesson 10
- Demo: Local Installation of Apache Kafka
- Using the Direct Approach
- DEMO Spark Streaming with Apache Kafka using the Direct Approach
Kafka Integration with Structured Streaming
Lesson 11
- Structured Streaming and Kafka
- Reading and Writing Data to Kafka using Structured Streaming
- DEMO Kafka and Structured Streaming
Using Spark Streaming with Kinesis
Lesson 12
- Using the Amazon Kinesis Producer and Client Libraries
- DEMO Intro to Amazon Kinesis
Using Spark Streaming with Kinesis
Lesson 13
- Using Spark Streaming with Amazon Kinesis
- DEMO Using Spark Streaming with Amanzon Kinesis
- Using Structured Streaming with Amazon Kinesis
- DEMO Using Structured Streaming with Amazon Kinesis
Additional Spark Streaming Integrations
Lesson 14
- Spark Streaming using MQTT
- Spark Streaming and Apache Flume
- Spark Streaming and Twitter
- Spark Streaming and Snowflake
- DEMO Structured Streaming with Snowflake

Who Should Enroll in This Program?
Data engineers working with real-time data systems
Big data professionals and Spark developers
Software engineers transitioning into data engineering roles
Data scientists interested in streaming analytics
Backend developers building data-intensive applications
IT professionals working with large-scale distributed systems
Prerequisites
- Basic understanding of programming (Java, Scala, or Python preferred)
- Familiarity with big data concepts and distributed systems
- Basic knowledge of data processing or analytics workflows
- Understanding of databases and SQL (helpful but not mandatory)
- No prior Spark Streaming experience is required.
Statements
Licensing and accreditation
This course is offered according to Partner Program Agreement and complies with the License Agreement requirements
Equity Policy
Candidates are encouraged to reach out to AVC for guidance and support throughout the accommodation process.
Frequently Asked Questions

Need corporate solutions or LMS integration?
Didn't find the course or program which would work for your business? Need LMS integration? Write us, we will solve everything!
