Skip to content
#

structured-streaming

Here are 32 public repositories matching this topic...

A structured streaming was applied to the robot data from ROS-Gazebo simulation environment using Apache Spark. Data is collected in Kafka, analyzed by Apache Spark and stored in Cassandra.

  • Updated Feb 6, 2022
  • Python

Databricks Real-Time Fintech Monitoring Pipeline: Hands-on lab to build a streaming fraud detection system using Auto Loader, watermarked deduplication, stream-static joins, and windowed rules engines in Databricks. Covers dual-SLA architecture for real-time alerts and batch compliance reporting.

  • Updated Apr 18, 2026
  • Python

Databricks PySpark Certification Prep Lab: Build an e-commerce analytics pipeline covering Spark DataFrame API, Structured Streaming, data skew handling with salting, broadcast joins, and Pandas UDFs. Designed for the Databricks Certified Associate Developer for Apache Spark exam.

  • Updated Apr 18, 2026
  • Python

End-to-end big data pipeline for Amazon product reviews: Spark batch ETL + feature engineering, MLflow-tracked sentiment (TF-IDF + LogReg) and fraud detection (GBT + behavioral features) models, Spark Structured Streaming scorer, and a FastAPI dashboard.

  • Updated May 7, 2026
  • Python

Improve this page

Add a description, image, and links to the structured-streaming topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the structured-streaming topic, visit your repo's landing page and select "manage topics."

Learn more