structured-streaming

A structured streaming was applied to the robot data from ROS-Gazebo simulation environment using Apache Spark. Data is collected in Kafka, analyzed by Apache Spark and stored in Cassandra.

python apache-spark cqlsh python3 ros pyspark spark-streaming kafka-consumer data-analysis apache-kafka kafka-producer apache-cassandra structured-streaming spark-sql spark-kafka-integration spark-cassandra-connector spark-cassandra ros-noetic spark-kafka-connector

Updated Feb 6, 2022
Python

data-han / twitter_kafka_sentiment

Star

Ingesting real-time Twitter API using tweepy into Kafka and process using Apache Spark Structured Streaming with Sentiment Analysis TextBlob before loading into time-series database, InfluxDB and monitoring dashboard, Grafana

streaming real-time kafka spark influxdb grafana structured-streaming

Updated Nov 30, 2025
Python

jrlasak / databricks_fintech_monitoring

Star

Databricks Real-Time Fintech Monitoring Pipeline: Hands-on lab to build a streaming fraud detection system using Auto Loader, watermarked deduplication, stream-static joins, and windowed rules engines in Databricks. Covers dual-SLA architecture for real-time alerts and batch compliance reporting.

real-time pyspark data-engineering fintech databricks structured-streaming fraud-detection databricks-notebooks hands-on watermarks delta-lake

Updated Apr 18, 2026
Python

Rishav273 / kafkaPysparkAnalytics

Star

Real-time ETL pipeline for financial data (kafka, pyspark) .

apache-spark pyspark apache-kafka structured-streaming streaming-analytics

Updated Dec 31, 2022
Python

jrlasak / databricks_pyspark_cert_zenith

Star

Databricks PySpark Certification Prep Lab: Build an e-commerce analytics pipeline covering Spark DataFrame API, Structured Streaming, data skew handling with salting, broadcast joins, and Pandas UDFs. Designed for the Databricks Certified Associate Developer for Apache Spark exam.

pyspark data-engineering certification databricks structured-streaming databricks-notebooks hands-on delta-lake data-skew

Updated Apr 18, 2026
Python

init-jay / pyspark-tdd-scaffold

Star

development scaffold for test driven pyspark structured streaming with fast local testing

spark tdd pyspark test-driven-development databricks structured-streaming deltalake medallion-architecture

Updated Nov 21, 2025
Python

TrainingByPackt / Big-Data-Processing-with-Apache-Spark-eLearning

Star

Efficiently tackle large datasets and perform big data analysis with Spark and Python

python spark dataset structured-streaming spark-mllib rdds

Updated Jan 11, 2019
Python

kaantas / twitter-trending-topics

Star

Structured Spark Streaming with Apache Kafka and Twitter

python twitter spark apache-spark twitter-api python3 pyspark sparksql apache-kafka structured-streaming

Updated Jul 31, 2017
Python

LeoneGarage / StreamJoin

Star

A framework for incremental streaming joins and incremental streaming aggregations over change data feeds from Databricks Delta

databricks structured-streaming deltalake

Updated Feb 17, 2025
Python

sebastianruizm / spark-kafka-cassandra

Star

Demo Spark Structured Streaming + Apache Kafka + Apache Cassandra

docker kafka spark cassandra structured-streaming

Updated Oct 8, 2025
Python

avrtt / user-activity-streaming-analytics

Star

An end‑to‑end real-time analytics & anomaly detection with PySpark Structured Streaming on user activity logs from Kafka

python kafka apache-spark dashboard pyspark hdfs data-pipeline data-parsing structured-streaming user-activity anomaly-detection streaming-analytics

Updated Mar 6, 2025
Python

amine-akrout / Spark_Stuctured_Streaming

Star

kafka spark postgresql superset zookeeper spark-streaming structured-streaming big-data-analytics

Updated Nov 18, 2020
Python

Paruchuri-Rajesh / Large_Scale_Product-Review

Star

End-to-end big data pipeline for Amazon product reviews: Spark batch ETL + feature engineering, MLflow-tracked sentiment (TF-IDF + LogReg) and fraud detection (GBT + behavioral features) models, Spark Structured Streaming scorer, and a FastAPI dashboard.

machine-learning big-data apache-spark sentiment-analysis etl scikit-learn pyspark structured-streaming fraud-detection amazon-reviews mlflow fastapi

Updated May 7, 2026
Python

serine000 / Structured-Streaming-Pyspark-Project

Star

This project links together a MongoDB cluster and a Kafka cluster with a Standalone Pyspark cluster all done locally

python docker kafka apache-spark mongodb pyspark structured-streaming

Updated Feb 14, 2024
Python

itsmawna / Seismic-Realtime-Pipeline

Star

react nodejs python api iot kafka spark mongodb websocket seismology data-visualization real-time-data event-processing earthquake structured-streaming streaming-pipeline

Updated Nov 12, 2025
Python

dragomirj / kafka-spark-integration

Star

Streaming and analyzing data streams from smart systems using Apache Kafka and Apache Spark tools

python mqtt raspberry-pi arduino apache-spark apache-kafka structured-streaming

Updated Jan 6, 2025
Python

ofili / pyspark-template

Star

Structured Streaming app that can read files from the local system folder as new files are added to the folder as stream data and apply all the operations on the new data and, finally, write the results in an output directory.

python pyspark structured-streaming

Updated Nov 9, 2022
Python

Improve this page

Add a description, image, and links to the structured-streaming topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the structured-streaming topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

structured-streaming

Here are 32 public repositories matching this topic...

radoslawkrolikowski / financial-market-data-analysis

astrolabsoftware / fink-broker

sankamuk / PysparkCheatsheet

zekeriyyaa / PySpark-Structured-Streaming-ROS-Kafka-ApacheSpark-Cassandra

data-han / twitter_kafka_sentiment

jrlasak / databricks_fintech_monitoring

Rishav273 / kafkaPysparkAnalytics

jrlasak / databricks_pyspark_cert_zenith

init-jay / pyspark-tdd-scaffold

TrainingByPackt / Big-Data-Processing-with-Apache-Spark-eLearning

kaantas / twitter-trending-topics

LeoneGarage / StreamJoin

sebastianruizm / spark-kafka-cassandra

avrtt / user-activity-streaming-analytics

amine-akrout / Spark_Stuctured_Streaming

Paruchuri-Rajesh / Large_Scale_Product-Review

serine000 / Structured-Streaming-Pyspark-Project

itsmawna / Seismic-Realtime-Pipeline

dragomirj / kafka-spark-integration

ofili / pyspark-template

Improve this page

Add this topic to your repo