SF Crime Statistics with Spark Streaming Project

Introduction

The aim of the project is to create an Streaming application with Spark that connects to a Kafka cluster, reads and process the data.

Requirements

Java 1.8.x
Scala 2.11.x
Spark 2.4.x
Kafka
Python 3.6 or above

How to use the application

In order to run the application you will need to start:

Zookeeper:

/usr/bin/zookeeper-server-start config/zookeeper.properties

Kafka server:

/usr/bin/kafka-server-start config/server.properties

Insert data into topic:

python kafka_server.py

Kafka consumer:

kafka-console-consumer --topic "topic-name" --from-beginning --bootstrap-server localhost:9092

Run Spark job:

spark-submit --packages org.apache.spark:spark-sql-kafka-0-10_2.11:2.3.4 --master local[*] data_stream.py

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
config		config
.gitignore		.gitignore
README.md		README.md
count_console_output.PNG		count_console_output.PNG
data_stream.py		data_stream.py
kafka-consumer-console-output.PNG		kafka-consumer-console-output.PNG
kafka_server.py		kafka_server.py
producer_server.py		producer_server.py
progress_report_console_output.PNG		progress_report_console_output.PNG
radio_code.json		radio_code.json
requirements.txt		requirements.txt
screenshots.rar		screenshots.rar
start.sh		start.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SF Crime Statistics with Spark Streaming Project

Introduction

Requirements

How to use the application

Kafka Consumer Console Output

Progress Reporter

Count Output

About

Releases

Packages

Languages

rubengura/SF_Crime_Statistics

Folders and files

Latest commit

History

Repository files navigation

SF Crime Statistics with Spark Streaming Project

Introduction

Requirements

How to use the application

Kafka Consumer Console Output

Progress Reporter

Count Output

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages