The objective of this project is to prepare the infrastructure to capture the events generated by the users of a hypothetical mobile game.
Parts of the pipeline:
- Users interact with the mobile app.
- An app server handles user requests and log events to Kafka.
- Spark pulls events from Kafka, transforms them and writes them to HDFS.
- Once the data is available in HDFS, perform queries using Presto.
- Tools: Python (Flask), Apache Kafka, Apache Spark, SQL, Presto, Jupyter Notebook.
- Related documents and code: HERE