Apache Spark scales complex data analyzes and machine learning applications that overwhelm a single computer using a computing cluster. The software has established itself as the standard tool for analyzing large amounts of data. The Spark Engine can be addressed from Python programs via the PySpark API.
In the two-day online course Big data analysis with PySpark you will receive a thorough introduction to the Spark framework in many practical exercises. You will learn how to develop productive, scalable Python applications based on Spark. You will learn about Spark SQL for working with tabular data, the Spark Streaming API, GraphX for graph calculations and Spark ML for machine learning.
The online course will take place from August 4th to 5th, 2021 with the online training platform BigBlueButton, so that you can participate comfortably from your own desk. The course is limited to a maximum of 15 people; this guarantees intensive support and an intensive exchange between the speaker and the participants. If you book by 7.7. 10% early bird discount.
The computer scientist Dr. Christian Staudt is an experienced data scientist with a focus on data mining, big data, machine learning and AI. Participants should have a solid basic knowledge of Python and some experience with big data applications.
Further information and registration: