With Spark, data analysis and machine learning applications can be flexibly scaled using computing clusters. The tool, based at the Apache Software Foundation, is a standard tool for the analysis and evaluation of large amounts of data. The PySpark API forms the interface between the Spark Engine and your self-written Python programs.
In the two-day online training Big data analysis with PySpark you will receive a thorough introduction to the Spark framework in many practical exercises. You will learn how to develop productive, scalable Python applications based on Spark. You will gain insight into Spark SQL for working with tabular data, the Spark Streaming API, GraphX for graph calculations and Spark ML.
The workshop will take place from November 4th to 5th, 2021 and is limited to a maximum of 15 people, so that an intensive exchange between the speaker and the other participants is guaranteed.
The speaker, Dr. Christian Staudt is a computer scientist and experienced data scientist. His focus is on data mining, big data, machine learning and artificial intelligence. To participate successfully, you should have a solid basic knowledge of Python and have some experience with big data applications.
Further information and registration: