How to use PySpark in PyCharm IDE

I have recently been exploring the world of big data and started to use Spark, a platform for cluster computing (i.e. allows the spread of data and computations over clusters with multiple nodes (think of each node as a separate computer)).

However, Spark can be used in 3 main languages, Scala, Python and Java. If you are curious as to which language to use, check out this great article by Datacamp

We will be download PySpark, the Python API for Spark.