Spark Python Application – Example

Spark Python Application – Example

In this Spark Tutorial : Spark Python Application – Example, we shall learn to write a Spark Application (Word-Count) in Python Programming Language and submit the application to run in Spark with local input and minimal (no) options.

Prepare Input

For Word-Count Example, we shall provide a text file as input. Input file contains multiple lines and each line has multiple words separated by white space, ” “.

Input File is located at : /home/input.txt

 

Spark Application – Python Program

Following is Python program that does word count in Apache Spark.

To submit the above Spark Application to Spark for running, Open a Terminal or Command Prompt from the location of wordcount.py, and run the following command :

 

Output

Check the output folder to which the word counts are written (path is provided in wordcount.py).

Example Spark Application in Python

Output has been written to two part files. Files contain tuples of word and the corresponding number of occurrences in the input file.

 

Conclusion :

In this tutorial, Spark Python Application – Example, we have learnt to run a simple Spark Application written in Python Programming language.