How to configure Apache Spark Application

Configure Apache Spark Application – Apache Spark Application could be configured using properties that could be set directly on a SparkConf object that is passed during SparkContext initialization.

Configure Apache Spark Application using Spark Properties

Following are the properties (and their descriptions) that could be used to tune and fit a spark application in the Apache Spark ecosystem. We shall discuss the following properties with details and examples :

Application Name

Property Name : spark.app.name

Default value : (none)

This is the name that you could give to your spark application. This application name appears in the Web UI and logs, which makes it easy for debugging and visualizing when multiple spark applications are running on the machine/cluster.

Following is an example to set spark application name :

Number of Spark Driver Cores

Property Name : spark.driver.cores

Default value : 1

Exception : This property is considered only in cluster mode.

It represents the maximum number of cores, a driver process may use.

Following is an example to set number spark driver cores :

Driver’s Maximum Result Size

Property Namespark.driver.maxResultSize

Default value : 1g (meaning 1 GB)

Exception : Minimum 1MB

This is the higher limit on the total sum of size of serialized results of all partitions for each Spark action. Submitted jobs abort if the limit is exceeded. Setting it to ‘0’ means, there is no upper limit. But, if the value set by the property is exceeded, out-of-memory may occur in driver.
Following is an example to set Maximum limit on Spark Driver’s memory usage :

Driver’s Memory Usage

Property Namespark.driver.memory

Default value : 1g (meaning 1 GB)

Exception : If spark application is submitted in client mode, the property has to be set via command line option  –driver-memory.

This is the higher limit on the memory usage by Spark Driver. Submitted jobs abort if the limit is exceeded. Setting it to ‘0’ means, there is no upper limit. But, if the value set by the property is exceeded, out-of-memory may occur in driver.
Following is an example to set Maximum limit on Spark Driver’s memory usage :