Set Index of DataFrame using Existing Column

To set the DataFrame index using existing column in Pandas, call set_index() method on this DataFrame and pass the column name as argument to this method.

In this tutorial, we will learn the syntax of DataFrame.set_index() method and how to use this method to set one of the existing columns as index for this DataFrame.

Syntax

The syntax of pandas DataFrame.set_index() method is

</>
Copy
DataFrame.set_index(keys, drop=True, append=False, inplace=False, verify_integrity=False)

where

ParameterValueDescription
keyslabel or array-like or list of labels/arraysThis parameter can be either a single column key, a single array of the same length as the calling DataFrame, or a list containing an arbitrary combination of column keys and arrays. Here, “array” encompasses Series, Index, np.ndarray, and instances of Iterator.
dropbool.
default value is True.
Delete columns to be used as the new index.
appendbool.
default value is False.
Whether to append columns to existing index.
inplacebool.
default value is False.
If True, modifies the DataFrame in place (do not create a new object).
verify_integritybool.default value is False.Check the new index for duplicates. Otherwise defer the check until necessary. Setting to False will improve the performance of this method.

Return Value

  • DataFrame or
  • None if inplace=True.

Examples

Existing Column as Index for DataFrame

In the following program, we take a DataFrame with three columns and a default index. We then set the index of this DataFrame with the column ‘col_0’.

set_index() by default does not modify the original DataFrame, but returns the result as DataFrame with the new index.

Example.py

</>
Copy
import pandas as pd

data = {'col_1': [10, 20, 30], 'col_2': [40, 50, 60]}
df = pd.DataFrame(data)

result = df.set_index('col_1')
print('Original DataFrame')
print(df)
print('\nResulting DataFrame')
print(result)

Output

Original DataFrame
   col_1  col_2
0     10     40
1     20     50
2     30     60

Resulting DataFrame
       col_2
col_1       
10        40
20        50
30        60

Set Column as Index In-place

We can modify the original DataFrame by passing True for the parameter inplace.

Example.py

</>
Copy
import pandas as pd

data = {'col_1': [10, 20, 30], 'col_2': [40, 50, 60]}
df = pd.DataFrame(data)

df.set_index('col_1', inplace=True)
print(df)

Output

       col_2
col_1       
10        40
20        50
30        60

Set Multiple Columns as Index

We can also use multiple columns of this DataFrame to set index for this DataFrame.

In the following program, we take a DataFrame with columns: col_1, col_2 and col_3. We use col_1 and col_2 for the index of this DataFrame.

Example.py

</>
Copy
import pandas as pd

data = {'col_1': [10, 20, 30], 'col_2': [40, 50, 60], 'col_3': [70, 80, 90]}
df = pd.DataFrame(data)

df.set_index(['col_1', 'col_2'], inplace=True)
print(df)

Output

             col_3
col_1 col_2       
10    40        70
20    50        80
30    60        90

Create a MultiIndex using an Index and a column:

Append Column to Index of DataFrame

We can also append a column to the existing index of this DataFrame by passing True for append parameter to set_index() method.

Example.py

</>
Copy
import pandas as pd

data = {'col_1': [10, 20, 30], 'col_2': [40, 50, 60], 'col_3': [70, 80, 90]}
df = pd.DataFrame(data)

df.set_index('col_1', inplace=True, append=True)
print(df)

Output

         col_2  col_3
  col_1              
0 10        40     70
1 20        50     80
2 30        60     90

Conclusion

In this Pandas Tutorial, we learned the syntax of DataFrame.set_index() method and how to use this method to set existing columns of this DataFrame as index.