Set Index of DataFrame using Existing Column
To set the DataFrame index using existing column in Pandas, call set_index() method on this DataFrame and pass the column name as argument to this method.
In this tutorial, we will learn the syntax of DataFrame.set_index() method and how to use this method to set one of the existing columns as index for this DataFrame.
Syntax
The syntax of pandas DataFrame.set_index() method is
DataFrame.set_index(keys, drop=True, append=False, inplace=False, verify_integrity=False)
where
Parameter | Value | Description |
---|---|---|
keys | label or array-like or list of labels/arrays | This parameter can be either a single column key, a single array of the same length as the calling DataFrame, or a list containing an arbitrary combination of column keys and arrays. Here, “array” encompasses Series, Index, np.ndarray, and instances of Iterator. |
drop | bool. default value is True. | Delete columns to be used as the new index. |
append | bool. default value is False. | Whether to append columns to existing index. |
inplace | bool. default value is False. | If True, modifies the DataFrame in place (do not create a new object). |
verify_integrity | bool.default value is False. | Check the new index for duplicates. Otherwise defer the check until necessary. Setting to False will improve the performance of this method. |
Return Value
- DataFrame or
- None if inplace=True.
Examples
Existing Column as Index for DataFrame
In the following program, we take a DataFrame with three columns and a default index. We then set the index of this DataFrame with the column ‘col_0’.
set_index() by default does not modify the original DataFrame, but returns the result as DataFrame with the new index.
Example.py
import pandas as pd
data = {'col_1': [10, 20, 30], 'col_2': [40, 50, 60]}
df = pd.DataFrame(data)
result = df.set_index('col_1')
print('Original DataFrame')
print(df)
print('\nResulting DataFrame')
print(result)
Output
Original DataFrame
col_1 col_2
0 10 40
1 20 50
2 30 60
Resulting DataFrame
col_2
col_1
10 40
20 50
30 60
Set Column as Index In-place
We can modify the original DataFrame by passing True for the parameter inplace
.
Example.py
import pandas as pd
data = {'col_1': [10, 20, 30], 'col_2': [40, 50, 60]}
df = pd.DataFrame(data)
df.set_index('col_1', inplace=True)
print(df)
Output
col_2
col_1
10 40
20 50
30 60
Set Multiple Columns as Index
We can also use multiple columns of this DataFrame to set index for this DataFrame.
In the following program, we take a DataFrame with columns: col_1, col_2 and col_3. We use col_1 and col_2 for the index of this DataFrame.
Example.py
import pandas as pd
data = {'col_1': [10, 20, 30], 'col_2': [40, 50, 60], 'col_3': [70, 80, 90]}
df = pd.DataFrame(data)
df.set_index(['col_1', 'col_2'], inplace=True)
print(df)
Output
col_3
col_1 col_2
10 40 70
20 50 80
30 60 90
Create a MultiIndex using an Index and a column:
Append Column to Index of DataFrame
We can also append a column to the existing index of this DataFrame by passing True for append
parameter to set_index() method.
Example.py
import pandas as pd
data = {'col_1': [10, 20, 30], 'col_2': [40, 50, 60], 'col_3': [70, 80, 90]}
df = pd.DataFrame(data)
df.set_index('col_1', inplace=True, append=True)
print(df)
Output
col_2 col_3
col_1
0 10 40 70
1 20 50 80
2 30 60 90
Conclusion
In this Pandas Tutorial, we learned the syntax of DataFrame.set_index() method and how to use this method to set existing columns of this DataFrame as index.