Median of DataFrame
To find the median of the values over rows or columns in DataFrame in Pandas, call median() method on this DataFrame. median() method returns a Series with the median calculated over specified axis.
In this tutorial, we will learn how to find the median of values along index or columns of a DataFrame using DataFrame.median() method.
Syntax
The syntax of pandas DataFrame.median() method is
DataFrame.median(axis=None, skipna=None, level=None, numeric_only=None, **kwargs)
where
Parameter | Value | Description |
---|---|---|
axis | {index (0), columns (1)}. default value is 0. | Axis for the function to be applied on. |
skipna | bool. default value is True. | Exclude NA/null values when computing the result. |
level | int or level name. default value is None. | If the axis is a MultiIndex (hierarchical), count along a particular level, collapsing into a Series. |
numeric_only | bool. default value is None. | Include only float, int, boolean columns. If None, will attempt to use everything, then use only numeric data. Not implemented for Series. |
**kwargs | Additional keyword arguments to be passed to the function. |
Return Value
- Series or
- DataFrame (if level specified)
Examples
Median of DataFrame for Columns
By default, the median is calculated for columns in a DataFrame.
In the following program, we take a DataFrame two columns containing numerical data, and find the median of columns in this DataFrame.
Example.py
import pandas as pd
df = pd.DataFrame({'a': [1, 4, 7], 'b': [3, 4, 2]})
result = df.median()
print(result)
Output
a 4.0
b 3.0
dtype: float64
Median of DataFrame for Rows
To compute the median of DataFrame along rows, pass axis=1
in call to median() method.
Example.py
import pandas as pd
df = pd.DataFrame({'a': [1, 4, 7], 'b': [3, 4, 2]})
result = df.median(axis=1)
print(result)
Output
0 2.0
1 4.0
2 4.5
dtype: float64
Do not skip NA while finding Median
By default, NA values like None, np.nan, etc are ignored. But if we would like consider those values as well, pass skipna=False
to median() method.
Example.py
import pandas as pd
df = pd.DataFrame({'a': [1, None, 7], 'b': [3, 4, 2]})
result = df.median(skipna=False)
print(result)
Output
a NaN
b 3.0
dtype: float64
If any of the values is NA, then the median would be considered as NaN.
Conclusion
In this Pandas Tutorial, we learned how to find the median of DataFrame along rows or columns using pandas DataFrame.median() method.