Drop Rows/Columns if values are NA in DataFrame

To remove rows/columns of DataFrame based on the NA values in them, call dropna() method on this DataFrame. We may specify parameters like along which axis we drop, and how we do this drop, threshold number of non-NA values to drop, etc.

We can also specify the condition if any or all values are to be considered if NA, for dropping using how parameter of dropna() method.

In this tutorial, we will learn the syntax of DataFrame.dropna() method and how to use this method to delete or drop rows or columns containing NA.

Syntax

The syntax of pandas DataFrame.dropna() method is

</>
Copy
DataFrame.dropna(axis=0, how='any', thresh=None, subset=None, inplace=False)

where

ParameterValueDescription
axis{0 or ‘index’, 1 or ‘columns’}, default 0Determine if rows or columns which contain missing values are removed. 0, or ‘index’ : Drop rows which contain missing values. 1, or ‘columns’ : Drop columns which contain missing value. Changed in version 1.0.0: Pass tuple or list to drop on multiple axes. Only a single axis is allowed.
how{‘any’, ‘all’}, default ‘any’Determine if row or column is removed from DataFrame, when we have at least one NA or all NA. ‘any’ : If any NA values are present, drop that row or column. ‘all’ : If all values are NA, drop that row or column.
threshint, optionalRequire that many non-NA values.
subsetarray-like, optionalLabels along other axis to consider, e.g. if you are dropping rows these would be a list of columns to include.
inplacebool, default FalseIf True, do operation inplace and return None.

Return Value

  • DataFrame.
  • None if inplace=True.

Examples

Delete Rows of DataFrame if Values are NA

In the following program, we take a DataFrame, and drop rows from this DataFrame if any of the values in that rows are NA.

By default, the parameter axis is 0. Therefore dropna() method by default would delete rows of the DataFrame containing NA.

Example.py

</>
Copy
import pandas as pd
import numpy as np

data = {'col_0': [10, 20, np.nan, 40, np.nan], 'col_1': [60, 70, 80, 90, 99]}
df = pd.DataFrame(data)

result = df.dropna()
print(result)

Output

   col_0  col_1
0   10.0     60
1   20.0     70
3   40.0     90

Third and fifth row has NA (numpy.nan) value. Therefore those rows have been dropped in the resulting DataFrame.

Delete Columns of DataFrame if Values are NA

In the following program, we take a DataFrame, and drop columns from this DataFrame if any of the values in that columns are NA.

Pass axis=1 to drop columns containing NA values.

Example.py

</>
Copy
import pandas as pd
import numpy as np

data = {'col_0': [10, 20, np.nan, 40, np.nan], 'col_1': [60, 70, 80, 90, 99]}
df = pd.DataFrame(data)

result = df.dropna(axis=1)
print(result)

Output

   col_1
0     60
1     70
2     80
3     90
4     99

Since the first column has two NA values, it is dropped in the resulting DataFrame.

Delete Rows of DataFrame if All Values are NA

In the following program, we take a DataFrame, and drop rows from this DataFrame only if all of the values in that row are NA.

Pass how='all' to drop rows containing all NA values.

Example.py

</>
Copy
import pandas as pd
import numpy as np

data = {'col_0': [10, 20, np.nan], 'col_1': [np.nan, 50, np.nan]}
df = pd.DataFrame(data)

result = df.dropna(how='all')
print(result)

Output

   col_0  col_1
0   10.0    NaN
1   20.0   50.0

Third row has all NA values, and therefore this row is dropped it the resulting DataFrame.

Delete Columns of DataFrame if Threshold Number of Values are non-NA

We can also specify a threshold for number of non-NA values to consider for dropping.

In the following program, we take a DataFrame, and drop columns from this DataFrame only if there are not at least 2 non-NAs.

Pass thresh=2 to drop columns not containing at least 2 non-NA values.

Example.py

</>
Copy
import pandas as pd
import numpy as np

data = {'col_0': [10, 20, np.nan, 40], 'col_1': [np.nan, 50, np.nan, 70], 'col_2': [np.nan, 90, np.nan, np.nan]}
df = pd.DataFrame(data)

result = df.dropna(axis=1, thresh=2)
print(result)

Output

   col_0  col_1
0   10.0    NaN
1   20.0   50.0
2    NaN    NaN
3   40.0   70.0

col_0 and col_1 has at least two non-NA values. col_2 has only one non-NA value. So, col_2 has been dropped in the resulting DataFrame.

Conclusion

In this Pandas Tutorial, we learned the syntax of DataFrame.dropna() method and how to use this method to drop rows or columns based on NA values in them, using pandas DataFrame.dropna() method.