Complete Guide to Detecting and Handling NaN in Python

1. How to Detect NaN in Python

What is NaN?

NaN (Not a Number) is a special floating‑point value that indicates a numeric operation is invalid or undefined. It typically appears as the result of division by zero or other invalid calculations, so extra care is needed when performing data analysis or numerical processing. If NaN is not handled correctly, calculation results can become inaccurate and programs may not behave as expected.

Importance of Detecting NaN

If NaN values are present in a dataset, the reliability of the data and the results of calculations can be affected. Therefore, it is essential to first detect NaN and then handle it appropriately (e.g., removal, replacement).

2. How to Generate NaN

Python can generate NaN with float('nan'). It is used to explicitly indicate an invalid result in numeric calculations.

Difference from None

NaN is numerically invalid, whereas None represents a “nothing” state. None can be compared with ==, but NaN is never equal to itself, so using == for testing is inappropriate.

3. How to Determine NaN

3.1. Determination with the Standard Library (math.isnan())

To check for NaN using Python’s standard library, use math.isnan(). This function returns True if the given value is NaN.
import math

num = float('nan')
print(math.isnan(num))  # Result: True

3.2. Determination Using NumPy (numpy.isnan())

NumPy is a library specialized for array and matrix computations, and it provides the numpy.isnan() function to efficiently detect NaNs within arrays. It is widely used in numerical analysis and scientific data processing.
import numpy as np

num_list = [1, 2, np.nan, 4]
print(np.isnan(num_list))  # Result: [False False  True False]

3.3. Determination with pandas (pandas.isna())

When working with DataFrames, use pandas isna() or isnull() to detect NaNs. These functions are helpful for data cleaning and handling missing values.
import pandas as pd
import numpy as np

data = pd.Series([1, 2, np.nan, 4])
print(pd.isna(data))  # Result: 0    False
                      #      1    False
                      #      2     True
                      #      3    False

4. How to Remove or Replace NaN

4.1. Remove NaN from a List

To remove NaN in a list, you can combine math.isnan() with a list comprehension.
import math

num_list = [1, 2, float('nan'), 4]
clean_list = [num for num in num_list if not math.isnan(num)]
print(clean_list)  # Result: [1, 2, 4]

4.2. Remove NaN with pandas (dropna())

To remove NaN from a DataFrame, use the dropna() method. This can remove rows or columns that contain NaN.
import pandas as pd
import numpy as np

df = pd.DataFrame({'A': [1, 2, np.nan], 'B': [4, np.nan, 6]})
clean_df = df.dropna()
print(clean_df)

4.3. Replace NaN with pandas (fillna())

If you prefer to replace NaN with a specific value rather than delete it, use the fillna() method.
import pandas as pd
import numpy as np

df = pd.DataFrame({'A': [1, 2, np.nan], 'B': [4, np.nan, 6]})
df.fillna(0, inplace=True)
print(df)

5. Calculations Involving NaN

When a calculation includes NaN, the result will also be NaN, so to obtain accurate results you need to remove or replace NaN beforehand.
import numpy as np

result = 10 + np.nan
print(result)  # Result: nan

Example of Statistical Calculations with NaN

When performing statistical calculations on a dataset that contains NaN and you want to ignore NaN, use NumPy’s nanmean() function. It computes the mean while excluding NaN values.
import numpy as np

data = [1, 2, np.nan, 4]
mean = np.nanmean(data)  # Calculate the mean while ignoring NaN
print(mean)  # Result: 2.3333...

6. Important Considerations for NaN Detection

6.1. Behavior of Comparison Operators

NaN has the special property that it is never equal to any other number or even to itself when using comparison operators. Therefore, you cannot test it with == or !=; you need to use dedicated functions (such as isnan() or isna()).
num = float('nan')
print(num == num)  # Result: False

6.2. Key Points for Data Cleaning

In data analysis, leaving NaNs in the dataset prevents accurate calculations. Because they can distort results, proper cleaning beforehand is essential. Removing or appropriately replacing NaNs improves the reliability of the data.

7. Summary

In Python, by leveraging math, numpy, and pandas, you can efficiently detect and handle NaN values. Understanding how to properly work with NaN and acquiring the foundational knowledge to maintain the reliability of data analysis and numerical computation is a valuable skill that benefits all areas of programming.