Handling Missing Values in Pandas

Original Source Here
Let’s take a few examples to understand how the parameter values affect the output.
Before we dive deeper into the pd.fillna() function, let us first create a DataFrame to work with.
A. Create a DataFrame:
Python Implementation:
B. Example — 1:
We can use the value parameter to specify by which value we want to fill the missing elements. In the following example, we are specifying value=0So it will fill all the missing elements with 0.
Parameters Used:
value = 0
Python Implementation:
C. Example — 2:
We can also specify different values to fill the missing elements for different columns by using the value parameter. The following example demonstrates how we can perform this operation.
Parameters Used:
value = dictionary
Python Implementation:
D. Example — 3:
To fill the missing elements, we can use the method parameter. If we specify method=”ffill”, it will use the last valid observation to fill the gap. If we do not specify the axis value, it will perform the operation row-wise or with axis=0. Please note that there is no limit to propagate the last valid observation to fill the gaps. If there are multiple consecutive missing elements, they will get filled by the last valid observation.
Important Note:
If we specify method=”ffill” and the axis=0, and if the elements in the first row are missing, they will never get filled.
Parameters Used:
method = “ffill”
Python Implementation:
E. Example — 4:
If we specify method=”pad”, it works the same way as method=”ffill”.
Parameters Used:
method = “pad”
Python Implementation:
F. Example — 5:
By default, the missing elements will be filled row-wise or with axis=0.
Important Note:
If we specify method=”ffill” and the axis=0, then if the elements in the first row are missing, they will never get filled.
Parameters Used:
method = “ffill”
axis = 0
Python Implementation:
G. Example — 6:
In some cases, if we want to fill missing the elements column-wise, we can specify the axis parameter and set axis=1.
Important Note:
If we specify method=”ffill” and the axis=1, then if the elements in the first column are missing, they will never get filled.
Parameters Used:
method = “ffill”
axis = 1
Python Implementation:
H. Example — 7:
To fill the missing elements, we can use the method parameter. If we specify method=”bfill”, it will use the next valid observation to fill the gap. If we do not specify the axis value, it will perform the operation row-wise or with axis=0. Please note that there is no limit to propagate the next valid observation to fill the gaps. If there are multiple consecutive missing elements, they will get filled by the next valid observation.
Important Note:
If we specify method=”bfill” and the axis=0, then if the elements in the last row are missing, they will never get filled.
Parameters Used:
method = “bfill”
axis = 0
Python Implementation:
I. Example — 8:
If we specify method=”backfill”, it works the same way as method=”bfill”.
Parameters Used:
method = “backfill”
Python Implementation:
J. Example — 9:
By default, the missing elements will be filled row-wise or with axis=0.
Important Note:
If we specify method=”bfill” and the axis=0, then if the elements in the last row are missing, they will never get filled.
Parameters Used:
method = “bfill”
axis = 0
Python Implementation:
K. Example — 10:
In some cases, if we want to fill missing the elements column-wise, we can specify the axis parameter and set axis=1.
Important Note:
If we specify method=”bfill” and the axis=1, then if the elements in the last column are missing, they will never get filled.
Parameters Used:
method = “ffill”
axis = 1
Python Implementation:
L. Example — 11:
If we specify the limit parameter, it will restrict the maximum number of consecutive missing values to be filled in forward or backward fill methods. We can say that if the gap of consecutive missing elements is more than the number specified by the limitparameter, it will only be filled partially. Here we are using the fill forward method with axis=0 and a limit of 1 element.
Parameters Used:
method = “ffill”
axis = 0
limit = 1
Python Implementation:
M. Example — 12:
In this example, we will use the fill forward method with axis=1 and a limit of 1 element.
Parameters Used:
method = “ffill”
axis = 1
limit = 1
Python Implementation:
N. Example — 13:
In this example, we will use the backward fill method with axis=0 and a limit of 1 element.
Parameters Used:
method = “bfill”
axis = 0
limit = 1
Python Implementation:
O. Example — 12:
In this example, we will use the backward fill method with axis=1 and a limit of 1 element.
Parameters Used:
method = “bfill”
axis = 1
limit = 1
Python Implementation:
P. Creating a DataFrame:
Python Implementation:
Q. Example — 13:
We can use the downcast parameter to downcast the datatype if possible. The string value “infer” will try to downcast to an appropriate equal type. For example, float64 to int64.
Parameters Used:
downcast = infer
Python Implementation:
R. Example — 14:
If we want the changes to take place in our original DataFrame, then we have to specify inplace=True as a parameter. Note that it will not return anything. After execution, the original DataFrame will be modified by the result of pd.dropna() function.
Parameters Used:
inplace = True
Python Implementation:
AI/ML
Trending AI/ML Article Identified & Digested via Granola by Ramsey Elbasheer; a Machine-Driven RSS Bot
via WordPress https://ramseyelbasheer.io/2021/04/28/handling-missing-values-in-pandas/