8

Be careful when you use “isin()” method in Pandas

 3 years ago
source link: http://www.donghao.org/2021/04/09/be-careful-when-you-use-isin-method-in-pandas/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
neoserver,ios ssh client

Be careful when you use “isin()” method in Pandas

import pandas as pd

df_excl = pd.DataFrame({"id": ["12345"]})
df = pd.DataFrame({"id": ["12345", "67890"]})

result = df[~df.id.isin(df_excl[["id"]])]
print(result)
Python
xxxxxxxxxx
import pandas as pd
df_excl = pd.DataFrame({"id": ["12345"]})
df = pd.DataFrame({"id": ["12345", "67890"]})
result = df[~df.id.isin(df_excl[["id"]])]
print(result)

Guess what’s the result of above snippet? Just a dataframe with “67890”? No, the result is

      id
0  12345
1  67890
Python
xxxxxxxxxx
      id
0  12345
1  67890

Why the “12345” has not been excluded? The reason is quite tricky: df_excl[["id"]] is a DataFrame but what we need in isin() is Series! So we shouldn’t use [[]] here, but []

The correct code should use df_excl["id"], as below:

...
result = df[~df.id.isin(df_excl["id"])] 
print(result)
Python
xxxxxxxxxx
...
result = df[~df.id.isin(df_excl["id"])] 
print(result)

Like this:

Loading...
4:17 am ROBIN DONG develope
pandas
Leave a comment

Leave a Reply Cancel reply


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK