np.nan == np.nan问题
source link: https://blog.51cto.com/CANGYE0504/5388260
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
np.nan == np.nan问题
今天在学习动手学数据分析的课程的时候,细心的队友发现了一个问题。
对于数值型数据,pandas使用浮点值NAN(Not a Number)来表示缺失值,我们称NaN为容易检测到的标识值
但是在运行以下代码时候,会发现:
np.nan == np.nan
返回的却是:
False
有点百思不得其解。然后继续查阅官方代码仓库的issue,定位到:
BUG: Incorrect handling of not-equal comparison to nan · Issue #21685 · numpy/numpy (github.com)
涉及的链接还有:
simd - How to choose AVX compare predicate variants - Stack Overflow
np.nan
和任何数做不等于比较都返回True
np.nan != np.nan
np.nan != 0
np.nan != None
np.nan != 0.0
True
True
True
True
原因是这个底层代码使用了有序比较或者无序比较:
使用有序比较时:
For _CMP_NEQ_OQ (Ordered comparisons returns false for NaN operands):
nan
!=nan
-->false
nan
!=0
-->false
使用无序比较时:
For _CMP_NEQ_UQ (Unordered comparison returns true for NaN operands):
nan
!=nan
-->true
nan
!=0
-->true
总之,别用np.nan
做任何比较,大于,小于,等于都不行。
判断是否是NaN
,可以参考使用:
np.isnan(np.nan)
np.isnan
API文档:
Recommend
About Joyk
Aggregate valuable and interesting links.
Joyk means Joy of geeK