3

np.nan == np.nan问题

 2 years ago
source link: https://blog.51cto.com/CANGYE0504/5388260
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
neoserver,ios ssh client

np.nan == np.nan问题

今天在学习动手学数据分析的课程的时候,细心的队友发现了一个问题。

对于数值型数据,pandas使用浮点值NAN(Not a Number)来表示缺失值,我们称NaN为容易检测到的标识值

但是在运行以下代码时候,会发现:

np.nan == np.nan

返回的却是:

False

有点百思不得其解。然后继续查阅官方代码仓库的issue,定位到:

 BUG: Incorrect handling of not-equal comparison to nan · Issue #21685 · numpy/numpy (github.com)

涉及的链接还有:

 floating point - What is the rationale for all comparisons returning false for IEEE754 NaN values? - Stack Overflow

 simd - How to choose AVX compare predicate variants - Stack Overflow

np.nan和任何数做不等于比较都返回True

np.nan != np.nan
np.nan != 0
np.nan != None
np.nan != 0.0
True
True
True
True

原因是这个底层代码使用了有序比较或者无序比较:

使用有序比较时:

For _CMP_NEQ_OQ (Ordered comparisons returns false for NaN operands):

  • nan != nan --> false
  • nan != 0 --> false

使用无序比较时:

For _CMP_NEQ_UQ (Unordered comparison returns true for NaN operands):

  • nan != nan --> true
  • nan != 0 --> true

总之,别用np.nan做任何比较,大于,小于,等于都不行。

判断是否是NaN,可以参考使用:

np.isnan(np.nan)

np.isnanAPI文档:

 numpy.isnan — NumPy v1.22 Manual


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK