Comparing NumPy arrays
- Published
- Revisions
Let's say we're given some numbers which we want to recompute using NumPy. Checking that our results match the original ones sounds trivial but, depending on the data type, there are a couple of pitfalls to avoid.
NumPy arrays
Let's start with the simple case: arrays of integers.
They can be compared using np.array_equal
:
A = np.array([1, 2, 3])
B = np.array([1, 2, 3])
assert np.array_equal(A, B)
We have to be more careful with floating-point numbers. For example:
A = np.array([0.1 + 0.2])
B = np.array([0.3])
assert np.array_equal(A, B) # AssertionError
fails because of the inherent inaccuracy of floating-point arithmetic.
Instead, np.allclose
should be used for comparing floating-point arrays.
One thing to note is that np.allclose
raises an exception if differently shaped arrays are passed in, but we can write a wrapper to handle this:
def myallclose(a, b, rtol=1e-05, atol=1e-08, equal_nan=False):
if a.shape != b.shape:
return False
return np.allclose(A, B, rtol=rtol, atol=atol, equal_nan=equal_nan)
A = np.array([0.1 + 0.2])
B = np.array([0.3])
assert myallclose(A, B) # OK
Additionally, equal_nan
argument is False
by default, but most likely you want to set this to True
:
A = np.array([np.nan])
B = np.array([np.nan])
assert myallclose(A, B) # AssertionError
assert myallclose(A, B, equal_nan=True) # OK
Masked arrays
Similar to np.array_equal
, there's ma.allequal
for masked arrays, but its behaviour can be surprising.
Logically, any masked values should be equal, but a masked value and a non-masked value should not.
Both requirements cannot be met at the same time with the provided fill_value
argument:
A = ma.array([1, 2, 3], mask=[0, 1, 0]) # [1, --, 3]
B = ma.array([1, 4, 5], mask=[0, 0, 1]) # [1, 4, --]
assert ma.allequal(A, A, fill_value=True) == True # OK
assert ma.allequal(A, B, fill_value=True) == False # AssertionError
assert ma.allequal(A, A, fill_value=False) == False # AssertionError
assert ma.allequal(A, B, fill_value=False) == True # OK
We can get the desired result by writing our own function:
def myallequal(a, b):
if not np.array_equal(ma.getmaskarray(a), ma.getmaskarray(b)):
return False
return ma.allequal(a, b)
A = ma.array([1, 2, 3], mask=[0, 1, 0]) # [1, --, 3]
B = ma.array([1, 4, 5], mask=[0, 0, 1]) # [1, 4, --]
assert myallequal(A, A) == True # OK
assert myallequal(A, B) == False # OK
Similarly to np.allclose
, there's also ma.allclose
, but it suffers from the same issue as ma.allequal
and doesn't provide equal_nan
argument.
Here's a function that works for our purposes:
def myallclose(a, b, rtol=1e-05, atol=1e-08, equal_nan=False):
if not np.array_equal(ma.getmaskarray(a), ma.getmaskarray(b)):
return False
res = np.all(np.isclose(a, b, rtol=rtol, atol=atol, equal_nan=equal_nan))
if res is ma.masked:
return True
return res
A = ma.array([0.3, 0.4, 0.5], mask=[0, 1, 0]) # [0.3, --, 0.5]
B = ma.array([0.1 + 0.2, 0.4, 0.5], mask=[0, 1, 0]) # [0.3, --, 0.5]
C = ma.array([0.1 + 0.2, 0.3, 0.4], mask=[0, 0, 1]) # [0.3, 0.4, --]
assert myallclose(A, B) == True # OK
assert myallclose(A, C) == False # OK