Saturday, January 30, 2010

Broadcasting in Numpy

The issue of array shapes matters when trying to combine them by arithmetic operations. Here is a simple array of 4 rows and 3 columns:

>>> from numpy import ones
>>> A = ones((4,3)) # 4 rows x 3 cols
>>> A.shape
(4, 3)
>>> A
array([[ 1., 1., 1.],
[ 1., 1., 1.],
[ 1., 1., 1.],
[ 1., 1., 1.]])


Here is a second array of 4 rows x 1 col

>>> ones((4,1))      # 4 rows x 1 col
array([[ 1.],
[ 1.],
[ 1.],
[ 1.]])


We can add them, and in the process the second array is "broadcast" to the same shape as the first one:

>>> A + ones((4,1))
array([[ 2., 2., 2.],
[ 2., 2., 2.],
[ 2., 2., 2.],
[ 2., 2., 2.]])


An array of a single row and 3 cols works the same:

>>> ones((1,3))      # 1 row x 3 cols
array([[ 1., 1., 1.]])
>>> A + ones((1,3))
array([[ 2., 2., 2.],
[ 2., 2., 2.],
[ 2., 2., 2.],
[ 2., 2., 2.]])


Now, it is possible to make a 1D array, e.g.:

>>> B = ones((3,))   # a 1D array
>>> B
array([ 1., 1., 1.])
>>> B.shape
(3,)


And again, we can add A + B:

>>> A + B
array([[ 2., 2., 2.],
[ 2., 2., 2.],
[ 2., 2., 2.],
[ 2., 2., 2.]])


But this doesn't work if the number of columns doesn't match:


>>> C = ones((4,))   # a 1D array
>>> C.shape
(4,)
>>> C
array([ 1., 1., 1., 1.])
>>> A + C
Traceback (most recent call last):
File "", line 1, in
ValueError: shape mismatch: objects cannot be broadcast to a single shape


We need to call newaxis:

from numpy import newaxis
>>> D = C[:,newaxis]
>>> D.shape
(4, 1)
>>> A + D
array([[ 2., 2., 2.],
[ 2., 2., 2.],
[ 2., 2., 2.],
[ 2., 2., 2.]])