While we believe that this content benefits our community, we have not yet thoroughly reviewed it. If you have any suggestions for improvements, please let us know by clicking the “report an issue“ button at the bottom of the tutorial.

In Python, we can use the **numpy.where()** function to select elements from a numpy array, based on a condition.

Not only that, but we can perform some operations on those elements if the condition is satisfied.

Let’s look at how we can use this function, using some illustrative examples!

This function accepts a numpy-like array (ex. a NumPy array of integers/booleans).

It returns a new numpy array, after filtering based on a **condition**, which is a numpy-like array of boolean values.

For example, `condition`

can take the value of `array([[True, True, True]]`

), which is a numpy-like boolean array. (By default, NumPy only supports numeric values, but we can cast them to `bool`

also)

For example, if `condition`

is `array([[True, True, False]])`

, and our array is `a = ndarray([[1, 2, 3]])`

, on applying a condition to array (`a[:, condition]`

), we will get the array `ndarray([[1 2]])`

.

```
import numpy as np
a = np.arange(10)
print(a[a <= 2]) # Will only capture elements <= 2 and ignore others
```

**Output**

```
array([0 1 2])
```

**NOTE**: The same condition condition can also be represented as **a <= 2**. This is the recommended format for the condition array, as it is very tedious writing it as a boolean array

But what if we want to preserve the dimension of the result, and not lose out on elements from our original array? We can use **numpy.where()** for this.

```
numpy.where(condition [, x, y])
```

We have two more parameters `x`

and `y`

. What are those?

Basically, what this says is that if `condition`

holds true for some element in our array, the new array will choose elements from `x`

.

Otherwise, if it’s false, elements from `y`

will be taken.

With that, our final output array will be an array with elements from `x`

wherever `condition = True`

, and elements from `y`

whenever `condition = False`

.

Note that although `x`

and `y`

are optional, if you specify `x`

, you **MUST** also specify `y`

. This is because, **in this case**, the output array shape must be the same as the input array.

**NOTE**: The same logic applies for both single and multi-dimensional arrays too. In both cases, we filter based on the condition. Also remember that the shapes of `x`

, `y`

and `condition`

are broadcasted together.

Now, let us look at some examples, to understand this function properly.

Suppose we want to take only positive elements from a numpy array and set all negative elements to 0, let’s write the code using `numpy.where()`

.

We’ll use a 2 dimensional random array here, and only output the positive elements.

```
import numpy as np
# Random initialization of a (2D array)
a = np.random.randn(2, 3)
print(a)
# b will be all elements of a whenever the condition holds true (i.e only positive elements)
# Otherwise, set it as 0
b = np.where(a > 0, a, 0)
print(b)
```

**Possible Output**

```
[[-1.06455975 0.94589166 -1.94987123]
[-1.72083344 -0.69813711 1.05448464]]
[[0. 0.94589166 0. ]
[0. 0. 1.05448464]]
```

As you can see, only the positive elements are now retained!

There may be some confusion regarding the above code, as some of you may think that the more intuitive way would be to simply write the condition like this:

```
import random
import numpy as np
a = np.random.randn(2, 3)
b = np.where(a > 0)
print(b)
```

If you now try running the above code, with this change, you’ll get an output like this:

```
(array([0, 1]), array([2, 1]))
```

If you observe closely, `b`

is now a **tuple** of numpy arrays. And each array is the location of a positive element. What does this mean?

Whenever we provide just a condition, this function is actually equivalent to `np.asarray.nonzero()`

.

In our example, `np.asarray(a > 0)`

will return a boolean-like array after applying the condition, and `np.nonzero(arr_like)`

will return the indices of the non-zero elements of `arr_like`

. (Refer to this link)

So, we’ll now look at a simpler example, that shows us how flexible we can be with numpy!

```
import numpy as np
a = np.arange(10)
b = np.where(a < 5, a, a * 10)
print(a)
print(b)
```

Ouptut

```
[0 1 2 3 4 5 6 7 8 9]
[ 0 1 2 3 4 50 60 70 80 90]
```

Here, the condition is `a < 5`

, which will be the numpy-like array `[True True True True True False False False False False]`

, `x`

is the array a, and `y`

is the array a * 10. So, we choose from an only if a < 5, and from a * 10, if a > 5.

So, this transforms all elements >= 5, by multiplication with 10. This is what we get indeed!

If we provide all of `condition`

, `x`

, and `y`

arrays, numpy will broadcast them together.

```
import numpy as np
a = np.arange(12).reshape(3, 4)
b = np.arange(4).reshape(1, 4)
print(a)
print(b)
# Broadcasts (a < 5, a, and b * 10)
# of shape (3, 4), (3, 4) and (1, 4)
c = np.where(a < 5, a, b * 10)
print(c)
```

**Output**

```
[[ 0 1 2 3]
[ 4 5 6 7]
[ 8 9 10 11]]
[[0 1 2 3]]
[[ 0 1 2 3]
[ 4 10 20 30]
[ 0 10 20 30]]
```

Again, here, the output is selected based on the condition, so all elements, but here, `b`

is broadcasted to the shape of `a`

. (One of its dimensions has only one element, so there will be no errors during broadcasting)

So, `b`

will now become `[[0 1 2 3] [0 1 2 3] [0 1 2 3]]`

, and now, we can select elements even from this broadcasted array.

So the shape of the output is the same as the shape of `a`

.

In this article, we learned about how we can use the Python **numpy.where()** function to select arrays based on another condition array.

- SciPy Documentation on Python numpy.where() function

Thanks for learning with the DigitalOcean Community. Check out our offerings for compute, storage, networking, and managed databases.

Enter your email to get $200 in credit for your first 60 days with DigitalOcean.

New accounts only. By submitting your email you agree to our Privacy Policy.