Beginner’s Guide to NumPy
NumPy is a Python library used for scientific computing. It’s also popularly used in data science, machine learning, and general-purpose applications along with other libraries. It provides high-performance multidimensional arrays and some useful tools to deal with them. The name “NumPy” stands for Numerical Python.
A NumPy array is a grid of values of the same type that is indexed by a tuple of positive integers. NumPy is popular because it’s very easy to use and understand, and it also gives programmers full control of arrays.
This article explores the ins and outs of NumPy, the amazing, memory-efficient Python library.
- Getting started with NumPy
- Creating arrays in NumPy
- NumPy data types
- Working with NumPy arrays
- Learn more about NumPy
Prerequisites
- Python 3. x (3.10 is highly recommended)
- NumPy latest version (
pip install numpy
)
Getting started with NumPy
First of all, let’s create a new Python file (e.g., index.py
) and then import NumPy to our script by using the following command.
import numpy
For easier usage, we can import it with an alias like below:
import numpy as np
Now, we can just use the term np
everywhere. In the following code, we’re creating a simple NumPy array containing integer values.
import numpy as np
myArray = np.array([1, 2, 3, 4, 5, 6])
print(myArray)
If you run the command python3 index.py
, it will return an array as the output.
Creating arrays in NumPy
As we learned before, NumPy is built for working with arrays. In NumPy, an array is simply made with the numpy.array()
function and it is called ndarray
(note: this refers to n-dimensional arrays; we’ll learn more about dimensions later on in this tutorial).
To verify, we can run this code:
import numpy as np
myArray = np.array([1,9,1,7])
print(type(myArray))
The output would be:
,
Unlike the built-in arrays, we can pass any data structure — like lists, tuples, and arrays — to create a ndarray
, as it automatically gets converted.
Dimensions
In NumPy arrays there are dimensions. Wait, what does “dimension” mean?
Array dimensions are just like cubes. You have one square (0-D array). By putting them together, you make a row of squares (1-D array). And then you can combine rows of squares to make a cube.
In arrays, dimension refers to the level of the array’s depth (often called nested arrays). I highly recommend paying attention to this part closely and understanding the concept as it is very useful in the future.
0-D arrays
According to our cube example, 0-D is the small square, just like the building unit.
Also known as scalars, 0-D arrays are the elements on arrays. Each non-array value in an array can be considered a 0-D array. Here’s what a 0-D array looks like:
import numpy as np
myArray = np.array('Mattermost')
print(myArray, myArray.ndim) #ndim is used to find a number of dimensions
Output:
Mattermost 0
1-D array
A 1-D array is a row of squares that we build using 0-D squares. Arrays containing 0-D arrays as elements are called 1-D arrays. The following is an example of a 1-D array:
import numpy as np
myArray = np.array(["Mattermost", "is", "Awesome!"])
print(myArray, myArray.ndim)
Output:
['Mattermost' 'is' 'Awesome!'] 1
2-D array
Once we have two or more rows of cubes, we can attach them together like a matrix.
That’s how we create 2-D arrays, or arrays built with 1-D arrays as elements. We can use the following syntax to make a 2-D array:
import numpy as np
myArray = np.array([['Mattermost', 'Playbooks'], ['Mattermost', 'Channels']])
print(myArray, myArray.ndim)
Output:
[['Mattermost' 'Playbooks']
['Mattermost' 'Channels']] 2
Often, 2-D arrays are used in representing matrices or second-order tensors.
3-D array
In the previous part, we built something like a matrix. Now, we have to attach them and make a cube!
And this can be considered a 3-D array, which uses 2-D arrays as the elements. Often, 3-D arrays are used to represent third-order tensors.
We can create 3-D arrays with the following syntax:
import numpy as np
myArray = np.array([[[1, 2, 3], [4, 5, 6]], [[1, 2, 3], [4, 5, 6]]])
print(myArray, myArray.ndim)
This one might look somewhat complex. But the following illustration should make it a little clearer.
Python Array Dimensions: FAQ
How many dimensions can a NumPy array have?
- Any number!
How can I find the number of dimensions in a NumPy array?
- Use the
ndim
property. (e.g.,print(myArray.ndim)
)
How to find the memory size on a NumPy array?
- We can use the
nbytes
property to find the total memory size of an array (e.g.,print(myArray.nbytes)
)
- print(myArray.nbytes))
NumPy data types
Except for basic data types in Python, there are also some different data types in NumPy arrays. Here are the most common ones:
i
– integer?
– booleanu
– unsigned integerf
– floatc
– complex floatm
– timedeltaM
– datetimeO
– Python objectS
– string
To check the data type of an object, ndarray
has a built-in keyword: dtype
.
import numpy as np
myArray = np.array([True, False, True])
print(myArray.dtype)
Output:
bool
Imagine that we want to create an array with integers but need them to be string values. We can simply add the dtype
value to the array as below:
import numpy as np
myArray = np.array([1, 2, 3, 4], dtype="S")
print(myArray.dtype)
Output:
|S1
Remember, there are some data types that can’t be converted (e.g., strings can’t be converted to integers).
Working with NumPy arrays
Now we have a clear understanding of creating arrays. Next, we need to learn how to do different operations with NumPy arrays — such as indexing, slicing, iterating, joining, and splitting.
Indexing NumPy arrays
As usual arrays, NumPy arrays also start with the 0th element. If we want to print out the first element of an array, we can simply use this command: print(myArray[0])
.
The following code is a basic example of where and how indexing can be used:
import numpy as np
myArray = np.array(["Mattermost", "Slack"])
print(f"{myArray[0]} is a great alternative for {myArray[1]}")
Output:
Mattermost is a great alternative for Slack
But this method can only be used for 1-D arrays. In 2-D arrays, it’s done in a different way.
Let’s think of a 2-D array as a table with rows and columns. A row represents the dimension while a column represents the index. By envisioning it on a grid, the array is easily accessible.
The following example shows indexing in 2-D arrays:
import numpy as np
myArray = np.array([[1,2,3,4,5], [6,7,8,9,10]])
print('3rd column element on 1st row: ', myArray[0, 2])
Output:
3
Here’s another example of 2-D array indexing:
import numpy as np
myArray = np.array([['Mattermost', 'Playbooks'], ['Mattermost', 'Channels']])
print(f"{myArray[0, 1]} and {myArray[1, 1]} are two products of Mattermost")
Output:
Playbooks and Channels are two products of Mattermost
When it comes to 3-D arrays, we can use a similar syntax and include the third dimension for the index. The code might look like this:
import numpy as np
myArray = np.array([[[1, 2, 3], [4, 5, 6]], [[7, 8, 9], [10, 11, 12]]])
print("First 2-D array, Second 1-D array, Second element is", myArray[0, 1, 1])
Output:
First 2-D array, Second 1-D array, Second element is 5
Slicing NumPy arrays
Slicing refers to the process of retrieving elements from one given index to another given index. For example, imagine that we want to print the words “Hello” and “World” from an array that is ["By the Way,", "Hello", "World!"]
. We can use slicing like this:
import numpy as np
myArray = np.array(["By the Way,", "Hello", "World!"])
print(myArray[1:3])
Output:
['Hello' 'World!']
When slicing, the result includes the first index but excludes the end index. If we used my Array[1:2]
, the result would be just: "Hello"
.
When slicing 2-D arrays, we have to insert a 1-D array index as well. Therefore, the code looks like this:
import numpy as np
myArray = np.array([[1, 2, 3, 4, 5], [6, 7, 8, 9, 10]])
print(f"print second 1-D array until before the 4th element:{myArray[1, 1:4]}")
Output:
print second 1-D array until before the 4th element:[7 8 9]
Step option
We can use the step value to indicate the step of slicing. Imagine that we want to print every other number from 0 to 7. In that case, we can use the step value as 2 to get the desired values:
import numpy as np
arr = np.array([0, 1, 2, 3, 4, 5, 6, 7])
print(arr[0:7:2])
Output:
[0 2 4 6]
Iterating arrays
Iterating is going through each element of the array. When iterating, you can perform actions on each element of the array. We can do this using a basic for-loop operation in Python, as simple as below:
import numpy as np
myArray = np.array(["M", "A", "T", "T", "E", "R", "M", "O", "S", "T",])
for x in myArray:
print(x)
The result would be:
M
A
T
T
E
R
M
O
S
T
If we tried this on a 2-D array, it would work the same, iterating through all the elements:
import numpy as np
myArray = np.array([[[1, 2, 3], [4, 5, 6]]])
for x in myArray:
print(x)
Output:
[[1 2 3]
[4 5 6]]
But sometimes we need to return the actual values. In that case, we have to iterate through each dimension recursively:
import numpy as np
myArray = np.array([[[1, 2, 3], [4, 5, 6]]])
for x in myArray:
for y in x:
print(y)
Output:
[1 2 3]
[4 5 6]
In 3-D arrays, if we use a normal for-loop, it will return the 2-D arrays as the output:
import numpy as np
myArray = np.array([[[1, 2, 3], [4, 5, 6]], [[7, 8, 9], [10, 11, 12]]])
for x in myArray:
print(x)
Output:
[[1 2 3]
[4 5 6]]
[[ 7 8 9]
[10 11 12]]
To return exact values, we have to use more nested loops just like in 2-D array iterating, but one step deeper:
import numpy as np
myArray = np.array([[[1, 2, 3], [4, 5, 6]], [[7, 8, 9], [10, 11, 12]]])
for x in myArray:
for y in x:
for z in y:
print(z)
Output:
1
2
3
4
5
6
7
8
9
10
11
12
Joining and splitting arrays
Joining means combining elements of two or more arrays into a single array. In NumPy, we have to join arrays by axes.
In NumPy, axes are defined for arrays with more than one dimension.
We use the concatenate()
function to pass values to join arrays, along with the axis. If the axis value is not given, it is considered as 0.
import numpy as np
myArray = np.array(["Mattermost", "is", "awesome"])
myArray2 = np.array(["and", "open-source!"])
joined_array = np.concatenate((myArray, myArray2))
print(joined_array)
The output is:
['Mattermost' 'is' 'awesome' 'and' 'open-source!']
Since the arrays above are 1-D arrays, including the axis value is not needed. But when it comes to 2-D and 3-D arrays, it’s a good practice to indicate the axis value.
Splitting has the opposite function of joining. When joining NumPy arrays, multiple arrays are merged into one. When splitting, however, a single array is broken into multiple arrays. We use the array_split()
function for that, along with the desired number of resulting arrays:
import numpy as np
myArray = np.array([1, 2, 3, 4, 5, 6])
newArray = np.array_split(myArray, 3)
print(newArray)
Output:
[array([1, 2]), array([3, 4]), array([5, 6])]
The resulting array is like any other array, and you can perform other array actions on it, like indexing:
import numpy as np
myArray = np.array([1, 2, 3, 4, 5, 6])
newArray = np.array_split(myArray, 3)
print(newArray[0])
Output:
[1 2]
We can use the same syntax for 2-D arrays, but it will return as 2-D values:
import numpy as np
myArray = ([[1, 2], [3, 4], [5, 6], [7, 8], [9, 10], [11, 12]])
newArray = np.array_split(myArray, 3)
print(newArray)
Output:
[array([[1, 2],
[3, 4]]), array([[5, 6],
[7, 8]]), array([[ 9, 10],
[11, 12]])]
Searching and sorting NumPy arrays
There are some instances where you need to find certain elements in an array. The where()
function in NumPy comes in handy in these situations.
There are several instances where you can use the where()
function — like finding indexes where a certain value is used:
import numpy as np
myArray = np.array(["Mattermost", "Playbooks", "Mattermost", "Channels"])
index = np.where(myArray == "Mattermost")
print(index)
Output:
(array([0, 2]),)
The above output means the string ‘Mattermost’ was found at the 0th and 2nd indexes.
You can also use the function to find the indexes where the values are even numbers:
import numpy as np
myArray = np.array([1, 2, 3, 4, 5, 6, 7, 8])
index = np.where(myArray%2 == 0)#divide each value by 2
# To find odd numbers, change 0 to 1
print(index)
Output:
(array([1, 3, 5, 7]),)
The above output means the values at those indexes in the array are even. Note that the indexes themselves are odd in the output.
Sorting in Python means arranging elements in an ordered sequence (e.g., putting string values in alphabetical order). NumPy arrays have a special function, sort()
, for that purpose.
The following are some examples of sorting arrays in NumPy.
Sorting integer values:
import numpy as np
myArray = np.array([4, 5, 6, 7, 8, 2, 0, 1])
print(np.sort(myArray))
Output:
[0 1 2 4 5 6 7 8]
Sorting string values:
import numpy as np
myArray = np.array(['Playbooks', 'Channels', 'Boards'])
print(np.sort(myArray))
Output:
['Boards' 'Channels' 'Playbooks']
Sorting Boolean values:
import numpy as np
myArray = np.array([True, False, True])
print(np.sort(myArray))
Output:
[False True True]
Shape and reshape in arrays
The shape of an array is the number of elements in each dimension. To get the shape of an array, we use the shape
keyword:
import numpy as np
myArray = np.array([[1, 2, 3], [4, 5, 6]])
print(myArray)
print(myArray.shape)
Output:
[[1 2 3]
[4 5 6]]
(2, 3)
The resulting (2, 3)
means there are 2 rows and 3 columns in the provided array.
In NumPy, reshaping is a function for changing the shape of an array. For example, if we want to convert a 1-D array to a 2-D array, we can use reshape
:
import numpy as np
myArray = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12])
newArray = myArray.reshape(4, 3)
print(newArray)
Output:
[[ 1 2 3]
[ 4 5 6]
[ 7 8 9]
[10 11 12]]
Accordingly, the output should contain 4 arrays (rows) with 3 elements (columns) each, with the content of the original array distributed amongst the new shape.
When converting from 1-D to 3-D, the process is a little bit different.
import numpy as np
myArray = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12])
newArray = myArray.reshape(2, 3, 2)
print(newArray)
Output:
[[[ 1 2]
[ 3 4]
[ 5 6]]
[[ 7 8]
[ 9 10]
[11 12]]]
The output above contains two 2-D arrays that contain three 1-D arrays, each with 2 elements (0-D).
And when we want to convert any multidimensional array to a 1-D array, it’s a lot simpler. We just need to use reshape(-1)
.
Now, you might have a question: Can we reshape any array into another shape? The answer is yes, as long as the elements required for reshaping are equal in both shapes. For example, if we have a 1-D array that contains 8 elements and we want to convert it into a 2-D array, with 2 rows containing 4 elements each, it is possible because 2 x 2 = 8.
But if we want to convert the same array into three 2-D arrays containing 3 elements each, it is not possible because 3 x 3 = 9. It will raise a ValueError:
ValueError: cannot reshape array of size 8 into shape (3,3)
Learn more about NumPy
In this guide, we learned many things about NumPy arrays, a critical component of working with NumPy. If you want to learn more about NumPy or get stuck, check out its documentation site.