Skip to main content

Beginners Guide to NumPy

NumPy (Numerical Python) is a powerful library in Python for numerical computations. It provides support for arrays, matrices, and a variety of high-level mathematical functions, making it an essential tool for scientific and engineering applications.

1. Introduction to NumPy

NumPy is an open-source Python library designed specifically for numerical computations. At its core, NumPy provides the ndarray, a powerful n-dimensional array object that allows for fast and efficient storage and manipulation of numerical data.

It also includes a wide range of mathematical functions to perform operations on these arrays, which enables users to handle large datasets and perform complex computations with fewer line of code.

NumPy is widely used in various fields, including data science, machine learning, and scientific research.

2. Installing NumPy

To start using NumPy, it needs to be installed in your Python environment. You can easily install NumPy using the Python package manager pip by running the following command in your terminal or command prompt:

pip install numpy

After installation, you can import NumPy in your Python scripts using the alias np (commonly used in the community):

import numpy as np

 

3. NumPy Arrays

The core of NumPy is the ndarray object. These arrays are homogeneous (all elements have the same data type) and multidimensional.

Creating Arrays

NumPy arrays can be created in several ways, depending on the use case and the data you are working with:

  1. Creating Arrays from Lists or Tuples: You can create a NumPy array directly from Python lists or tuples using the np.array() function. For example:

import numpy as np

arr = np.array([1, 2, 3])

print(arr) 

# Output: [1 2 3]

This creates a one-dimensional NumPy array from a list of integers.

A blue arrow pointing to the right

Description automatically generated

  1. Using NumPy Functions to Create Arrays: NumPy provides functions to create arrays with predefined values, such as zeros, ones, or evenly spaced values. Some of these functions include:

·       np.zeros((rows, cols)): Creates an array filled with zeros.

·       np.ones((rows, cols)): Creates an array filled with ones.

·       np.arange(start, stop, step): Generates an array with values in a specified range.

·       np.linspace(start, stop, num): Creates an array with evenly spaced values between the start and stop values.

 

Besides creating an array from a sequence of elements, you can easily create an array filled with 0’s:

>>> np.zeros(2)

array([0., 0.])

Or an array filled with 1’s:

>>> np.ones(2)

array([1., 1.])

Or even an empty array! The function empty creates an array whose initial content is random and depends on the state of the memory. The reason to use empty over zeros (or something similar) is speed - just make sure to fill every element afterwards!

>>> # Create an empty array with 2 elements

>>> np.empty(2)

array([ 3.14, 42.  ])  # may vary

You can create an array with a range of elements:

>>> np.arange(4)

array([0, 1, 2, 3])

And even an array that contains a range of evenly spaced intervals. To do this, you will specify the first numberlast number, and the step size.

>>> np.arange(2, 9, 2)

array([2, 4, 6, 8])

You can also use np.linspace() to create an array with values that are spaced linearly in a specified interval:

>>> np.linspace(0, 10, num=5)

array([ 0. ,  2.5,  5. ,  7.5, 10. ])

Specifying your data type

While the default data type is floating point (np.float64), you can explicitly specify which data type you want using the dtype keyword.

>>> x = np.ones(2, dtype=np.int64)

>>> x

array([1, 1])


Array Attributes

Every NumPy array has several attributes that provide information about its structure and properties. These include:

  • shape: This attribute returns the dimensions of the array as a tuple (e.g., rows and columns for a 2D array).
  • dtype: This specifies the data type of the array elements (e.g., int32, float64).
  • ndim: This indicates the number of dimensions of the array.
  • size: This provides the total number of elements in the array.

Example:

arr = np.array([[1, 2], [3, 4]])

print(arr.shape)  # Output: (2, 2), indicating 2 rows and 2 columns.

print(arr.dtype)  # Output: int32 or int64, depending on the system.

print(arr.ndim)   # Output: 2, indicating a 2D array.

print(arr.size)   # Output: 4, the total number of elements in the array.

 Array Indexing and Slicing

Indexing

You can index and slice NumPy arrays in the same ways you can slice Python lists.

>>> data = np.array([1, 2, 3])

>>> data[1]

2

>>> data[0:2]

array([1, 2])

>>> data[1:]

array([2, 3])

>>> data[-2:]

array([2, 3])

You can visualize it this way:

For multidimensional arrays, you can specify row and column indices:

arr = np.array([[1, 2, 3], [4, 5, 6]])

print(arr[0, 2]) 

# Output: 3 (element in the first row and third column)

 

For multidimensional arrays, slicing can be applied along each dimension:

arr = np.array([[1, 2, 3], [4, 5, 6]])

print(arr[:, 1]) 

# Output: [2 5] (second column of the array)

  

Basic array operations

NumPy supports element-wise arithmetic operations on arrays. You can perform operations like addition, subtraction, multiplication, and division directly on the arrays:


Once you’ve created your arrays, you can start to work with them. Let’s say, for example, that you’ve created two arrays, one called “data” and one called “ones”

You can add the arrays together with the plus sign.

>>> data = np.array([1, 2])

>>> ones = np.ones(2, dtype=int)

>>> data + ones

array([2, 3])

You can, of course, do more than just addition!

>>> data - ones

array([0, 1])

>>> data * data

array([1, 4])

>>> data / data

array([1., 1.])


 Basic operations are simple with NumPy. If you want to find the sum of the elements in an array, you’d use sum(). This works for 1D arrays, 2D arrays, and arrays in higher dimensions.

>>> a = np.array([1, 2, 3, 4])

>>> a.sum()

10

To add the rows or the columns in a 2D array, you would specify the axis.

If you start with this array:

>>> b = np.array([[1, 1], [2, 2]])

You can sum over the axis of rows with:

>>> b.sum(axis=0)

array([3, 3])

You can sum over the axis of columns with:

>>> b.sum(axis=1)

array([2, 4])


Creating Matrices

You can pass Python lists of lists to create a 2-D array (or “matrix”) to represent them in NumPy.

>>> data = np.array([[1, 2], [3, 4], [5, 6]])

>>> data

array([[1, 2],

       [3, 4],

       [5, 6]])

Indexing and slicing operations are useful when you’re manipulating matrices:

>>> data[0, 1]

2

>>> data[1:3]

array([[3, 4],

       [5, 6]])

>>> data[0:2, 0]

array([1, 3])

You can aggregate matrices the same way you aggregated vectors:

>>> data.max()

6

>>> data.min()

1

>>> data.sum()

21


 

You can aggregate all the values in a matrix and you can aggregate them across columns or rows using the axis parameter:

>>> data.max(axis=0)

array([5, 6])

>>> data.max(axis=1)

array([2, 4, 6])

Once you’ve created your matrices, you can add and multiply them using arithmetic operators if you have two matrices that are the same size.

>>> data = np.array([[1, 2], [3, 4]])

>>> ones = np.array([[1, 1], [1, 1]])

>>> data + ones

array([[2, 3],

       [4, 5]])

You can do these arithmetic operations on matrices of different sizes, but only if one matrix has only one column or one row. In this case, NumPy will use its broadcast rules for the operation.

>>> data = np.array([[1, 2], [3, 4], [5, 6]])

>>> ones_row = np.array([[1, 1]])

>>> data + ones_row

array([[2, 3],

       [4, 5],

       [6, 7]])


There are often instances where we want NumPy to initialize the values of an array. NumPy offers functions like ones() and zeros(), and the random.

>>> np.ones(3)

array([1., 1., 1.])

>>> np.zeros(3)

array([0., 0., 0.])

You can also use ones(), zeros(), and random() to create a 2D array if you give them a tuple describing the dimensions of the matrix:

>>> np.ones((3, 2))

array([[1., 1.],

       [1., 1.],

       [1., 1.]])

>>> np.zeros((3, 2))

array([[0., 0.],

       [0., 0.],

       [0., 0.]])

 

Matrix Multiplication

To perform matrix multiplication, you can use the np.dot() function or the @ operator:

arr1 = np.array([[1, 2], [3, 4]])

arr2 = np.array([[5, 6], [7, 8]])

result = np.dot(arr1, arr2)  # Equivalent to arr1 @ arr2

print(result) 

# Output: [[19 22]

          [43 50]]

 

Comments

Popular posts from this blog

Getting Started with NumPy and Pandas:  A Beginner’s Guide NumPy and Pandas are two of the most popular Python libraries for data manipulation, analysis, and scientific computing. Whether you're working on numerical computations or analyzing large datasets, these libraries provide efficient, intuitive, and powerful tools. In this blog, we'll explore the basics of NumPy and Pandas, along with their key features and common use cases. Introduction to NumPy NumPy (Numerical Python) is a foundational library in Python that provides support for large, multi-dimensional arrays and matrices. It also includes a collection of mathematical functions to perform operations on these arrays efficiently. Why Use NumPy? Performance : Faster computations compared to Python lists. Convenience : Provides a wide range of mathematical functions. Flexibility : Works seamlessly with other Python libraries like Pandas, Matplotlib, and Scikit-learn. Installing NumPy pip install numpy Key Fe...