NumPy stands for ‘Numerical Python’. It is a package for data analysis and scientific computing with Python. NumPy uses a multidimensional array object, and has functions and tools for working with these arrays. The powerful n-dimensional array in NumPy speeds-up data processing.
NumPy is often used along with packages like SciPy (Scientific Python) and Mat−plotlib (plotting library). This combination is widely used as a replacement for MatLab, a popular platform for technical computing.
It contains various features including these important ones:
A lightweight alternative is to install NumPy using the popular Python package installer: pip install numpy
The best way to enable NumPy is to use an installable binary package specific to your operating system. These binaries contain full SciPy stack (inclusive of NumPy, SciPy, matplotlib, IPython, SymPy and nose packages along with core Python).
An array is a data type used to store multiple values using a single identifier (variable name). An array contains an ordered collection of data elements where each element is of the same type and can be referenced by its index (position).
The important characteristics of an array are:
• Each element of the array is of the same data type, though the values stored in them may be different.
• The entire array is stored contiguously in memory. This makes operations on the array fast.
• Each element of the array is identified or referred to using the name of the Array along with the index of that element, which is unique for each element.
NumPy arrays are used to store lists of numerical data, vectors, and matrices. The NumPy library has a large set of routines (built-in functions) for creating, manipulating, and transforming NumPy arrays. Python language also has an array data structure. (import
array)
Difference Between List and Array
NUMPY − NDARRAY OBJECT
The most important object defined in NumPy is an N-dimensional array type called ndarray. It describes the collection of items of the same type. Items in the collection can be accessed using a zero-based index. Every item in an ndarray takes the same size of block in the memory. Each element in ndarray is an object of data-type object (called dtype).
It creates an ndarray from any object exposing array interface, or from any method that returns an array.
numpy.array(object, dtype=None, copy=True, order=None, subok=False, ndmin=0)
The above constructor takes the following parameters:
object | Any object exposing the array interface method returns an array, or any (nested) sequence |
dtype | Desired data type of array, optional |
copy | Optional. By default (true), the object is copied |
order | C (row major) or F (column major) or A (any) (default) |
subok | By default, returned array forced to be a base class array. If true, sub-classes passed through |
ndimin | Specifies minimum dimensions of resultant array |
# minimum dimensions
import numpy as np
a=np.array([1, 2, 3,4,5], ndmin=2)
print(a) # output: [[1, 2, 3, 4, 5]]
# dtype parameter
import numpy as np
a = np.array([1, 2, 3], dtype=complex)
print(a) # Output: [ 1.+0.j, 2.+0.j, 3.+0.j]
The ndarray object consists of contiguous one-dimensional segment of computer memory, combined with an indexing scheme that maps each item to a location in the memory block. The memory block holds the elements in a row-major order (C style) or a column-major order (FORTRAN or MatLab style).
NUMPY − DATA TYPES
NumPy supports a much greater variety of numerical types than Python does. NumPy numerical types are instances of dtype (data-type) objects, each having unique characteristics. The dtypes are available as np.bool_, np.float32, etc.
Data Type Objects(dtype)
A data type object describes the interpretation of fixed block of memory corresponding to an array, depending on the following aspects:
- Type of data (integer, float or Python object)
- Size of data
- Byte order (little-endian or big-endian)
- In case of structured type, the names of fields, data type of each field, and part of the memory block taken by each field
- If data type is a subarray, its shape and data type
numpy.dtype(object, align, copy)
The parameters are:
- Object: To be converted to data type object
- Align: If true, adds padding to the field to make it similar to C-struct
- Copy: Makes a new copy of dtype object. If false, the result is a reference to built-in data type object
import numpy as np
dt = np.dtype([('age',np.int8)])
a= np.array([(10,),(20,),(30,)], dtype=dt)
print(a['age']) # Output: [10 20 30]
NUMPY − ARRAY ATTRIBUTES
import numpy as np a=np.array([[1,2,3],[4,5,6]]) a.shape=(3,2) print(a) # Output: [[1 2] [3 4] [5 6]] |
import numpy as np a = np.array([[1,2,3],[4,5,6]]) b = a.reshape(3,2) print(b) # Output: [[1 2] [3 4] [5 6]] |
# an array of evenly spaced numbers import numpy as np a = np.arange(24) print(a) |
# dtype of array is int8 (1 byte) # itemsize returns the length of each # element of array in bytes. import numpy as np x = np.array([1,2,3,4,5], dtype=np.int8) print( x.itemsize) |
import numpy as np x = np.array([1,2,3,4,5]) print(x.flags) |
# dtype of array is now float32 (4 bytes) import numpy as np x = np.array([1,2,3,4,5], dtype=np.float32) print(x.itemsize) |
A new ndarray object can be constructed by any of the following array creation routines or using a low-level ndarray constructor.
numpy.empty(shape, dtype=float, order='C')
Shape: Shape of an empty array in int or tuple of int
Dtype: Desired output data type. Optional
Order: 'C' for C-style row-major array, 'F' for FORTRAN style column-major array.
import numpy as np x = np.empty([3,2], dtype=int) print(x) # Output: [[22649312 1701344351] [1818321759 1885959276] [16779776 156368896]] |
# array of five zeros. Default dtype is float import numpy as np x = np.zeros(5) print (x) # Output: [ 0. 0. 0. 0. 0.] |
import numpy as np x = np.zeros((5,), dtype=np.int) print(x) # Output: [0 0 0 0 0] |
import numpy as np x = np.zeros((2,2), dtype=[('x', 'i4'), ('y', 'i4')]) print(x) # Output: [[(0, 0) (0, 0)] [(0, 0) (0, 0)]] |
# array of five ones. Default dtype is float import numpy as np x = np.ones(5) print(x) # Output: [ 1. 1. 1. 1. 1.] |
import numpy as np x = np.ones([2,2], dtype=int) print(x) # Output: [[1 1] [1 1]] |
# convert list to ndarray import numpy as np x = [1,2,3] a = np.asarray(x) print(a) # Output: [1 2 3] |
# dtype is set import numpy as np x = [1,2,3] a = np.asarray(x, dtype=float) print(a) # Output: [ 1. 2. 3.] |
import numpy as np s = 'Hello World' a = np.frombuffer(s, dtype='S1') print(a) # Output: ['H' 'e' 'l' 'l' 'o' ' ' 'W' 'o' 'r' 'l' 'd'] |
import numpy as np list = range(7) it = iter(list) # use iterator to create ndarray x = np.fromiter(it, dtype=float) print(x) # Output: [0. 1. 2. 3. 4. 5. 6.] |
# start and stop parameters set import numpy as np x = np.arange(10,20,2) print(x) # Output: [10 12 14 16 18] |
a = np.array(range(6), float).reshape((2, 3)) a.transpose() print(a) # Output: array([[ 0., 3.], [ 1., 4.], [ 2., 5.]]) |
import
numpy as np
arr
=
np.array([[1
1
, 17
, 1
6],
[2
4
, 2
7
, 1
2],
[1
3
,
17
, 19]])
print
(
"Largest element is:"
, arr.
max
())
print
(
"Row-wise maximum elements:",
arr.
max
(axis
=
1))
print
(
"Column-wise minimum elements:",
arr.
min
(axis
=
0))
# cumulative sum along each row
print
(
"Cumulative sum along each row:\n",
arr.cumsum(axis
=
1
))
a = np.array([2, 4, 3], float) a.sum() # Output: 9.0 a.prod() # Output: 24.0 |
a = np.array([2, 1, 9], float) a.mean() # Output: 4.0 a.var() # Output: 12.666666666666666 a.std() #Output: 3.5590260840104371 |
a = np.array([2, 1, 9], float) a.min() # Output: 1.0 a.max() # Output: 9.0 |
a = np.array([2, 1, 9], float) a.argmin() # Output: 1 a.argmax() # Output: 2 |
# Polynomial mathematics np.poly([-1, 1, 1, 10]) array([ 1, -11, 9, 11, -10])
|
a = np.array([1, 4, 3, 8, 9, 2, 3], float) np.median(a) # Output: 3.0 |
# Random numbers np.random.seed(293423) |
np.random.rand(5) np.random.rand(2,3) |
np.random.random() np.random.randint(5, 10) |
# the discrete Poisson distribution with 𝜆 = 6.0 np.random.poisson(6.0) |
# (Gaussian) distribution with mean 𝜇 = 1.5 and # standard deviation 𝜎 = 4.0 np.random.normal(1.5, 4.0) |
np.random.normal() # 𝜇 = 0, 𝜎 = 1 np.random.normal(size=5) |
NumPy is an alternative for lists in Python as it holds less memory, has faster processing, and is more convenient to use. The difference between the elements is that the NumPy array has to be homogenous.
We can maintain homogeneity for the efficient application of the mathematical functions. Arrays in NumPy are more compact when we compare it to lists and the data type specification which leads to code optimization.
We can combine NumPy with other basic packages like SciPy and Mat-plotlib. The combination implements scientific computations and plotting graphs respectively.