Numpy Cheat Sheet

ADVERTISEMENT

SLICING (INDEXING/SUBSETTING)
n
(n
P
)
umPy
umerical
ython
Numpy Cheat Sheet
5. Boolean arrays methods
Setting data with assignment :
Count # of ‘Trues’
(ndarray1
> 0).sum()
< 0] = 0 *
ndarray1[ndarray1
P
P
ython
ackage
in boolean array
If at least one
ndarray1.any()
If ndarray1 is two-dimensions, ndarray1 < 0
C
B
: a
C
S
C
reated
y
rianne
olton and
ean
hen
*
value is ‘True’
creates a two-dimensional boolean array.
If all values are
ndarray1.all()
n
(n
P
)
‘True’
COMMON OPERATIONS
umPy
umerical
ython
1. Transposing
Note: These methods also work with non-boolean
Default data type is ‘np.float64’. This is
• A special form of reshaping which returns a ‘view’
arrays, where non-zero elements evaluate to True.
What is NumPy?
**
equivalent to Python’s float type which is 8
on the underlying data without copying anything.
Foundation package for scientific computing in Python
bytes (64 bits); thus the name ‘float64’.
6. Sorting
or
ndarray1.transpose()
Why NumPy?
If casting were to fail for some reason,
***
‘TypeError’ will be raised.
or
ndarray1.T
Inplace sorting
ndarray1.sort()
• Numpy ‘ndarray’ is a much more efficient way
of storing and manipulating “numerical data”
ndarray1.swapaxes(0, 1)
than the built-in Python data structures.
Return a sorted
sorted1 =
SLICING (INDEXING/SUBSETTING)
np.sort(ndarray1)
2. Vectorized wrappers (for functions that
copy instead of
• Libraries written in lower-level languages, such
take scalar values)
• Slicing (i.e.
) is a ‘view’ on
as C, can operate on data stored in Numpy
ndarray1[2:6]
inplace
the original array. Data is NOT copied. Any
‘ndarray’ without copying any data.
works on only a
math.sqrt()
scalar
modifications (i.e.
) to the
ndarray1[2:6] = 8
7. Set methods
N-DIMENSIONAL ARRAY (NDARRAY)
# any sequence (list,
np.sqrt(seq1)
‘view’ will be reflected in the original array.
ndarray, etc) to return a ndarray
What is NdArray?
Return sorted
np.unique(ndarray1)
• Instead of a ‘view’, explicit copy of slicing via :
unique values
3. Vectorized expressions
Fast and space-efficient multidimensional array
ndarray1[2:6].copy()
(container for homogeneous data) providing vectorized
Test membership
resultBooleanArray =
is a vectorized version
np.where(cond, x, y)
np.in1d(ndarray1, [2,
arithmetic operations
of ndarray1 values
of the expression ‘x if condition else y’
• Multidimensional array indexing notation :
3, 6])
in [2, 3, 6]
np.array(seq1)
Create NdArray
or
ndarray1[0][2]
ndarray1[0, 2]
np.where([True, False], [1, 2],
[2, 3]) => ndarray (1, 3)
# seq1 - is any sequence like object,
• Other set methods :
,
,
intersect1d()
union1d()
i.e. [1, 2, 3]
,
setdiff1d()
setxor1d()
* Boolean indexing :
1, np.zeros(10)
Create Special
• Common Usages :
NdArray
# one dimensional ndarray with 10
8. Random number generation (np.random)
ndarray1[(names == ‘Bob’) | (names ==
elements of value 0
np.where(matrixArray
> 0, 1, -1)
‘Will’), 2:]
• Supplements the built-in Python random * with
2,
np.ones(2,
3)
=> a new array (same shape) of 1 or -1 values
# ‘2:’ means select from 3rd column on
functions for efficiently generating whole arrays
# two dimensional ndarray with 6
of sample values from many kinds of probability
np.where(cond, 1, 0).argmax() *
elements of value 1
=> Find the first True element
distributions.
Selecting data by boolean indexing
3,
np.empty(3, 4,
5) *
*
ALWAYS creates a copy of the data.
# three dimensional ndarray of
samples = np.random.normal(size =(3, 3))
uninitialized values
can be used to find the
argmax()
The ‘and’ and ‘or’ keywords do NOT work
index of the maximum element.
4, np.eye(N) or
*
Example usage is find the first
with boolean arrays. Use & and |.
*
np.identity(N)
Python built-in random ONLY samples
element that has a “price > number”
*
# creates N by N identity matrix
one value at a time.
in an array of price data.
np.arange(1, 10)
NdArray version of
* Fancy indexing (aka ‘indexing using integer arrays’)
Python’s
range
4. Aggregations/Reductions Methods
Select a subset of rows in a particular order :
Get # of Dimension
ndarray1.ndim
(i.e. mean, sum, std)
Get Dimension Size
dim1size, dim2size, ..
=
ndarray1[ [3, 8, 4] ]
ndarray1.shape
Compute mean
or
ndarray1.mean()
ndarray1[ [-1, 6] ]
Get Data Type **
ndarray1.dtype
np.mean(ndarray1)
Created by Arianne Colton and Sean Chen
# negative indices select rows from the end
Compute statistics
ndarray1.mean(axis = 1)
Explicit Casting
ndarray2
= ndarray1.
astype(np.int32) ***
over axis *
Fancy indexing ALWAYS creates a
ndarray1.sum(axis = 0)
*
Based on content from
copy of the data.
Cannot assume empty() will return all zeros.
‘Python for Data Analysis’ by Wes McKinney
*
axis = 0 means column axis, 1 is row axis.
*
It could be garbage values.
Updated: August 18, 2016

ADVERTISEMENT

00 votes

Related Articles

Related forms

Related Categories

Parent category: Education
Go