LabExercise Numpy Exercises
===========================

In machine learning we are dealing with massive amounts of data. Data
most often organised in tables. When all data elements in a table are
of the same datatype (like an integer or a floating point number) the
table can be represented with a homogeneous array.

Languages that are optimally suited for programming with data are
therefore equipped with array data types that are integral part of the
language. Although arrays look a lot like python lists they are not as
shown in the following code.


.. ipython:: python

   import numpy as np
   a = np.array([1,2,3])
   print(type(a))
   print(a)
   b = [1,2,3]
   print(b)

   print(a+a)
   print(b+b)


The nice thing about Numpy arrays is that it allows you to manipulate
the data in arrays without writing explicit loops. For instance look
at the addition of all elements in an array:

.. ipython:: python

   import numpy as np
   a = np.random.rand(65536)

   def loopsum(a):
       sum = 0
       for i in range(len(a)):
           sum += a[i]
       return sum

   %timeit loopsum(a)
   %timeit np.sum(a)


So the explicit loop sum function in python takes 10 ms versus 30 us
for the numpy version. That is about 350 times slower for the explicit
loop version.

So be aware in this course to use build-in Numpy tools to manipulate
and calculate with arrays.

There are many python/numpy tutorials available like this one
http://cs231n.github.io/python-numpy-tutorial/.

Array Calculations and Array Indexing
-------------------------------------

**In all exercises below you are not allowed to use a loop in python.**

#. Given two arrays A and B each of the same size calculate their sum
   (elementwise) and their product (elementwise)

#. Calculate the mean of all elements in an array A without using the
   np.mean or np.average functions.

#. Calculate the standard deviation of all elements in an array A
   without using np.var or np.std.
   
#. Given an array A with shape (128,) calculate the sum of the first,
   third, fifth, etc elements (A[0]+A[2]+...).

#. Given an array A with shape (1024,) make an array B containing only
   the first 512 elements of A. I.e. B-shape should be (512,).

#. Given an array A with shape (1024,) make an array B containing only
   the elements A[22],A[23],...,A[42].

#. Given an array A with shape (1024,) and an integer array I make an
   array B whose elements are A[I[0]], A[I[1]], ..., A[I[-1]]

#. Given an array A with shape (N,) make an array with all elements of
   A in reverse order.
   
#. Given an array A = np.random.rand(128) make an array B containing
   all elements in A that are less then or equal to 0.5


Array indexing is probably one of the most difficult subjects of
programming with numpy in an efficient way. Let A be a numpy ndarray
(n-dimensional array) then A[obj] is an indexing operation on
array A. It depends on the value and type of obj what type of indexing
is used. There are really three types of indexing...


Views on Arrays
---------------

Consider the following code snippet:

.. ipython:: python

   import numpy as np

   A = np.random.randint(0, 5, size=(3,5))
   print(A)
   B = A[::2,::2]
   print(B)
   B[:,:] = 99
   print(B)
   print(A)


Yes indeed B shares the same data (the element values) as A. B is not
an entire new data item, it is a *view* on (a subset of) the
array A. This is something that a programmer should be aware of, else
it will lead to errors that are hard to spot since it is hidden in the
semantics of array indexing operators.


Two dimensional data arrays
---------------------------

In machine learning a classical task is classification. Given $n$
features measured for an object, say we measure the mass, the
circumference and the surface area of either banana's or apples, we
would like to classify an piece of fruit as either banana or apple
based only on the three numerical values.

In machine learning such a problem is tackled by collecting a lot of
examples of these types of fruit. Say we have $m$ examples. For each
example with index $i$ we know whether it is an apple or banana, this
will be encoded with the *target vector* $Y$ such that $Y[i]=0$ for
banana's and $Y[i]=1$ for apples. The mass, circumference and area of
example $i$ form the $i$ -th row in the $data array$ $X$. So $X[i,0]$
is the mass, $X[i,1]$ is the circumference and $X[i,2]$ is the
area. The $i$ -th row is called the *feature vector* for the $i$ -th
example.

The task then for a machine learning algorithm is to come up with a
rule that takes a feature vector as input and returns the
corresponding class. This rule should be learned from the data matrix
$X$ and the target vector $Y$.


#. Select the $i$ -th feature vector from the data matrix
   $X$. I.e. select the $i$ -th row from $X$.

#. Select the $j$ -th column from the data matrix $X$.

#. Given a data matrix $X$ with shape $(m,n)$ calculate the vector $M$
   of shape $(n,)$ where $M[i]$ is the mean of the $i$ -th column of
   $X$, i.e. the mean of the $i$ -th feature. For instance in our
   example of the apples and banana's $M[1]$ is the mean value of all
   circumferences of all pieces of fruits in the data matrix.

#. Now subtract the mean vector you just calculated from all feature
   vectors (the row vectors) in the data set leading to the data
   matrix $X_0$. Yes this can be done without a loop! Hint: look at
   array broadcasting.
   
#. Before calculating the mean of the features we would like to select
   the apples from the data set. Note that apples and oranges are
   randomly distributed over the rows of $X$. So calculate $M_a$ as
   the matrix such that $M_a[i]$ is the mean of the $i$ -th feature of
   only the apples in the data set.

#. Select the feature vector $F$ for the piece of fruit with the
   largest mass of all in the data set. Hint: look at the function
   np.argmax for this.

#. In a lot of algorithms we start with a data matrix $X$ and then we
   would like to make a matrix $X'$ that is matrix $X$ but with a
   column prepended containing only the values 1. You can do that in a
   one-liner!

   
Tricks with Arrays
------------------

#. Given an array (vector) A of shape (N,) make it into an array B of
   shape (N,1).

#. Given an array (vector) A of shape (N,) make it into an array B of
   shape (1,N).

#. Given an array A of shape (M,N) what is A[i]? What are the valid
   values for i?

#. Let A35 be an array of shape (3,5) and let v5 be an array of shape
   (5,). Subtract v5 from each row in A35.

#. Let A35 be an array of shape (3,5) and let v3 be an array of shape
   (3,). Subtract v3 from each column in A35.


Linear Algebra
--------------

This is easier in Python 3 then in Python 2. In python 3 the @
operator is introduced for matrix multiplication. Let A be an array of
shape (m,n) and let B be an array of shape (m,k) then in python 3 we
can write A @ B for the matrix multiplication of A and B. In python 2
we have to write A.dot(B) for matrix multiplication.


Note that there is conceptual difference between a 1 dimensional array
V of size (N,) and a vector V as we know it from linear algebra. In
linear algebra a vector with N elements has dimensions $N\times 1$. A
'vector' V as a numpy array has shape (N,).

#. Calculate the inner product of two vector v and w both of shape
   (N,).

#. Calculate the product of a matrix A of shape (M,N) with a vector v
   of shape (N,).

#. Let v be an array of shape (N,). What is the shape of v.T (or
   np.transpose(v))

#. Let A be an array of shape (3,5) and let v3 be an array of shape
   (3,). We define v31 = v3.reshape(3,1). What is v3 @ A, v31 @ A and
   v31.T @ A?


If you really want to eat your heart out with Numpy and matrix
(tensor) multiplications look at the documentation of numpy.tensordot.


Concluding Remarks
------------------

If you want to program a loop over the elements in the array and you
think that is quite logical thing to do, then chances are high that is
indeed so logical that someone else has done it before and that it is
part of the numpy library.

In general you should really read the manuals and tutorials on the web
to make sure you learn how to use modern array manipulation languages
(libraries) to unleash their power.

Indeed in some cases ease of programming and speed is at the cost of
(a lot) more memory that is needed. But for proof of concept
implementations nothing beats these languages/libraries.

General principle: **if you can write it down in a few lines of math,
you should be able to program it in a few lines of Python/Numpy
code...**