k-Nearest Neighbor Classifier ============================= For the k-nearest neighbor classifier we don't only consider the closest element in the learning set. Instead we consider the $k$ closest elements in the learning set. For a two class problem we then classify an unknown feature vector for the class that has the majority in the k closest elements from the learning set. The algorithm is simple: 1. First calculate the distance from the feature vector $\v x$ to all vectors in the learning set. .. math:: d_i = \|\v x - \v x\ls i\| 2. Sort all distances and set the *sorted index map* $I_s$ such that .. math:: \forall i\leq j: d_{I_s(i)} \leq d_{I_s(j)} Note: $I_s$ is what the `argsort` function in Numpy returns. 3. For $i=1,\ldots,k$ count which class $y\ls{I_s(i)}$ is in the majority. That class will be the result of the classifier. The k-NNb classifier can be interpreted as a voting algorithm: the closest $k$ vectors $\v x\ls i$ in the learning set vote (with equal voting strength) for their class $y\ls i$. In the next section we consider a generic voting scheme where the voting strength is a function of the distance of the vector $\v x$ to each of the vectors in the learning set.