6.2.3. Weighted Nearest Neighbor Classification

In the k nearest neighbor classifier each of the k nearest feature vectors in the learning set cast a vote for their class. And all votes are of the same strength.

We may soften (regularize) the hard voting by assigning soft votes. Most often the vote strength is taken to be dependent on the distance \(d_i\) of feature vector \(\v x\) to the i-th learning example \(\v x\ls i\). Let the function \(w\) be the voting strenght function then weighted NNb classification works as follows.

First calculate the distance from the feature vector \(\v x\) to all vectors in the learning set.

\[d_i = \|\v x - \v x\ls i\|\]
Then perform the voting procedure resulting in a function V that maps a class label to the total voting strength for that class.

\[V(y) = \sum_{i=1}^{m} w(d_i) \left[ y\ls i = y \right]\]

where \(\left[ \text{cond} \right]\) is the Iverson bracket that equals one in case \(\text{cond}\) is true and zero otherwise.

Note that the expression for \(V\) is not the right way to implement the voting procedure. Be sure to understand the algorithm that is of order \(m\) and not dependent on the number of classes.
The unknown \(\v x\) is then classified as:

\[\text{classify}(\v x) = \arg\max_y V(y)\]

Logical choices for the kernel function \(w\) are such that:

\(w(d)>0\) for all \(d\),
\(w(d)\) attains its maximum for \(d=0\), and
\(w(d)\) is not increasing for increasing \(d\).

We refer to the documentation of the k nearest neighbor classifier in sklearn for possible choices of the weight function and for possible ways to scale the data.