KNN is a typical example of a lazy learner. It is called lazy not because of its apparent simplicity, but because it doesn't learn a discriminative function from the training data but memorizes the training dataset instead.

the downside is that the computational complexity for classifying new samples grows linearly with the number of samples in the training dataset in the worst-case scenario—unless the dataset has very few dimensions (features) and the algorithm has been implemented using efficient data structures such as KD-trees.

if we are using a Euclidean distance measure, it is also important to standardize the data so that each feature contributes equally to the distance. The 'minkowski' distance that we used in the previous code is just a generalization of the Euclidean and Manhattan distance… It becomes the Euclidean distance if we set the parameter p=2 or the Manhatten distance at p=1, respectively.

We have discussed the concept of regularization in the section about logistic regression as one way to avoid overfitting. However, in models where regularization is not applicable such as decision trees and KNN, we can use feature selection and dimensionality reduction techniques to help us avoid the curse of dimensionality.

results matching ""

    No results matching ""