KNN Classification
A non-parametric classification algorithm predicting classes via local neighbors.
What is KNN Classification?
K-Nearest Neighbors (KNN) is a simple, supervised machine learning algorithm used for both classification and regression. To classify an unlabeled query point, KNN calculates its distance (typically Euclidean) to all points in the training set, identifies the K closest points (neighbors), and outputs the majority class.
Key Characteristics:
- Supervised: relies on labeled training points
- Lazy learner: no training phase; lazy computation on query
- Distance metric: Euclidean distance standard
- K parameter dictates neighbor voting boundaries
Complexity Analysis
Avg Time Complexity:
O(1) Training, O(N * D) Classification (naive search)
Space Complexity:
O(N * D) training data storage
Best Case:
O(N) (single test point evaluation)
Worst Case:
O(N * D) (expensive sorting/lookups)
How it Works Step-by-Step
- Load Data: Store all labeled training points and coordinates.
- Calculate Distance: Compute distance between test point and all training points.
- Identify Neighbors: Select the K training points with the smallest distances.
- Vote Class: Perform majority vote among neighbors; output target label.
Code Implementation
Worked Trace Example
Query point Q (0,0) classified with K=3:
1. Neighbors: N1 (1,1) [Red], N2 (1,-1) [Red], N3 (2,2) [Blue]
2. Euclidean distances calculated and sorted.
3. Majority vote: 2 Red vs 1 Blue -> Classify Q as Red.