K-Nearest Neighbors (KNN)

Machine Learning

O(nd) per query time, O(n) space

Beginner

Simple, instance-based learning algorithm that classifies new data points based on k closest training examples. Non-parametric, lazy learning method used for classification and regression.

Prerequisites:

Distance metrics

Classification concepts

Visualization

Interactive visualization for K-Nearest Neighbors (KNN)

K-Nearest Neighbors:

• Classification by proximity
• Find k closest points
• Time: O(n)

Interactive visualization with step-by-step execution

Implementation

Language:

1class KNN {
2  constructor(private k: number = 3) {}
3  
4  private X: number[][] = [];
5  private y: number[] = [];
6  
7  fit(X: number[][], y: number[]): void {
8    this.X = X;
9    this.y = y;
10  }
11  
12  private euclideanDistance(a: number[], b: number[]): number {
13    return Math.sqrt(a.reduce((sum, val, i) => sum + (val - b[i]) ** 2, 0));
14  }
15  
16  predict(X: number[][]): number[] {
17    return X.map(x => this.predictOne(x));
18  }
19  
20  private predictOne(x: number[]): number {
21    // Calculate distances to all training points
22    const distances = this.X.map((xi, i) => ({
23      distance: this.euclideanDistance(x, xi),
24      label: this.y[i]
25    }));
26    
27    // Sort by distance and take k nearest
28    distances.sort((a, b) => a.distance - b.distance);
29    const kNearest = distances.slice(0, this.k);
30    
31    // Majority vote
32    const votes = new Map<number, number>();
33    for (const { label } of kNearest) {
34      votes.set(label, (votes.get(label) || 0) + 1);
35    }
36    
37    let maxVotes = 0, prediction = 0;
38    for (const [label, count] of votes) {
39      if (count > maxVotes) {
40        maxVotes = count;
41        prediction = label;
42      }
43    }
44    
45    return prediction;
46  }
47  
48  score(X: number[][], y: number[]): number {
49    const predictions = this.predict(X);
50    const correct = predictions.filter((pred, i) => pred === y[i]).length;
51    return correct / y.length;
52  }
53}

Deep Dive

Theoretical Foundation

KNN stores all training data and classifies new points by majority vote of k nearest neighbors. Distance metrics (Euclidean, Manhattan, Minkowski) determine 'nearness'. No training phase - all computation at prediction time. K selection crucial: small k = noise sensitive, large k = blurred boundaries.

Complexity

Time

Best

O(nd)

Average

O(nd)

Worst

O(nd)

Space

Required

O(n)

Applications

Industry Use

Recommendation systems

Pattern recognition

Credit rating

Medical diagnosis

Image classification

Use Cases

Classification

Regression

Pattern recognition

Recommender systems

Related Algorithms

Linear Regression

Fundamental supervised learning algorithm modeling relationship between dependent variable and independent variables using linear equation. Foundational for predictive modeling and statistics.

Machine Learning

Logistic Regression

Binary classification algorithm using sigmoid function to model probability of class membership. Despite name, it's classification not regression. Foundation for neural networks.

Machine Learning

Decision Tree

Tree-based model making decisions through sequence of if-else questions. Splits data based on feature values to create hierarchical structure. Interpretable and handles non-linear relationships.

Machine Learning

K-Means Clustering

Unsupervised learning algorithm partitioning n observations into k clusters. Each observation belongs to cluster with nearest mean. Widely used for data segmentation and pattern discovery.

Machine Learning