K-Nearest Neighbors (KNN)
Simple, instance-based learning algorithm that classifies new data points based on k closest training examples. Non-parametric, lazy learning method used for classification and regression.
Visualization
Interactive visualization for K-Nearest Neighbors (KNN)
K-Nearest Neighbors:
- • Classification by proximity
- • Find k closest points
- • Time: O(n)
Interactive visualization with step-by-step execution
Implementation
1class KNN {
2 constructor(private k: number = 3) {}
3
4 private X: number[][] = [];
5 private y: number[] = [];
6
7 fit(X: number[][], y: number[]): void {
8 this.X = X;
9 this.y = y;
10 }
11
12 private euclideanDistance(a: number[], b: number[]): number {
13 return Math.sqrt(a.reduce((sum, val, i) => sum + (val - b[i]) ** 2, 0));
14 }
15
16 predict(X: number[][]): number[] {
17 return X.map(x => this.predictOne(x));
18 }
19
20 private predictOne(x: number[]): number {
21 // Calculate distances to all training points
22 const distances = this.X.map((xi, i) => ({
23 distance: this.euclideanDistance(x, xi),
24 label: this.y[i]
25 }));
26
27 // Sort by distance and take k nearest
28 distances.sort((a, b) => a.distance - b.distance);
29 const kNearest = distances.slice(0, this.k);
30
31 // Majority vote
32 const votes = new Map<number, number>();
33 for (const { label } of kNearest) {
34 votes.set(label, (votes.get(label) || 0) + 1);
35 }
36
37 let maxVotes = 0, prediction = 0;
38 for (const [label, count] of votes) {
39 if (count > maxVotes) {
40 maxVotes = count;
41 prediction = label;
42 }
43 }
44
45 return prediction;
46 }
47
48 score(X: number[][], y: number[]): number {
49 const predictions = this.predict(X);
50 const correct = predictions.filter((pred, i) => pred === y[i]).length;
51 return correct / y.length;
52 }
53}Deep Dive
Theoretical Foundation
KNN stores all training data and classifies new points by majority vote of k nearest neighbors. Distance metrics (Euclidean, Manhattan, Minkowski) determine 'nearness'. No training phase - all computation at prediction time. K selection crucial: small k = noise sensitive, large k = blurred boundaries.
Complexity
Time
O(nd)
O(nd)
O(nd)
Space
O(n)
Applications
Industry Use
Recommendation systems
Pattern recognition
Credit rating
Medical diagnosis
Image classification
Use Cases
Related Algorithms
Linear Regression
Fundamental supervised learning algorithm modeling relationship between dependent variable and independent variables using linear equation. Foundational for predictive modeling and statistics.
Logistic Regression
Binary classification algorithm using sigmoid function to model probability of class membership. Despite name, it's classification not regression. Foundation for neural networks.
Decision Tree
Tree-based model making decisions through sequence of if-else questions. Splits data based on feature values to create hierarchical structure. Interpretable and handles non-linear relationships.
K-Means Clustering
Unsupervised learning algorithm partitioning n observations into k clusters. Each observation belongs to cluster with nearest mean. Widely used for data segmentation and pattern discovery.