SVM - Support Vector Machines

Home

Support Vector Machines

Machine Learning

EvoNet

List 1

List 3

Repository

ROCKIT

SVM - Support Vector Machines

Optimum Separation Hyperplane

The optimum separation hyperplane (OSH) is the linear classifier with the maximum margin for a given finite set of learning patterns. The OSH computation with a linear support vector machine is presented in this section.

Figure 1. The optimum separation hyperplane (OSH).

Consider the classification of two classes of patterns that are linearly separable, i.e., a linear classifier can perfectly separate them (Figure 1). The linear classifier is the hyperplane H (w•x+b=0) with the maximum width (distance between hyperplanes H₁ and H₂). Consider a linear classifier characterized by the set of pairs (w, b) that satisfies the following inequalities for any pattern x_i in the training set:

These equations can be expressed in compact form as

Because we have considered the case of linearly separable classes, each such hyperplane (w, b) is a classifier that correctly separates all patterns from the training set:

For all points from the hyperplane H (w•x + b = 0), the distance between origin and the hyperplane H is |b|/||w||. We consider the patterns from the class -1 that satisfy the equality w•x + b = -1, and determine the hyperplane H₁; the distance between origin and the hyperplane H₁ is equal to |-1-b|/||w||. Similarly, the patterns from the class +1 satisfy the equality w•x + b = +1, and determine the hyperplane H₂; the distance between origin and the hyperplane H₂ is equal to |+1-b|/||w||. Of course, hyperplanes H, H₁, and H₂ are parallel and no training patterns are located between hyperplanes H₁ and H₂. Based on the above considerations, the distance between hyperplanes (margin) H₁ and H₂ is 2/||w||.

From these considerations it follows that the identification of the optimum separation hyperplane is performed by maximizing 2/||w||, which is equivalent to minimizing ||w||²/2. The problem of finding the optimum separation hyperplane is represented by the identification of (w, b) which satisfies: