SVM  Support Vector Machines

Optimum Separation Hyperplane

The optimum separation hyperplane (OSH) is the linear classifier
with the maximum margin for a given finite set of learning patterns.
The OSH computation with a linear
support vector machine is presented in this section.

Figure 1. The optimum separation hyperplane (OSH).

Consider the classification of two classes of patterns that are linearly separable,
i.e., a linear classifier can perfectly separate them (Figure 1).
The linear classifier is the hyperplane H (w•x+b=0)
with the maximum width (distance between hyperplanes H_{1} and H_{2}).
Consider a linear classifier characterized by
the set of pairs (w, b) that satisfies the following inequalities
for any pattern x_{i} in the training set:
These equations can be expressed in compact form as
or
Because we have considered the case of linearly separable
classes, each such hyperplane (w, b) is a classifier that
correctly separates all patterns from the training set:
For all points from the hyperplane H
(w•x + b = 0), the
distance between origin and the hyperplane H is b/w. We
consider the patterns from the class 1 that satisfy the equality w•x
+ b = 1, and determine the hyperplane H_{1}; the distance
between origin and the hyperplane H_{1} is equal to 1b/w.
Similarly, the patterns from the class +1 satisfy the equality w•x
+ b = +1, and determine the hyperplane H_{2}; the distance
between origin and the hyperplane H_{2} is equal to +1b/w.
Of course, hyperplanes H, H_{1}, and H_{2} are parallel and no
training patterns are located between hyperplanes H_{1} and H_{2}.
Based on the above considerations, the distance between hyperplanes (margin) H_{1}
and H_{2} is 2/w.
From these considerations it
follows that the identification of the optimum separation hyperplane is performed by
maximizing 2/w, which is equivalent to minimizing w^{2}/2.
The problem of finding the optimum separation hyperplane is represented by the
identification of (w, b) which satisfies:
for which w is minimum.
