Mathematics Behind Support Vector Machine

Support Vector Machine is supervised machine learning algorithm. In this blog, we are going to discuss how mathematically support vector machine works. We will also discuss the types of SVM and how to implement it in Python. So, let's get started.

In the past, we have authored blog posts covering a wide range of topics, including machine learning algorithms such as classification and prediction. Our intention has always been to assist you in enhancing your understanding of how machines operate. We invite you to explore our website and take a look at some of the articles we have published. We would greatly appreciate any feedback you might have, as it would tremendously aid us in refining our writing. Please feel free to follow the provided links to access some of our featured pieces.

This post has been cross-posted from my GitHub page

One of the most widely used supervised learning techniques, Support Vector Machine (SVM), is utilized for both classification and regression issues. However, it is largely employed in Machine Learning Classification issues.

When given input data points, a support vector machine generates the hyper plane which in two dimensions is just a line that best divides the data points into two groups.

The decision boundary is this line or hyper plane; any data points that fall on one side of it are categorized in one class, and those that fall on the other side are classified in a different class.

Support Vector Machine(SVM) offer two key advantages over more recent algorithms like neural networks: greater speed and improved performance with fewer samples (in the thousands). The nearest data points to the hyper plane, known as support vectors, have an impact on the hyper plane's position and orientation. There are a variety of different hyper planes that might be used to split the two classes of data points.

Support Vector Machine(SVM) algorithm chooses the hyper plane with the highest margin as the ideal hyper plane. Maximum marginal hyper plane is the name of such a hyper plane (MMH). Let H1 and H2 be parallel to the decision boundary's hyper plane and pass through the support vectors.

It should be the same distance between plane H1 and the hyper plane as between plane H2 and the hyper plane. The margin is the separation of planes H1 and H2.

Types of Support Machine

There are two varieties of Support Vector Machine (SVM): Linear SVM and Non-linear SVM.

The Support Vector Machine(SVM) used to categorize data points with linear separability is known as linear Support Vector MAchine(SVM), whereas the Support Vector Machine(SVM) used to categorize data points with non-linear separability is known as non-linear Support Vector Machine(SVM).

Non-linear SVM operates in the subsequent two steps:

  • By employing the kernel-trick, it converts low-dimensional data points into high-dimensional data points that can be linearly separated.

  • Then, it uses linear-hyperplane to classify the data points.

The kernel of Support Vector Machine(SVM) algorithms is a collection of mathematical functions. Non-linearly separable low-dimensional data points are converted into linearly separable high-dimensional data points using these functions. Popular kernels include the Gaussian, Linear, Polynomial, Radial Basis Function (RBF), Sigmoid, and others.

Consider the following linearly inseparable 2-D data items. By including the third dimension $$z=x^2+y^2$$, we may convert the data points into linearly separable data points.

Let's demonstrate it by using one numerical example

Consider following data points

  • Positively Labelled Data Points:(3,1),(3,-1),(6,1),(6,-1)
  • Negatively Labelled Data Points:(1,0),(0,1),(0,-1),(-1,0)

Determine the equation of hyperplane that divides the above data points into two classes.
Then predict the class of data point (5,2).

Solution: First plot the given points

Positive Points: (3,1),(3,-1),(6,1),(6,-1)

Negative Points: (1,0),(0,1),(0,-1),(-1,0)

Support vectors are

s1=(1,0), s2=(3,1), s3=(3,-1)

Augment support vectors with bias = 1

s1=(1,0,1), s2=(3,1,1), s3=(3,-1,1)

Since there are three support vectors, we need to calculate three variables
Thus, three linear equations can be written as

Because s1 belongs to the negative class and s2, s3, which are points belonging to the positive class, we wrote -1, 1 and 1 on the right side of the equation above.

After simplifying above equations, we get,

solving these equations, we get

Now, we can compute weight vector of the hyperplane as below,

Hence, equation of hyperplane that divides data points is

X-2 = 0

Data point to be classified is (5,2)
Putting this data point is above equation we get,
Thus the data point (5,2) belongs to +1 (Positively) class

Leave a Reply

Scroll to top
Subscribe to our Newsletter

Hello surfer, thank you for being here. We are as excited as you are to share what we know about data. Please subscribe to our newsletter for weekly data blogs and many more. If you’ve already done it, please close this popup.

No, thank you. I do not want.
100% secure.