Introduction to Support Vector Machines (SVM) - GeeksforGeeks (2024)

Last Updated : 02 Feb, 2023

Summarize

Comments

Improve

INTRODUCTION:

Support Vector Machines (SVMs) are a type of supervised learning algorithm that can be used for classification or regression tasks. The main idea behind SVMs is to find a hyperplane that maximally separates the different classes in the training data. This is done by finding the hyperplane that has the largest margin, which is defined as the distance between the hyperplane and the closest data points from each class. Once the hyperplane is determined, new data can be classified by determining on which side of the hyperplane it falls. SVMs are particularly useful when the data has many features, and/or when there is a clear margin of separation in the data.

What are Support Vector Machines? Support Vector Machine (SVM) is a relatively simple Supervised Machine Learning Algorithm used for classification and/or regression. It is more preferred for classification but is sometimes very useful for regression as well. Basically, SVM finds a hyper-plane that creates a boundary between the types of data. In 2-dimensional space, this hyper-plane is nothing but a line. In SVM, we plot each data item in the dataset in an N-dimensional space, where N is the number of features/attributes in the data. Next, find the optimal hyperplane to separate the data. So by this, you must have understood that inherently, SVM can only perform binary classification (i.e., choose between two classes). However, there are various techniques to use for multi-class problems. Support Vector Machine for Multi-CLass Problems To perform SVM on multi-class problems, we can create a binary classifier for each class of the data. The two results of each classifier will be :

  • The data point belongs to that class OR
  • The data point does not belong to that class.

For example, in a class of fruits, to perform multi-class classification, we can create a binary classifier for each fruit. For say, the ‘mango’ class, there will be a binary classifier to predict if it IS a mango OR it is NOT a mango. The classifier with the highest score is chosen as the output of the SVM. SVM for complex (Non Linearly Separable) SVM works very well without any modifications for linearly separable data. Linearly Separable Data is any data that can be plotted in a graph and can be separated into classes using a straight line.

Introduction to Support Vector Machines (SVM) - GeeksforGeeks (1)

A: Linearly Separable Data B: Non-Linearly Separable Data

We use Kernelized SVM for non-linearly separable data. Say, we have some non-linearly separable data in one dimension. We can transform this data into two dimensions and the data will become linearly separable in two dimensions. This is done by mapping each 1-D data point to a corresponding 2-D ordered pair. So for any non-linearly separable data in any dimension, we can just map the data to a higher dimension and then make it linearly separable. This is a very powerful and general transformation. A kernel is nothing but a measure of similarity between data points. The kernel function in a kernelized SVM tells you, that given two data points in the original feature space, what the similarity is between the points in the newly transformed feature space. There are various kernel functions available, but two are very popular :

  • Radial Basis Function Kernel (RBF): The similarity between two points in the transformed feature space is an exponentially decaying function of the distance between the vectors and the original input space as shown below. RBF is the default kernel used in SVM.
Introduction to Support Vector Machines (SVM) - GeeksforGeeks (2)
  • Polynomial Kernel: The Polynomial kernel takes an additional parameter, ‘degree’ that controls the model’s complexity and computational cost of the transformation

A very interesting fact is that SVM does not actually have to perform this actual transformation on the data points to the new high dimensional feature space. This is called the kernel trick. The Kernel Trick: Internally, the kernelized SVM can compute these complex transformations just in terms of similarity calculations between pairs of points in the higher dimensional feature space where the transformed feature representation is implicit. This similarity function, which is mathematically a kind of complex dot product is actually the kernel of a kernelized SVM. This makes it practical to apply SVM when the underlying feature space is complex or even infinite-dimensional. The kernel trick itself is quite complex and is beyond the scope of this article. Important Parameters in Kernelized SVC ( Support Vector Classifier)

  1. The Kernel: The kernel, is selected based on the type of data and also the type of transformation. By default, the kernel is Radial Basis Function Kernel (RBF).
  2. Gamma : This parameter decides how far the influence of a single training example reaches during transformation, which in turn affects how tightly the decision boundaries end up surrounding points in the input space. If there is a small value of gamma, points farther apart are considered similar. So more points are grouped together and have smoother decision boundaries (maybe less accurate). Larger values of gamma cause points to be closer together (may cause overfitting).
  3. The ‘C’ parameter: This parameter controls the amount of regularization applied to the data. Large values of C mean low regularization which in turn causes the training data to fit very well (may cause overfitting). Lower values of C mean higher regularization which causes the model to be more tolerant of errors (may lead to lower accuracy).

Pros of Kernelized SVM:

  1. They perform very well on a range of datasets.
  2. They are versatile: different kernel functions can be specified, or custom kernels can also be defined for specific datatypes.
  3. They work well for both high and low dimensional data.

Cons of Kernelized SVM:

  1. Efficiency (running time and memory usage) decreases as the size of the training set increases.
  2. Needs careful normalization of input data and parameter tuning.
  3. Does not provide a direct probability estimator.
  4. Difficult to interpret why a prediction was made.

Example

Python3

import numpy as np

from sklearn.datasets import make_classification

from sklearn import svm

from sklearn.model_selection import train_test_split

classes = 4

X,t= make_classification(100, 5, n_classes = classes, random_state= 40, n_informative = 2, n_clusters_per_class = 1)

#%%

X_train, X_test, y_train, y_test= train_test_split(X, t , test_size=0.50)

#%%

model = svm.SVC(kernel = 'linear', random_state = 0, C=1.0)

#%%

model.fit(X_train, y_train)

#%%

y=model.predict(X_test)

y2=model.predict(X_train)

#%%

from sklearn.metrics import accuracy_score

score =accuracy_score(y, y_test)

print(score)

score2 =accuracy_score(y2, y_train)

print(score2)

#%%

import matplotlib.pyplot as plt

color = ['black' if c == 0 else 'lightgrey' for c in y]

plt.scatter(X_train[:,0], X_train[:,1], c=color)

# Create the hyperplane

w = model.coef_[0]

a = -w[0] / w[1]

xx = np.linspace(-2.5, 2.5)

yy = a * xx - (model.intercept_[0]) / w[1]

# Plot the hyperplane

plt.plot(xx, yy)

plt.axis("off"), plt.show();

 
 

Conclusion: Now that you know the basics of how an SVM works, you can go to the following link to learn how to implement SVM to classify items using Python: https://www.geeksforgeeks.org/classifying-data-using-support-vector-machinessvms-in-python/



alokesh985

Introduction to Support Vector Machines (SVM) - GeeksforGeeks (4)

Improve

Next Article

Support Vector Machine (SVM) for Anomaly Detection

Please Login to comment...

Introduction to Support Vector Machines (SVM) - GeeksforGeeks (2024)
Top Articles
Request a partial or full refund from your Host
Does a Full System Restore Remove Viruses?
Katie Pavlich Bikini Photos
Gamevault Agent
Hocus Pocus Showtimes Near Harkins Theatres Yuma Palms 14
Free Atm For Emerald Card Near Me
Craigslist Mexico Cancun
Hendersonville (Tennessee) – Travel guide at Wikivoyage
Doby's Funeral Home Obituaries
Vardis Olive Garden (Georgioupolis, Kreta) ✈️ inkl. Flug buchen
Select Truck Greensboro
How To Cut Eelgrass Grounded
Craigslist In Flagstaff
Shasta County Most Wanted 2022
Energy Healing Conference Utah
Testberichte zu E-Bikes & Fahrrädern von PROPHETE.
Aaa Saugus Ma Appointment
Geometry Review Quiz 5 Answer Key
Walgreens Alma School And Dynamite
Bible Gateway passage: Revelation 3 - New Living Translation
Home
Shadbase Get Out Of Jail
Gina Wilson Angle Addition Postulate
Celina Powell Lil Meech Video: A Controversial Encounter Shakes Social Media - Video Reddit Trend
Walmart Pharmacy Near Me Open
Dmv In Anoka
A Christmas Horse - Alison Senxation
Ou Football Brainiacs
Access a Shared Resource | Computing for Arts + Sciences
Pixel Combat Unblocked
Umn Biology
Obituaries, 2001 | El Paso County, TXGenWeb
Cvs Sport Physicals
Mercedes W204 Belt Diagram
Rogold Extension
'Conan Exiles' 3.0 Guide: How To Unlock Spells And Sorcery
Colin Donnell Lpsg
Teenbeautyfitness
Weekly Math Review Q4 3
Facebook Marketplace Marrero La
Nobodyhome.tv Reddit
Topos De Bolos Engraçados
Gregory (Five Nights at Freddy's)
Grand Valley State University Library Hours
Holzer Athena Portal
Hampton In And Suites Near Me
Stoughton Commuter Rail Schedule
Bedbathandbeyond Flemington Nj
Free Carnival-themed Google Slides & PowerPoint templates
Otter Bustr
San Pedro Sula To Miami Google Flights
Selly Medaline
Latest Posts
Article information

Author: Msgr. Benton Quitzon

Last Updated:

Views: 6127

Rating: 4.2 / 5 (63 voted)

Reviews: 86% of readers found this page helpful

Author information

Name: Msgr. Benton Quitzon

Birthday: 2001-08-13

Address: 96487 Kris Cliff, Teresiafurt, WI 95201

Phone: +9418513585781

Job: Senior Designer

Hobby: Calligraphy, Rowing, Vacation, Geocaching, Web surfing, Electronics, Electronics

Introduction: My name is Msgr. Benton Quitzon, I am a comfortable, charming, thankful, happy, adventurous, handsome, precious person who loves writing and wants to share my knowledge and understanding with you.