top of page
Search

# SVM and K-SVM - Learn to Code

I am trying to put together a quick coding guide for SVM and K-SVM using simple dateset and I will compare the accuracy using confusion matrix between and accuracy score both model.

Lets understand the data set....

This below data set represent whether an individual purchased a car or not based on age and salary. As part of explanation I will create training and testing data set out of below data set. The training datasets will be used to train the model and testing data will be used to test the model.

I will follow below steps in order to arrive prediction for the above data.

SVM & K-SVM :

1.Importing the libraries

2.Importing the datasets

3.Splitting the datasets into the Training set and Test set

4.Feature Scaling

5.Training the Logistic Regression model on the Training set

6.Predicting a new result

7.Predicting the Test set results

8.Making the Confusion Matrix

Details :-

1.Importing the libraries

```importÂ numpyÂ asÂ np
importÂ matplotlib.pyplotÂ asÂ plt
importÂ pandasÂ asÂ pd```

2.Importing the datasets

```datasetÂ =Â pd.read_csv('/content/sample_data/Social_Network_Ads.csv')
XÂ =Â dataset.iloc[:,Â :-1].values
yÂ =Â dataset.iloc[:,Â -1].values```

3.Splitting the datasets into the Training set and Test set

```fromÂ sklearn.model_selectionÂ importÂ train_test_split
X_train,Â X_test,Â y_train,Â y_testÂ =Â train_test_split(X,Â y,Â test_sizeÂ =Â 0.2,Â random_stateÂ =Â 1)```

4.Feature Scaling :

In this I am standardizing the feature variables.

```fromÂ sklearn.preprocessingÂ importÂ StandardScaler
scÂ =Â StandardScaler()
X_train=Â sc.fit_transform(X_train)
X_testÂ =Â sc.transform(X_test)```

5.Training the SVN Regression model on the Training set

```fromÂ sklearn.svmÂ importÂ SVC
classifierÂ =Â SVC(kernelÂ =Â 'XX',random_stateÂ =Â 0)
classifier.fit(X_train,Â y_train)```

Note : Please replace XX to linear for SVM or rbf for K-SVM.

6.Predicting a result

`printÂ (classifier.predict(sc.transform([[30,87000]])))`

Test dateset:

Output :

The below output is matching with test dateset.

```Kernel(linear) :- [0]
Kernel(rbf) :- [0]```

7.Predicting the Test set results

```y_predÂ =Â classifier.predict(X_test)
print(np.concatenate((y_pred.reshape(len(y_pred),1),Â y_test.reshape(len(y_test),1)),1))```

Output :

The below output shows the comparison between actual result(y_test) vs predicted

result(y_pred) out of test datasets.

```Kernel(linear):
[[0 0]  [0 0]  [1 1]  [1 1]  [0 0]  [0 0]  [0 0]  [1 1]  [0 0]  [1 0]  [0 0]  [0 0]  [0 0]  [1 1]  [1 1]  [1 1]  [1 1]  [0 0]  [0 0]  [1 1]  [0 0]  [1 1]  [1 1]  [0 0]  [0 1]  [0 0]  [1 1]  [1 0]  [1 1]  [1 0]  [0 0]  [0 0]  [0 0]  [1 1]  [0 0]  [0 0]  [0 0]  [0 0]  [0 1]  [0 0]  [1 1]  [1 1]  [0 0]  [0 0]  [1 1]  [0 1]  [0 1]  [1 1]  [0 0]  [1 1]  [0 0]  [0 0]  [1 1]  [0 1]  [0 1]  [0 0]  [1 1]  [0 0]  [1 1]  [1 1]  [0 0]  [0 0]  [1 0]  [0 0]  [0 1]  [1 1]  [0 0]  [0 0]  [1 0]  [0 0]  [1 0]  [0 0]  [1 1]  [0 0]  [0 0]  [1 1]  [0 0]  [0 0]  [0 0]  [0 0]]
Kernel(rbf) :
[[0 0]  [0 0]  [1 1]  [1 1]  [1 0]  [0 0]  [0 0]  [1 1]  [0 0]  [1 0]  [0 0]  [0 0]  [0 0]  [1 1]  [1 1]  [1 1]  [1 1]  [0 0]  [0 0]  [1 1]  [0 0]  [1 1]  [1 1]  [1 0]  [1 1]  [0 0]  [1 1]  [1 0]  [1 1]  [1 0]  [0 0]  [0 0]  [0 0]  [1 1]  [0 0]  [0 0]  [0 0]  [0 0]  [1 1]  [0 0]  [1 1]  [1 1]  [1 0]  [0 0]  [1 1]  [1 1]  [1 1]  [1 1]  [0 0]  [1 1]  [0 0]  [0 0]  [0 1]  [1 1]  [0 1]  [0 0]  [1 1]  [0 0]  [1 1]  [1 1]  [0 0]  [0 0]  [1 0]  [0 0]  [1 1]  [1 1]  [0 0]  [0 0]  [1 0]  [0 0]  [1 0]  [0 0]  [1 1]  [0 0]  [0 0]  [1 1]  [0 0]  [0 0]  [0 0]  [0 0]]```

8.Making the Confusion Matrix

A confusion matrix is a table ( see below) that is often used to describe the performance of a

classification model on a set of test data for which the true values are known.

Legend :

• TP- True Positive

• FP- False Positive

• FN- False Negative

• TN- True Negative

```fromÂ sklearn.metricsÂ importÂ confusion_matrix,Â accuracy_score
cmÂ =Â confusion_matrix(y_test,Â y_pred)
print(cm)
accuracy_score(y_test,Â y_pred)```

Output:-

```Kernel(linear):
[[42  6]  [ 7 25]]
accuracy_score : 0.8375
Kernel(rbf):
[[39  9]  [ 2 30]]
accuracy_score : 0.8625```

Observation :

In linear model 42 came out as true positive and 6 false positive where as 7 came out

as false negative and 25 true negative. So only 13 outputs evaluated incorrectly.

In rbf model 39 came out as true positive and 9 false positive where as 2 came out

as false negative and 30 true negative. So only 11 outputs evaluated incorrectly.

Per observation it is evident that kernel rbf is better model than kernel linear based on confusion matrix and accuracy score.

Please note there is only one difference between rbf and linear model. All I have to change kernel parameter in SVC class and rest of code is same to build model.

I hope you will able to create both model using the blog.Happy reading....