The objective is to find the widest street that separates the two classes.
Imagine a vector perpendicular to the median line to the street.
SVM optimization doesn’t get stuck in local maximum, it has a convex base unlike Neural nets which could often get stuck in local maxima.
For linearly separable points, SVM works fine, but for nonlinearly separable points, you need to do a transformation to project these points to higher dimensional space, so that they can get linearly separable.
- A general method that is convex and guaranteed to produce a global solution.
- Small sigma in gaussian kernel can cause overfitting because then classification is shrunk right around the sample points.
- For handwritten character recognition, linear kernel with n=2(nonlinear) works good.