Different approaches for applying SVM in Keras

Issue

I want to build a multi-class classification model using Keras.
My data is containing 7 features and 4 labels.
If I am using Keras I have seen two ways to apply the Support vector Machine (SVM) algorithm.

First:
A Quasi-SVM in Keras
By using the (RandomFourierFeatures layer) presented here
I have built the following model:

def create_keras_model():
  initializer = tf.keras.initializers.GlorotNormal()
  return tf.keras.models.Sequential([
                            layers.Input(shape=(7,)),
                            RandomFourierFeatures(output_dim=4822, kernel_initializer=initializer),
                            layers.Dense(units=4, activation='softmax'),
                            ])

Second:
Using the last layer in the network as described here as follows:

def create_keras_model():
  return tf.keras.models.Sequential([
            tf.keras.layers.Input(shape=(7,)),
            tf.keras.layers.Dense(64),
            tf.keras.layers.Dense(4, kernel_regularizer=l2(0.01)),
            tf.keras.layers.Softmax()
                        
  ])

note: CategoricalHinge() was used as the loss function.
My question is: are these approaches appropriate and can be defined as applying of SVM model or it is just an approximation of the model architecture? in short, can I say this is applying of SVM model?

Solution

You can check two models on your data like below:

I check on mnist dataset and get the below result:

  1. Less overfitting with the second approach
  2. Fast training time with the first approach
  3. Less trainable params with the first approach
  4. Accuracy for two approaches same as each other
from keras.utils.layer_utils import count_params  
import matplotlib.pyplot as plt
import tensorflow as tf
import seaborn as sns
import pandas as pd
import time


def create_model(approach):

    model = tf.keras.Sequential()
    model.add(tf.keras.Input(shape=(784,)))
    if  approach == 'Quasi_SVM':
        model.add(tf.keras.layers.experimental.RandomFourierFeatures(
            output_dim=4096, scale=10.0, 
            kernel_initializer="gaussian"))
        model.add(tf.keras.layers.Dense(10))


    if approach == 'kernel_regularizer':
        model.add(tf.keras.layers.Dense(128, activation='relu'))
        model.add(tf.keras.layers.Dense(64, activation='relu'))
        model.add(tf.keras.layers.Dense(32, activation='relu'))
        model.add(tf.keras.layers.Dense(16, activation='relu'))
        model.add(tf.keras.layers.Dense(10, 
                                        kernel_regularizer = tf.keras.regularizers.l2(0.01), 
                                        activation='softmax')) 
    

    model.compile(
        optimizer = 'adam',
        loss = 'hinge',
        metrics=['accuracy'],
    )

    return model


(x_train, y_train), (x_test, y_test) = tf.keras.datasets.mnist.load_data()

x_train = x_train.reshape(-1, 784).astype("float32") / 255
x_test = x_test.reshape(-1, 784).astype("float32") / 255

y_train = tf.keras.utils.to_categorical(y_train)
y_test = tf.keras.utils.to_categorical(y_test)

for approach in ['Quasi_SVM', 'kernel_regularizer']:

    model = create_model(approach)
    start = time.time()
    history = model.fit(x_train, y_train, epochs=30, batch_size=128, validation_split=0.2)
    print(f'Training time {approach} : {time.time() - start} sec')
    print(f'Trainable params {approach} : {count_params(model.trainable_weights)}')
    print(f'Accuracy on x_test {approach} : {model.evaluate(x_test, y_test, verbose=0)[1]}')

    
    df = pd.DataFrame(history.history).rename_axis('epoch').reset_index().melt(id_vars=['epoch'])
    fig, axes = plt.subplots(1,2, figsize=(18,6))
    for ax, mtr in zip(axes.flat, ['loss', 'accuracy']):
        ax.set_title(f'{approach} {mtr.title()} Plot')
        dfTmp = df[df['variable'].str.contains(mtr)]
        sns.lineplot(data=dfTmp, x='epoch', y='value', hue='variable', ax=ax)

    fig.tight_layout()
    plt.show()

Output: (benchmark on colab)

Training time Quasi_SVM : 43.78484082221985 sec
Trainable params Quasi_SVM : 40970
Accuracy on x_test Quasi_SVM : 0.9729999899864197
Training time kernel_regularizer : 45.47012114524841 sec
Trainable params kernel_regularizer : 111514
Accuracy on x_test kernel_regularizer : 0.972100019454956

enter image description here

enter image description here

Answered By – I'mahdi

This Answer collected from stackoverflow, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0

Leave a Reply

(*) Required, Your email will not be published