DEV Community

Satwik Kansal
Satwik Kansal

Posted on

Sklearn how to create a pipeline for a keras classifier

Fitting a keras classifier wasn't that straightforward as it is for sklearn-classifiers (like the MLPClassifier). After some struggle through the docs and github issues, I've figured out a reusable solution, and thought of sharing it here.

Here's what my keras classification model looks like, I'll wrap it in a function and add a few comments,

#making MLP with keras for classification - predicts probability of 4 classes 
def get_clf_model(input_size, output_size):
    # Use stochastic gradient descent optimizer
    sgd = SGD(lr=0.01, decay=1e-6, momentum=0.9, nesterov=True)
    # Building the model layer by layer
    model = tf.keras.Sequential([
        # First layer
        tf.keras.layers.Dense(128, activation='relu', input_shape=[input_size],
                              kernel_regularizer='l2'),
        # Add dropout layer to reduce overfitting
        tf.keras.layers.Dropout(0.5),
        # Second layer
        tf.keras.layers.Dense(16, activation='sigmoid', kernel_regularizer='l2'),
        # Another dropout layer
        tf.keras.layers.Dropout(0.5),
        # Final layer, it outputs probability values for every class
        tf.keras.layers.Dense(output_size, kernel_initializer='he_uniform', activation='sigmoid'),
    ])

    model.compile(loss='categorical_crossentropy',
                  optimizer=sgd,
                  metrics=['accuracy'])
    print(model.summary())
    return model

The loss function used is categorical_crossentropy which expects the Y-labels in "One hot vector form" but the problem is the feature selectors, normalizers and other transformers might expect the Y-labels in simple array form.

Apparently, I can't just write another transformer extending the BaseEstimator or TransformerMixin class since the API only supports transformation of features and not lables.

The solution that worked for me was breaking the pipeline in between to fit my classifier, and then remerging it for later uses. Here's how it looks in code,

# training & preprocessing for keras classifier

n_features = X_train.shape[1] // 2


# We need to break pipeline into two steps if the keras model expects
# Y-values in one hot vector form.
steps = [
    ['scaling', preprocessing.PowerTransformer()], # Scaling
    ['feature_selection', SelectKBest(score_func=f_regression, k=n_features)], # Feature selection
]

prep_pipeline = Pipeline(steps, verbose=True)
X_train_p = prep_pipeline.fit_transform(X_train, Y_train_lab)

# Add class weight for impbalanced dataset
class_weights = class_weight.compute_class_weight('balanced',
                                                    np.unique(Y_train_lab),
                                                   Y_train_lab)
clf = get_clf_model(n_features, len(le.classes_))
clf.fit(X_train_p, to_categorical(Y_train_lab), epochs=250, class_weight=dict(enumerate(class_weights)))

# Combine the pipeline
pipeline = Pipeline([
  *prep_pipeline.steps,
  ('clf', clf)
])

# Now we can make predictions like this with the whole pipeline
pipeline.predict(X_val)

Of course, the above hack works because predictions has nothing to do with the Y-lables.

Top comments (0)