DEV Community


Day 99 Of 100DaysOfCode:Centering and scaling in a pipeline

Durga Pokharel
A mathematics student learning to code.
・2 min read

This is my 99th day of #100daysofcode and #python learning journey. Approximately I am in terminal point. Now I feel I am champion. Talking about today's progress I keep learning from DataCamp. I also did some exercises there. Did some codes on the random topic.

Centering and Scaling In a Pipeline

# Import the necessary modules
from sklearn.preprocessing import StandardScaler
from sklearn.pipeline import Pipeline

# Setup the pipeline steps: steps
steps = [('scaler', StandardScaler()),
        ('knn', KNeighborsClassifier())]

# Create the pipeline: pipeline
pipeline = Pipeline(steps)

# Create train and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

# Fit the pipeline to the training set: knn_scaled
knn_scaled =, y_train)

# Instantiate and fit a k-NN classifier to the unscaled data
knn_unscaled = KNeighborsClassifier().fit(X_train, y_train)

# Compute and print metrics
print('Accuracy with Scaling: {}'.format(knn_scaled.score(X_test, y_test)))
print('Accuracy without Scaling: {}'.format(knn_unscaled.score(X_test, y_test)))
Enter fullscreen mode Exit fullscreen mode

The output of above code is,

Accuracy with Scaling: 0.7700680272108843
Accuracy without Scaling: 0.6979591836734694
Enter fullscreen mode Exit fullscreen mode

Discussion (1)

otumianempire profile image
Otu Michael