DEV Community

Ankita Sahoo
Ankita Sahoo

Posted on

Day-19 of Machine Learning:

Day-19 of Machine Learning:
I. Basic template of TensorFlow implementation:

1. construct the network

model = Sequential(
    [               
        tf.keras.Input(shape=(400,)),    #specify input size

        Dense(25, activation='sigmoid'), 
        Dense(15, activation='sigmoid'), 
        Dense(1,  activation='sigmoid')  


    ], name = "my_model" 
)                            

Enter fullscreen mode Exit fullscreen mode

Keras Sequential model and Dense Layer with sigmoid activations.

2. loss function

model.compile(
    loss=tf.keras.losses.BinaryCrossentropy(),
    optimizer=tf.keras.optimizers.Adam(0.001),
)

Enter fullscreen mode Exit fullscreen mode

Here for binary classification, BinaryCrossentropy() is used. We can also use MeanSquareError() for Linear regression.

3. gradient descent to fit the weights of the model to the training data

model.fit(
    X,y,
    epochs=20
)

Enter fullscreen mode Exit fullscreen mode

II. Got to know about different Activation

- Linear Activation:

Activation a = g(Z) = Z
where Z = W.X + b
Output y might be an Integer number (+ve/-ve)

- Sigmoid Activation:

Activation a = g(Z) = 1 / (1 + e ^ (-Z)).
Output y might be 0 or 1 i.e binary classification

- ReLU Activation (Rectified Linear Activation):

Activation a = g(Z) = max (0, Z).
Output y will be any Whole number


III. How to choose Activation?
We can choose different activation within a Neural Network for separate layers and activations can be chosen accordingly requirement and goal of the Neural Network. However some recommendations are,

  • A neural network with many layers but no activation function is not effective. A Neural network with only linear activation is the same as no activation function.
  • ReLU are often use than Sigmoid activation. It is because firstly ReLU is a bit faster as it does less computation (max of 0 and Z) than sigmoid which does exponential then inverse and so on. Secondly Gradient Descent goes slow for flat and ReLU goes flat in one place whereas Sigmoid in 2 places.
  • Use ReLU instead of Linear Activation in Hidden layers.

Latest comments (0)