댑덥딥 3주차 정리

'모두를 위한 딥러닝 시즌 2' 강의를 듣고 공부하는 스터디 입니다.

비대면 19 April, 2023

07-1 Tips

maximum likelihood estimation

likelihood: 가능도

MLE: f(θ)가 최대가 되는 θ(observation을 가장 잘 설명하는 θ)를 찾아내는 과정

ex)베르누이 분포를 따를 때, f(θ)=

Image description

(n과 k는 observation으로 얻어짐)

optimization via gradient descent

f(θ)의 최대를 찾을 때 활용


overfitting and regrularization

MLE는 숙명적으로 overfitting이 따름

overfitting: 주어진 데이터에 대해 과도하게 fitting된 상태

Image description

-원하는 fitting: 파란색 선

  • overfitting을 줄이는 방법

1)more data

2)less features


  • regulatization의 종류

1)early stopping: validation Loss가 더이상 낮아지지 않을 때

2)reducing network size

3)weight decay


5)batch normalization⭐

-2)~5) 딥러닝에서 사용

training and test dataset

:overfitting을 최소화하는 방법 중 하나

Image description

dev set(validation set)을 통해 training set이 overfitting되었는지 검증(optional)→test set으로 확인

Image description

Basic Approach to Train DNN

①make a neural network architecture

②train and check that model is over-fitted

  1. if it is not, increase the model size(deeper and wider)
  2. if it is, add regularization, such as drop-out, batch-normalization

③repeat from step-2

learning rate

learning rate가 너무 크면 cost가 너무 커진다(발산한다)

learning rate가 너무 작으면 cost가 거의 줄어들지 않는다

data preprocessing

1)standardization: 정규분포화

sigma = x_train.std(dim=0)
norm_x_train = (x_train - mu) / sigm
전처리를 안 했다면? y_train의 column 간 데이터의 크기 차이가 크면 크기가 작은 쪽은 거의 무시됨

07-2 MNIST

MNIST: handwritten digits dataset(training set(60,000 장)+test set)

Image description

-size: 28x28

-1 channel gray image

-0~9 digits


: 유명 데이터셋, 모델 아키텍쳐, transform으로 구성된 패키지

import torchvision.datasets as dsets

mnist_train = dsets.MNIST(root="MNIST_data/", train = True, transform=transforms.ToTensor(), download=True)
mnist_test = dsets.MNIST(root="MNIST_data/", train = False, transform=transforms.ToTensor(), download=True)
-pytorch image: channel height wide 순서 v.s. 일반적인 image: height wide channel 순서 → .ToTensor() 활용

② torch.utils.DataLoader를 활용해 data 불러옴

댑덥딥 2주차 정리

③ size: 28x28 →view()를 이용해 784로 바꿔줌

for epoch in range(training_epochs):
    for X, Y in data_loader:
        X = X.view(-1, 28 * 28).to(device)
  • full code
# MNIST data image of shape 28 * 28 = 784 Softmax
linear = torch.nn.Linear(784, 10, bias=True).to(device)
# initialization
# parameters
training_epochs = 15
batch_size = 100
# define cost/loss & optimizer
criterion = torch.nn.CrossEntropyLoss().to(device) # Softmax is internally computed.
optimizer = torch.optim.SGD(linear.parameters(), lr=0.1)
for epoch in range(training_epochs):
avg_cost = 0
total_batch = len(data_loader)
for X, Y in data_loader:
# reshape input image into [batch_size by 784]
# label is not one-hot encoded
X = X.view(-1, 28 * 28).to(device)
hypothesis = linear(X)
cost = criterion(hypothesis, Y)
avg_cost += cost / total_batch
print("Epoch: ", "%04d" % (epoch+1), "cost =", "{:.9f}".format(avg_cost))

# Test the model using test sets
With torch.no_grad():
X_test = mnist_test.test_data.view(-1, 28 * 28).float().to(device)
Y_test =
prediction = linear(X_test)
correct_prediction = torch.argmax(prediction, 1) == Y_test
accuracy = correct_prediction.float().mean()
print("Accuracy: ", accuracy.item())

import matplotlib.pyplot as plt
import random
r = random.randint(0, len(mnist_test) - 1)
X_single_data = mnist_test.test_data[r:r + 1].view(-1, 28 *
Y_single_data = mnist_test.test_labels[r:r + 1].to(device)
print("Label: ", Y_single_data.item())
single_prediction = linear(X_single_data)
print("Prediction: ", torch.argmax(single_prediction,
plt.imshow(mnist_test.test_data[r:r + 1].view(28, 28),
cmap="Greys", interpolation="nearest")
대면 22 April, 2023

tips and torchvision

