새싹 AI데이터엔지니어 핀테커스 11주차 (금) - Pytorch Tutorial

2023. 11. 17. 16:24

728x90

2023-11-17 52nd Class

Random Dataset

#️⃣ scikit-learn random dataset

scikit-learn의 랜덤 데이터 셋을 만드는 주요 방법 2가지는 아래와 같음

선형 구분이 가능한 데이터셋
비선형으로 구분이 가능한 데이터 셋

def get_binary_linear_dataset():  
    n_samples = 100  
    X, y = make_blobs(n_samples=n_samples, centers=2,  
                      n_features=2, cluster_std=0.5)  
  
    fig, ax = plt.subplots(figsize=(5, 5))  
  
    X_pos, X_neg = X[y == 1], X[y == 0]  
    ax.scatter(X_pos[:, 0], X_pos[:, 1], color='blue')  
    ax.scatter(X_neg[:, 0], X_neg[:, 1], color='red')  
    ax.tick_params(labelsize=15)  
    # ax.scatter(X[:, 0], X[:, 1], c=y)  
    fig.tight_layout()  
    plt.show()  
  
    return X, y  
  
  
def get_binary_moon_dataset():  
    n_samples = 300  
    X, y = make_moons(n_samples=n_samples, noise=0.2)  
  
    X_pos, X_neg = X[y == 1], X[y == 0]  
  
    fig, ax = plt.subplots(figsize=(5, 5))  
    ax.scatter(X_pos[:, 0], X_pos[:, 1], color='blue')  
    ax.scatter(X_neg[:, 0], X_neg[:, 1], color='red')  
    ax.tick_params(labelsize=15)  
    fig.tight_layout()  
    plt.show()  
  
    return X, y

get_binary_linear_dataset()  
get_binary_moon_dataset()

linear	moon

Pytorch Tutorial

#️⃣ DataLoader & nn.Linear & Sigmoid

DataLoader: 배치사이즈만큼 데이터를 추출해주는 것
전체 배치로 돌리는 경우 연산이 과다해짐 (cpu및 gpu에 부하)
ex) 100건짜리 데이터에서 배치사이즈가 32인 경우 32개, 32개, 32개, 4개씩 미니배치로 학습이 이루어짐
(사진) 마지막 배치에는 32개씩 돌리고 남은 개수로만 학습

DataLoader 코드

def get_linear_dataloader():  
    n_samples = 100  
    X, y = make_blobs(n_samples=n_samples, centers=2,  
                          n_features=2, cluster_std=0.7)  
  
    dataset = TensorDataset(torch.FloatTensor(X), torch.FloatTensor(y))  
  
    BATCH_SIZE = 8  
    dataloader = DataLoader(dataset, batch_size=BATCH_SIZE)  
  
    for X_, y_ in dataloader:  
        print(type(X_), X_.shape, X_.dtype)  
        print(type(y_), y_.shape, y_.dtype)  
        break  
  
  
def get_moon_dataloader():  
    n_samples = 300  
    BATCH_SIZE = 16  
    X, y = make_moons(n_samples=n_samples, noise=0.2)  
  
    dataset = TensorDataset(torch.FloatTensor(X), torch.FloatTensor(y))  
    dataloader = DataLoader(dataset, batch_size=BATCH_SIZE)  
  
    for X_, y_ in dataloader:  
        print(type(X_), X_.shape, X_.dtype)  
        print(type(y_), y_.shape, y_.dtype)  
        break

nn.Linear: fully connected layer
- fully connected layer란 입력 뉴런과 출력 뉴런이 전부다 선으로 연결되어있다는 의미
- in_features: 입력되는 데이터의 차원
- out_features: 출력할 데이터의 차원
- ★in_feature(입력되는 데이터의 차원)은 내부 뉴런의 weight 개수와 같음★

nn.Linear 코드

def linear_shape():  
    fc = nn.Linear(in_features=8, out_features=4)  
    print(fc.weight.shape) # torch.Size([4, 8])  
    print(fc.bias.shape) # torch.Size([4])  
  
  
def linear_shape2():  
    test_input = torch.randn(size=(16, 8))  
    fc = nn.Linear(in_features=8, out_features=4)  
    test_output = fc(test_input)  
  
    print(f"test input: {test_input.shape}") # test input: torch.Size([16, 8])  
    print(f"test input: {test_output.shape}") # test input: torch.Size([16, 4])

★nn.Linear 파라미터 개수 세는법★

import torch.nn as nn
fc = nn.Linear(in_features=8, out_features=4) 일때,

input은 16x8로 들어가고, output은 16x4로 나옴

개수(뉴런)	weight	bias	total
1개	8	1	9
전체(4개)	32	4	36

뉴런(1개)

뉴런 1개의 weight 개수: 8개 (직전 노드가 8개가 들어오니까)
뉴런 1개의 bias 개수: 1개

전체 뉴런 개수: 4개

전체 뉴런의 weight 개수: 8 (weight/개) x 4(뉴런 개수) = 32
전체 뉴런의 bias 개수: 1 (bias/개) x 4 (뉴런 개수) = 4

이 레이어(전체 뉴런)의 총 파라미터 개수: 32+4=36

code

def linear_shape():  
    fc = nn.Linear(in_features=8, out_features=4)  
    print(fc.weight.shape) # torch.Size([4, 8])  
    print(fc.bias.shape) # torch.Size([4])  
  
  
def linear_shape2():  
    test_input = torch.randn(size=(16, 8))  
    fc = nn.Linear(in_features=8, out_features=4)  
    test_output = fc(test_input)  
  
    print(f"test input: {test_input.shape}") # test input: torch.Size([16, 8])  
    print(f"test output: {test_output.shape}") # test input: torch.Size([16, 4])

첫 번째 함수 (데이터 1건)

input 개수(혹은 이전 노드 개수)는 8개, 내보낼 노드 개수는 4개임
weight.shape: 8칸짜리 weight이 내보낼 노드 4줄로 있기 때문에 -> 4x8 (4줄 8칸의 행렬)
bias.shape: 1개짜리 bias가 내보낼 노드 4줄로 있기 때문에 -> 4, (벡터)

두 번째 함수 (데이터 16건)

input 개수(혹은 이전 노드 개수)는 16개, 내보낼 노드 개수는 8개임
test_input.shape: (16x8) 8칸짜리 데이터가 16건 있음
teset_output.shape: (16x4) 4칸짜리 데이터가 16건 있음
즉 8개의 노드를 4개로 줄인 것

#️⃣ Activation Function & Loss Function

Sigmoid 함수

Binary Crossentropy 함수

nn.Sigmoid(), nn.BCELoss() code

import torch.nn as nn

def after_sigmoid():  
    test_input = torch.randn(size=(2, 3))  
    sigmoid = nn.Sigmoid()  
    test_output = sigmoid(test_input)  
  
    print("======= Test Input ========")  
    print(test_input)  
  
    print("======= nn.Sigmoid Output ========")  
    print(test_output)  
  
    print("======= manual computation ========")  
    print(1 / (1 + torch.exp(-test_input)))  
  
    '''  
    ======= Test Input ========    
    tensor([[-0.1151, -0.4276,  1.0766],            
    [ 0.3289,  0.3325,  0.4018]])    
    ======= nn.Sigmoid Output ========    
    tensor([[0.4713, 0.3947, 0.7459],            
    [0.5815, 0.5824, 0.5991]])    
    ======= manual computation ========    
    tensor([[0.4713, 0.3947, 0.7459],            
    [0.5815, 0.5824, 0.5991]])    '''  
  
def after_bceloss():  
    test_pred = torch.tensor([0.8])  
    test_y = torch.tensor([1.])  
  
    loss_function = nn.BCELoss()  
    test_output = loss_function(test_pred, test_y)  
  
    print("======= Test Input ========")  
    print(f"{test_pred=}")  
    print(f"{test_y=}")  
    print("======= nn.Sigmoid Output ========")  
    print(f"{test_output=}")  
    print("======= manual computation ========")  
    print(-(test_y * torch.log(test_pred) + (1 - test_y) * torch.log(1 - test_pred)))  
    
    '''  
    ======= Test Input ========    
    test_pred=tensor([0.8000])    
    test_y=tensor([1.])    
    ======= nn.Sigmoid Output ========    
    test_output=tensor(0.2231)    
    ======= manual computation ========    
    tensor([0.2231])   
    '''

#️⃣ Model

10개의 feature을 가지고있는 데이터 셋의 feature를 1개로 줄이는 모델 만들기

Pasted image 20231117115033.png

위의 모델 아키텍처를 pytorch로 구현하면 아래와 같음

code

class Model(nn.Module):  
    def __init__(self):  
        super(Model, self).__init__()  
        self.fc1 = nn.Linear(in_features=10, out_features=5)  
        self.fc1_activation = nn.Sigmoid()  
        self.fc2 = nn.Linear(in_features=5, out_features=2)  
        self.fc2_activation = nn.Sigmoid()  
        self.fc3 = nn.Linear(in_features=2, out_features=1)  
        self.fc3_activation = nn.Sigmoid()  
      
    def forward(self, x):  
        # 학습을 위해 z1, y1.. 으로 표기함  
        # 이후에는 메모리 절약을 위해 x로 통일  
        z1 = self.fc1(x); print(f"{z1.shape=}")  
        y1 = self.fc1_activation(z1)  
  
        z2 = self.fc2(y1); print(f"{z2.shape=}")  
        y2 = self.fc2_activation(z2)  
  
        z3 = self.fc3(y2); print(f"{z3.shape=}")  
        logits = self.fc3_activation(z3); print(f"{logits.shape=}")  
        return logits  
  
  
def model_trial():  
    n_samples = 100  
    X, y = make_blobs(n_samples=n_samples, centers=2, n_features=10, cluster_std=0.7)  
    X = torch.FloatTensor(X)  
    print(f"{X.shape=}")  
  
    model = Model()  
    logits = model.forward(X)  
  
    '''  
    X.shape=torch.Size([100, 10])    
    z1.shape=torch.Size([100, 5])    
    z2.shape=torch.Size([100, 2])    
    z3.shape=torch.Size([100, 1])    
    logits.shape=torch.Size([100, 1])    
    '''

nn.Linear(들어오는 노드 개수, 나가는 노드 개수)
nn.Sigmoid(): shape 변화 없음
처음 10개의 feature수로 시작해서, 5, 2, 1개로 줄어들고
마지막 시그모이드를 통과한 데이터는 (데이터 건수, out_features)로 출력됨

#️⃣ Model Training

2개의 feature을 가지고있는 100건짜리 데이터 셋의 feature를 1개로 줄이는 모델 학습시키기

조건1: batch size는 32로 설정 (DataLoader 활용)
조건2: GPU 있는 경우 cuda 사용
조건3: 매 epoch마다 Model의 loss와 accuracy를 계산해 시각화 하기

code

from sklearn.datasets import make_blobs  
from sklearn.datasets import make_moons  
import matplotlib.pyplot as plt  
  
import torch  
from torch.utils.data import TensorDataset  
from torch.utils.data import DataLoader  
import torch.nn as nn  
from torch.optim import SGD

DEVICE = torch.device('cuda' if torch.cuda.is_available() else 'cpu')  
print(f"curr device = {DEVICE}")

# make blobs 만들어서 학습시키기  
class SimpleModel(nn.Module):  
    def __init__(self, n_features):  
        super(SimpleModel, self).__init__()  
        self.fc = nn.Linear(in_features=n_features, out_features=1)  
        self.activation = nn.Sigmoid()  
  
    def forward(self, x):  
        x = self.fc(x)  
        x = self.activation(x)  
        x = x.view(-1) # (B,1)을 (B,)의 벡터로 squeeze        
        return x  
  
def simple_model_training():  
    # Data  
    N_SAMPLES = 100  
    BATCH_SIZE = 32  
    n_features = 2  
    X, y = make_blobs(n_samples=N_SAMPLES, centers=2, n_features=n_features, cluster_std=0.5)  
    dataset = TensorDataset(torch.FloatTensor(X), torch.FloatTensor(y))  
    dataloader = DataLoader(dataset, batch_size=BATCH_SIZE)  
  
    # Training  
    LR = 0.1  
    model = SimpleModel(n_features=n_features)  
    model.to(DEVICE)  
    optimizer = SGD(model.parameters(), lr=LR)  
    loss_function = nn.BCELoss()  
    EPOCHS = 10  
    losses, accs = list(), list()  
  
    for epoch in range(EPOCHS):  
        epoch_loss, n_corrects = 0., 0  
        for X, y in dataloader:  
            X, y = X.to(DEVICE), y.to(DEVICE)  
  
            pred = model(X)  
            loss = loss_function(pred, y)  
  
            optimizer.zero_grad()  
            loss.backward() # 모델의 파라미터가 학습되는 부분  
            optimizer.step()  
  
            # Batch Size의 loss로 변환하여 누적  
            epoch_loss += loss.item() * len(X)  
            pred = (pred > 0.5).type(torch.float)  
            n_corrects += (pred == y).sum().item()  
  
        epoch_loss /= N_SAMPLES  
        epoch_accr = n_corrects / N_SAMPLES  
        print(f"Epoch: {epoch + 1}", end="\t")  
        print(f"Loss: {epoch_loss:.4f}", end="\t")  
        print(f"Accuracy: {epoch_accr:.4f}")  
        losses.append(epoch_loss)  
        accs.append(epoch_accr)  
  
    fig, axes = plt.subplots(2, 1, figsize=(7, 3))  
    axes[0].plot(losses)  
    axes[1].plot(accs)  
  
    axes[1].set_xlabel("Epoch", fontsize=15)  
    axes[0].set_ylabel("BCELoss", fontsize=15)  
    axes[1].set_ylabel("Accuracy", fontsize=15)  
    axes[0].tick_params(labelsize=10)  
    axes[1].tick_params(labelsize=10)  
    fig.suptitle("1-Layer Model Eval Metrics by Epoch", fontsize=16)  
    fig.tight_layout()  
    plt.show()  
  
  
if __name__ == '__main__':  
    simple_model_training()

231117 pytorch loss.png

curr device = cuda
Epoch: 1	Loss: 0.8601	Accuracy: 0.2700
Epoch: 2	Loss: 0.5956	Accuracy: 0.5600
Epoch: 3	Loss: 0.4481	Accuracy: 0.9200
Epoch: 4	Loss: 0.3580	Accuracy: 0.9500
Epoch: 5	Loss: 0.2982	Accuracy: 0.9700
Epoch: 6	Loss: 0.2558	Accuracy: 0.9800
Epoch: 7	Loss: 0.2243	Accuracy: 0.9900
Epoch: 8	Loss: 0.2001	Accuracy: 0.9900
Epoch: 9	Loss: 0.1808	Accuracy: 1.0000
Epoch: 10	Loss: 0.1652	Accuracy: 1.0000

pred = model(X) -> 순전파
loss = loss_function(pred, y) -> 오차 계산
optimizer.zero_grad() -> optimizer 설정 (매번 초기화)
loss.backward() -> 오차 역전파 모델의 파라미터가 학습되는 부분
optimizer.step() -> optimizer

'Education > 새싹 TIL' 카테고리의 다른 글

새싹 AI데이터엔지니어 핀테커스 12주차 (화) - Multiclass Classification & MNIST (1)	2023.11.21
새싹 AI데이터엔지니어 핀테커스 12주차 (월) - Pytorch Tutorial2 (Moon & Make Blobs XOR Dataset) (0)	2023.11.20
새싹 AI데이터엔지니어 핀테커스 11주차 (목) - MLP Visualization (0)	2023.11.16
새싹 AI데이터엔지니어 핀테커스 11주차 (수) - Multi Layer Perceptron (0)	2023.11.16
새싹 AI데이터엔지니어 핀테커스 11주차 (화) - Backpropagation (1)	2023.11.14

Shijuan's AI Diary