[AI 스터디] Section 10 : Autoencoder

동아리,학회/GDGoC

[AI 스터디] Section 10 : Autoencoder

egahyun 2024. 12. 27. 02:55

차원의 감소

: most relevant feature 추출

PCA : linear transformation → 선형대수 사용
Autoencoder : non-linear transformation → 비선형 activation 함수가 뉴런에 들어가 있기때문에

적용 분야

⇒ structed data, unstructed data 뭐든 상관없음

정보의 압축
노이즈 제거 : 불필요한 정보를 소실시키도록
유사한 이미지 검색
이미지 변형에 의한 새로운 이미지 생성
pre-training : 노이즈 제거로 불필요한거 지우고 시작한다.

예시 ) 이미지 autoencoder

input 이미지를 받아서 차원축소를 진행
⇒ 줄어드는 형태로 뉴럴 네트워크 구현
reconstructed 이미지(원래의 이미지)로 재구성 (=복원)
⇒ 정확한 원래 이미지로 나오진 않음 (약간 블러된 형태)
⇒ 이미지에 꼭 필요한 정보들은 다 남음 (이유 : 원본 이미지와 같은 결과가 나오도록 중간의 파라미터를 업데이트 하기 때문 )

잠재표현 (latent representation)

: 사람이 알 수 없는 숨겨진 데이터 패턴

⇒ 머신러닝이 이를 잘 잡아낸다. (⇒ 딥뉴럴 네트워크의 알고리즘)

기본 구조

3개의 입력을 피처로하여, 2 개로 압축시키고, 다시 3차원으로 복원 dense layer로 여러층이 구성되어있는 딥뉴럴 네트워크의 레이어들

압축
- 방법 : 이미지와 같은 고차원 데이터를 인코딩을 통해 저차원의 hidden 공간으로 표현
  ⇒ Bottleneck (= latent space, 잠재공간, 피처, 코드) 이 이루어짐
- 히든 패턴 : 잠재 표현을 학습하는 숨겨진 패턴
- 인코더 : 히든 패턴을 학습하는 뉴럴 네트워크 부분
복원
- 방법 : Latent Variable 의 정보를 기반으로 원래의 이미지를 복원하는 과정 수행
- 디코더 : 압축된 데이터를 원래로 복원시키는 뉴럴 네트워크 부분
작동 방식
: 비지도학습 문제를 지도학습 문제로 바꾸어 해결 ⇒ y가 없으므로 자신을 레이블로 사용
인코더 : $ G_\theta $ / 디코더 : $ F_\phi $ ⇒ 세타와 파이로 한 이유 : 파라미터를 구분하기 위해
1. 오토인코더 학습 방법 : backpropagation(오차역전파) + gradient descent(경사하강법)
  ⇒ 정답 레이블 : 자기 자신
  ⇒ 경사하강법 학습 : 모델이 예측한 값, 알고 있는 정답 값과 비교한 오차를 줄이는 방향으로 진행
2. 손실함수
  $$L(x, y) = \|X - \hat{X}\|^2$$
  : 원본값 - 예측한값의 평균 제곱 오차 (MSE)
  : MSE (입력이 정규분포 일때) & cross-entropy (입력이 베르누이 분포 일때))
3. $ z = G(X) $ : 레이턴트 벡터
4. $ \hat{X} = F(z) = F(G(X)) $

실습 : Autoencoder 시각화

3차원 데이터 생성

import tensorflow as tf
from tensorflow import keras
import numpy as np
import matplotlib.pyplot as plt

np.random.seed(42)
tf.random.set_seed(42)

m = 100
angles = np.random.rand(m) * 3 * np.pi / 2 - 0.5

data = np.empty((m, 3))    # (100, 3)
data[:,0] = np.cos(angles) + np.sin(angles)/2 + 0.1 * np.random.randn(m)/2
data[:,1] = np.sin(angles) * 0.7 + 0.1 * np.random.randn(m) / 2
data[:,2] = data[:, 0] * 0.1 + data[:, 1] * 0.3 + 0.1 * np.random.randn(m)

# 3차원 data 시각화
X_train = data #- data.mean(axis=0, keepdims=0)
ax = plt.axes(projection='3d') # 3차원으로 투영하겠
ax.scatter3D(X_train[:, 0], X_train[:, 1],
             X_train[:, 2], c=X_train[:, 0], cmap='Reds');

Autoencoder model 작성

→ 인코더가 중요 !!

from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense

# 3 차원 => 2 차원
# 방법 1
model = Sequential([
	Dense(2, input_shape=(3,)),  # 여러 차원이 들어올 수 있기 떄문에 튜플
	Dense(3)
])

# 방법 2 : 나중에 인코더만 뽑아서 쓸 수 있기 때문에 인코더와 디코더로 나눠서 진행한 것
# 각각 sequential로 하여 두개의 모델을 생성한다.
encoder = **Sequential**([Dense(2, input_shape=(3,))]) 
decoder = **Sequential**([Dense(3, input_shape=(2,))])

autoencoder = Sequential([encoder, decoder])
autoencoder.summary()

# 컴파일
autoencoder.compile(loss="mse", optimizer=keras.optimizers.SGD(learning_rate=0.1))
history = autoencoder.fit(X_train, X_train, epochs=200)

encoder output

# 학습시킨 encoder 를 이용하여 data를 차원 축소
encodings = encoder.predict(X_train)
encodings.shape

# encoder output을 시각화
fig = plt.figure(figsize=(4,3))
plt.plot(encodings[:,0], encodings[:, 1], "b.")
plt.xlabel("$z_1$", fontsize=18)
plt.ylabel("$z_2$", fontsize=18, rotation=0)
plt.grid(True)
plt.show()

Decoder 를 이용한 data 복원

# 학습된 decoder를 이용하여 data 복원
decodings = decoder.predict(encodings)
decodings.shape  # (100, 3)

# 복원된 data 시각화
ax = plt.axes(projection='3d')
ax.scatter3D(decodings[:, 0], decodings[:, 1],
             decodings[:, 2], c=decodings[:, 0], cmap='Reds');

실습 : Simple stacked autoencoder -MNIST

Deep Auto-Encoders

fashion_mnist dataset 을 이용한 deep autoencoder 생성
Mnist dataset 의 손글씨체를 encoding 후 decoding 하여 복원

plot model 사용을 위한 graphviz 와 pydot

# anaconda prompt를 실행해 입력
# 다 설치후, 커널 restart 필요
conda install pydot
conda install graphviz

데이터 불러오기 및 샘플 시각화

import numpy as np
import tensorflow as tf
from tensorflow.keras.datasets import fashion_mnist
from tensorflow.keras import regularizers
from tensorflow.keras.models import Sequential, Model
from tensorflow.keras.layers import Input, Dense
from tensorflow.keras.utils import plot_model
import matplotlib.pyplot as plt

(X_train, _), (X_test, _) = fashion_mnist.load_data()

# sample image 시각화
fig, ax = plt.subplots(1, 10, figsize=(20, 4))
for i in range(10):
    ax[i].imshow(X_test[i], cmap='gray')
    ax[i].set_xticks([])
    ax[i].set_yticks([])
    
# data normalization : 0~ 1 사이의 값으로 바뀜
X_train = X_train / 255.           
X_test = X_test / 255.

# 2차원 이미지를 1차원으로 변경
X_train = X_train.reshape(-1, 784)
X_test = X_test.reshape(-1, 784)

X_train.shape, X_test.shape # ((60000, 784), (10000, 784))

stacked autoencoder 작성

input = Input(shape=(784,))

# stacked autoencoder : functional API 사용 (함수형으로)
x = Dense(units=128, activation='relu')(input) # units : 뉴런의 개수 => 생략 가능
x = Dense(units=64, activation='relu')(x)
encoder = Dense(units=32, activation='relu')(x) # 2개를 가지고 있기 때문에 dense 필요

x = Dense(units=64, activation='relu')(encoder) # 인코더의 출력을 디코더의 입력으로 넣음
x = Dense(units=128, activation='relu')(x)
decoder = Dense(units=784, activation='sigmoid')(x) # 784개의 픽셀로 복원 필요 (0~1 사이의 값으로 normalize했으므로 sigmoid 사용필요)

# autoencoder model
encoder_model = Model(inputs=input, outputs=encoder)
autoencoder = Model(inputs=input, outputs=decoder)
autoencoder.compile(loss='binary_crossentropy', optimizer='adam')

autoencoder.summary() # 왼쪽 결과
plot_model(autoencoder, show_shapes=True) # 오른쪽 모델 시각화

오토인코더 훈련 + 결과 시각화

history = autoencoder.fit(X_train, X_train, epochs=50, shuffle=True, # shuffle을 통해 에폭이 바뀔때마다 데이터를 섞음  
              batch_size=256, validation_data=(X_test, X_test))

# 손실 시각화 
plt.plot(history.history['loss'], label='train loss')
plt.plot(history.history['val_loss'], label='validation_loss')
plt.legend()

# 결과 이미지 시각화
fig, ax = plt.subplots(3, 10, figsize=(20, 8))
for i in range(10):
    ax[0, i].imshow(X_test[i].reshape(28, 28), cmap='gray') #784 픽셀을 다시 28x28의 원래로 되돌려야함
    
    img = np.expand_dims(X_test[i], axis=0)
    
    ax[1, i].imshow(encoder_model.predict(img, verbose=0).reshape(8, 4), cmap='gray')
    ax[2, i].imshow(autoencoder.predict(img, verbose=0).reshape(28, 28), cmap='gray')
    
    ax[0, i].axis('off')
    ax[1, i].axis('off')
    ax[2, i].axis('off')