๋™์•„๋ฆฌ,ํ•™ํšŒ/GDGoC

[AI ์Šคํ„ฐ๋””] Section 10 : Autoencoder

egahyun 2024. 12. 27. 02:55

์ฐจ์›์˜ ๊ฐ์†Œ

: most relevant feature ์ถ”์ถœ

  1. PCA : linear transformation → ์„ ํ˜•๋Œ€์ˆ˜ ์‚ฌ์šฉ
  2. Autoencoder : non-linear transformation → ๋น„์„ ํ˜• activation ํ•จ์ˆ˜๊ฐ€ ๋‰ด๋Ÿฐ์— ๋“ค์–ด๊ฐ€ ์žˆ๊ธฐ๋•Œ๋ฌธ์—

์ ์šฉ ๋ถ„์•ผ

⇒ structed data, unstructed data ๋ญ๋“  ์ƒ๊ด€์—†์Œ

  1. ์ •๋ณด์˜ ์••์ถ•
  2. ๋…ธ์ด์ฆˆ ์ œ๊ฑฐ : ๋ถˆํ•„์š”ํ•œ ์ •๋ณด๋ฅผ ์†Œ์‹ค์‹œํ‚ค๋„๋ก
  3. ์œ ์‚ฌํ•œ ์ด๋ฏธ์ง€ ๊ฒ€์ƒ‰
  4. ์ด๋ฏธ์ง€ ๋ณ€ํ˜•์— ์˜ํ•œ ์ƒˆ๋กœ์šด ์ด๋ฏธ์ง€ ์ƒ์„ฑ
  5. pre-training : ๋…ธ์ด์ฆˆ ์ œ๊ฑฐ๋กœ ๋ถˆํ•„์š”ํ•œ๊ฑฐ ์ง€์šฐ๊ณ  ์‹œ์ž‘ํ•œ๋‹ค.

์˜ˆ์‹œ ) ์ด๋ฏธ์ง€ autoencoder

  1. input ์ด๋ฏธ์ง€๋ฅผ ๋ฐ›์•„์„œ ์ฐจ์›์ถ•์†Œ๋ฅผ ์ง„ํ–‰
    ⇒ ์ค„์–ด๋“œ๋Š” ํ˜•ํƒœ๋กœ ๋‰ด๋Ÿด ๋„คํŠธ์›Œํฌ ๊ตฌํ˜„
  2. reconstructed ์ด๋ฏธ์ง€(์›๋ž˜์˜ ์ด๋ฏธ์ง€)๋กœ ์žฌ๊ตฌ์„ฑ (=๋ณต์›)
      ์ •ํ™•ํ•œ ์›๋ž˜ ์ด๋ฏธ์ง€๋กœ ๋‚˜์˜ค์ง„ ์•Š์Œ (์•ฝ๊ฐ„ ๋ธ”๋Ÿฌ๋œ ํ˜•ํƒœ)
    ⇒ ์ด๋ฏธ์ง€์— ๊ผญ ํ•„์š”ํ•œ ์ •๋ณด๋“ค์€ ๋‹ค ๋‚จ์Œ (์ด์œ  : ์›๋ณธ ์ด๋ฏธ์ง€์™€ ๊ฐ™์€ ๊ฒฐ๊ณผ๊ฐ€ ๋‚˜์˜ค๋„๋ก ์ค‘๊ฐ„์˜ ํŒŒ๋ผ๋ฏธํ„ฐ๋ฅผ ์—…๋ฐ์ดํŠธ ํ•˜๊ธฐ ๋•Œ๋ฌธ )

์ž ์žฌํ‘œํ˜„ (latent representation)

: ์‚ฌ๋žŒ์ด ์•Œ ์ˆ˜ ์—†๋Š” ์ˆจ๊ฒจ์ง„ ๋ฐ์ดํ„ฐ ํŒจํ„ด

⇒ ๋จธ์‹ ๋Ÿฌ๋‹์ด ์ด๋ฅผ ์ž˜ ์žก์•„๋‚ธ๋‹ค. (⇒ ๋”ฅ๋‰ด๋Ÿด ๋„คํŠธ์›Œํฌ์˜ ์•Œ๊ณ ๋ฆฌ์ฆ˜)

๊ธฐ๋ณธ ๊ตฌ์กฐ

3๊ฐœ์˜ ์ž…๋ ฅ์„ ํ”ผ์ฒ˜๋กœํ•˜์—ฌ, 2 ๊ฐœ๋กœ ์••์ถ•์‹œํ‚ค๊ณ , ๋‹ค์‹œ 3์ฐจ์›์œผ๋กœ ๋ณต์› dense layer๋กœ ์—ฌ๋Ÿฌ์ธต์ด ๊ตฌ์„ฑ๋˜์–ด์žˆ๋Š” ๋”ฅ๋‰ด๋Ÿด ๋„คํŠธ์›Œํฌ์˜ ๋ ˆ์ด์–ด๋“ค

  1. ์••์ถ•
    • ๋ฐฉ๋ฒ• : ์ด๋ฏธ์ง€์™€ ๊ฐ™์€ ๊ณ ์ฐจ์› ๋ฐ์ดํ„ฐ๋ฅผ ์ธ์ฝ”๋”ฉ์„ ํ†ตํ•ด ์ €์ฐจ์›์˜ hidden ๊ณต๊ฐ„์œผ๋กœ ํ‘œํ˜„
      ⇒ Bottleneck (= latent space, ์ž ์žฌ๊ณต๊ฐ„, ํ”ผ์ฒ˜, ์ฝ”๋“œ) ์ด ์ด๋ฃจ์–ด์ง
    • ํžˆ๋“  ํŒจํ„ด : ์ž ์žฌ ํ‘œํ˜„์„ ํ•™์Šตํ•˜๋Š” ์ˆจ๊ฒจ์ง„ ํŒจํ„ด
    • ์ธ์ฝ”๋” : ํžˆ๋“  ํŒจํ„ด์„ ํ•™์Šตํ•˜๋Š” ๋‰ด๋Ÿด ๋„คํŠธ์›Œํฌ ๋ถ€๋ถ„
  2. ๋ณต์›
    • ๋ฐฉ๋ฒ• : Latent Variable ์˜ ์ •๋ณด๋ฅผ ๊ธฐ๋ฐ˜์œผ๋กœ ์›๋ž˜์˜ ์ด๋ฏธ์ง€๋ฅผ ๋ณต์›ํ•˜๋Š” ๊ณผ์ • ์ˆ˜ํ–‰
    • ๋””์ฝ”๋” : ์••์ถ•๋œ ๋ฐ์ดํ„ฐ๋ฅผ ์›๋ž˜๋กœ ๋ณต์›์‹œํ‚ค๋Š” ๋‰ด๋Ÿด ๋„คํŠธ์›Œํฌ ๋ถ€๋ถ„
  3. ์ž‘๋™ ๋ฐฉ์‹
    : ๋น„์ง€๋„ํ•™์Šต ๋ฌธ์ œ๋ฅผ ์ง€๋„ํ•™์Šต ๋ฌธ์ œ๋กœ ๋ฐ”๊พธ์–ด ํ•ด๊ฒฐ ⇒ y๊ฐ€ ์—†์œผ๋ฏ€๋กœ ์ž์‹ ์„ ๋ ˆ์ด๋ธ”๋กœ ์‚ฌ์šฉ
    ์ธ์ฝ”๋” : \( G_\theta \) / ๋””์ฝ”๋” : \( F_\phi \) ⇒ ์„ธํƒ€์™€ ํŒŒ์ด๋กœ ํ•œ ์ด์œ  : ํŒŒ๋ผ๋ฏธํ„ฐ๋ฅผ ๊ตฌ๋ถ„ํ•˜๊ธฐ ์œ„ํ•ด

    1. ์˜คํ† ์ธ์ฝ”๋” ํ•™์Šต ๋ฐฉ๋ฒ• : backpropagation(์˜ค์ฐจ์—ญ์ „ํŒŒ) + gradient descent(๊ฒฝ์‚ฌํ•˜๊ฐ•๋ฒ•)
      ⇒ ์ •๋‹ต ๋ ˆ์ด๋ธ” : ์ž๊ธฐ ์ž์‹ 
      ⇒ ๊ฒฝ์‚ฌํ•˜๊ฐ•๋ฒ• ํ•™์Šต : ๋ชจ๋ธ์ด ์˜ˆ์ธกํ•œ ๊ฐ’, ์•Œ๊ณ  ์žˆ๋Š” ์ •๋‹ต ๊ฐ’๊ณผ ๋น„๊ตํ•œ ์˜ค์ฐจ๋ฅผ ์ค„์ด๋Š” ๋ฐฉํ–ฅ์œผ๋กœ ์ง„ํ–‰

    2. ์†์‹คํ•จ์ˆ˜
      $$L(x, y) = \|X - \hat{X}\|^2$$
      : ์›๋ณธ๊ฐ’ - ์˜ˆ์ธกํ•œ๊ฐ’์˜ ํ‰๊ท  ์ œ๊ณฑ ์˜ค์ฐจ (MSE)
      : MSE (์ž…๋ ฅ์ด ์ •๊ทœ๋ถ„ํฌ ์ผ๋•Œ) & cross-entropy (์ž…๋ ฅ์ด ๋ฒ ๋ฅด๋ˆ„์ด ๋ถ„ํฌ ์ผ๋•Œ))
    3. \( z = G(X) \) : ๋ ˆ์ดํ„ดํŠธ ๋ฒกํ„ฐ
    4. \( \hat{X} = F(z) = F(G(X)) \)

์‹ค์Šต : Autoencoder ์‹œ๊ฐํ™”

3์ฐจ์› ๋ฐ์ดํ„ฐ ์ƒ์„ฑ

import tensorflow as tf
from tensorflow import keras
import numpy as np
import matplotlib.pyplot as plt

np.random.seed(42)
tf.random.set_seed(42)

m = 100
angles = np.random.rand(m) * 3 * np.pi / 2 - 0.5

data = np.empty((m, 3))    # (100, 3)
data[:,0] = np.cos(angles) + np.sin(angles)/2 + 0.1 * np.random.randn(m)/2
data[:,1] = np.sin(angles) * 0.7 + 0.1 * np.random.randn(m) / 2
data[:,2] = data[:, 0] * 0.1 + data[:, 1] * 0.3 + 0.1 * np.random.randn(m)

# 3์ฐจ์› data ์‹œ๊ฐํ™”
X_train = data #- data.mean(axis=0, keepdims=0)
ax = plt.axes(projection='3d') # 3์ฐจ์›์œผ๋กœ ํˆฌ์˜ํ•˜๊ฒ 
ax.scatter3D(X_train[:, 0], X_train[:, 1],
             X_train[:, 2], c=X_train[:, 0], cmap='Reds');

Autoencoder model ์ž‘์„ฑ

→ ์ธ์ฝ”๋”๊ฐ€ ์ค‘์š” !!

from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense

# 3 ์ฐจ์› => 2 ์ฐจ์›
# ๋ฐฉ๋ฒ• 1
model = Sequential([
	Dense(2, input_shape=(3,)),  # ์—ฌ๋Ÿฌ ์ฐจ์›์ด ๋“ค์–ด์˜ฌ ์ˆ˜ ์žˆ๊ธฐ ๋–„๋ฌธ์— ํŠœํ”Œ
	Dense(3)
])

# ๋ฐฉ๋ฒ• 2 : ๋‚˜์ค‘์— ์ธ์ฝ”๋”๋งŒ ๋ฝ‘์•„์„œ ์“ธ ์ˆ˜ ์žˆ๊ธฐ ๋•Œ๋ฌธ์— ์ธ์ฝ”๋”์™€ ๋””์ฝ”๋”๋กœ ๋‚˜๋ˆ ์„œ ์ง„ํ–‰ํ•œ ๊ฒƒ
# ๊ฐ๊ฐ sequential๋กœ ํ•˜์—ฌ ๋‘๊ฐœ์˜ ๋ชจ๋ธ์„ ์ƒ์„ฑํ•œ๋‹ค.
encoder = **Sequential**([Dense(2, input_shape=(3,))]) 
decoder = **Sequential**([Dense(3, input_shape=(2,))])

autoencoder = Sequential([encoder, decoder])
autoencoder.summary()

# ์ปดํŒŒ์ผ
autoencoder.compile(loss="mse", optimizer=keras.optimizers.SGD(learning_rate=0.1))
history = autoencoder.fit(X_train, X_train, epochs=200)

encoder output

# ํ•™์Šต์‹œํ‚จ encoder ๋ฅผ ์ด์šฉํ•˜์—ฌ data๋ฅผ ์ฐจ์› ์ถ•์†Œ
encodings = encoder.predict(X_train)
encodings.shape

# encoder output์„ ์‹œ๊ฐํ™”
fig = plt.figure(figsize=(4,3))
plt.plot(encodings[:,0], encodings[:, 1], "b.")
plt.xlabel("$z_1$", fontsize=18)
plt.ylabel("$z_2$", fontsize=18, rotation=0)
plt.grid(True)
plt.show()

Decoder ๋ฅผ ์ด์šฉํ•œ data ๋ณต์›

# ํ•™์Šต๋œ decoder๋ฅผ ์ด์šฉํ•˜์—ฌ data ๋ณต์›
decodings = decoder.predict(encodings)
decodings.shape  # (100, 3)

# ๋ณต์›๋œ data ์‹œ๊ฐํ™”
ax = plt.axes(projection='3d')
ax.scatter3D(decodings[:, 0], decodings[:, 1],
             decodings[:, 2], c=decodings[:, 0], cmap='Reds');


์‹ค์Šต : Simple stacked autoencoder -MNIST

Deep Auto-Encoders

์ธ์ฝ”๋” ๋ ˆ์ด์–ด, ๋””์ฝ”๋” ๋ ˆ์ด์–ด๊ฐ€ ์—ฌ๋Ÿฌ๊ฐœ ์žˆ๋Š” ํ˜•ํƒœ์˜ ์˜คํ† ์ธ์ฝ”๋”

  • fashion_mnist dataset ์„ ์ด์šฉํ•œ deep autoencoder ์ƒ์„ฑ
  • Mnist dataset ์˜ ์†๊ธ€์”จ์ฒด๋ฅผ encoding ํ›„ decoding ํ•˜์—ฌ ๋ณต์›

plot model ์‚ฌ์šฉ์„ ์œ„ํ•œ graphviz ์™€ pydot

# anaconda prompt๋ฅผ ์‹คํ–‰ํ•ด ์ž…๋ ฅ
# ๋‹ค ์„ค์น˜ํ›„, ์ปค๋„ restart ํ•„์š”
conda install pydot
conda install graphviz

๋ฐ์ดํ„ฐ ๋ถˆ๋Ÿฌ์˜ค๊ธฐ ๋ฐ ์ƒ˜ํ”Œ ์‹œ๊ฐํ™”

import numpy as np
import tensorflow as tf
from tensorflow.keras.datasets import fashion_mnist
from tensorflow.keras import regularizers
from tensorflow.keras.models import Sequential, Model
from tensorflow.keras.layers import Input, Dense
from tensorflow.keras.utils import plot_model
import matplotlib.pyplot as plt

(X_train, _), (X_test, _) = fashion_mnist.load_data()

# sample image ์‹œ๊ฐํ™”
fig, ax = plt.subplots(1, 10, figsize=(20, 4))
for i in range(10):
    ax[i].imshow(X_test[i], cmap='gray')
    ax[i].set_xticks([])
    ax[i].set_yticks([])
    
# data normalization : 0~ 1 ์‚ฌ์ด์˜ ๊ฐ’์œผ๋กœ ๋ฐ”๋€œ
X_train = X_train / 255.           
X_test = X_test / 255.

# 2์ฐจ์› ์ด๋ฏธ์ง€๋ฅผ 1์ฐจ์›์œผ๋กœ ๋ณ€๊ฒฝ
X_train = X_train.reshape(-1, 784)
X_test = X_test.reshape(-1, 784)

X_train.shape, X_test.shape # ((60000, 784), (10000, 784))

stacked autoencoder ์ž‘์„ฑ

input = Input(shape=(784,))

# stacked autoencoder : functional API ์‚ฌ์šฉ (ํ•จ์ˆ˜ํ˜•์œผ๋กœ)
x = Dense(units=128, activation='relu')(input) # units : ๋‰ด๋Ÿฐ์˜ ๊ฐœ์ˆ˜ => ์ƒ๋žต ๊ฐ€๋Šฅ
x = Dense(units=64, activation='relu')(x)
encoder = Dense(units=32, activation='relu')(x) # 2๊ฐœ๋ฅผ ๊ฐ€์ง€๊ณ  ์žˆ๊ธฐ ๋•Œ๋ฌธ์— dense ํ•„์š”

x = Dense(units=64, activation='relu')(encoder) # ์ธ์ฝ”๋”์˜ ์ถœ๋ ฅ์„ ๋””์ฝ”๋”์˜ ์ž…๋ ฅ์œผ๋กœ ๋„ฃ์Œ
x = Dense(units=128, activation='relu')(x)
decoder = Dense(units=784, activation='sigmoid')(x) # 784๊ฐœ์˜ ํ”ฝ์…€๋กœ ๋ณต์› ํ•„์š” (0~1 ์‚ฌ์ด์˜ ๊ฐ’์œผ๋กœ normalizeํ–ˆ์œผ๋ฏ€๋กœ sigmoid ์‚ฌ์šฉํ•„์š”)

# autoencoder model
encoder_model = Model(inputs=input, outputs=encoder)
autoencoder = Model(inputs=input, outputs=decoder)
autoencoder.compile(loss='binary_crossentropy', optimizer='adam')

autoencoder.summary() # ์™ผ์ชฝ ๊ฒฐ๊ณผ
plot_model(autoencoder, show_shapes=True) # ์˜ค๋ฅธ์ชฝ ๋ชจ๋ธ ์‹œ๊ฐํ™”

latent representation : 32์ฐจ์›์œผ๋กœ ์ค„์–ด๋“  ํ˜•ํƒœ

 

์˜คํ† ์ธ์ฝ”๋” ํ›ˆ๋ จ + ๊ฒฐ๊ณผ ์‹œ๊ฐํ™”

history = autoencoder.fit(X_train, X_train, epochs=50, shuffle=True, # shuffle์„ ํ†ตํ•ด ์—ํญ์ด ๋ฐ”๋€”๋•Œ๋งˆ๋‹ค ๋ฐ์ดํ„ฐ๋ฅผ ์„ž์Œ  
              batch_size=256, validation_data=(X_test, X_test))

# ์†์‹ค ์‹œ๊ฐํ™” 
plt.plot(history.history['loss'], label='train loss')
plt.plot(history.history['val_loss'], label='validation_loss')
plt.legend()

# ๊ฒฐ๊ณผ ์ด๋ฏธ์ง€ ์‹œ๊ฐํ™”
fig, ax = plt.subplots(3, 10, figsize=(20, 8))
for i in range(10):
    ax[0, i].imshow(X_test[i].reshape(28, 28), cmap='gray') #784 ํ”ฝ์…€์„ ๋‹ค์‹œ 28x28์˜ ์›๋ž˜๋กœ ๋˜๋Œ๋ ค์•ผํ•จ
    
    img = np.expand_dims(X_test[i], axis=0)
    
    ax[1, i].imshow(encoder_model.predict(img, verbose=0).reshape(8, 4), cmap='gray')
    ax[2, i].imshow(autoencoder.predict(img, verbose=0).reshape(28, 28), cmap='gray')
    
    ax[0, i].axis('off')
    ax[1, i].axis('off')
    ax[2, i].axis('off')