[AI 스터디] Section 8 : 전이학습

동아리,학회/GDGoC

[AI 스터디] Section 8 : 전이학습

egahyun 2024. 12. 27. 02:11

전이학습이란?

모델 작동 흐름

기존의 훈련된 모델 : 1000가지를 분류할 수 있는 큰 모델
weight transfer
1. 사용 예시 : 기존 큰 모델이 있고, 2가지를 분류할 수 있는 모델을 하고싶을 때
2. 효과 : 소규모의 데이터 셋으로 훈련에도 좋은 성능을 낼 수 있음
3. 훈련된 컨볼루션 레이어 : 훈련된 필터들이 내장되어있음
4. ⇒ 작은 데이터셋으로도 분류 가능하게 함
: 기존 모델의 컨볼루션 레이어를 가져와 가중치를 전위시켜 사용하는 것

학습 전략

CNN layer = 유지 / 추가한 완전연결층 (Dense Layer) = 새롭게 학습
1. 사용 예시 : 데이터가 굉장히 작은 경우
전체 레이어 = 매우 작은 learning rate로 재학습
1. 사용 예시 : 데이터를 전체 훈련 시킬 정도로 많이 있는 경우
2. 작은 학습률을 사용해야함 ⇒ 가급적 가중치를 잘 유지하며, 조금 튜닝하도록 !(큰 학습률 사용시, 가중치가 다 흐트러짐)
3. (이미 훈련된 convolutional layer를 재훈련하기 때문)

고려 사항

목적에 맞는 데이터 셋 선택
1. 있는 데이터셋 : 분류가 잘될 가능성이 높음 (ex : Cat & Dog 구분 → ImageNet 에 포함)
2. 없는 데이터셋 : 분류가 안됨 (ex : Cancer cell 구분 → ImageNet 에 없음 ⇒ 야구공 등으로 분류)
: ImageNet에 있는 데이터 셋인지 확인 필요
보유 데이터의 Volume 고려
1. 구조만 가져오고, 모든 weight 새로이 training (Large Data 보유)
2. Weight 의 일부만 training
3. 마지막 layer 만 Fine-tuning (Small Data 보유) : 훈련잘 되어 있던 부분은 건들지 않음

Tensorflow Hub

https://www.tensorflow.org/hub?hl=ko : 파이썬만 사용할줄 알면 모델을 만들 수 있음

ImageDataGenerator

: 이미지 레이블링이되도록 반환해주는 함수

methods : .flow_from_directory
- 대용량 data 를 directory 에서 직접 로드
- directory 구조에 의해 자동으로 label 인식

2. flow_from_directory 사용법

from tensorflow.keras.preprocessing.image import ImageDataGenerator
# Instance 생성
train_data_gen = ImageDataGenerator(rescale=1/255.)

# flow_from_directory method 호출 :
train_generator = train_data_gen.flow_from_directory(
	train_dir,
	target_size(150, 150),  # image size 통일
	batch_size=20,
	class_mode='binary’ or 'categorical)

3. data augmentation (데이터 증강)

: 다른이미지 처럼 보이도록 하여 부족한 데이터를 보충해주는 기능

→ ex) 이미지를 찌그러트림 / 좌우 반전

→ flow_from_directory 가 부수적으로 제공하는 기능

실습 - Tensorflow Hub 모델을 이용한 전이 학습

문제 정의

: pre-trained model (MobileNet_V2) 을 feature extractor로 이용하여 꽃 image 에 특화된 image 분류 model

모델 구성 : MobileNet

ImageNet 의 수백만장 이미지 를 이용하여 훈련됨
1000 개의 class 로 이미지 구분
class중 확률이 높은 5개를 반환

model = tf.keras.Sequntial([
	**MobileNet_feature_extractor_layer**
	tf.keras.layers.Dense(flowers_data.num_classes, activation='softmax')
])

모델 구성 : 파인튜닝하지 않은 모델

import tensorflow as tf
import tensorflow_hub as hub
from tensorflow.keras.applications.mobilenet import decode_predictions # 몇 번째에 어떤 사진인지 확인 할 수 있음

# Fine Tuning 없이 사용하기 위해 Full Model download
Trained_MobileNet_url = "<https://tfhub.dev/google/tf2-preview/mobilenet_v2/classification/2>"

Trained_Mobilenet = tf.keras.Sequential([
                    hub.KerasLayer(Trained_MobileNet_url , input_shape=(224, 224,3))]) # 사전 훈련시, 사용했던 입력 사이즈를 그대로 해야함

Trained_Mobilenet.input, Trained_Mobilenet.output
# (,
# ) -> 1001 개의 확률분포 생성

# 위의 코드가 오류나는 경우 해결 방법 1
!pip install tf_keras

import tf_keras as tfk

Trained_MobileNet_url = "<https://tfhub.dev/google/tf2-preview/mobilenet_v2/classification/2>"

Trained_MobileNet = tfk.Sequential([
    hub.KerasLayer(Trained_MobileNet_url, input_shape=(224, 224, 3))
])

Trained_MobileNet.input, Trained_MobileNet.output

# 해결 방법 2 : 아래처럼 다운그레이드후, 원래 코드 실행
!pip install tensorflow==2.13.0 !pip install tensorflow_hub==0.14.0

파인 튜닝하지 않은 모델로 분류

from PIL import Image       # 이미지 처리
from urllib import request  # 인터넷에서 가져오기 함
from io import BytesIO      # 인터넷에서 사진 수집시 아스키코드로 수집되므로 이미지형태로 바꾸기 위해

# 수집할 이미지 url로 가져옴
url = "<https://github.com/ironmanciti/MachineLearningBasic/blob/master/datasets/TransferLearningData/watch.jpg?raw=true>"
res = request.urlopen(url).read()
Sample_Image = Image.open(BytesIO(res)).resize((224, 224)) # 모델 입력 사이즈에 맞춰서 resize (tuple로)

# numpy array로 샘플이미지를 넣어서 프리프로세싱된 이미지 데이터 출력
# 모델이 입력데이터를 어떻게 전처리 했는지 모르기 때문에 preprocess_input을 사용해 전처리
x = tf.keras.applications.mobilenet.preprocess_input(np.array(Sample_Image))
x.shape # (224, 224, 3)

# 클래스 예측 : 1001가지 확률 분포 (1000개의 클래스 + 배경 1개)
predicted_class = Trained_Mobilenet.predict(np.expand_dims(x, axis = 0)) # 한건의 이미지지만 배치인것 처럼줌 

# 분류된 클래스의 인덱스가 추출 -> ex) 827 : 827번째 클래스로 분류것
predicted_class.argmax(axis=-1)

# 해당 사진의 클래스별 예측 확률 top 5 확인 
decode_predictions(predicted_class[:, 1:])  # 첫번째 label은 background

[[('n04328186', 'stopwatch', 9.666367),
  ('n02708093', 'analog_clock', 8.007808),
  ('n03706229', 'magnetic_compass', 6.8384614),
  ('n04548280', 'wall_clock', 6.563991),
  ('n03197337', 'digital_watch', 4.9182053)]]

# 입력 사진과 분류된 것으로 제목으로 하여 시각화
plt.imshow(Sample_Image)
predicted_class = imagenet_labels[np.argmax(predicted_class)]
plt.title("Predicted Class is: " + predicted_class.title())

# 1000가지 레이블이 뭐가 있는지 확인
labels_path = tf.keras.utils.get_file('ImageNetLabels.txt',
                '<https://storage.googleapis.com/download.tensorflow.org/data/ImageNetLabels.txt>')
imagenet_labels = np.array(open(labels_path).read().splitlines())

print(imagenet_labels[:10])

꽃 사진 Batch Image 에 대한 MobileNet 평가 - 파인 튜닝하지 않은 모델

flower data 는 5 개의 class 로 구성

# Specify path of the flowers dataset : 데이터 다운로드
flowers_data_path = tf.keras.utils.get_file(
  'flower_photos','<https://storage.googleapis.com/download.tensorflow.org/example_images/flower_photos.tgz>', 
    untar=True)

# 이미지 자동 레이블링 : Found 3670 images belonging to 5 classes
image_generator = tf.keras.preprocessing.image.ImageDataGenerator(
            preprocessing_function=tf.keras.applications.mobilenet.preprocess_input) # 전처리 모듈

flowers_data = image_generator.**flow_from_directory**(flowers_data_path, 
                    target_size=(224, 224), batch_size = 64, shuffle = True)

# input_batch : 이미지 자체
# label_batch : 이미지가 존재하는 폴더 이름
input_batch, label_batch = next(flowers_data) # 제너레이터 함수이므로 next를 이용해 데이터를 가져옴

print("Image batch shape: ", input_batch.shape)    # (64, 224, 224, 3)
print("Label batch shape: ", label_batch.shape)    # (64, 5) : 원핫인코딩 되어있음
print("Label class 수: ", flowers_data.num_classes) # 5
print("Class Index : ", flowers_data.class_indices) # {'daisy': 0, 'dandelion': 1, 'roses': 2, 'sunflowers': 3, 'tulips': 4}

# Key : label 이름,  Value : index => Key : index,  Value : label 이름
# 어떤 인덱스가 어떤 레이블 명인지 확인 가능하록
class_names = {v:k for k,v in flowers_data.class_indices.items()}

# 임의의 꽃 이미지 1 개를 선택하여 prediction 비교 : 성능이 안 좋다.
prediction = Trained_Mobilenet.predict(input_batch[2:3])
decode_predictions(prediction[:, 1:])  

[[('n03930313', 'picket_fence', 5.1765366),
  ('n03944341', 'pinwheel', 4.367714),
  ('n03598930', 'jigsaw_puzzle', 3.9853535),
  ('n03447721', 'gong', 3.9513822),
  ('n09256479', 'coral_reef', 3.7957406)]]

분류된 이미지 확인 시각화

# 10 개 image 시각화
plt.figure(figsize=(16, 8))
for i in range(10):
    plt.subplot(1, 10, i+1)
    img = ((input_batch[i]+1)*127.5).astype(np.uint8)
    idx  = np.argmax(label_batch[i])
    plt.imshow(img)
    plt.title(class_names[idx])
    plt.axis('off')

전이학습 모델을 Flower 분류에 적합한 모델로 재훈련

⇒ Fine Tuning 을 위해 head 가 제거된 model 을 download

# 탑 레이어를 제거한 모델의 url
extractor_url = "<https://tfhub.dev/google/tf2-preview/mobilenet_v2/feature_vector/2>"

# 특성 피처값
extractor_layer = hub.KerasLayer(extractor_url, input_shape=(224, 224, 3))
feature_batch = extractor_layer(input_batch)

# MobileNet 의 pre-trained weight 는 update x 
# Top layer 에 Dense layer 추가
# CNN layer = 유지 / 추가한 완전연결층 (Dense Layer) = 새롭게 학습 하는 략
extractor_layer.trainable = False

# Build a model with two pieces:
#    (1)  MobileNet Feature Extractor 
#    (2)  Dense Network (classifier) added at the end 
model = tf.keras.Sequential([
  extractor_layer,
  tf.keras.layers.Dense(flowers_data.num_classes, activation='softmax')
])

# output shape 이 정확한지 training 전에 사전 check
# (,
# )
model.input, model.output

# 모델 컴파일 및 훈련
# 다중분류이므로 categorical_crossentropy
# 64개씩 flower data가 불러들여옴
model.compile(optimizer=tf.keras.optimizers.Adam(), 
              loss='categorical_crossentropy', metrics=['accuracy'])

history = model.fit(flowers_data, epochs=30)

Flower 분류 전문으로 Fine Tuning 된 MODEL 평가

# 확률 분포로 나옴
y_pred = model.predict(input_batch)
# 분류된 것의 인덱스를 가지도록
y_pred = np.argmax(y_pred, axis=-1)
# 정답 데이터의 인덱스 확인
y_true = np.argmax(label_batch, axis=-1)

# 정확도 : 100 %
f"{sum(y_pred == y_true) / len(y_true) * 100:.2f} %"

# 예측 시각화
plt.figure(figsize=(10,9))
plt.subplots_adjust(hspace=0.5)

for i in range(64):
  plt.subplot(8, 8, i+1)
  img = ((input_batch[i]+1)*127.5).astype(np.uint8)
  plt.imshow(img)
  color = "green" if y_pred[i] == y_true[i] else "red"
  plt.title(class_names[y_pred[i]], color=color)
  plt.axis('off')