Image Augmentation #1

Bootcamp_zerobase/Image Augmentation

Image Augmentation #1

Loft_mind 2025. 3. 11. 20:07

2025.03.11

Chapter 7. Autoencoders, Image Augmentation

51. 이미지 증강

Tensorflow 이미지 증강

이미지 데이터가 항상 충분하지는 않다. 어느 정도 모이면, 정리하고 부족한 부분을 채우기 위해 증강한다.

그 후 분석을 하고 또 부족한 데이터를 다시 증강한다.

결국 어떠한 분석업무, 예측업무를 진행할 때 항상 데이터가 충분하지 않고 또 클래스/라벨별로 편향이 존재할 수 있기 때문에 이를 하는 경우가 많다.

OpenCV 방법을 제일 많이 활용한다고 하지만, 이번에는 그 외 방법(증강)에 대해 진행하겠다.

IMPORT 및 데이터 둘러보기

## 기본 라이브러리 IMPORT 

import tensorflow as tf
import numpy as np
import matplotlib.pyplot as plt
import tensorflow_datasets as tfds

from tensorflow.keras import layers
from tensorflow.keras.datasets import mnist

필요 데이터를 tfds로부터 load 한다.

## 데이터 로드

(train_ds, val_ds, test_ds), metadata = tfds.load(
    'tf_flowers',
    split = ['train[:80%]', 'train[80%:90%]', 'train[90%:]'],
    
    with_info=True,
    # 데이터셋과 함께 메타데이터(metadata)도 반환하도록 설정합니다.
    # metadata에는 클래스 개수, 특징, 샘플 수 등의 정보가 포함됩니다
    
    as_supervised=True
    #데이터를 (이미지, 라벨) 쌍의 형태로 반환하도록 설정하는 옵션입니다.
    # 기본적으로 tfds.load()는 딕셔너리 형태(dict)로 반환하지만, 
    # as_supervised=True를 설정하면 (x, y) 형태로 로드됩니다.
    # x: 이미지 (Tensor 형태)
    # y: 정수형 라벨 (클래스 ID)
)

앞전에 말했던 것 처럼, 로컬 내 .keras/dataset 폴더 내 저장되어진다.

load 된 데이터를 둘러보려면 아래와 같다.

# 클래스 개수

print(metadata.features)
print('='*50)
print(metadata.features['label'].num_classes)

# take 호출로 데이터의 data와 label 호출

## 데이터 확인
plt.figure(figsize=(6, 4))
n = 1
for image, label in train_ds.take(4):
    plt.subplot(2, 2, n)
    plt.imshow(image)
    plt.axis('off')
    plt.title(f'label : {label}')
    n += 1

plt.tight_layout()
plt.show()

총데이터가 몇 개인지 확인

# 전체 데이터 개수수
metadata.splits['train']

------------------------------
> <SplitInfo num_examples=3670, num_shards=2>

# 클래스 종류 확인
print(metadata.features['label'].names)

# 또는

print('\n')

get_name_label = metadata.features['label'].int2str
name_list = [get_name_label(n) for n in range(5)]
print(name_list)

아래처럼 하나씩 불러와서 데이터를 확인할 수 있다.

## 다르게 데이터 확인방법

get_label_name = metadata.features['label'].int2str

image, label = next(iter(train_ds))
_ = plt.imshow(image)
_ = plt.title(get_label_name(label))

추가설명 : next( ) 와 iter( )

iter(train_ds)
- train_ds는 tf.data.Dataset 객체로, 이를 이터레이터(iterator)로 변환합니다.
- 이터레이터는 데이터를 하나씩 꺼낼 수 있는 객체입니다.

next(iter(train_ds))
- next()는 이터레이터에서 첫 번째 데이터 샘플을 가져오는 함수입니다.
- 즉, train_ds에서 첫 번째 (image, label) 쌍을 가져옵니다.
- image → 이미지 텐서
- label → 정수형 클래스 라벨 (0~4)

데이터 전처리

우선 데이터(사진)의 크기와 비율을 조절한다.

IMAGE_SIZE = 180

resized_and_rescale = tf.keras.Sequential([
    layers.experimental.preprocessing.Resizing(IMAGE_SIZE, IMAGE_SIZE),
    layers.experimental.preprocessing.Rescaling(1./255)
])

# 사진의 크기를 180 X 180으로 조절하고
# 사진 각 픽셀의 RGB 값을 Scaling (0과 1 사이 값) - 정규화

## 픽셀 범위 확인
result = resized_and_rescale(image)
print('Min and Max pixel values : ', result.numpy().min(), result.numpy().max())

>> Min and Max pixel values :  0.0 1.0

데이터 증강(사진)

통상 사진 데이터의 증강은, 좌우대칭, 랜덤하게 돌리거나, 랜덤하게 자르거나 등의 방법을 이용한다.

data_augmentation = tf.keras.Sequential([
    layers.experimental.preprocessing.RandomFlip('horizontal_and_vertical'),
    # 랜덤하게 상하좌우 뒤집기기
    layers.experimental.preprocessing.RandomRotation(0.2)
])

print(image.shape)
print(result.shape)

>> (333, 500, 3)
>> (180, 180, 3)


# Add the image to a batch

result = tf.expand_dims(image, 0)
print(result.shape)
>> (1, 333, 500, 3)

(1) 배치 차원이 필요한 경우
tf.keras 모델이나 tf.data.Dataset에서 입력 데이터는 배치(batch) 차원이 있어야 함
Resizing() 같은 전처리 레이어는 보통 (batch, height, width, channels) 형식 기대

(2) CNN 모델 입력에 맞추기
CNN 모델(tf.keras.Model)은 (batch_size, height, width, channels) 형태의 입력을 받음
expand_dims()를 사용해서 단일 이미지를 배치 형태로 변환 가능

Resize, Rescale, Rotation and Flip (데이터 증강)

한 개의 사진을 통해서 9개의 서로 다른 사진 생성

for image, label in train_ds.take(2):
    image = image

# 사이즈 변경 및 정규화(0~1)
resize_image = resized_and_rescale(image)

plt.figure(figsize=(10, 10))
for i in range(9):
    # flip and rotation
    aug_image = data_augmentation(resize_image)
    ax = plt.subplot(3, 3, i+1)
    plt.imshow(aug_image)
    plt.axis('off')
plt.show()

증강 모델 구성

데이터 증강은 나머지 레이어와 동기적으로 기기에서 실행되며, GPU 가속을 이용

model.fit 호출에만 증강 (model.evaluate or model.predict에선 아님)

model = tf.keras.Sequential([
    resized_and_rescale,    #resize 및 정규화
    data_augmentation,  # rotation and flip
    layers.Conv2D(16, 3, padding='same', activation='relu'),
    layers.MaxPooling2D()
])

또는

## 데이터 증강을 위한 구성

batch_size = 32
AUTOTUNE = tf.data.experimental.AUTOTUNE

data_augmentation = tf.keras.Sequential([
    layers.experimental.preprocessing.RandomFlip('horizontal_and_vertical'),
    # 랜덤하게 상하좌우 뒤집기기
    layers.experimental.preprocessing.RandomRotation(0.2)
])

IMAGE_SIZE = 180

resized_and_rescale = tf.keras.Sequential([
    layers.experimental.preprocessing.Resizing(IMAGE_SIZE, IMAGE_SIZE),
    layers.experimental.preprocessing.Rescaling(1./255)
])


def prepare(ds, shuffle=False, augment=False):
    
    # Resize and Rescale all datasets
    ds = ds.map(lambda x, y : (resized_and_rescale(x), y), 
                num_parallel_calls = AUTOTUNE)
    
    if shuffle:
        ds = ds.shuffle(1000)

    # Batch all datasests
    ds = ds.batch(batch_size)

    # Use data augmentation only on the training set
    if augment:
        ds = ds.map(lambda x, y : (data_augmentation(x, traning=True), y),
                    num_parallel_calls = AUTOTUNE)
    
    # Use buffered prefetching on all datasets
    return ds.prefetch(buffer_size=AUTOTUNE)

map 함수의 동작

map() 함수는 데이터셋의 각 요소에 대해 지정된 함수(map_func)를 병렬 처리 또는 순차적으로 적용합니다.
이 함수는 각각의 데이터 항목에 대해 변환 작업을 수행하고, 변환된 항목들로 이루어진 새로운 데이터셋을 반환합니다.

AUTOTUNE은 자동 최적화이다. 데이터의 준비 개수, 병렬처리 개수 등을 자동 최적화한다.

학습 준비

train_ds = prepare(train_ds, shuffle=True, augment=True)
val_ds = prepare(val_ds)
test_ds = prepare(test_ds)

## 모델구성

model = tf.keras.Sequential([
    layers.Conv2D(16, 3, padding='same', activation='relu'),
    layers.MaxPooling2D(),
    layers.Conv2D(32, 3, padding='same', activation='relu'),
    layers.MaxPooling2D(),
    layers.Conv2D(64, 3, padding='same', activation='relu'),
    layers.MaxPooling2D(),
    layers.Flatten(),
    layers.Dense(128, activation='relu'),
    layers.Dense(metadata.features['label'].num_classes)
])

model.compile(
    optimizer = 'adam',
    loss = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
    metrics=['accuray']
)

## 학습

epochs = 5
history = model.fit(
    train_ds,
    validation_data = val_ds,
    epochs=epochs
)

단순한 구조라서 그런지 성능이 좋지는 않다.

Lambda layer 데이터 증강

이번에는 또 다른 데이터 증강방법이다.

## lambda layer 데이터 증강

def random_convert_img(x, p=0.5):
	# 정해진 확률값 이하일 때 픽셀반전
    
    if tf.random.uniform([]) < p:
        x = (255-x)
    else:
        x
    return x

def random_invert(factor=0.5):
    return layers.Lambda(lambda x : random_convert_img(x, factor))

random_invert = random_invert()

plt.figure(figsize=(10, 10))

for i in range(9):
    augmentation_image = random_invert(image)
    ax = plt.subplot(3, 3, i+1)
    plt.imshow(augmentation_image.numpy().astype("uint8"))
    plt.axis('off')

참고

혹시 몰라 다시 데이터 로드하고, 비교함수 하나 생성

## 데이터 로드

(train_ds, val_ds, test_ds), metadata = tfds.load(
    'tf_flowers',
    split = ['train[:80%]', 'train[80%:90%]', 'train[90%:]'],
    
    with_info=True,
    as_supervised=True

)


def visualize(original, augmented):
    fig = plt.figure()
    plt.subplot(1, 2, 1)
    plt.title('Original Image')
    plt.imshow(original)

    plt.subplot(1, 2, 2)
    plt.title('Augmented Image')
    plt.imshow(augmented)

다시 샘플데이터 한 개 호출하고 비교해 보면...

for image, label in train_ds.take(3):
    image = image
    label = label
    

flipped = tf.image.flip_left_right(image)
visualize(image, flipped)

위 flip_left_right 말고 또 있다.

grayscaled = tf.image.rgb_to_grayscale(image)
visualize(image, grayscaled)

채도변경

saturated = tf.image.adjust_saturation(image, 3)
visualize(image, saturated)

tf.image.adjust_brightness( ) : 밝기변화 라던가, tf.image.central_crop( ) : 이미지 자르기 등 이 있다.