[TensorFlow] 텐서

4 minute read

기초

텐서 생성하기

TensorFlow 임포트

import tensorflow as tf
import numpy as np

기본 텐서 생성하기

스칼라: 단일, 순위0, 축 없음
벡터: 순위1, 축 1개
행렬: 순위2, 축 2개
n개의 축 가능

rank_0_tensor = tf.constant(4)    # 스칼라
rank_1_tensor = tf.constant([2.0, 3.0, 4.0])    # 벡터
rank_2_tensor = tf.constant([[1, 2], [3, 4], [5, 6]], dtype = tf.float16)   # 행렬
rank_3_tensor = tf.constant([
  [[0, 1, 2, 3, 4],
   [5, 6, 7, 8, 9]],
  [[10, 11, 12, 13, 14],
   [15, 16, 17, 18, 19]],
  [[20, 21, 22, 23, 24],
   [25, 26, 27, 28, 29]],])

복소수, 문자열 가능
기본 tf.Tensor 클래스에서는 텐서가 직사각형 이어야 함
다양한 형상을 처리하는 특수 유형의 텐서: 비정형 텐서(RaggedTensor), 희소 텐서(SparseTensor)

텐서를 numpy배열로 변환

np.array(rank_2_tensor)     # 첫 번째 방법
rank_2_tensor.numpy()       # 두 번째 방법

텐서에 대한 기본 산술

a = tf.constant([[1, 2], [3, 4]])
b = tf.constant([[5, 6], [7, 8]])

# 표현법1
print(tf.add(a, b), "\n")       # 동일한 위치에 있는 원소의 합
print(tf.multiply(a, b), "\n")  # 동일한 위치에 있는 원소의 곱
print(tf.matmul(a, b), "\n")    # 행렬의 곱

print(a + b, "\n")    # 동일한 위치에 있는 원소의 합
print(a * b, "\n")    # 동일한 위치에 있는 원소의 곱
print(a @ b, "\n")    # 행렬의 곱

c = tf.constant([[4.0, 5.0], [10.0, 1.0]])

print(tf.reduce_max(c))     # 변수에서 가장 큰 값을 탐색
print(tf.math.argmax(c))    # 가장 큰 값의 인덱스 값을 찾음
print(tf.nn.softmax(c))     # 

softmax

형상 정보

텐서 관련 용어

형상: 텐서의 각 차원의 길이(요소의 수)
순위: 텐서 축의 수
축/차원: 텐서의 특정 차원
크기: 텐서의 총 항목 수, 형상 요소의 곱

텐서 관련 출력

rank_4_tensor = tf.zeros([3, 2, 4, 5])    # 각 축을 전부 0으로 설정

print("Type of every element: ", rank_4_tensor.dtype)   # 데이터 타입 확인

# ndim, shape(속성)는 텐서 객체를 반환하지 않음
print("Number of axes:", rank_4_tensor.ndim)      # 배열의 차원 확인            
print("Shape of tensor:", rank_4_tensor.shape)

print("Elements along axis 0 of tensor:", rank_4_tensor.shape[0])
print("Elements along the last axis of tensor:", rank_4_tensor.shape[-1])
print("Total number of elements (3*2*4*5): ", tf.size(rank_4_tensor).numpy())

텐서 객체가 필요한 경우

tf.rank(rank_4_tensor)
tf.shape(rank_4_tensor)

인덱싱

텐서에서 인덱싱 하는 방법

표준 Python 인덱싱 규칙을 따름

단일 축 인덱싱

rank_1_tensor = tf.constant([0, 1, 1, 2, 3, 5, 8, 13, 21, 34])

print(rank_1_tensor)                        # 텐서 자체를 출력
print(rank_1_tensor.numpy())                # 텐서의 값을 출력
print("First: ", rank_1_tensor[0].numpy())  # 스칼라 사용 인덱싱 -> 축 제거
print("Slice from 2, before 7: ", rank_1_tensor[2:7].numpy()) # ':' 이용 슬라이싱 -> 축 유지
print("Jump item: ", rank_1_tensor[::2].numpy())    # 인덱스 2를 간격으로 뜀
print("Reversed: ", rank_1_tensor[::-1].numpy())    # 반대로 뒤집기

다중 축 인덱싱

print(rank_2_tensor.numpy())

print(rank_2_tensor[1, 1].numpy())    # 각 축에 정수 전잘 시 결과는 스칼라
print("Second row:", rank_2_tensor[1, :].numpy())     # 1번째 행 전부 출력
print("Second column:", rank_2_tensor[:, 1].numpy())  # 1번째 열 전부 출력
print("Skip the first row:")
print(rank_2_tensor[1:, :].numpy(), "\n")             # 0번째 행 제외 전부 출력(1번째 행부터 끝까지 출력)

여러 인덱스 사용
단일 축의 경우에서와 같은 규칙이 각 축에 독립적으로 작용

형상 조작하기

새로운 형상으로 변환하기

tf.reshape(tensor, shape, name=None)

새로운 형상으로 변환하기(1)

x = tf.constant([[1], [2], [3]])
print(x.shape.as_list())    # 파이썬 배열로 반환 가능
reshaped = tf.reshape(x, [1, 3])
print(x.shape)
print(reshaped.shape)

새로운 형상으로 변환하기(2)

print(rank_3_tensor)
print(tf.reshape(rank_3_tensor, [-1]))        # -1 일렬로 텐서를 나열
print(tf.reshape(rank_3_tensor, [2, 3, 5]))     # 원하는 shape 로 모양 변형

rank_3_tensor은 현재 325 = 30임
reshape를 할때 곱해서 도합 30이 나오게끔 shape 설정([1, 30], [2, 15], [2, 3, 5] 등)

Dtypes에 대한 추가 정보
tf.Tensor의 데이터 유형을 검사하려면, Tensor.dtype 속성을 사용 Python 객체에서 tf.Tensor를 만들 때 선택적으로 데이터 유형을 지정할 수 있음

Python 정수 -> tf.int32

Python 부동 소수점 -> tf.float32

브로드캐스팅
특정 조건에서 작은 텐서는 결합된 연산을 실행할 때 더 큰 텐서에 맞게 자동으로 ‘확장’을 함

브로드캐스팅 한 값이 무엇인지 미리 확인 가능: .broadcast_to

비정형 텐서

비정형 텐서란?

각 축이 다양한 수요를 가진 텐서

ragged_list = [
    [0, 1, 2, 3],
    [4, 5],
    [6, 7, 8],
    [9]]

# rag = tf.constant(ragged_list)        # 정규텐서로 변환 불가
rag = tf.ragged.constant(ragged_list)   # ragged 사용할 것

print(rag)

비정형 텐서는 바로 정형 텐서로 변형 불가, ragged사용

문자열 텐서

문자영 텐서에 대해

tf.string은 dtype

문자열은 원자성이므로 Python 문자열과 같은 방식으로 인덱싱할 수 없음
문자열을 조작하는 함수는 tf.strings 참고

문자열 텐서 다루기

# 출력 결과에 나오는 b는 tf.string dtype이 유니코드 문자열이 아닌 바이트 문자열임을 나타냄
scalar_string_tensor = tf.constant("Gray wolf")
print(scalar_string_tensor)
print(scalar_string_tensor.numpy())

# tf.strings를 이용한 문자열 조작
print(tf.strings.split(scalar_string_tensor, sep=" "))  # 공백을 기준으로 분리

# 문자열로 되어있는 숫자를 숫자로 변환
# tf.string.to_number
text = tf.constant("1 10 100")
print(tf.strings.to_number(tf.strings.split(text, " ")))

접두사 b는 문자가 유니코드가 아닌 바이트 코드라는 것을 의미
sep를 통해 문자열 분리 가능

문자열을 바이트로 변환

byte_strings = tf.strings.bytes_split(tf.constant("Duck"))  # 각 바이트로 쪼갬
byte_ints = tf.io.decode_raw(tf.constant("Duck"), tf.uint8) # 바이트로 쪼갠 후 유니코드로 변환
print("Byte strings:", byte_strings)
print("Bytes:", byte_ints)

tf.string dtype: TensorFlow의 모든 원시 바이트 데이터에 사용
tf.io 모듈: 이미지 디코딩, csv 구문 분석 등 데이터를 바이트로 변환하거나 바이트에서 변환하는 함수가 포함

희소 텐서

희소 텐서란?

매우 넓은 공간에 데이터가 희소한 경우

tf.sparse.SparseTensor 및 관련 연산을 지원하여 희소 데이터를 효율적으로 저장

희소 텐서 예제

# Sparse tensors store values by index in a memory-efficient manner
sparse_tensor = tf.sparse.SparseTensor(indices=[[0, 0], [1, 2]],
                                       values=[24, 3.5],
                                       dense_shape=[5, 6])
print(sparse_tensor, "\n")

# You can convert sparse tensors to dense
print(tf.sparse.to_dense(sparse_tensor))

indices: 어떤 인덱스가 0이 아닌 원소인지 나타냄
values: indices의 인덱스 순서대로 값을 나타냄(indices의 인덱스 개수와 ㅍvalues의 원소 개수와 동일)
dense_shape: 공간의 크기를 나타냄

Link

TensorFlow 텐서 소개

Share on

Twitter Facebook LinkedIn

Min Young Choi

[TensorFlow] 텐서

기초

텐서 생성하기

TensorFlow 임포트

기본 텐서 생성하기

텐서를 numpy배열로 변환

텐서에 대한 기본 산술

형상 정보

텐서 관련 용어

텐서 관련 출력

텐서 객체가 필요한 경우

인덱싱

텐서에서 인덱싱 하는 방법

단일 축 인덱싱

다중 축 인덱싱

형상 조작하기

새로운 형상으로 변환하기

새로운 형상으로 변환하기(1)

새로운 형상으로 변환하기(2)

비정형 텐서

비정형 텐서란?

문자열 텐서

문자영 텐서에 대해

문자열 텐서 다루기

문자열을 바이트로 변환

희소 텐서

희소 텐서란?

희소 텐서 예제

Link

Share on

You may also enjoy

Attention Is All You Need

Testing NLP cate

[Knowledge Graphs] 지식 그래프의 핵심 개념

[Knowledge Graphs] 지식 그래프란 무엇인가?