加载数据时映射文本和图像的最佳方式
Best way to map Text and Image while loading the data
我有一个 csv 文件,看起来有点像照片。
我正在构建一个模型,该模型将图像及其对应的文本(df['Content']
作为输入。
我想知道以下列方式加载此数据的最佳方式:
- 正在将
df['Image_location']
中的图像加载到张量中。
- 并保留图像到相应文本的顺序。
- 保留相应的标签(
df['Sentiment']
)
关于如何做到这一点有什么想法吗?
您可以尝试使用 tf.data.Dataset
API.
创建虚拟数据:
import numpy
from PIL import Image
for i in range(1, 3):
imarray = numpy.random.rand(64,64,3) * 255
im = Image.fromarray(imarray.astype('uint8')).convert('RGBA')
im.save('result_image{}.png'.format(i))
进程:
import tensorflow as tf
import pandas as pd
import matplotlib.pyplot as plt
df = pd.DataFrame(data= {'Location': ['some.txt', 'some-other.txt'],
'Content': ['This road was ok', 'This was wonderful'],
'Score': [0.0353, -0.341],
'Sentiment': ['Neutral', 'Positive'],
'Image_location': ['/content/result_image1.png', '/content/result_image2.png']})
features = df[['Content', 'Image_location']]
labels = df['Sentiment']
dataset = tf.data.Dataset.from_tensor_slices((features, labels))
def process_path(x):
content, image_path = x[0], x[1]
img = tf.io.read_file(image_path)
img = tf.io.decode_png(img, channels=3)
return content, img
dataset = dataset.map(lambda x, y: (process_path(x), y))
for x, y in dataset.take(1):
content = x[0]
image = x[1]
print('Content -->', content)
print('Sentiment -->', y)
plt.imshow(image.numpy())
Content --> tf.Tensor(b'This road was ok', shape=(), dtype=string)
Sentiment --> tf.Tensor(b'Neutral', shape=(), dtype=string)
我有一个 csv 文件,看起来有点像照片。
我正在构建一个模型,该模型将图像及其对应的文本(df['Content']
作为输入。
我想知道以下列方式加载此数据的最佳方式:
- 正在将
df['Image_location']
中的图像加载到张量中。 - 并保留图像到相应文本的顺序。
- 保留相应的标签(
df['Sentiment']
)
关于如何做到这一点有什么想法吗?
您可以尝试使用 tf.data.Dataset
API.
创建虚拟数据:
import numpy
from PIL import Image
for i in range(1, 3):
imarray = numpy.random.rand(64,64,3) * 255
im = Image.fromarray(imarray.astype('uint8')).convert('RGBA')
im.save('result_image{}.png'.format(i))
进程:
import tensorflow as tf
import pandas as pd
import matplotlib.pyplot as plt
df = pd.DataFrame(data= {'Location': ['some.txt', 'some-other.txt'],
'Content': ['This road was ok', 'This was wonderful'],
'Score': [0.0353, -0.341],
'Sentiment': ['Neutral', 'Positive'],
'Image_location': ['/content/result_image1.png', '/content/result_image2.png']})
features = df[['Content', 'Image_location']]
labels = df['Sentiment']
dataset = tf.data.Dataset.from_tensor_slices((features, labels))
def process_path(x):
content, image_path = x[0], x[1]
img = tf.io.read_file(image_path)
img = tf.io.decode_png(img, channels=3)
return content, img
dataset = dataset.map(lambda x, y: (process_path(x), y))
for x, y in dataset.take(1):
content = x[0]
image = x[1]
print('Content -->', content)
print('Sentiment -->', y)
plt.imshow(image.numpy())
Content --> tf.Tensor(b'This road was ok', shape=(), dtype=string)
Sentiment --> tf.Tensor(b'Neutral', shape=(), dtype=string)