How to use TensorDataset, Dataloader (pytorch)

TOP MENU

Python

library
・pip

・MeCab

numpy
・digitize

・mgrid

・pad

・polyfit

・prod

・shape

matplotlib
・figure

・pcolormesh

・scatter

pytorch
・BCELoss, MSELoss

・device

・Embedding

・TensorDataset, Dataloader

・RNN, LSTM

scikit-learn
・SVC

scipy
・interpolate

tkinter
・postscript

・image display

・frame, grid

other
・linear interpolation

OpenAI gym
・CartPole-v0

Release date:2023/3/11　　　　　　　　　

・In Japanese

■Description

TensorDataset and Dataloader are functions mainly used for processing input data and correct answer (teacher) data for machine learning.

■Description of TensorDataset

Set input data and correct answer data.

import numpy as np
import torch
from torch.utils.data import TensorDataset
from torch.utils.data import DataLoader

input = np.random.rand(4, 2) # Input data
correct = np.random.rand(4, 1) # Correct answer data

input = torch.FloatTensor(input) # Change to an array that can be handled by pytorch
correct = torch.FloatTensor(correct) # Same as above

print(input)

⇒ tensor([[0.7752, 0.9332],
              [0.5186, 0.1956],
              [0.1267, 0.1171],
              [0.3495, 0.5235]])

print(correct)

⇒ tensor([[0.2506],
              [0.9407],
              [0.9416],
              [0.8879]])

dataset = TensorDataset(input, correct) # set the data
print(vars(dataset)) # vars prints the contents of the object

⇒{'tensors': (tensor([[0.7752, 0.9332], # Input data
                            [0.5186, 0.1956],
                            [0.1267, 0.1171],
                            [0.3495, 0.5235]]),
                    tensor([[0.2506], # Correct answer data
                            [0.9407],
                            [0.9416],
                            [0.8879]]))}

■Description of Dataloader

Read out the input data and correct answer data set above. By entering the batch size, you can specify the number to be read at one time.

train_load = DataLoader(dataset, batch_size=2, shuffle=False) # Data shuffle with shuffle=True

for x, t in train_load:
    print(x)
    print(t)

⇒ tensor([[0.7752, 0.9332], # First read of x
              [0.5186, 0.1956]])
　tensor([[0.2506], # First read of t
              [0.9407]])

   tensor([[0.1267, 0.1171], # 2nd read
              [0.3495, 0.5235]])
　tensor([[0.9416],
              [0.8879]])

List of related articles