15

I am working on the classic example with digits. I want to create a my first neural network that predict the labels of digit images {0,1,2,3,4,5,6,7,8,9}. So the first column of train.txt has the labels and all the other columns are the features of each label. I have defined a class to import my data:

class DigitDataset(Dataset):
    """Digit dataset."""

    def __init__(self, file_path, transform=None):
        """
        Args:
            csv_file (string): Path to the csv file with annotations.
            root_dir (string): Directory with all the images.
            transform (callable, optional): Optional transform to be applied
                on a sample.
        """
        self.data = pd.read_csv(file_path, header = None, sep =" ")
        self.transform = transform

    def __len__(self):
        return len(self.data)

    def __getitem__(self, idx):
        if torch.is_tensor(idx):
            idx = idx.tolist()

        labels = self.data.iloc[idx,0]
        images = self.data.iloc[idx,1:-1].values.astype(np.uint8).reshape((1,16,16))

        if self.transform is not None:
            sample = self.transform(sample)
        return images, labels

And then I am running these commands to split my dataset into batches, to define a model and a loss:

train_dataset = DigitDataset("train.txt")
train_loader = DataLoader(train_dataset, batch_size=64,
                        shuffle=True, num_workers=4)

# Model creation with neural net Sequential model
model=nn.Sequential(nn.Linear(256, 128), # 1 layer:- 256 input 128 o/p
                    nn.ReLU(),          # Defining Regular linear unit as activation
                    nn.Linear(128,64),  # 2 Layer:- 128 Input and 64 O/p
                    nn.Tanh(),          # Defining Regular linear unit as activation
                    nn.Linear(64,10),   # 3 Layer:- 64 Input and 10 O/P as (0-9)
                    nn.LogSoftmax(dim=1) # Defining the log softmax to find the probablities 
for the last output unit 
                  ) 

# defining the negative log-likelihood loss for calculating loss
criterion = nn.NLLLoss()

images, labels = next(iter(train_loader))
images = images.view(images.shape[0], -1)

logps = model(images) #log probabilities
loss = criterion(logps, labels) #calculate the NLL-loss

And I take the error:

---------------------------------------------------------------------------
   RuntimeError                              Traceback (most recent call last) 
    <ipython-input-2-7f4160c1f086> in <module>
     47 images = images.view(images.shape[0], -1)
     48 
---> 49 logps = model(images) #log probabilities
     50 loss = criterion(logps, labels) #calculate the NLL-loss

~/anaconda3/lib/python3.8/site-packages/torch/nn/modules/module.py in _call_impl(self, 
*input, **kwargs)
    725             result = self._slow_forward(*input, **kwargs)
    726         else:
--> 727             result = self.forward(*input, **kwargs)
    728         for hook in itertools.chain(
    729                 _global_forward_hooks.values(),

~/anaconda3/lib/python3.8/site-packages/torch/nn/modules/container.py in forward(self, input)
    115     def forward(self, input):
    116         for module in self:
--> 117             input = module(input)
    118         return input
    119 

~/anaconda3/lib/python3.8/site-packages/torch/nn/modules/module.py in _call_impl(self, 
*input, **kwargs)
    725             result = self._slow_forward(*input, **kwargs)
    726         else:
--> 727             result = self.forward(*input, **kwargs)
    728         for hook in itertools.chain(
    729                 _global_forward_hooks.values(),

 ~/anaconda3/lib/python3.8/site-packages/torch/nn/modules/linear.py in forward(self, input)
     91 
     92     def forward(self, input: Tensor) -> Tensor:
---> 93         return F.linear(input, self.weight, self.bias)
     94 
     95     def extra_repr(self) -> str:

 ~/anaconda3/lib/python3.8/site-packages/torch/nn/functional.py in linear(input, weight, bias)
   1688     if input.dim() == 2 and bias is not None:
   1689         # fused op is marginally faster
-> 1690         ret = torch.addmm(bias, input, weight.t())
   1691     else:
   1692         output = input.matmul(weight.t())

RuntimeError: expected scalar type Float but found Byte

Do you know what is wrong? Thank you for your patience and help!

3 Answers 3

20

This line is the cause of your error:

images = self.data.iloc[idx, 1:-1].values.astype(np.uint8).reshape((1, 16, 16))

images are uint8 (byte) while the neural network needs inputs as floating point in order to calculate gradients (you can't calculate gradients for backprop using integers as those are not continuous and non-differentiable).

You can use torchvision.transforms.functional.to_tensor to convert the image into float and into [0, 1] like this:

import torchvision

images = torchvision.transforms.functional.to_tensor(
    self.data.iloc[idx, 1:-1].values.astype(np.uint8).reshape((1, 16, 16))
)

or simply divide by 255 to get values into [0, 1].

Sign up to request clarification or add additional context in comments.

4 Comments

It worked, but now I have a problem with the loss function, and I can't apply torchvision.transforms.functional.to_tensor to labels, cause they are not images. Actually I take this error: RuntimeError: expected scalar type Long but found Double
Cast it to float via your_tensor.float()
I wrote labels = torch.from_numpy(np.array(self.data.iloc[idx,0])).float() but I keep having an error message RuntimeError: expected scalar type Long but found Float Also I tried labels.float() and it didn't work.
Ok I fixed it by labels = torch.from_numpy(np.array(self.data.iloc[idx,0])).long(). Thank you very much for your help!
7

Here is a simple solution.

Just add the .float() to the image tensor. Like this:

# Forward Pass
outputs = model(images.float())

Comments

1
import torch
import torchvision
import matplotlib.pyplot as plt
from time import time
from torchvision import datasets, transforms
from torch import nn, optim

transform = transforms.Compose([transforms.ToTensor(),
                              transforms.Normalize((0.5,), (0.5,)),
                              ])

trainset = datasets.MNIST('data/train', download=True, train=True, transform=transform)
valset = datasets.MNIST('data/test', download=True, train=False, transform=transform)
trainloader = torch.utils.data.DataLoader(trainset, batch_size=64, shuffle=True)
valloader = torch.utils.data.DataLoader(valset, batch_size=64, shuffle=True)



input_size = 784
hidden_sizes = [128,128,64]
output_size = 10

model = nn.Sequential(nn.Linear(input_size, hidden_sizes[0]),
                      nn.ReLU(),
                      nn.Linear(hidden_sizes[0], hidden_sizes[1]),
                      nn.ReLU(),
                      nn.Linear(hidden_sizes[1], hidden_sizes[2]),
                      nn.ReLU(),
                      nn.Linear(hidden_sizes[2], output_size),
                      nn.LogSoftmax(dim=1))
# print(model)

criterion = nn.NLLLoss()
images, labels = next(iter(trainloader))
images = images.view(images.shape[0], -1)

logps = model(images) #log probabilities
loss = criterion(logps, labels) #calculate the NLL loss




optimizer = optim.SGD(model.parameters(), lr=0.003, momentum=0.9)
time0 = time()
epochs = 15
for e in range(epochs):
    running_loss = 0
    for images, labels in trainloader:
        # Flatten MNIST images into a 784 long vector
        images = images.view(images.shape[0], -1)
    
        # Training pass
        optimizer.zero_grad()
        
        output = model(images)
        loss = criterion(output, labels)
        
        #This is where the model learns by backpropagating
        loss.backward()
        
        #And optimizes its weights here
        optimizer.step()
        
        running_loss += loss.item()
    else:
        print("Epoch {} - Training loss: {}".format(e, running_loss/len(trainloader)))
    print("\nTraining Time (in minutes) =",(time()-time0)/60)
    
    
    
    
    
images, labels = next(iter(valloader))
img = images[0].view(1, 784)
with torch.no_grad():
    logps = model(img)

ps = torch.exp(logps)
probab = list(ps.numpy()[0])
print("Predicted Digit =", probab.index(max(probab)))
# view_classify(img.view(1, 28, 28), ps)





correct_count, all_count = 0, 0
for images,labels in valloader:
  for i in range(len(labels)):
    img = images[i].view(1, 784)
    with torch.no_grad():
        logps = model(img)

    
    ps = torch.exp(logps)
    probab = list(ps.numpy()[0])
    pred_label = probab.index(max(probab))
    true_label = labels.numpy()[i]
    if(true_label == pred_label):
      correct_count += 1
    all_count += 1

print("Number Of Images Tested =", all_count)
print("\nModel Accuracy =", (correct_count/all_count))


torch.save(model, './my_mnist_model.pt') ```

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.