4

I'm using the coil-100 dataset which has images of 100 objects, 72 images per object taken from a fixed camera by turning the object 5 degrees per image. Following is the folder structure I'm using:

data/train/obj1/obj01_0.png, obj01_5.png ... obj01_355.png
.
.
data/train/obj85/obj85_0.png, obj85_5.png ... obj85_355.png
.
.
data/test/obj86/obj86_0.ong, obj86_5.png ... obj86_355.png
.
.
data/test/obj100/obj100_0.ong, obj100_5.png ... obj100_355.png

I have used the imageloader and dataloader classes. The train and test datasets loaded properly and I can print the class names.

train_path = 'data/train/'
test_path = 'data/test/'
data_transforms = {
    transforms.Compose([
    transforms.Resize(224, 224),
    transforms.ToTensor(),
    transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))
    ])
}

train_data = torchvision.datasets.ImageFolder(
    root=train_path,
    transform= data_transforms
)
test_data = torchvision.datasets.ImageFolder(
    root = test_path,
    transform = data_transforms
)
train_loader = torch.utils.data.DataLoader(
    train_data,
    batch_size=None,
    num_workers=1,
    shuffle=False
)
test_loader = torch.utils.data.DataLoader(
    test_data,
    batch_size=None,
    num_workers=1,
    shuffle=False
)

print(len(train_data))
print(len(test_data))
classes = train_data.class_to_idx
print("detected classes: ", classes)

In my model I wish to pass every image through pretrained resnet and make a dataset from the output of resnet to feed into a biderectional LSTM. For which I need to access the images by classname and index. for ex. pre_resnet_train_data['obj01'][0] should be obj01_0.png and post_resnet_train_data['obj01'][0] should be the resnet output of obj01_0.png and so on.
I'm a beginner in Pytorch and for the past 2 days, I have read many tutorials and stackoverflow questions about creating a custom dataset class but couldn't figure out how to achieve what I want. please help!

1 Answer 1

5

Assuming you only plan on running resent on the images once and save the output for later use, I suggest you write your own data set, derived from ImageFolder.
Save each resnet output at the same location as the image file with .pth extension.

class MyDataset(torchvision.datasets.ImageFolder):
  def __init__(self, root, transform):
    super(MyDataset, self).__init__(root, transform)

  def __getitem__(self, index):
    # override ImageFolder's method
    """
    Args:
      index (int): Index
    Returns:
      tuple: (sample, resnet, target) where target is class_index of the target class.
    """
    path, target = self.samples[index]
    sample = self.loader(path)
    if self.transform is not None:
      sample = self.transform(sample)
    if self.target_transform is not None:
      target = self.target_transform(target)
    # this is where you load your resnet data
    resnet_path = os.path.join(os.path.splitext(path)[0], '.pth')  # replace image extension with .pth
    resnet = torch.load(resnet_path)  # load the stored features
    return sample, resnet, target
Sign up to request clarification or add additional context in comments.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.