The Wayback Machine - https://web.archive.org/web/20200727144603/https://github.com/pytorch/text/issues/474
Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TypeError: '<' not supported between instances of 'Example' and 'Example' #474

Open
kidman99 opened this issue Nov 12, 2018 · 7 comments
Open
Labels

Comments

@kidman99
Copy link

@kidman99 kidman99 commented Nov 12, 2018

Got the error when running the following code. Is there anything similar to an operator overloading for "<" needed here, or there is a go around way here?

from torchtext.data import TabularDataset
from torchtext import data
from torchtext.vocab import GloVe
from torchtext.vocab import GloVe

tv_datafields = [("id", None), # we won't be needing the id, so we pass in None as the field
("question_text", TEXT),
("target", LABEL)]

trn = TabularDataset.splits(
path="data/quora", # the root directory where the data lies
train='train.csv',
format='csv',
skip_header=True, # if your csv header has a header, make sure to pass this to ensure it doesn't get proceesed as data!
fields=tv_datafields)

TEXT.build_vocab(trn, vectors=GloVe(name='6B', dim=300))

@tu-artem
Copy link

@tu-artem tu-artem commented Jan 15, 2019

.splits() returns a tuple of datasets, in your case it is of length 1. So

trn = TabularDataset.splits(
...
...
...
fields=tv_datafields)[0]

should work here or you can use a regular TabularDataset constructor instead.

@cheryllwl
Copy link

@cheryllwl cheryllwl commented Jan 21, 2019

I had the same problem with TabularDataset too
http://mlexplained.com/2018/02/08/a-comprehensive-tutorial-to-torchtext/
This tutorial was helpful.
image
added these two lines and it worked like a charm

@mttk
Copy link
Contributor

@mttk mttk commented Jan 31, 2019

thanks @cheryllwl , this should be documented properly.

@mttk mttk added the docs label Jan 31, 2019
@kunjmehta
Copy link

@kunjmehta kunjmehta commented Oct 4, 2019

@tu-artem Can you please elaborate on what adding the index [0] does?
From what I gather the splits() method returns a Dataset object as a tuple containing Example objects (instances/rows)
So, if I write;
train, val = torchtext.data.TabularDataset.splits(path='./', train = "train.csv", test = "test.csv", format='csv', fields=data_fields, skip_header = True)
I will get a Dataset object which is a tuple containing all training instances in train variable and another Dataset object containing all test instances in val variable. Am I right?
In this case, please help me understand what the indexing [0] does. Thanks.

@tu-artem
Copy link

@tu-artem tu-artem commented Oct 4, 2019

@kunjmehta in your case you are already doing tuple unpacking via multiple assignment train, val = ..., so you don't need any further indexing

@aaronbriel
Copy link

@aaronbriel aaronbriel commented Jan 25, 2020

What worked for me was to simply add sort=False, as sorting was not needed in my case.

@Sandesh10
Copy link

@Sandesh10 Sandesh10 commented Feb 19, 2020

What worked for me was to simply add sort=False, as sorting was not needed in my case.

This worked for me too. I added sort=False as a parameter in the BucketIterator.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
7 participants
You can’t perform that action at this time.