8

I have a huge dataframe more than 100 mln rows. In that I have a date columns, unfortunately have improper formatted (mixed) date strings.

Now I did convert it to datetime by:

df['TRX_DATE'] = pd.to_datetime(df['TRX_DATE'],coerce=True)
# without any error
# Now i want to calculate week day from that date columns
df['day_type'] = [x.strftime('%A') for x in d['TRX_DATE']]
###ValueError: month out of range

If it would a single field I can manage with dateutil parser. But in this case I am getting out of idea, how to handle that.

Just intersted, if the week conversion line can have something like if anything getting out of range place a default...

Have the idea but as a newbie. Don't have that much experience to do that.

It would be great help if someone can give a code line to handle that.

2 Answers 2

21

I think you can parse to_datetime with parameter errors='coerce' and then use strftime for converting to weekday as locale’s full name:

print df
              TRX_DATE  some value
0  2010-08-15 13:00:00      27.065
1  2010-08-16 13:10:00      25.610
2  2010-08-17 02:30:00      17.000
3  2010-06-18 02:40:00      17.015
4  2010-18-19 02:50:00      16.910

df['TRX_DATE'] = pd.to_datetime(df['TRX_DATE'],errors='coerce')

df['day_type'] = df['TRX_DATE'].dt.strftime('%A')
print df
             TRX_DATE  some value day_type
0 2010-08-15 13:00:00      27.065   Sunday
1 2010-08-16 13:10:00      25.610   Monday
2 2010-08-17 02:30:00      17.000  Tuesday
3 2010-06-18 02:40:00      17.015   Friday
4                 NaT      16.910      NaT
Sign up to request clarification or add additional context in comments.

5 Comments

@jezrael- Already tried that..getting same error...i know some dataissue is there.but identifying the flaws from such a huge data is difficult for me...so need to avoid the error anyhow
What is version of your pandas? print pd.show_versions() Is possible share data?
Hmmm, but now the last version of pandas is 0.18.0. Is possible update pandas?
what does Nat mean?
0
[x.strftime('%A') for x in df['TRX_DATE'] if not isinstance(x, pandas.tslib.NaTType)]

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.