2

I am trying to combine date and time columns of a csv file and convert them to timestamp using pandas.

Here is a sample of my csv file when read into a dataframe

Dataframe after reading

Id     Station        Month       Parameter    Date        From       To
1.0    ANANDVIHAR     Dec         ?PM2.5       2015-12-01  ?00:00:00  ?00:59:00

The Following Code:-

df['DateTime'] = df.apply(lambda row: datetime.datetime.strptime(row['Date']+ ':' + row['From'], '%Y.%m.%d:%H:%M:%S'), axis=1)

Is giving the following error:-

Traceback (most recent call last):

  File "project101.py", line 36, in <module>
    df['DateTime'] = df.apply(lambda row: datetime.datetime.strptime(row['Date']+ ':' + row['From'], '%Y.%m.%d:%H:%M:%S'), axis=1)

  File "c:\Python27\lib\site-packages\pandas\core\frame.py", line 4133, in apply
    return self._apply_standard(f, axis, reduce=reduce)

 File "c:\Python27\lib\site-packages\pandas\core\frame.py", line 4229, in _apply_standard
    results[i] = func(v)

  File "project101.py", line 36, in <lambda>
    df['DateTime'] = df.apply(lambda row: datetime.datetime.strptime(row['Date']+ ':' + row['From'], '%Y.%m.%d:%H:%M:%S'), axis=1)

  File "c:\Python27\lib\_strptime.py", line 332, in _strptime
    (data_string, format))

ValueError: ("time data '2015-12-01:\\xa000:00:00' does not match format '%Y.%m.%d:%H:%M:%S'", u'occurred at index 0')
1
  • Your question looks all messed up. Please use a correct format. Commented Nov 23, 2016 at 17:25

3 Answers 3

1

You can simply do:

df['DateTime'] = pd.to_datetime(df['Date'].str.cat(df['From'], sep=" "),
                                format='%Y-%m-%d \\xa%H:%M:%S', errors='coerce')

The '\\xa' in the format specifier will take care of the question marks. Those marks are for misinterpreted literal, which probably looks like '\\xa'

Sign up to request clarification or add additional context in comments.

2 Comments

I tried the solution but all the values in the DateTime column are "NaT"
Thank you kartik I got the solution finally
0

You can use pandas.Series.str.cat function.

Following code gives you a basic idea about this:

>>> Series(['a', 'b', 'c']).str.cat(['A', 'B', 'C'], sep=',')
0    a,A
1    b,B
2    c,C
dtype: object

For more information, please check this:

http://pandas.pydata.org/pandas-docs/version/0.17.0/generated/pandas.Series.str.cat.html

Hope this solves your problem...

2 Comments

Can you see the image of the printed dataframe . The From and To columns of date contains question marks before data. How can I remove them
Thanks sansingh the pandas reference helped
0

I finally got a solution I stripped the question marks before the date column and applied to_datetime() to the column of the dataframe

df['From'] = df['From'].map(lambda x: str(x)[1:])
df['FromTime'] = pd.to_datetime(df['Date'].str.cat(df['From'], sep=" "),format='%Y-%m-%d %H:%M:%S', errors='coerce')

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.