147

I read Pandas change timezone for forex DataFrame but I'd like to make the time column of my dataframe timezone naive for interoperability with an sqlite3 database.

The data in my pandas dataframe is already converted to UTC data, but I do not want to have to maintain this UTC timezone information in the database.

Given a sample of the data derived from other sources, it looks like this:

print(type(testdata))
print(testdata)
print(testdata.applymap(type))

gives:

<class 'pandas.core.frame.DataFrame'>
                        time  navd88_ft  station_id  new
0  2018-03-07 01:31:02+00:00  -0.030332          13    5
1  2018-03-07 01:21:02+00:00  -0.121653          13    5
2  2018-03-07 01:26:02+00:00  -0.072945          13    5
3  2018-03-07 01:16:02+00:00  -0.139917          13    5
4  2018-03-07 01:11:02+00:00  -0.152085          13    5
                                     time        navd88_ft     station_id  \
0  <class 'pandas._libs.tslib.Timestamp'>  <class 'float'>  <class 'int'>   
1  <class 'pandas._libs.tslib.Timestamp'>  <class 'float'>  <class 'int'>   
2  <class 'pandas._libs.tslib.Timestamp'>  <class 'float'>  <class 'int'>   
3  <class 'pandas._libs.tslib.Timestamp'>  <class 'float'>  <class 'int'>   
4  <class 'pandas._libs.tslib.Timestamp'>  <class 'float'>  <class 'int'>   

             new  
0  <class 'int'>  
1  <class 'int'>  
2  <class 'int'>  
3  <class 'int'>  
4  <class 'int'>  

but

newstamp = testdata['time'].tz_convert(None)

gives an eventual error:

TypeError: index is not a valid DatetimeIndex or PeriodIndex

What do I do to replace the column with a timezone naive timestamp?

1

5 Answers 5

258

The column must be a datetime dtype, for example after using pd.to_datetime. Then, you can use tz_localize to change the time zone, a naive timestamp corresponds to time zone None:

testdata['time'].dt.tz_localize(None)

Unless the column is an index (DatetimeIndex), the .dt accessor must be used to access pandas datetime functions.

Sign up to request clarification or add additional context in comments.

Comments

15

When your data contains datetimes spanning different timezones or prior and after application of daylight saving time e.g. obtained from postges database with psycopg2, depending on pandas version you might end up in some of the scenarios where best method of conversion is:

testdata['time'].apply(lambda x: x.replace(tzinfo=None))

Scenarios when this works (note the usage of FixedOffsetTimezone with different offset) while usage of .dt.tz_localize(None) does not:

df = pd.DataFrame([
    datetime.datetime(2018, 5, 17, 21, 40, 20, 775854, 
                      tzinfo=psycopg2.tz.FixedOffsetTimezone(offset=120, name=None)),
    datetime.datetime(2021, 3, 17, 14, 36, 13, 902741, 
                      tzinfo=psycopg2.tz.FixedOffsetTimezone(offset=60, name=None))
])

pd.__version__
'0.24.2'


df[0].dt.tz_localize(None)

Traceback (most recent call last):
  File "/usr/local/lib/python3.7/site-packages/pandas/core/arrays/datetimes.py", line 1861, in objects_to_datetime64ns
    values, tz_parsed = conversion.datetime_to_datetime64(data)
  File "pandas/_libs/tslibs/conversion.pyx", line 185, in pandas._libs.tslibs.conversion.datetime_to_datetime64
ValueError: Array must be all same time zone
pd.__version__
'1.1.2'


df[0].dt.tz_localize(None)

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python3.8/site-packages/pandas/core/generic.py", line 5132, in __getattr__
    return object.__getattribute__(self, name)
  File "/usr/local/lib/python3.8/site-packages/pandas/core/accessor.py", line 187, in __get__
    accessor_obj = self._accessor(obj)
  File "/usr/local/lib/python3.8/site-packages/pandas/core/indexes/accessors.py", line 480, in __new__
    raise AttributeError("Can only use .dt accessor with datetimelike values")
AttributeError: Can only use .dt accessor with datetimelike values

Comments

11

I know that you mentioned that your timestamps are already in UTC, but just to be defensive, you might as well make your code impervious to the case where timestamps (some or all of them) were in a different timezone. This doesn't cost anything, and will be more robust:

newcol = testdata['time'].dt.tz_convert(None)

As per the docs:

A tz of None will convert to UTC and remove the timezone information.

This is safer than just dropping any timezone the timestamps may contain.

1 Comment

If you use this with TZ-naive data you get TypeError: Cannot convert tz-naive timestamps, use tz_localize to localize If you tz_localize to None it won't convert the TZ-aware data to UTC before it drops the TZ. -- pandas.pydata.org/docs/reference/api/… Neither seem good for mixed TZ-aware and TZ-naive data. (I recently had some TZ-aware data with +00 appended to a set of TZ-naive data in a sqlite3 DB, and the mixed data choked pd.read_sql_query(...) )
3

Here is a function that will

  • find all columns with any instance of pd.Timestamp in them
  • convert those columns to dtype datetime (to be able to use the .dt accessor on the Series')
  • Localize all timestamps with dt.tz_localize(None), which will keep the timeshift relative to UTC
def remove_tz_from_dataframe(df_in):
    df = df_in.copy()
    col_times = [ col for col in df.columns if any([isinstance(x, pd.Timestamp) for x in df[col]])]
    for col in col_times:
        df[col] = pd.to_datetime(
            df[col], infer_datetime_format=True) 
        df[col] = df[col].dt.tz_localize(None) 
    return df

2 Comments

There is an ) missing after pd.Timestamp: any([isinstance(x, pd.Timestamp) for x in df[col]])]
@Valentin I see.. Corrected it now.
0

Years later, the sqlite python module has deprecated the default sqlite conversions and adapters and recommended that users write/register their own.

I adjusted these samples into this adapter and converter:

""" adapter to sqlite and converter from sqlite per https://docs.python.org/3/library/sqlite3.html#sqlite3-adapter-converter-recipes """
def adapt_datetime_iso(val):
    """Adapt datetime.datetime to sqlite timezone-naive ISO 8601 date string.
Example:
id|transUTC|value
22|2025-03-28T07:36:00|-1.9199996471405
"""
    val = val.replace(tzinfo=None)
    return val.isoformat()

def convert_datetime(val):
    """Convert sqlite ISO 8601 datetime string to UTC datetime.datetime object.
Example:

df = pd.read_sql_query('select MAX(transUTC) as "transUTC [datetimeN]", id from mytable where id = ?;', con ,params=(station_id))

    """
    dt = datetime.datetime.fromisoformat(val.decode())
    return dt.replace(tzinfo=datetime.timezone.utc)

# register them
sqlite3.register_adapter(datetime.datetime, adapt_datetime_iso)
sqlite3.register_converter("datetimeN", convert_datetime)

...which allow the sqlite3 database to remain as a clean timezone naive/UTC-assumed field. The converter and adapter enforce the assumption of UTC on conversion from the field to a python tz-aware datetime.datetime, and tha adapt the python timestamp into a timezone naive, UTC conformant field in the database.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.