I am trying to plot a pandas.DataFrame, but getting an unexplainable ValueError. Here is sample code causing the problem:
import pandas as pd
import matplotlib.pyplot as plt
from io import StringIO
import matplotlib.dates as mdates
weekday_fmt = mdates.DateFormatter('%a %H:%M')
test_csv = 'datetime,x1,x2,x3,x4,x5,x6\n' \
'2021-12-06 00:00:00,8,42,14,23,12,2\n' \
'2021-12-06 00:15:00,17,86,68,86,92,45\n' \
'2021-12-06 00:30:00,44,49,81,26,2,95\n' \
'2021-12-06 00:45:00,35,78,33,18,80,67'
test_df = pd.read_csv(StringIO(test_csv), index_col=0)
test_df.index = pd.to_datetime(test_df.index)
plt.figure()
ax = test_df.plot()
ax.set_xlabel(f'Weekly aggregation')
ax.set_ylabel('y-label')
fig = plt.gcf()
fig.set_size_inches(12.15, 5)
ax.get_legend().remove()
ax.xaxis.set_major_formatter(weekday_fmt) # This and the following line are the ones causing the issues
ax.xaxis.set_minor_formatter(weekday_fmt)
plt.show()
If the two formatting lines are removed, the code runs through, but if I leave them in there, I get a ValueError: ValueError: Date ordinal 27312480 converts to 76749-01-12T00:00:00.000000 (using epoch 1970-01-01T00:00:00), but Matplotlib dates must be between year 0001 and 9999.
The reason seems to be that the conversion of datetime in pandas and matplotlib are incompatible. This could probably be circumvented by not using the built-in plot-function of pandas. Is there another way? Thanks!
My package versions are:
pandas 1.3.4
numpy 1.19.5
matplotlib 3.4.2
python 3.8.10
'2021-12-07 00:00:00,35,78,33,18,80,67') and it works fine. Not sure why, you should probably report this case to the matplotlib mailing list / tracker to ensure this is not a bug.ax = test_df.plot(x_compat=True)to enable compatibility mode for the x axis, i.e. use Python datetime instead of pandas datetime. Thereby, matplotlib can do its job and format correctly. Example from pandas docs.