0

I'm not sure what's going on here, but when I try to do a scatter plot with a dataframe that has the index set to datetimes, I get a much wider range of dates in the plot for the x-axis. Here's an example:

import matplotlib.pyplot as plt
import pandas as pd

datetimes = ['2020-01-01 01:00:00', '2020-01-01 01:00:05',
             '2020-01-01 01:00:10', '2020-01-01 01:00:15',
             '2020-01-01 01:00:20', '2020-01-01 01:00:25',
             '2020-01-01 01:00:30', '2020-01-01 01:00:35',
             '2020-01-01 01:00:40', '2020-01-01 01:00:45']
datetimes = pd.to_datetime(datetimes)
values = [1,2,3,4,5,6,7,8,9,10]
df = pd.DataFrame()
df['values'] = values
df = df.set_index(datetimes)

fig, ax = plt.subplots(figsize=(16,9))
ax.scatter(df.index, df.values)
plt.show()

I get this: This does not plot correctly

Yet if I do a plot instead of a scatter

fig, ax = plt.subplots(figsize=(16,9))
ax.plot(df)
plt.show()

I get: This plots correctly

I don't understand why the x-axis has a huge date range on the scatter plot which is not included in the datetime range I gave it. It appears to work correctly using plot but not scatter. I'm guessing I'm missing something obvious here but I haven't had any success googling it. Any insight would be greatly appreciated!

4
  • I could not reproduce your chart with this code. Commented Sep 10, 2020 at 19:37
  • The scatter dots follow exactly the line -- from left bottom to right top -- in the charts I produced with your code. Commented Sep 10, 2020 at 19:41
  • 1
    The issue is the x-axis scale of the scatter plot. If you look at the line plot, the axis is only for the relevant hours. Make sure you're in matplotlib 3.3.1, or specify xlim. Commented Sep 10, 2020 at 19:51
  • Thanks! I really should have specified which version of matplotlib I'm using. I have matplotlib 3.0.3 so this is probably an issue with upgrading. Commented Sep 11, 2020 at 19:52

3 Answers 3

2

your code runs just fine on my machine (matplotlib 3.2.2 and pandas 1.0.5). what version of matplotlib and pandas you're in?

try updating your libraries or use this:

ax.set_xlim(df.index[0], df.index[-1])
Sign up to request clarification or add additional context in comments.

1 Comment

I'm using matplotlib 3.0.3 and pandas 1.0.3 so maybe upgrading will fix this. Thank you! I should have checked that but I appreciate the info.
1

I sligtly shortened your code to:

datetimes = ['2020-01-01 01:00:00', '2020-01-01 01:00:05',
             '2020-01-01 01:00:10', '2020-01-01 01:00:15',
             '2020-01-01 01:00:20', '2020-01-01 01:00:25',
             '2020-01-01 01:00:30', '2020-01-01 01:00:35',
             '2020-01-01 01:00:40', '2020-01-01 01:00:45']
values = [1,2,3,4,5,6,7,8,9,10]
df = pd.DataFrame({'values': values}, index=pd.to_datetime(datetimes))
fig, ax = plt.subplots(figsize=(10,4))
ax.scatter(df.index, df['values'])
plt.show()

but it should not matter.

Another detail is that df.values retrieves the underlying Numpy array, whereas df['values'] (as I wrote) retrieves just the column of interest.

The plot I got is quite as expected:

enter image description here

Maybe it is a matter of the version of Pandas and/or Pyplot. I use Pandas version 1.0.3 and Pyplot version 3.2.1. If you have older versions, maybe you should upgrade?

Another option: Set manually x axis limits:

plt.xlim(pd.to_datetime('2020-01-01 00:59:55'),
    pd.to_datetime('2020-01-01 01:00:50'))

2 Comments

I've always used df.values and df['values'] interchangeably in the past and have no idea why I didn't realize they are different. That is super useful info. I'm using matplotlib 3.0.3 and pandas 1.0.3 so I will upgrade and see if that helps. I had some weird errors when using set_xlim if it was called before the plot call but moving it after the plot call fixed that. Thanks for the great info!
General suggestion: Don't use column names the same as existing Pandas attributes. They should differ at least in other case of letters. E.g. "values" is an attribute of a DataFrame, so use e.g. "Values" (upper case "V"). But a better solution is to use some other words, like "Val" or "Vals".
1

I don't know the reason without research but if you use plt.xlim(df.index[0], df.index[-1]) you can move on: enter image description here

1 Comment

Yep, that was what I needed. I tried this but had it called right before the plot call and it gave me an odd error. I had previously used ax.set_ylim() before the plot call with no errors but for some reason I needed to move ax.set_xlim() to after the plot call. Thanks for the quick reply!

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.