I have a series of dataframes with identical structure that represent results of a simulation for each hour of the year. Each simulation contains results for a series of coordinates (x,y).
Each dataframe is imported from a csv file that has time information only in the file name. Example:
results_YYMMDDHH.csv
contains data such
x y a b
0.0 0.0 0.318705 -0.871259
0.1 0.0 -0.937012 0.704270
0.1 0.1 -0.032225 -1.939544
0.0 0.1 -1.874781 -0.033073
I would like to create a single MultiIndexed Dataframe (level 0 is time and level 1 is (x,y)) that would allow me to perform various operations like averages, sums, max, etc. between these dataframes using the resampling or groupby methods. For each time step
The resulting dataframe should look something like this
x y a b
2010-01-01 10:00 0.0 0.0 0.318705 -0.871259
0.1 0.0 -0.934512 0.745270
0.1 0.1 -0.0334525 -1.963544
0.0 0.1 -1.835781 -0.067573
2010-01-01 11:00 0.0 0.0 0.318705 -0.871259
0.1 0.0 -0.923012 0.745670
0.1 0.1 -0.035225 -1.963544
0.0 0.1 -1.835781 -0.067573
.................
.................
2010-12-01 10:00 0.0 0.0 0.318705 -0.871259
0.1 0.0 -0.923012 0.723270
0.1 0.1 -0.034225 -1.963234
0.0 0.1 -1.835781 -0.067233
You can imagine this for each hour of the year. I would like now to be able to calculate for example the average for the whole year or the average for June. Also any other function like the number of hours above a certain threshold or between a min and a max value. Please bear in mind that the result should be in any of these operations a DataFrame. For example the monthly averages should look like
x y a b
2010-01 0.0 0.0 0.45 -0.13
2010-02 0.1 0.0 0.55 -0.87
2010-03 0.1 0.1 0.24 -0.83
2010-04 0.0 0.1 0.11 -0.87
How do I build this MultiIndexed dataframe? I picture this like a timeseries of dataframes.