I have a fairly complex dataframe that looks like this :
df = pd.DataFrame({'0': {('Total Number of End Points', '0.01um', '0hr'): 12,
('Total Number of End Points', '0.1um', '0hr'): 8,
('Total Number of End Points', 'Control', '0hr'): 4,
('Total Number of End Points', '0.01um', '24hr'): 18,
('Total Number of End Points', '0.1um', '24hr'): 12,
('Total Number of End Points', 'Control', '24hr'): 6,
('Total Vessel Length', '0.01um', '0hr'): 12,
('Total Vessel Length', '0.1um', '0hr'): 8,
('Total Vessel Length', 'Control', '0hr'): 4,
('Total Vessel Length', '0.01um', '24hr'): 18,
('Total Vessel Length', '0.1um', '24hr'): 12,
('Total Vessel Length', 'Control', '24hr'): 6},
'1': {('Total Number of End Points', '0.01um', '0hr'): 12,
('Total Number of End Points', '0.1um', '0hr'): 8,
('Total Number of End Points', 'Control', '0hr'): 4,
('Total Number of End Points', '0.01um', '24hr'): 18,
('Total Number of End Points', '0.1um', '24hr'): 12,
('Total Number of End Points', 'Control', '24hr'): 6,
('Total Vessel Length', '0.01um', '0hr'): 12,
('Total Vessel Length', '0.1um', '0hr'): 8,
('Total Vessel Length', 'Control', '0hr'): 4,
('Total Vessel Length', '0.01um', '24hr'): 18,
('Total Vessel Length', '0.1um', '24hr'): 12,
('Total Vessel Length', 'Control', '24hr'): 6},
'2': {('Total Number of End Points', '0.01um', '0hr'): 12,
('Total Number of End Points', '0.1um', '0hr'): 8,
('Total Number of End Points', 'Control', '0hr'): 4,
('Total Number of End Points', '0.01um', '24hr'): 18,
('Total Number of End Points', '0.1um', '24hr'): 12,
('Total Number of End Points', 'Control', '24hr'): 6,
('Total Vessel Length', '0.01um', '0hr'): 12,
('Total Vessel Length', '0.1um', '0hr'): 8,
('Total Vessel Length', 'Control', '0hr'): 4,
('Total Vessel Length', '0.01um', '24hr'): 18,
('Total Vessel Length', '0.1um', '24hr'): 12,
('Total Vessel Length', 'Control', '24hr'): 6}})
print(df)
0 1 2
Total Number of End Points 0.01um 0hr 12 12 12
24hr 18 18 18
0.1um 0hr 8 8 8
24hr 12 12 12
Control 0hr 4 4 4
24hr 6 6 6
Total Vessel Length 0.01um 0hr 12 12 12
24hr 18 18 18
0.1um 0hr 8 8 8
24hr 12 12 12
Control 0hr 4 4 4
24hr 6 6 6
I'm trying to divide each value by the average of the columns in the corresponding control level. I tried the following but it didn't work.
df2 = df.divide(df.xs('Control', level=1).mean(axis=1), axis='index')
I'm pretty new to python and pandas so I tend to think in MS Excel terms on this problem.
If it were in Excel the formula for A1 ('Total Number of End Points', '0.01um', '0hr', 0) would look be :
=A1 / AVERAGE($A$5:$C$5)
B1 ('Total Number of End Points', '0.01um', '0hr', 1) would be :
=B1 / AVERAGE($A$5:$C$5)
and A2 ('Total Number of End Points', '0.01um', '24hr', 0) would be
=A1 / AVERAGE($A$6:$C$6)
The desired result of this example would be :
0 1 2
Total Number of End Points 0.01um 0hr 3 3 3
24hr 3 3 3
0.1um 0hr 2 2 2
24hr 2 2 2
Control 0hr 1 1 1
24hr 1 1 1
Total Vessel Length 0.01um 0hr 3 3 3
24hr 3 3 3
0.1um 0hr 2 2 2
24hr 2 2 2
Control 0hr 1 1 1
24hr 1 1 1
Note : There are many indexes and columns in the real data.