This is follow-up question to the answer for this question:
pandas performance issue - need help to optimize
The following suggestion works:
df = DataFrame(np.arange(20).reshape(5,4))
df2 = df.set_index(keys=[0,1,2])
df2.ix[(4,5,6)]
for using a MultiIndex
So I created a file sample_data.csv that looks like this:
col1,col2,year,amount 
111111,3.5,2012,700 
111112,3.5,2011,600 
222221,4.0,2012,222 
... 
I then ran the following:
import numpy as np 
import pandas as pd 
sd=pd.read_csv('sample_data.csv') 
sd2=sd.set_index(keys=['col2','year']) 
sd2.ix[(4.0,2012)] 
But this produces the following error: IndexError: index out of bounds
Any ideas why it works in the former case but not the latter? This is what the error looks like:
IndexError                                Traceback (most recent call last)
<ipython-input-19-1d72a961db95> in <module>()
----> 1 sd2.ix[(4.0,2012)]
/Library/Python/2.7/site-packages/pandas-0.8.1-py2.7-macosx-10.7-intel.egg/pandas/core/indexing.pyc in __getitem__(self, key)
     31                 pass
     32 
---> 33             return self._getitem_tuple(key)
     34         else:
     35             return self._getitem_axis(key, axis=0)
