I have two questions. First, my filling up the data in the end triggers the following error. Second, since I am not too familiar with ``pandas'', this code is probably really untypical. If you have any improvements, feel free to help make this compact and efficient.
The code is supposed to create a crosswalk between x to y. The database may contain the same x<->y relationship several time. However, it should be unique. For every X, I check if the database is actually correct: if there is more than one relation, they all match to the same y.
Beginning of the crosswalk.csv:
x,y
832,"6231"
0,"00000000"
0,"00000000"
0,"00000000"
0,"00000000"
0,"00000000"
0,"00000000"
840,"6214"
842,"6111"
The code
data = pd.read_csv('data/crosswalk_short.csv')
df = pd.DataFrame(data)
xs = df.x.unique()
result = pd.DataFrame(index=xs)
result.fillna(NaN)
for x in xs:
ys = df[df.x == x].y
range = arange(0, len(ys.index))
ys = ys.reindex(range)
if (range[-1] > 0 and not isnan(ys[1]) ):
print 'error!'
result._ix[x] = ys[0]
The error:
File "<ipython-input-129-4cf0c04508c4>", line 1, in <module>
result._ix[x] = ys[0]
TypeError: 'NoneType' object does not support item assignment