3

currently i have a dictionary that looks something like this:

{'a':[1,2,3,0,0],'b':[1,5,2,1,4], 'c':[1,2,4,12,1]}

I'm trying to create a covariance matrix out of this dictionary. i already have a defined covariance function so ideally the output would look something like this (along with the keys as the labels for the rows and columns):

   a   b   c
a   
b
c

The ith row jth column output would call the covariance function and have as its input the value (a vector) of key i and the value (a vector) of key j. For example:

covariance([1,5,2,1,4],[1,2,4,12,1])

I'm doing something like this right now to print out all the covariances but I'd prefer it in a matrix form:

keys=dictionary.keys()
values=dictionary.values()
for counter in range(len(values)-1):
    print keys[counter]-1 + '&' + keys[counter] + ':' + covariance(values[counter-1],values[counter])
    counter+=1

which gives me:

a & b: 0.10
b & c: 0.20

but no association with a & c

any help would be greatly appreciated. thanks.

1 Answer 1

2

Sometimes the fantastic numpy, scipy and Pandas family comes to the rescue. Taking a quick stab at this you may try something like

import pandas as pd
df = pd.DataFrame({'a':[1,2,3,0,0],'b':[1,5,2,1,4], 'c':[1,2,4,12,1]})
covariance = df.cov()
Sign up to request clarification or add additional context in comments.

2 Comments

Any particular reason why -1? I use Pandas like this for principle component analysis all the time. It's fast, concise and widely used.
+1 can't explain the -1 (and claim this was NAA). This is the answer, numpy doesn't have concept of labelled rows/columns.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.