Timeline for Data set with many variables in Python, many indented dictionaries?
Current License: CC BY-SA 4.0
        6 events
    
    | when toggle format | what | by | license | comment | |
|---|---|---|---|---|---|
| Jan 10, 2021 at 6:10 | comment | added | Parfait | 
        
            
    There are many pandas methods for calculations. Pandas works best in sets not scalar values, especially vectorized calculations. It is not clear what exactly you need to do. Look into groupby to run calculations across all groups: df.groupby(['x','y','channel'])['value'].agg(['sum','mean','median','min','max']).
        
     | 
|
| Jan 9, 2021 at 22:49 | comment | added | user171780 | 
        
            
    According to a time measurement using time.time() with the data frame it takes about 150 s while with the dictionaries it takes 20 s.
        
     | 
|
| Jan 9, 2021 at 22:26 | comment | added | user171780 | 
        
            
    I have implemented my code with pandas.DataFrame and it works, but accessing single elements is incredibly slow. My df has about 1.5e6 rows and later I need to group the elements according to the values of some of the columns, for example df[(df['x']=1)&(df['y']=2)&(df['channel']='left pad')]. I need to do this for each value of x and y and channel and this takes a considerable amount of time, with the indented dictionaries implementation is super fast (however a bit too rigid).
        
     | 
|
| Jan 9, 2021 at 9:06 | vote | accept | user171780 | ||
| Jan 8, 2021 at 22:09 | comment | added | user171780 | Thanks, this looks better with data frames! | |
| Jan 8, 2021 at 21:56 | history | answered | Parfait | CC BY-SA 4.0 |