1

I have 2 dataframes (Thresholds and InfoTable) were the first row is a row of headers:

(Thresholds)

AA BB CC DD EE 
0  15 7  0  23

and

(InfoTable)

ID Xposition Yposition AA BB CC DD EE
1  1         1         10 20 5  10 50
2  2         2         20 12 10 20 2
3  3         3         30 19 17 30 26
4  4         4         40 35 3  40 38
5  5         5         50 16 5  50 16

I am trying to filter the data so that the columns in the Thresholds dataframe that contain a 0 are columns that are removed from the InfoTable dataframe. I am then trying to compare the values of each row in the Thresholds dataframe to the values in the InfoTable dataframe so that They can be replaced with either a 1 or a 0 in the Infotable. My desired output is below:

ID Xposition Yposition BB CC EE
1  1         1         1  0  1
2  2         2         0  1  0
3  3         3         1  1  1
4  4         4         1  0  1
5  5         5         1  0  0

This is the code that I have right now to filter each table.

with open('thresholds_test.txt' ) as a:
    Thresholds = pd.read_table(a, sep=',') 
print Thresholds 

with open('includedThresholds.txt') as b:
    IncludedThresholds = pd.read_table(b, sep=',' )
print IncludedThresholds

InterestingThresholds = IncludedThresholds.drop(IncludedThresholds.columns[~IncludedThresholds.iloc[0].astype(bool)],axis=1)
print InterestingThresholds 

with open('PivotTable.tab') as c:
    PivotTable = pd.read_table(c, sep='\t' )
print PivotTable

headers = InterestingThresholds.columns.append(pd.Index(['ID','XPostion','YPosition']))
InfoTable = PivotTable.loc[:, headers]
print InfoTable 

Any help would be greatly appreciated!

1 Answer 1

1

Find columns to keep and drop:

cols = Thresholds.columns[Thresholds.iloc[0].astype(bool)]
dcols = Thresholds.columns[~Thresholds.iloc[0].astype(bool)]

Do comparison:

comp_df = pd.DataFrame(InfoTable[cols].values >= Thresholds[cols].values, columns=cols).astype(int)

Assign comparison result to original dataframe and drop columns:

df_out = InfoTable.assign(**comp_df).drop(dcols, axis=1)
print(df_out)

Output:

  ID  Xposition  Yposition  BB  CC  EE
0   1          1          1   1   0   1
1   2          2          2   0   1   0
2   3          3          3   1   1   1
3   4          4          4   1   0   1
4   5          5          5   1   0   0
Sign up to request clarification or add additional context in comments.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.