1
import pandas as pd
A=pd.read_csv("C:/Users/amulya/Desktop/graves lab/main_now.csv", index_col=False, header=None)
DATA1=pd.DataFrame(A)
DATA1[0]
B=pd.read_csv("C:/Users/amulya/Desktop/graves lab/words.csv", index_col=False, header=None) 
DATA2=pd.DataFrame(B)
DATA2[0]
for xrow in range (1,len(DATA1)):  
for yrow in range (1,len(DATA2)):
    if DATA2== DATA1:
    print(DATA1[3]) 

"In column 1 of DATA1 file there is numbers from 1-3000, and in column 1 of DATA2 there 465 random numbers . I want to search these numbers in DATA1 file and print rest of the columns"

1 Answer 1

1

You can use isin to find if the value in col1 of Data2 is a value in col1 of Data1 and then slice Data1 by that boolean DataFrame.

import pandas as pd
df1 = pd.DataFrame({'col1': [1,2,3,4,5,6,7,8,9],
                    'col2': [1,3,5,7,9,11,13,15,17]})
df2 = pd.DataFrame({'col1': [1, 101, 6, 9, 4]})

We have the two DataFrames df1 and df2. You can select the first column of the first dataframe by its column name by either df['col1'] or equivalently df.col1

df1.col
#0    1
#1    2
#2    3
#3    4
#4    5

The condition you want is whether the value in df1.col1 appears in the first column of df2. This is accomplished with the isin function. The syntax reads as you expect, it looks for 'whether df1.col1 is in df2.col1' and returns a True/False dataframe.

df1.col1.isin(df2.col1)
#0     True
#1    False
#2    False
#3     True
#4    False
#5     True

When you then slice df1 by this true false dataframe, it returns only the rows that were TRUE, in this case the indices 0,3,5 and 8. It will return all columns, as you are only slicing the dataframe by rows.

df1[df1.col1.isin(df2.col1)]
#   col1  col2
#0     1     1
#3     4     7
#5     6    11
#8     9    17
Sign up to request clarification or add additional context in comments.

6 Comments

that's great!! but both my files are .csv files with DATA1 file having 103 colums and 2999 rows. And DATA2 have 1 column and 465 rows. So with the above solution we have to mention all the column names?
i am beginner in python..so m not clear how to go about t.could you please elaborate on the solution?
No not at all! df1.col1.isin(df2.col1) returns a single column DataFrame that is just True or False indicating whether that value was found anywhere in the first column of DATA2. When you then slice the first dataframe by df[]` it return all columns, but only the rows where the condition was true
You will replace df1 by DATA1 and df2 by DATA2. And the only other thing you need to specify are the column names for the first columns in your dataframes. so df1.col1 should be replaced by DATA1.whatever_your_column_is_named and the same for df2
Worked!! thanks for the help. Also wanted to know how to put output back to .csv file??
|

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.