2

I have two dataframes:

products

+------------+--------------------+
| item_name  | item_tags          |
+------------+--------------------+
| blue_shirt | summer,winter,blue |
|            |                    |
+------------+--------------------+
| red_skirt  | spring,summer      |
+------------+--------------------+

and orders

+------------+
| item       |
+------------+
| blue_shirt |
+------------+
| red_skirt  |
+------------+

and I want to create a new column in orders: when products.item_name == orders.item, I want to take the value of products.item_tags and add it to orders.

I've tried:

orders['ItemTags'] = products.query("{0}=={1}".format(orders['item'], products['item_name']))['Tags']

But it gives me an error.

2 Answers 2

1

One way we can do this is with creating a dictionary from your products table, with your item_name column as your key and your item_tags column as your value, and then map it onto your orders item column:

products_dict = dict(zip(products.item_name,products.item_tags))
orders['item_tags'] = orders['item'].map(products_dict)

Output

orders
Out[83]: 
         item           item_tags
0  blue_shirt  summer,winter,blue
1   red_skirt       spring,summer
Sign up to request clarification or add additional context in comments.

2 Comments

brilliant, exactly what I needed. I tried all sorts of things with map but didn't think about creating a dict first. thanks!
Welcome, glad it works. Another way to get there would be with merging the 2 datasets.
0

Another approach could be using of the merge function.

pd.merge(df_orders, df_products, left_on='item', right_on='item_name').drop(['item_name'], axis = 1)

By using merge method you can pass the both products and orders datasets and specify the column(s) for the join operations.

import pandas as pd
df_products = pd.DataFrame(data={'item_name': ['blue_shirt', 'red_skirt'], 'item_tags': ['summer,winter,blue','spring,summer']})
df_orders = pd.DataFrame(data={'item': ['blue_shirt','red_skirt']})
df_orders = pd.merge(df_orders, df_products, left_on='item', right_on='item_name').drop(['item_name'], axis = 1)
print(df_orders)

Output

   item           item_tags
0  blue_shirt  summer,winter,blue
1  red_skirt       spring,summer

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.