1

I have a list which looks like this:

[2,4,3,1,5]  # One to Five in a certain order

I have a df that looks like this

ID    Name    Address
1      A         ABC
3      B         DEF

I want to add which ever IDs are missing with the values being False for Name and Address (there are a lot of columns towards to right, I don't want to manually give false for it, so if there is a method that will fill False for rest of the columns then its better)

After adding it, I want the DF to be sorted based on the order of the list.

Expected result:

ID   Name    Address
2    False      False
4    False      False
3    B          DEF
1    A          ABC
5    False      False
4
  • Can you please show what you at least tried? There are multiple ways of doing this Commented Aug 20, 2022 at 19:02
  • I am pretty new to pandas, just learning things. I couldn't do much Commented Aug 20, 2022 at 19:03
  • What steps do you think might be relevant, though? If you don't try to think your way through the problem then the answer (which I predict will come within 2 minutes) doesn't make you consider anything and diminishes from your learning Commented Aug 20, 2022 at 19:04
  • I think pd.concat will work after converting the list to a df. But how to only take the needed rows and fill in False for others + sorting it. i have no clue Commented Aug 20, 2022 at 19:13

2 Answers 2

4

Getting your initial df:

import pandas as pd

df = pd.DataFrame({'ID': [1, 3],
                   'Name': ['A', 'B'],
                   'Address': ['ABC', 'DEF']})
print(df.head()

It's useful to note here that an integer index is made for you. The way you list your df doesn't show such an index. The required output suggests that 'ID' is the index but it's not totally clear how you want the output in that format.

The first thing to do is set the index as 'ID':

df = df.set_index('ID')

Then you can reindex() that to build a backbone of the resulting df, which will fill the remaining columns with NaN.

df = df.set_index('ID').reindex(index=lst)

Gives:

  Name Address
ID             
2   NaN     NaN
4   NaN     NaN
3     B     DEF
1     A     ABC
5   NaN     NaN

(Note that 'ID', being the index, is now on a different level to the other column names)

The final step is to .fillna() with False

Putting it together:

df = df.set_index('ID').reindex(index=lst).fillna(False).reset_index()

The last reset_index() is optional because, as I said, you don't appear to have an independent index in your expected output.

Sign up to request clarification or add additional context in comments.

Comments

1

One way to do this:

your_list = [2,4,3,1,5]
df1 = pd.DataFrame({'Name':[False]*len(your_list), 'Address':[False]*len(your_list)}, index=your_list)

df1.loc[df.index] = df

where df is the sample dataframe with two rows.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.