1

I have a .csv file of the following form: enter image description here

I need to parse through the whole csv file and replace 0 with 1 on the corresponding color, when I find it on the "Palette" section.

For example, for the first row, there are two values on the "Palette" section of the image, "Black" and "Blue". I need to replace the corresponding colors in the same row with 1 (so Black and Blue sections).

1
  • Can you provide a sample of the csv file in text form if possible. Commented Dec 14, 2021 at 17:22

1 Answer 1

1

I have something, but I'm not sure how it'll scale.

Test dataframe:

df = pd.DataFrame({
    "image" : ['photo1', 'photo2', 'photo3', 'photo4'],
    "palette" : ['["Black", "Blue"]', 'Yellow', 'Black', '["Yellow", "Blue"]']
})

Output:

enter image description here

First step: convert the strings to actual lists.

def wrap_eval(x):
    try:
        return eval(x)
    except:
        return [x]
    
df["palette"] = df["palette"].apply(wrap_eval)

Output; it looks very similar, but if you check for example, df.loc[0, "palatte"], you'll see that we have a list of strings now rather than a string that happens to look like a list:

enter image description here

Now, we're going to iterate down the rows, (1) test to see if a column exists for each colour in the "palette" list in each row, (2) if it doesn't, add the column, with values of zero all the way down, and lastly (3), the column will exist by now, so set the value for it in this row to 1.

for i, row in df.iterrows():
    for colour in row["palette"]:
        try:
            df[colour]             # (1) in the steps above.
        except:
            df[colour] = 0         # (2)
        finally:
            df.loc[i, colour] = 1  # (3)

enter image description here

Sign up to request clarification or add additional context in comments.

4 Comments

If you try this please do let me know how many rows your dataframe has and how long it takes!
Thank you very much for your answer. It works wonders! Funny thing, I have created the first .csv, and I put all the zeroes. I' ll fix that too. Your approach of adding them later is very clever. The .csv isn't very big yet (200 rows / 15 columns) so the execution is instant. Thanks again!
The only problem it may occur is for a value to NOT exist in the Palette column, so I guess the corresponding color will never be created. I don't need to be so strict though :P
You're right, it won't. But if you know the list of colours beforehand, then you can pre-populate the columns with zeros all the way down (as you say you have done), and the code will still work the same, I'm pretty sure.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.