0

i wanted to assign a value on Pandas dataframe based on a condition of the length of the value..

So i wanted to create a new column, that new column value is assigned based on the condition, if the value length is > 8. and another new column is assigned with value =< 8

I'm trying this : So the df_dr_dp_lj['kode_wilayah'] is the existing columns that holds the value, and i wanted to check the length of it. And the columns df_dr_dp_lj['kode_kelurahan'] and that column will hold the value with length more than 8. And this column df_dr_dp_lj['kode_kecamatan'] will hold the value with length =<8 My code is looking like this :

if df_dr_dp_lj['kode_wilayah'].str.len() > 8:
    df_dr_dp_lj['kode_kelurahan']=df_dr_dp_lj['kode_wilayah']
else :
    df_dr_dp_lj['kode_kecamatan']=df_dr_dp_lj['kode_wilayah']

an error i got is :

Input In [47], in <cell line: 1>()
----> 1 if df_dr_dp_lj['kode_wilayah'].str.len() > 8:
      2     df_dr_dp_lj['kode_kelurahan']=df_dr_dp_lj['kode_wilayah']
      3 else :

File D:\Python\lib\site-packages\pandas\core\generic.py:1527, in NDFrame.__nonzero__(self)
   1525 @final
   1526 def __nonzero__(self):
-> 1527     raise ValueError(
   1528         f"The truth value of a {type(self).__name__} is ambiguous. "
   1529         "Use a.empty, a.bool(), a.item(), a.any() or a.all()."
   1530     )

ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

#update

I tried it with the suggested answer, but still got data length for more than 8 of length. enter image description here

1 Answer 1

1

here is one way to do it, using mask

in your code example, you're using a for loop and doing a comparison against a series, hence the error

df_dr_dp_lj['kode_kelurahan']= df_dr_dp_lj['kode_wilayah'].mask(
    df_dr_dp_lj['kode_wilayah'].str.len()> 8, 
    df_dr_dp_lj['kode_wilayah'] )

df_dr_dp_lj['kode_kecamatan']= df_dr_dp_lj['kode_wilayah'].mask(
    df_dr_dp_lj['kode_wilayah'].str.len()<=8, 
    df_dr_dp_lj['kode_wilayah'] )


Sign up to request clarification or add additional context in comments.

4 Comments

I tried it , the error is gone but i still got the data for df_dr_dp_lj['kode_kecamatan'] more than 8 of length, I've provided the screenshots in the post i've updated,in case it might helps
Make sure you don't have any extra spaces in your data. What is the output of df[df["id"].isin(["PGW001","PGW002"])]["kode_wilayah"].apply(len)?
@PiPio, can you share the data as a code? Not Images please. Since you didn't provide the data with your question, I tried on the sample data of my own and it did worked for me
Sorry for the trouble, it's working, i think it didn't work before because i had to restart the kernel in my juptyer notebook.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.