I have a dataframe (the sample looks like this)
Type          SKU      Description   FullDescription        Size      Price
Variable       2        Boots          Shoes on sale       XL,S,M       
Variation      2.5      Boots XL                             XL       330
Variation      2.6      Boots S                              S        330
Variation      2.7      Boots M                              M        330
Variable       3        Helmet           Helmet Sizes      E42,E41
Variation      3.8      Helmet E42                          E42       89
Variation      3.2      Helmet E41                          E41       89
What I want to do is sort the values based on Size so the final data frame should look like this:
  Type          SKU      Description   FullDescription        Size      Price
    Variable       2        Boots          Shoes on sale       S,M,XL        
    Variation      2.6      Boots S                             S       330
    Variation      2.7      Boots M                             M        330
    Variation      2.5      Boots XL                            XL        330
    Variable       3        Boots           Helmet Sizes       E41,E42
    Variation      3.2      Helmet E41                          E41       89
    Variation      3.8      Helmet E42                          E42       89
I am able to successfully get the results using this code
sizes, dig = ['S','M','XL','L',], ['000','111','333','222'] #make sure dig values do not exist as a substring anywhere in your dataframe
df = (df.assign(Size=df['Size'].replace(sizes, dig, regex=True))
        .assign(grp=(df['Type'] == 'Variable').cumsum()) 
        .sort_values(['grp', 'Type', 'Size']).drop('grp', axis=1))
df['Size'] = df['Size'].apply(lambda x: ','.join(sorted(x.split(',')))).replace(dig, sizes, regex=True)
df
The issue is that the given code dosen't work on dataframe
Type          SKU      Description   FullDescription        Size      Price
Variable       2        Boots          Shoes on sale       XL,S,3XL       
Variation      2.5      Boots XL                             XL       330
Variation      2.6      Boots 3XL                            3XL        330
Variation      2.7      Boots S                              S        330
Variable       3        Helmet           Helmet Sizes      S19, S9
Variation      3.8      Helmet E42                          S19       89
Variation      3.2      Helmet E41                          S9       89
it gives the results 'S,3XL,XL' and 'S19,S9' whereas I want the results as
Type          SKU      Description   FullDescription        Size      Price
Variable       2        Boots          Shoes on sale       S,XL,3XL       
Variation      2.7      Boots S                             S          330
Variation      2.5      Boots XL                            XL        330
Variation      2.6      Boots 3XL                           3XL        330
Variable       3        Helmet           Helmet Sizes      S9,S19
Variation      3.2      Helmet E41                          S9        89
Variation      3.8      Helmet E42                          S19       89
also in case of more sizes, the order should be 'XXS,XS,S,M,L,XL,XXL,3XL,4XL,5XL' and in case of second example, 'S9,S19,M9,M19,L9 and so on'
This is what I have done so far but it's not working and showing the wrong order
sizes, dig = ['XS','S','M','L','XL','XXL','3XL','4XL','5XL'], ['000','111','222','333','444','555','666','777','888'] #make sure dig values do not exist as a substring anywhere in your dataframe
df = (df.assign(Size=df['Size'].replace(sizes, dig, regex=True))
        .assign(grp=(df['Type'] == 'variable').cumsum())
        .sort_values(['grp', 'Type', 'Size']).drop('grp', axis=1))
df['Size'] = df['Size'].apply(lambda x: ','.join(sorted(x.split(',')))).replace(dig, sizes, regex=True)
