Return to Answer

deleted 2 characters in body

Source Link

edited Aug 11, 2022 at 13:25

71.1k
5
76
256

Almost certainly an X/Y problem: you're trying hard to pad-interleave NaN values when you shouldn't be doing that at all, and theThe premise for your problem assumes a structure for your dataframe that you shouldn't use, at least for data processing. It seems you have no influence over the format of the Excel file to which you need to export.

Source Link

answered Aug 10, 2022 at 13:04

Reinderien

71.1k
5
76
256

Almost certainly an X/Y problem: you're trying hard to pad-interleave NaN values when you shouldn't be doing that at all, and the premise for your problem assumes a structure for your dataframe that you shouldn't use.

Think about the "per"s in your frame description: per time, per KPI, per SKU, you start off with some unidentified floating-point value (maybe a price; who knows). Whenever you write "per", that should become a level in a multi-level index. This reduces your dataframe to one column, "Price".

Thinking about your "pad" operation, you're really just adding a second column (of even more unknown purpose):

import pandas as pd

# Our df for this example
dict_df = {
    "Time": [1,1,1,1,2,2,2,2,3,3,3,3],
    "KPI":  ["A","B","C","D","A","B","C","D","A","B","C","D"],
    "SKU1": [10,1,0.1,0.01,40,4,0.4,0.04,90,9,0.9,0.09],
    "SKU2": [20,2,0.2,0.02,50,5,0.5,0.05,100,10,1,0.1],
    "SKU3": [30,3,0.3,0.03,60,6,0.6,0.06,110,11,1.1,0.11],
    "SKU4": [70,7,0.7,0.07,80,8,0.8,0.08,120,12,1.2,0.12]
}
df_test = pd.DataFrame(dict_df)

# Reshape data to be in stacked, multi-index form
df_test.set_index(['Time', 'KPI'], inplace=True)
df_test.columns = df_test.columns.str.slice(3).astype(int)
df_test = df_test.stack()
df_test.index.set_names('SKU', level=-1, inplace=True)
df_test = pd.DataFrame({'Price': df_test})

filter_kpi = {"A", "B", "C"}
df_test = df_test[df_test.index.get_level_values('KPI').isin(filter_kpi)]

df_test['Mystery'] = pd.NA
print(df_test)

              Price Mystery
Time KPI SKU               
1    A   1     10.0    <NA>
         2     20.0    <NA>
         3     30.0    <NA>
         4     70.0    <NA>
     B   1      1.0    <NA>
         2      2.0    <NA>
         3      3.0    <NA>
         4      7.0    <NA>
     C   1      0.1    <NA>
         2      0.2    <NA>
         3      0.3    <NA>
         4      0.7    <NA>
2    A   1     40.0    <NA>
         2     50.0    <NA>
         3     60.0    <NA>
         4     80.0    <NA>
     B   1      4.0    <NA>
         2      5.0    <NA>
         3      6.0    <NA>
         4      8.0    <NA>
     C   1      0.4    <NA>
         2      0.5    <NA>
         3      0.6    <NA>
         4      0.8    <NA>
3    A   1     90.0    <NA>
         2    100.0    <NA>
         3    110.0    <NA>
         4    120.0    <NA>
     B   1      9.0    <NA>
         2     10.0    <NA>
         3     11.0    <NA>
         4     12.0    <NA>
     C   1      0.9    <NA>
         2      1.0    <NA>
         3      1.1    <NA>
         4      1.2    <NA>