I have a DataFrame with a lot of columns. Now I want to adjust the order of the columns.
A number of columns must come first (in a certain order) and the rest of the columns after them sorted by column name (not manually because there are many)
How can I achieve this using PySpark?
I guess sort them first and than adjust some in specific order
df.orderBy(cols, ascending=True)
Assume current column order:
col_a, col_k, col_c, col_h, col_e, col_f, col_g, col_d, col_j, col_i, col_b
Desired new order:
col_c, col_j, col_a, col_g :: col_b, col_d, col_e, col_f, col_h, col_i, col_k
Before :: is columns in specific order, after is remaining columns ordered by column name