2

I would like to expand this dataframe on a depth range which has such a depth column:

import numpy as np
import pandas as pd

depth = np.array([0.8, 0.9, 1.0, 1.1, 1.2, 1.3, 1.4, 1.5, 1.6, 1.7, 1.8, 1.9, 2.0, 2.1, 2.2, 2.3, 2.4, 2.5])    

df1 = pd.DataFrame({'depth': [0.8, 0.9, 1.0, 1.1, 1.2, 1.3, 1.4, 1.5, 1.6, 1.7, 1.8, 1.9, 2.0, 2.1, 2.2],
           '400.0': [13.909261, 7.758734, 3.513627, 2.095409, 1.628918, 0.782643, 0.278548, 0.160153, -0.155895, -0.152373, -0.147820, -0.023997, 0.010729, 0.006050, 0.002356],
           '401.0': [14.581624, 8.173803, 3.757856, 2.223524, 1.695623, 0.818065, 0.300235, 0.173674, -0.145402, -0.144456, -0.142969, -0.022471, 0.010802, 0.006181, 0.002641],
           '402.0': [15.253988, 8.588872, 4.002085, 2.351638, 1.762327, 0.853486, 0.321922, 0.187195, -0.134910, -0.136539, -0.138118, -0.020945, 0.010875, 0.006313, 0.002927],
           '403.0': [15.633908, 8.833914, 4.146499, 2.431543, 1.798185, 0.874350, 0.333470, 0.192128, -0.130119, -0.134795, -0.136049, -0.019307, 0.012037, 0.006674, 0.003002],
           '404.0': [15.991816, 9.066159, 4.283401, 2.507818, 1.831721, 0.894119, 0.344256, 0.196415, -0.125758, -0.133516  , -0.134189, -0.017659, -0.013281,0.007053, 0.003061],
           '405.0': [16.349725, 9.298403, 4.420303, 2.584094, 1.865257, 0.913887, 0.355041, 0.200702, -0.121396, -0.132237, -0.132330, -0.016012, 0.014525, 0.007433, 0.003120]
           })

So what I need in this case is three additional rows at the bottom with NaN values.

Similarly I have a df2 with depth range from 1.1 to 2.5 and need to fill the upper 3 rows based on the extended depth range.

How do I do it?

3 Answers 3

3

You can using merge

pd.DataFrame({'depth':depth}).merge(df1,how='left')
Sign up to request clarification or add additional context in comments.

6 Comments

for some reason in a more complex example it gives me all NaNs. Any ideas on why?
@PEBKAC I have no idea about the "complex example", would you like show us ?
it's a same type of dataframe, only that it has more than 300 rows and 450 columns
@PEBKAC Your sample data can not addressed your problem, please fix the sample data
@PEBKAC depth=np.round(depth,1)
|
2

One easy way to do it is to set the index to depth then reindex using your depth array:

df1.set_index('depth').reindex(depth).reset_index()


    depth      400.0      401.0      402.0      403.0      404.0      405.0
0     0.8  13.909261  14.581624  15.253988  15.633908  15.991816  16.349725
1     0.9   7.758734   8.173803   8.588872   8.833914   9.066159   9.298403
2     1.0   3.513627   3.757856   4.002085   4.146499   4.283401   4.420303
3     1.1   2.095409   2.223524   2.351638   2.431543   2.507818   2.584094
4     1.2   1.628918   1.695623   1.762327   1.798185   1.831721   1.865257
5     1.3   0.782643   0.818065   0.853486   0.874350   0.894119   0.913887
6     1.4   0.278548   0.300235   0.321922   0.333470   0.344256   0.355041
7     1.5   0.160153   0.173674   0.187195   0.192128   0.196415   0.200702
8     1.6  -0.155895  -0.145402  -0.134910  -0.130119  -0.125758  -0.121396
9     1.7  -0.152373  -0.144456  -0.136539  -0.134795  -0.133516  -0.132237
10    1.8  -0.147820  -0.142969  -0.138118  -0.136049  -0.134189  -0.132330
11    1.9  -0.023997  -0.022471  -0.020945  -0.019307  -0.017659  -0.016012
12    2.0   0.010729   0.010802   0.010875   0.012037  -0.013281   0.014525
13    2.1   0.006050   0.006181   0.006313   0.006674   0.007053   0.007433
14    2.2   0.002356   0.002641   0.002927   0.003002   0.003061   0.003120
15    2.3        NaN        NaN        NaN        NaN        NaN        NaN
16    2.4        NaN        NaN        NaN        NaN        NaN        NaN
17    2.5        NaN        NaN        NaN        NaN        NaN        NaN

4 Comments

for some reason in a more complex example it gives me all NaNs already during reindexing... Any ideas why?
Not sure, but maybe in your complex example, there is no overlap between depth in your dataframe and your depth array?
exactly, the difference gives me something like this: -2.220446e-16 etc. any suggestions on how should one fix that?
anyway, it is very efficient as well, thank you so much for your help! You spotted the error right away, thank you!
2

Using combine_first

>>> pd.DataFrame({'depth':depth}).combine_first(df1)

Using pd.concat

>>> pd.concat([pd.DataFrame({'depth':depth}), df1.iloc[:,1:]], 1)

1 Comment

Notice , combine_first is index sensitive , which means df=pd.DataFrame({'ID':[1,2,5],'Va':[1,2,3]}) ,and IDs=[1,2,3,4,5] will return the kind of 'wrong' result

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.