Let me start out by saying this, I am unsure if this is the best way to do it, but I wrote some code to create a pandas' dataframe that contains the index values from my left dataframe and one from my right dataframe where specific spatial conditions match. This is your basic spatial join, but with some additional attributes. The index values are correct.
My issue is this, how can I join the left and right dataframe together with this 3rd dataframe?
I need to support the following:
- If I want to keep all (from both df1 and df2), how do I do that?
- By default I want to keep all left dataframe values, so my join dataframe has values like:
[1, None]will this be a problem?
Example:
join_df = pd.DataFrame(data=[[0, 2], [1, 3], [2, None]], columns=['left_idx', 'right_idx'])
df1 = pd.DataFrame([["a", {5:5}], ["b", {4:5}], ["c", {12:5}]], columns=['A1', 'A2'])
df2 = pd.DataFrame([["b", {'a':5}], ["bbb", {'b':5}], ["ccc", {'c':5}]], columns=['B1', 'B2'])
So the join_df works like this:
- The data in the join_df is the index of the left dataframe (df1) and the row to join from df2 is in column 2.
- The join can be many to many, 1:m, or many to 1.
The goals is that all rows from df1 will be matched to all rows in df2. Optionally, (bonus question), if a match does not exist in df1 to df2, can df1's record be kept? Same with df2?
Thank you