Simplifying, I have the following data:
| Col1 | Col2 |
|---|---|
| A | X |
| A | Y |
| A | Z |
| B | X |
| B | Y |
| B | Z |
| C | Z |
I need to receive the following result:
| Col1 | Col2 |
|---|---|
| A | X |
| B | Y |
| C | Z |
In other words: For each value in the left column, I need to assign the minimum UNUSED value from the right column (no duplicates). This was easy to do iteratively, i.e., with cursors. However, I would like something that's a thousand times faster.
What I've tried
Unsurprisingly, select L,min(R) gives the wrong result. I've tried partitioning over several window functions, but I can't get the right combination. I always get the following incorrect result:
| Col1 | Col2 |
|---|---|
| A | X |
| B | X |
| C | Z |
I've loaded some of the data into https://dbfiddle.uk/6HbpdlYd.
Here are 142 rows, created from 139 distinct L values, and 139 distinct R values.
Since the input data is produced by a join, there is always exactly one correct solution.
C,(C,X)? Should it "look backward" and change the min Col2 you already picked for A?