Write rows to SQL server

Question

I have a table in SQL server. I am trying to append a spark dataframe from databricks to the SQL table. I am able to append the new rows but the problem is that it appends randomly to the SQL table. I want to append the new rows to the end of the SQL table. How do I do that?

SQL table

╔═════════════════════════════╗
║   Name      a   b   c   d   ║
╠═════════════════════════════╣
║1  Lisa      5   4   1   3   ║
║2  Timmy     3   2   7   2   ║  
║3  Rann      2   3   1   5   ║
║4  Bob       1   6   3   4   ║
╚═════════════════════════════╝

Spark dataframe

╔═════════════════════════════╗
║   Name      a   b   c   d   ║
╠═════════════════════════════╣
║1  Mark      10  3   2   4   ║
║2  Rudy      5   6   7   8   ║  
╚═════════════════════════════╝

What I want

╔═════════════════════════════╗
║   Name      a   b   c   d   ║
╠═════════════════════════════╣
║1  Lisa      5   4   1   3   ║
║2  Timmy     3   2   7   2   ║  
║3  Rann      2   3   1   5   ║
║4  Bob       1   6   3   4   ║
║5  Mark      10  3   2   4   ║
║6  Rudy      5   6   7   8   ║
╚═════════════════════════════╝

When I append using this code

df.write.mode("append").jdbc(url=jdbcurl, table="employee")

I get this result in the SQL server. Mark and Rudy should be at the bottom but it appends randomly. Is there a parameter that I am suppose to add?

╔═════════════════════════════╗
║   Name      a   b   c   d   ║
╠═════════════════════════════╣
║1  Lisa      5   4   1   3   ║
║2  Rudy      5   6   7   8   ║
║3  Timmy     3   2   7   2   ║
║4  Mark      10  3   2   4   ║
║5  Rann      2   3   1   5   ║
║6  Bob       1   6   3   4   ║
╚═════════════════════════════╝

You can have a column within your dataframe that is integer and represent a primary key in sql table then you can see it in order when you do select. This is just a hack. — Nikunj Kakadiya
– Nikunj Kakadiya, Commented Jan 14, 2021 at 14:14

Conor Cunningham MSFT · Accepted Answer · 2021-01-14 13:37:54Z

2

Tables in SQL are not ordered. It's a set. Put in other words, you can feel free to believe they are being added at the end, but there's no guarantee that they will come back out in any order based on insertion order.

If you want to get data out in a specific order, you should run a query with an ORDER BY on it it based on the order you wish to see them. That could be a column represented the time of insertion if you'd like.

answered Jan 14, 2021 at 13:37

Conor Cunningham MSFT

4,6092 gold badges18 silver badges21 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

Write rows to SQL server

1 Answer 1

Comments

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Related