6

I am trying to insert data from a data frame into a Hive table. I have been able to do so successfully using df.write.insertInto("db1.table1", overwrite = True).

I am just a little confused about the overwrite = True part -- I tried running it multiple times and it seemed to append, not overwrite. There wasn't too much in the docs, but when should I set overwrite to False vs. True?

1 Answer 1

13

df.insertInto works only if table already exists in hive.

df.write.insertInto("db.table1",overwrite=False) will append the data to the existing hive table.

df.write.insertInto("db.table1",overwrite=True) will overwrite the data in hive table.

Example:

df.show()
#+----+---+                                                                                                                                                                              
#|name| id|
#+----+---+
#|   a|  1|
#|   b|  2|
#+----+---+

#save the table to hive
df.write.saveAsTable("default.table1")

#from hive
#hive> select * from table1;
#OK
#a       1
#b       2

df.write.insertInto("moch.table1",overwrite=True)

#from hive
#hive> select * from table1;
#OK
#a       1
#b       2

#appending data to hive
df.write.insertInto("moch.table1",overwrite=False)

#from hive
#hive> select * from table1;
#OK
#a       1
#b       2
#a       1
#b       2
Sign up to request clarification or add additional context in comments.

3 Comments

How to insertInto a S3 location for which a hive table is not created? The ETL job is only to load the data to S3 to be consumable by downstream processes but a table is not required to create
Thanks for the helpful user. A minor suggestion to improve the example is to use a different dataframe for demonstrating df.write.insertInto("moch.table1",overwrite=True). Right now, it is not clear if the table itself gets overwritten or the duplicate rows get updated.
What happens when the output table is partitioned? Does it overwrite the whole table or just the partition?

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.