1

I have the following table in db

+----------------+------------+------+-----+---------+----------------+
| Field          | Type       | Null | Key | Default | Extra          |
+----------------+------------+------+-----+---------+----------------+
| id             | bigint(20) | NO   | PRI | NULL    | auto_increment |
| VERSION        | bigint(20) | NO   |     | NULL    |                |
| user_id        | bigint(20) | NO   | MUL | NULL    |                |
| measurement_id | bigint(20) | NO   | MUL | NULL    |                |
| day            | timestamp  | NO   |     | NULL    |                |
| hour           | tinyint(4) | NO   |     | NULL    |                |
| hour_timestamp | timestamp  | NO   |     | NULL    |                |
| value          | bigint(20) | NO   |     | NULL    |                |
+----------------+------------+------+-----+---------+----------------+

I'm trying to save spark dataframe that holds multiple rows that have the following case class structure:

case class Record(val id : Int,
                  val VERSION : Int,
                  val user_id : Int,
                  val measurement_id : Int,
                  val day : Timestamp,
                  val hour : Int,
                  val hour_timestamp : Timestamp,
                  val value : Long  )

When I'm trying to save the dataframe to my sql through jdbc driver using:

dataFrame.insertIntoJDBC(...)

I get a primary key violation error:

com.mysql.jdbc.exceptions.jdbc4.MySQLIntegrityConstraintViolationException: Duplicate entry '1' for key 'PRIMARY'
    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
    at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
    at java.lang.reflect.Constructor.newInstance(Constructor.java:423)

I tried to set id=0 as the default value of all the rows and also tried to remove the id field from the case class, neither worked.

Can anyone help?

Thanks, Tomer

1
  • Are you sure that the only duplicate is the first record? Commented Nov 16, 2016 at 23:17

1 Answer 1

5

Found it. I had a sql <-> java column type issue. According to: https://www.cis.upenn.edu/~bcpierce/courses/629/jdkdocs/guide/jdbc/getstart/mapping.doc.html

bigint sql columns should be represented as Long in java. After I've changed my case class to:

case class Record(val id: Long,
                  val VERSION : Long,
                  val user_id : Long,
                  val measurement_id : Long,
                  val day : Timestamp,
                  val hour : Int,
                  val hour_timestamp : Timestamp,
                  val value : Long  )

And set a id=0 for all the records in the dataframe it worked. Thanks

Sign up to request clarification or add additional context in comments.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.