Using pandas to write df to sqlite

Question

I'm trying to create a sqlite db from a csv file. After some searching it seems like this is possible using a pandas df. I've tried following some tutorials and the documentation but I can't figure this error out. Here's my code:

# Import libraries
import pandas, csv, sqlite3

# Create sqlite database and cursor
conn = sqlite3.connect('test.db')
c = conn.cursor()
# Create the table of pitches
c.execute("""CREATE TABLE IF NOT EXISTS pitches (
            pitch_type text,
            game_date text,
            release_speed real
            )""")

conn.commit()

df = pandas.read_csv('test2.csv')
df.to_sql('pitches', conn, if_exists='append', index=False)

conn.close()

When I run this code, I get the following error:

sqlite3.OperationalError: table pitches has no column named SL

SL is the first value in the first row in my csv file. I can't figure out why it's looking at the csv value as a column name, unless it thinks the first row of the csv should be the headers and is trying to match that to column names in the table? I don't think that was it either though because I tried changing the first value to an actual column name and got the same error.

EDIT:

When I have the headers in the csv, the dataframe looks like this:

     pitch_type  game_date  release_speed
0            SL  8/31/2017           81.9
1            SL  8/31/2017           84.1
2            SL  8/31/2017           81.9
...         ...        ...            ...
2919         SL   8/1/2017           82.3
2920         CU   8/1/2017           78.7

[2921 rows x 3 columns]

and I get the following error:

sqlite3.OperationalError: table pitches has no column named game_date

When I take the headers out of the csv file:

      SL  8/31/2017  81.9
0     SL  8/31/2017  84.1
1     SL  8/31/2017  81.9
2     SL  8/31/2017  84.1
...   ..        ...   ...
2918  SL   8/1/2017  82.3
2919  CU   8/1/2017  78.7

[2920 rows x 3 columns]

and I get the following error:

sqlite3.OperationalError: table pitches has no column named SL

EDIT #2:

I tried taking the table creation out of the code entirely, per this answer, with the following code:

# Import libraries
import pandas, csv, sqlite3

# Create sqlite database and cursor
conn = sqlite3.connect('test.db')
c = conn.cursor()

df = pandas.read_csv('test2.csv')
df.to_sql('pitches', conn, if_exists='append', index=False)

conn.close()

and still get the

sqlite3.OperationalError: table pitches has no column named SL

error

EDIT #3:

I changed the table creation code to the following:

# Create the table of pitches
dropTable = 'DROP TABLE pitches'
c.execute(dropTable)
createTable = "CREATE TABLE IF NOT EXISTS pitches(pitch_type text, game_date text, release_speed real)"
c.execute(createTable)

and it works now. Not sure what exactly changed, as it looks basically the same to me, but it works.

Can you post how your dataframe looks like? You definitely should get a different error once you assigned column names in your csv file. — BernardL
– BernardL, Commented Oct 27, 2018 at 17:21

BernardL · Accepted Answer · 2018-10-27 19:45:47Z

5

Check your column names. I am able to replicate your code successfully with no errors. The names variable gets all the columns names from the sqlite table and you can compare them with the dataframe headers with df.columns.

# Import libraries
import pandas as pd, csv, sqlite3

# Create sqlite database and cursor
conn = sqlite3.connect('test.db')
c = conn.cursor()
# Create the table of pitches
c.execute("""CREATE TABLE IF NOT EXISTS pitches (
            pitch_type text,
            game_date text,
            release_speed real
            )""")
conn.commit()

test = conn.execute('SELECT * from pitches')
names = [description[0] for description in test.description]
print(names)

df = pd.DataFrame([['SL','8/31/2017','81.9']],columns = ['pitch_type','game_date','release_speed'])
df.to_sql('pitches', conn, if_exists='append', index=False)

conn.execute('SELECT * from pitches').fetchall()
>> [('SL', '8/31/2017', 81.9), ('SL', '8/31/2017', 81.9)]

I am guessing there might be some whitespaces in your column headers.

edited Oct 27, 2018 at 19:45

answered Oct 27, 2018 at 18:00

BernardL

5,4749 gold badges33 silver badges52 bronze badges

Sign up to request clarification or add additional context in comments.

5 Comments

jbf Over a year ago

It's also strange to me that the error says "no column named game_date" when that's the second column in the table. I can't figure out why it would skip over the pitch_type column, or decide that it was okay.

jbf Over a year ago

Using your answer I found that when I print(names) in line 17 of your code it only shows ['pitch_type'], so there must be something wrong with my syntax of creating the table. I can't figure out what though.

BernardL Over a year ago

Are your other columns numbered? You need to assign column headers for all of them. to_sql assigns the column values based on your column names in your dataframe in this case.

jbf Over a year ago

I changed the table creation code (added it to Edit #3 in the original post), and it works now. Not sure what was wrong in the initial code. Thanks for the help.

Rigoberta Raviolini Over a year ago

i got similar error message with sqlite3 while using df.to_sql('table_name',engine,index=False, if_exists='append'), it turned out in my case that my pandas dataframe df had some multi-level indexes. After having dropped them, it worked fine.

It_is_Chris · Accepted Answer · 2018-10-27 17:27:36Z

3

If you are trying to create a table from a csv file you can just run sqlite3 and do:

sqlite> .mode csv
sqlite> .import c:/path/to/file/myfile.csv myTableName

answered Oct 27, 2018 at 17:27

It_is_Chris

14.2k3 gold badges27 silver badges45 bronze badges

Comments

Yura Beznos · Accepted Answer · 2018-10-27 17:06:13Z

As you can see from pandas read_csv docs:

header : int or list of ints, default 'infer'
    Row number(s) to use as the column names, and the start of the
    data.  Default behavior is to infer the column names: if no names
    are passed the behavior is identical to ``header=0`` and column
    names are inferred from the first line of the file, if column
    names are passed explicitly then the behavior is identical to
    ``header=None``. Explicitly pass ``header=0`` to be able to
    replace existing names. The header can be a list of integers that
    specify row locations for a multi-index on the columns
    e.g. [0,1,3]. Intervening rows that are not specified will be
    skipped (e.g. 2 in this example is skipped). Note that this
    parameter ignores commented lines and empty lines if
    ``skip_blank_lines=True``, so header=0 denotes the first line of
    data rather than the first line of the file.

That means read_csv using your first row as header names.

Collectives™ on Stack Overflow

Using pandas to write df to sqlite

3 Answers 3

5 Comments

Comments

Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

5 Comments

Comments

Comments

Linked

Related