1

I'm trying to create a sqlite db from a csv file. After some searching it seems like this is possible using a pandas df. I've tried following some tutorials and the documentation but I can't figure this error out. Here's my code:

# Import libraries
import pandas, csv, sqlite3

# Create sqlite database and cursor
conn = sqlite3.connect('test.db')
c = conn.cursor()
# Create the table of pitches
c.execute("""CREATE TABLE IF NOT EXISTS pitches (
            pitch_type text,
            game_date text,
            release_speed real
            )""")

conn.commit()

df = pandas.read_csv('test2.csv')
df.to_sql('pitches', conn, if_exists='append', index=False)

conn.close()

When I run this code, I get the following error:

sqlite3.OperationalError: table pitches has no column named SL

SL is the first value in the first row in my csv file. I can't figure out why it's looking at the csv value as a column name, unless it thinks the first row of the csv should be the headers and is trying to match that to column names in the table? I don't think that was it either though because I tried changing the first value to an actual column name and got the same error.

EDIT:

When I have the headers in the csv, the dataframe looks like this:

     pitch_type  game_date  release_speed
0            SL  8/31/2017           81.9
1            SL  8/31/2017           84.1
2            SL  8/31/2017           81.9
...         ...        ...            ...
2919         SL   8/1/2017           82.3
2920         CU   8/1/2017           78.7

[2921 rows x 3 columns]

and I get the following error:

sqlite3.OperationalError: table pitches has no column named game_date

When I take the headers out of the csv file:

      SL  8/31/2017  81.9
0     SL  8/31/2017  84.1
1     SL  8/31/2017  81.9
2     SL  8/31/2017  84.1
...   ..        ...   ...
2918  SL   8/1/2017  82.3
2919  CU   8/1/2017  78.7

[2920 rows x 3 columns]

and I get the following error:

sqlite3.OperationalError: table pitches has no column named SL

EDIT #2:

I tried taking the table creation out of the code entirely, per this answer, with the following code:

# Import libraries
import pandas, csv, sqlite3

# Create sqlite database and cursor
conn = sqlite3.connect('test.db')
c = conn.cursor()

df = pandas.read_csv('test2.csv')
df.to_sql('pitches', conn, if_exists='append', index=False)

conn.close()

and still get the

sqlite3.OperationalError: table pitches has no column named SL

error

EDIT #3:

I changed the table creation code to the following:

# Create the table of pitches
dropTable = 'DROP TABLE pitches'
c.execute(dropTable)
createTable = "CREATE TABLE IF NOT EXISTS pitches(pitch_type text, game_date text, release_speed real)"
c.execute(createTable)

and it works now. Not sure what exactly changed, as it looks basically the same to me, but it works.

2
  • Can you post how your dataframe looks like? You definitely should get a different error once you assigned column names in your csv file. Commented Oct 27, 2018 at 17:21
  • Edited with the requested information. Commented Oct 27, 2018 at 17:40

3 Answers 3

5

Check your column names. I am able to replicate your code successfully with no errors. The names variable gets all the columns names from the sqlite table and you can compare them with the dataframe headers with df.columns.

# Import libraries
import pandas as pd, csv, sqlite3

# Create sqlite database and cursor
conn = sqlite3.connect('test.db')
c = conn.cursor()
# Create the table of pitches
c.execute("""CREATE TABLE IF NOT EXISTS pitches (
            pitch_type text,
            game_date text,
            release_speed real
            )""")
conn.commit()

test = conn.execute('SELECT * from pitches')
names = [description[0] for description in test.description]
print(names)

df = pd.DataFrame([['SL','8/31/2017','81.9']],columns = ['pitch_type','game_date','release_speed'])
df.to_sql('pitches', conn, if_exists='append', index=False)

conn.execute('SELECT * from pitches').fetchall()
>> [('SL', '8/31/2017', 81.9), ('SL', '8/31/2017', 81.9)]

I am guessing there might be some whitespaces in your column headers.

Sign up to request clarification or add additional context in comments.

5 Comments

It's also strange to me that the error says "no column named game_date" when that's the second column in the table. I can't figure out why it would skip over the pitch_type column, or decide that it was okay.
Using your answer I found that when I print(names) in line 17 of your code it only shows ['pitch_type'], so there must be something wrong with my syntax of creating the table. I can't figure out what though.
Are your other columns numbered? You need to assign column headers for all of them. to_sql assigns the column values based on your column names in your dataframe in this case.
I changed the table creation code (added it to Edit #3 in the original post), and it works now. Not sure what was wrong in the initial code. Thanks for the help.
i got similar error message with sqlite3 while using df.to_sql('table_name',engine,index=False, if_exists='append'), it turned out in my case that my pandas dataframe df had some multi-level indexes. After having dropped them, it worked fine.
3

If you are trying to create a table from a csv file you can just run sqlite3 and do:

sqlite> .mode csv
sqlite> .import c:/path/to/file/myfile.csv myTableName

Comments

0

As you can see from pandas read_csv docs:

header : int or list of ints, default 'infer'
    Row number(s) to use as the column names, and the start of the
    data.  Default behavior is to infer the column names: if no names
    are passed the behavior is identical to ``header=0`` and column
    names are inferred from the first line of the file, if column
    names are passed explicitly then the behavior is identical to
    ``header=None``. Explicitly pass ``header=0`` to be able to
    replace existing names. The header can be a list of integers that
    specify row locations for a multi-index on the columns
    e.g. [0,1,3]. Intervening rows that are not specified will be
    skipped (e.g. 2 in this example is skipped). Note that this
    parameter ignores commented lines and empty lines if
    ``skip_blank_lines=True``, so header=0 denotes the first line of
    data rather than the first line of the file.

That means read_csv using your first row as header names.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.