0

The following code works:

import pandas as pd
import csv
import psycopg2

df = pd.read_csv(r'https://developers.google.com/adwords/api/docs/appendix/geo/geotargets-2021-02-24.csv')
df=df.rename(columns = {'Criteria ID':'Criteria_ID','Canonical Name':'Canonical_Name','Parent ID':'Parent_ID','Country Code':'Country_Code','Target Type':'Target_Type'})
df = df.loc[df['Country_Code']=='IN']
df.to_csv(r'C:\Users\Harshal\Desktop\tar.csv',index=False)

conn = psycopg2.connect(host='1.11.11.111',
                   dbname='postgres',
                   user='postgres',
                   password='myPassword',
                   port='1234')  
cur = conn.cursor()
f = open('C:\Users\Harshal\Desktop\tar.csv', 'r')
cur.copy_expert("""copy geotargets_india from stdin with (format csv, header, delimiter ',', quote '"')""", f)
conn.commit()
conn.close()
f.close()

But instead of saving the changed data frame I want to directly upload it in postgreSQL table. I tried cur.copy_expert("""copy geotargets_india from stdin with (format csv, header, delimiter ',', quote '"')""", df) but it throws error. Note: cur.copy_expert("""copy geotargets_india from stdin with (format csv, header, delimiter ',', quote '"')""", f) cannot be avoided as I'm saving csv with some condition. My table structure:

create table public.geotargets_india(
Criteria_ID integer not null,
Name character varying(50) COLLATE pg_catalog."default" NOT NULL,
Canonical_Name character varying(100) COLLATE pg_catalog."default" NOT NULL,
Parent_ID NUMERIC(10,2),
Country_Code character varying(10) COLLATE pg_catalog."default" NOT NULL,
Target_Type character varying(50) COLLATE pg_catalog."default" NOT NULL,
Status character varying(50) COLLATE pg_catalog."default" NOT NULL
)

enter image description here

EDIT: I tried

import pandas as pd
import csv
import psycopg2
from sqlalchemy import create_engine

df = pd.read_csv(r'https://developers.google.com/adwords/api/docs/appendix/geo/geotargets-2021-02-24.csv')
df=df.rename(columns = {'Criteria ID':'Criteria_Id','Canonical         Name':'Canonical_Name','Parent ID':'Parent_ID','Country Code':'Country_Code','Target Type':'Target_Type'})
df = df.loc[df['Country_Code']=='IN']
df['Canonical_Name']=df['Canonical_Name'].str.replace(',', " ")
engine = create_engine('postgresql+psycopg2://postgres:[email protected]:1234/postgres')
df.to_sql(
 'geotargets_india',
  con=engine,
  schema=None, 
  if_exists='append', 
  index=False
)

But getting error: UndefinedColumn: column "Criteria_Id" of relation "geotargets_india" does not exist LINE 1: INSERT INTO geotargets_india ("Criteria_Id", "Name", "Canoni...

EDIT2: The above-tried code works if I drop my table and the script the new table created is as follows:

CREATE TABLE public.geotargets_india
(
"Criteria_Id" bigint,
"Name" text COLLATE pg_catalog."default",
"Canonical_Name" text COLLATE pg_catalog."default",
"Parent_ID" double precision,
"Country_Code" text COLLATE pg_catalog."default",
"Target_Type" text COLLATE pg_catalog."default",
"Status" text COLLATE pg_catalog."default"
)

Why is it not working with a predefined table schema?

6
  • As @SarindraThérèse explained with the example below, the easiest way is to skip the intermediate "tar.csv" and update PostgreSQL straight away using df.to_sql() Commented Apr 26, 2021 at 8:26
  • @IODEV If you open the CSV in the above-mentioned link it has a column named Canonical Name. Which contains data in the form "Kabul,Kabul,Afghanistan". These ',' are considered separate columns and hence the error undefined column. Commented Apr 26, 2021 at 8:35
  • Does the table contain exactly the same column names and order, ie: Criteria ID, Name, Canonical Name, Parent ID, Country Code, Target Type, Status? Btw, what's the table name? Commented Apr 26, 2021 at 8:46
  • @IODEV I added an image of the table and its named 'geotargets_india'. Commented Apr 26, 2021 at 8:56
  • @IODEV This line of my code does exactly that. df=df.rename(columns = {'Criteria ID':'Criteria_ID','Canonical Name':'Canonical_Name','Parent ID':'Parent_ID','Country Code':'Country_Code','Target Type':'Target_Type'}) Commented Apr 26, 2021 at 9:00

2 Answers 2

2

I tried your code and corrected some line and mine worked,

import pandas as pd
from sqlalchemy import create_engine

df = pd.read_csv(r'https://developers.google.com/adwords/api/docs/appendix/geo/geotargets-2021-02-24.csv', delimiter=',')
print(df)
df=df.rename(columns = {'Criteria ID':'Criteria_Id','Canonical Name':'Canonical_Name','Parent ID':'Parent_ID','Country Code':'Country_Code','Target Type':'Target_Type'})
df = df.loc[df['Country_Code']=='IN']
df['Canonical_Name']=df['Canonical_Name'].str.replace(',', " ")
engine = create_engine('postgresql+psycopg2://collaborateur1:nG@e3P@tapp581lv:2345/base_project')
df.to_sql('geotargets_india',con = engine,schema=None,if_exists='append',index=False)

I add delimiter ','and corrected 'Canonical Name'

Sign up to request clarification or add additional context in comments.

4 Comments

I copied your exact code except changed the engine to my own db. still getting undefined column error as mentioned in the question. Also, there is nothing visible in the above answer where you showed your table.
It works when I dropped the table and ran the command. But I don't understand where my table schema went wrong?
try to delete the database first, or replace append with replace
@eras'q: Glad it worked out with the suggestion from Sarindra Thérèse. Regarding the initial error, maybe the column Criteria _Id was renamed or dropped by mistake.
0

I recommend you to use sqlalchemy orm, it's easy and simple

    df = pd.read_csv(r'https://developers.google.com/adwords/api/docs/appendix/geo/geotargets-2021-02-24.csv')
   engine = create_engine('postgresql+psycopg2://user:password@host:port/database')
   df.to_sql(dbname,engine, if_exists='append',index=False)

3 Comments

I tried this approach and it won't work. I mentioned above that cur.copy_expert("""copy geotargets_india from stdin with (format csv, header, delimiter ',', quote '"')""", f) is important to fit my csv properly in table.
@eras'q: You can easily mangle the dataframe data to fit the sql table structure. Can you update your question with an example of the table structure?
@IODEV I updated my table structure if cur.copy_expert("""copy geotargets_india from stdin with (format csv, header, delimiter ',', quote '"')""", f) can be done with just pandas would make life easy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.