Python - adding multiple tables into a single CSV with Panda

Question

I'm wondering how to get parsed tables from panda into a single CSV, I have managed to get each table into a separate CSV for each one, but would like them all on one CSV. This is my current code to get multiple CSVs:

import pandas as pd
import csv

url = "https://fasttrack.grv.org.au/RaceField/ViewRaces/228697009? 
raceId=318809897"

data = pd.read_html(url, attrs = {'class': 'ReportRaceDogFormDetails'} )

for i, datas in enumerate(data):

    datas.to_csv("new{}.csv".format(i), header = False, index = False)

Is the schema for all tables same?

Sam Chats
– Sam Chats

2018-05-09 03:48:45 +00:00
Commented May 9, 2018 at 3:48 — Sam Chats
– Sam Chats, Commented May 9, 2018 at 3:48
yes the schema is the same

user3170725
– user3170725

2018-05-09 04:33:20 +00:00
Commented May 9, 2018 at 4:33 — user3170725
– user3170725, Commented May 9, 2018 at 4:33

jezrael · Accepted Answer · 2018-05-09 06:45:52Z

4

I think need concat only, because data is list of DataFrames:

df = pd.concat(data, ignore_index=True)
df.to_csv(file, header=False, index=False)

edited May 9, 2018 at 6:45

answered May 9, 2018 at 6:34

jezrael

867k102 gold badges1.4k silver badges1.3k bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

Benares Over a year ago

You can use axis=1 in concat to put the dataframes side-by-side instead of one after the other (not sure which one you want).

klvmungai · Accepted Answer · 2018-05-09 12:21:35Z

You have 2 options:

You can tell pandas to append data while writing to the CSV file.

data = pd.read_html(url, attrs = {'class': 'ReportRaceDogFormDetails'} )
for datas in data:
    datas.to_csv("new.csv", header=False, index=False, mode='a')

Merge all the tables into one DataFrame and then write that into the CSV file.

data = pd.read_html(url, attrs = {'class': 'ReportRaceDogFormDetails'} )
df = pd.concat(data, ignore_index=True)
df.to_csv("new.csv", header=False, index=False)

Edit

To still separate the dataframes on the csv file, we shall have to stick with option #1 but with a few additions

data = pd.read_html(url, attrs = {'class': 'ReportRaceDogFormDetails'} )
with open('new.csv', 'a') as csv_stream:
    for datas in data:
        datas.to_csv(csv_stream, header=False, index=False)
        csv_stream.write('\n')

Thankyou! Would you know how to somehow still seperate the tables during the concat? So they aren't straight after one another? Like have one row of space between

Sam Chats · Accepted Answer · 2018-05-09 03:50:57Z

0

all_dfs = []

for i, datas in enumerate(data):
    all_dfs.append(datas.to_csv("new{}.csv".format(i), header = False, index = False))

result = pd.concat(all_dfs)

answered May 9, 2018 at 3:50

Sam Chats

2,3211 gold badge14 silver badges36 bronze badges

2 Comments

Sam Chats Over a year ago

This can be a one-liner with list comprehension, but I chose the form above for clarity.

user3170725 Over a year ago

Thanks for your reply, I'm getting an error with that code ValueError: All objects passed were None

Collectives™ on Stack Overflow

Python - adding multiple tables into a single CSV with Panda

3 Answers 3

1 Comment

Edit

1 Comment

2 Comments

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

1 Comment

Edit

1 Comment

2 Comments

Related