Making DataFrame from CSV output

Question

Here is my code:

import requests, re, pandas, csv
from bs4 import BeautifulSoup

r=requests.get("http://www.hltv.org/?pageid=188&statsfilter=2816&offset=0")
c=r.content

table=BeautifulSoup(c,"html.parser")

for row in table.find_all('div', style=re.compile(r'width:606px;height:22px;background-color')):
    data=row.get_text(strip=True, separator=',')
    print(data)

Here is the scraped output:

5/3 17,Astralis (16),FaZe (13),inferno,IEM Katowice 2017
5/3 17,Astralis (16),FaZe (12),nuke,IEM Katowice 2017
5/3 17,Astralis (16),FaZe (12),overpass,IEM Katowice 2017
5/3 17,FaZe (16),Astralis (9),cache,IEM Katowice 2017
4/3 17,Astralis (16),Heroic (12),nuke,IEM Katowice 2017
4/3 17,Astralis (16),Heroic (12),train,IEM Katowice 2017
4/3 17,Immortals (10),FaZe (16),mirage,IEM Katowice 2017
2/3 17,Virtus.pro (14),Heroic (16),nuke,IEM Katowice 2017
2/3 17,Cloud9 (6),Natus Vincere (16),mirage,IEM Katowice 2017
2/3 17,SK (16),North (8),cbble,IEM Katowice 2017
2/3 17,Cloud9 (12),North (16),cbble,IEM Katowice 2017
2/3 17,Natus Vincere (12),Heroic (16),overpass,IEM Katowice 2017
2/3 17,Virtus.pro (16),SK (14),inferno,IEM Katowice 2017

What is the good way to make pandas.DataFrame from this output?

In first column should be "Date" like 5/3/17 in the output, second column "Team1" like Astralis in the output, third "Team3" like Faze in the output, fourth "Map" like inferno in the output and in fifth column should be "Event" like IEM Katowice in the output. — Juho M
– Juho M, Commented Mar 14, 2017 at 19:53

dmlicht · Accepted Answer · 2017-03-14 20:13:37Z

1

You can use the function pandas.read_csv. If for some reason, you don't want to write your string to an actual file, you can just make pandas think you are passing it one, by wrapping your string in a StringIO object.

import pandas as pd
from io import StringIO

csv_string = '''
5/3 17,Astralis (16),FaZe (13),inferno,IEM Katowice 2017
5/3 17,Astralis (16),FaZe (12),nuke,IEM Katowice 2017
5/3 17,Astralis (16),FaZe (12),overpass,IEM Katowice 2017
5/3 17,FaZe (16),Astralis (9),cache,IEM Katowice 2017
4/3 17,Astralis (16),Heroic (12),nuke,IEM Katowice 2017
4/3 17,Astralis (16),Heroic (12),train,IEM Katowice 2017
4/3 17,Immortals (10),FaZe (16),mirage,IEM Katowice 2017
2/3 17,Virtus.pro (14),Heroic (16),nuke,IEM Katowice 2017
2/3 17,Cloud9 (6),Natus Vincere (16),mirage,IEM Katowice 2017
2/3 17,SK (16),North (8),cbble,IEM Katowice 2017
2/3 17,Cloud9 (12),North (16),cbble,IEM Katowice 2017
2/3 17,Natus Vincere (12),Heroic (16),overpass,IEM Katowice 2017
2/3 17,Virtus.pro (16),SK (14),inferno,IEM Katowice 2017
'''

csv_string_io = StringIO(csv_string)
frame = pd.read_csv(csv_string_file)

edited Mar 14, 2017 at 20:13

answered Mar 14, 2017 at 19:46

dmlicht

2,4582 gold badges16 silver badges16 bronze badges

Sign up to request clarification or add additional context in comments.

6 Comments

Juho M Over a year ago

Yes, but then I should make some CSV file first. Is there something way to make DataFrame without making csv file first. The scraped output is only in CSV format.

parsethis Over a year ago

@JuhoM well you said your data is in csv format. what is the exact format of your data coming from the scrapper

dmlicht Over a year ago

What format is your data currently in? Do you have it as a string?

Juho M Over a year ago

I'm sorry, confusing expression from me. I scraped the data from website and the output datatype is "<class 'str'>". I edited the first post so you can see my code.

dmlicht Over a year ago

@JuhoM I edited my answer to support converting strings directly

|

Collectives™ on Stack Overflow

Making DataFrame from CSV output

1 Answer 1

6 Comments

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

6 Comments

Related