1

Here is my code:

import requests, re, pandas, csv
from bs4 import BeautifulSoup

r=requests.get("http://www.hltv.org/?pageid=188&statsfilter=2816&offset=0")
c=r.content

table=BeautifulSoup(c,"html.parser")

for row in table.find_all('div', style=re.compile(r'width:606px;height:22px;background-color')):
    data=row.get_text(strip=True, separator=',')
    print(data)

Here is the scraped output:

5/3 17,Astralis (16),FaZe (13),inferno,IEM Katowice 2017
5/3 17,Astralis (16),FaZe (12),nuke,IEM Katowice 2017
5/3 17,Astralis (16),FaZe (12),overpass,IEM Katowice 2017
5/3 17,FaZe (16),Astralis (9),cache,IEM Katowice 2017
4/3 17,Astralis (16),Heroic (12),nuke,IEM Katowice 2017
4/3 17,Astralis (16),Heroic (12),train,IEM Katowice 2017
4/3 17,Immortals (10),FaZe (16),mirage,IEM Katowice 2017
2/3 17,Virtus.pro (14),Heroic (16),nuke,IEM Katowice 2017
2/3 17,Cloud9 (6),Natus Vincere (16),mirage,IEM Katowice 2017
2/3 17,SK (16),North (8),cbble,IEM Katowice 2017
2/3 17,Cloud9 (12),North (16),cbble,IEM Katowice 2017
2/3 17,Natus Vincere (12),Heroic (16),overpass,IEM Katowice 2017
2/3 17,Virtus.pro (16),SK (14),inferno,IEM Katowice 2017

What is the good way to make pandas.DataFrame from this output?

2
  • what would you like the data to look like when loaded? Commented Mar 14, 2017 at 19:46
  • In first column should be "Date" like 5/3/17 in the output, second column "Team1" like Astralis in the output, third "Team3" like Faze in the output, fourth "Map" like inferno in the output and in fifth column should be "Event" like IEM Katowice in the output. Commented Mar 14, 2017 at 19:53

1 Answer 1

1

You can use the function pandas.read_csv. If for some reason, you don't want to write your string to an actual file, you can just make pandas think you are passing it one, by wrapping your string in a StringIO object.

import pandas as pd
from io import StringIO

csv_string = '''
5/3 17,Astralis (16),FaZe (13),inferno,IEM Katowice 2017
5/3 17,Astralis (16),FaZe (12),nuke,IEM Katowice 2017
5/3 17,Astralis (16),FaZe (12),overpass,IEM Katowice 2017
5/3 17,FaZe (16),Astralis (9),cache,IEM Katowice 2017
4/3 17,Astralis (16),Heroic (12),nuke,IEM Katowice 2017
4/3 17,Astralis (16),Heroic (12),train,IEM Katowice 2017
4/3 17,Immortals (10),FaZe (16),mirage,IEM Katowice 2017
2/3 17,Virtus.pro (14),Heroic (16),nuke,IEM Katowice 2017
2/3 17,Cloud9 (6),Natus Vincere (16),mirage,IEM Katowice 2017
2/3 17,SK (16),North (8),cbble,IEM Katowice 2017
2/3 17,Cloud9 (12),North (16),cbble,IEM Katowice 2017
2/3 17,Natus Vincere (12),Heroic (16),overpass,IEM Katowice 2017
2/3 17,Virtus.pro (16),SK (14),inferno,IEM Katowice 2017
'''

csv_string_io = StringIO(csv_string)
frame = pd.read_csv(csv_string_file)
Sign up to request clarification or add additional context in comments.

6 Comments

Yes, but then I should make some CSV file first. Is there something way to make DataFrame without making csv file first. The scraped output is only in CSV format.
@JuhoM well you said your data is in csv format. what is the exact format of your data coming from the scrapper
What format is your data currently in? Do you have it as a string?
I'm sorry, confusing expression from me. I scraped the data from website and the output datatype is "<class 'str'>". I edited the first post so you can see my code.
@JuhoM I edited my answer to support converting strings directly
|

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.