1

Im trying to get the text from a webpage with Python 3.3 and then search through that text for certain strings. When I find a matching string I need to save the following text. For example I take this page: http://gatherer.wizards.com/Pages/Card/Details.aspx?name=Dark%20Prophecy and I need to save the text after each category (card text, rarity, etc) in the card info. Currently Im using beautiful Soup but get_text causes a UnicodeEncodeError and doesnt return an iterable object. Here is the relevant code:

               urlStr = urllib.request.urlopen('http://gatherer.wizards.com/Pages/Card/Details.aspx?name=' + cardName).read()

                htmlRaw = BeautifulSoup(urlStr)

                htmlText = htmlRaw.get_text

                for line in htmlText:
                    line = line.strip()
                    if "Converted Mana Cost:" in line:
                        cmc = line.next()
                        message += "*Converted Mana Cost: " + cmc +"* \n\n"
                    elif "Types:" in line:
                        type = line.next()
                        message += "*Type: " + type +"* \n\n"
                    elif "Card Text:" in line:
                        rulesText = line.next()
                        message += "*Rules Text: " + rulesText +"* \n\n"
                    elif "Flavor Text:" in line:
                        flavor = line.next()
                        message += "*Flavor Text: " + flavor +"* \n\n"
                    elif "Rarity:" in line:
                        rarity == line.next()
                        message += "*Rarity: " + rarity +"* \n\n"

1 Answer 1

1

consider using lxml and xpath instead, you will then be able to do things like:

>>> from lxml import html
>>> root = html.parse("http://gatherer.wizards.com/Pages/Card/Details.aspx?name=Dark%20Prophecy")
>>> root.xpath('//div[contains(text(), "Flavor Text")]/following-sibling::div/div/i/text()')
['When the bog ran short on small animals, Ekri turned to the surrounding farmlands.']
Sign up to request clarification or add additional context in comments.

1 Comment

How do I install this on windows? The instructions on the website seem to say just to download it but that doesnt work

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.