0

How I can load this data with json module? Looks like I need to study some encoding lectures again, explanation would be much appriciated.

_json = """{"sku":"02366  20","productRef":"02366  20@1401-B1","image":"http://media.wuerth.com/stmedia/shop/348px/1337528.jpg","shortInfo":"Stahl verzinkt<br>SHR-BLA-(A2K)-M6X20","pdfCatalogPage":["http://media.wuerth.com/stmedia/shop/masterpages0000/LANG_de/03721.pdf"],"catalogSheet":"http://eshop.wuerth.de/stmedia/Blaetterkatalog/Gesamtkatalog/index.php?mode=or&searchquery=03721&hook_url=http://eshop.wuerth.de/-/","documentInfoMap":{},"cadValue":null,"showCadValue":null,"msdsInformations":[],"technicalInformation":"<table class=\"tech_table\"><tbody><tr><td class=\"tech_col_left\"><p>Nenndurchmesser</p></td><td class=\"tech_col_left\"><p>6 mm</p></td></tr><tr><td class=\"tech_col_left\"><p>Werkstoff</p></td><td class=\"tech_col_left\"><p>Stahl</p></td></tr><tr><td class=\"tech_col_left\"><p>Oberfläche</p></td><td class=\"tech_col_left\"><p>Verzinkt</p></td></tr><tr><td class=\"tech_col_left\"><p>Lochdurchmesser</p></td><td class=\"tech_col_left\"><p>8 mm</p></td></tr><tr><td class=\"tech_col_left\"><p>Länge</p></td><td class=\"tech_col_left\"><p>20 mm</p></td></tr><tr><td class=\"tech_col_left\"><p>Blattdicke</p></td><td class=\"tech_col_left\"><p>3 mm</p></td></tr></tbody></table>"}"""

print json.loads(_json, encoding='utf-8')

SyntaxError: Non-ASCII character '\xc3' in file
1
  • 1
    google/searchbar + your error line = result Commented Oct 30, 2014 at 17:28

1 Answer 1

2

You'll need to declare your encoding at the top of the Python file, and you'll need to use a raw string literal, as the \" escapes are interpreted by Python as plain " characters:

# encoding: utf-8  
import json

_json = r"""
{"sku":"02366  20","productRef":"02366  20@1401-B1","image":"http://media.wuerth.com/stmedia/shop/348px/1337528.jpg","shortInfo":"Stahl verzinkt<br>SHR-BLA-(A2K)-M6X20","pdfCatalogPage":["http://media.wuerth.com/stmedia/shop/masterpages0000/LANG_de/03721.pdf"],"catalogSheet":"http://eshop.wuerth.de/stmedia/Blaetterkatalog/Gesamtkatalog/index.php?mode=or&searchquery=03721&hook_url=http://eshop.wuerth.de/-/","documentInfoMap":{},"cadValue":null,"showCadValue":null,"msdsInformations":[],"technicalInformation":"<table class=\"tech_table\"><tbody><tr><td class=\"tech_col_left\"><p>Nenndurchmesser</p></td><td class=\"tech_col_left\"><p>6 mm</p></td></tr><tr><td class=\"tech_col_left\"><p>Werkstoff</p></td><td class=\"tech_col_left\"><p>Stahl</p></td></tr><tr><td class=\"tech_col_left\"><p>Oberfläche</p></td><td class=\"tech_col_left\"><p>Verzinkt</p></td></tr><tr><td class=\"tech_col_left\"><p>Lochdurchmesser</p></td><td class=\"tech_col_left\"><p>8 mm</p></td></tr><tr><td class=\"tech_col_left\"><p>Länge</p></td><td class=\"tech_col_left\"><p>20 mm</p></td></tr><tr><td class=\"tech_col_left\"><p>Blattdicke</p></td><td class=\"tech_col_left\"><p>3 mm</p></td></tr></tbody></table>"}

"""

print json.loads(_json, encoding='utf-8')

The above assumes that you actually did use UTF-8 as the encoding for the source file; the script then produces:

$ bin/python test.py 
{u'sku': u'02366  20', u'documentInfoMap': {}, u'catalogSheet': u'http://eshop.wuerth.de/stmedia/Blaetterkatalog/Gesamtkatalog/index.php?mode=or&searchquery=03721&hook_url=http://eshop.wuerth.de/-/', u'cadValue': None, u'msdsInformations': [], u'showCadValue': None, u'shortInfo': u'Stahl verzinkt<br>SHR-BLA-(A2K)-M6X20', u'technicalInformation': u'<table class="tech_table"><tbody><tr><td class="tech_col_left"><p>Nenndurchmesser</p></td><td class="tech_col_left"><p>6 mm</p></td></tr><tr><td class="tech_col_left"><p>Werkstoff</p></td><td class="tech_col_left"><p>Stahl</p></td></tr><tr><td class="tech_col_left"><p>Oberfl\xe4che</p></td><td class="tech_col_left"><p>Verzinkt</p></td></tr><tr><td class="tech_col_left"><p>Lochdurchmesser</p></td><td class="tech_col_left"><p>8 mm</p></td></tr><tr><td class="tech_col_left"><p>L\xe4nge</p></td><td class="tech_col_left"><p>20 mm</p></td></tr><tr><td class="tech_col_left"><p>Blattdicke</p></td><td class="tech_col_left"><p>3 mm</p></td></tr></tbody></table>', u'productRef': u'02366  20@1401-B1', u'pdfCatalogPage': [u'http://media.wuerth.com/stmedia/shop/masterpages0000/LANG_de/03721.pdf'], u'image': u'http://media.wuerth.com/stmedia/shop/348px/1337528.jpg'}
Sign up to request clarification or add additional context in comments.

1 Comment

+1. It will work even on Python 3 where encoding parameter is ignored (because Unicode string literals are there by default) (if you use parens for print).

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.