57

I have this code:

keys_file = open("keys.json")
keys = keys_file.read().encode('utf-8')
keys_json = json.loads(keys)
print(keys_json)

There are some none-english characters in keys.json. But as a result I get:

[{'category': 'мбт', 'keys': ['Блендер Philips',
'мультиварка Polaris']}, {'category': 'КБТ', 'keys':
['холод ильник атлант', 'посудомоечная
машина Bosch']}]

what do I do?

12
  • What do you want to do? Remove non-ASCII characters? Commented Sep 25, 2017 at 14:47
  • 1
    I want to display them to normal language not list of rundom not understandable symbols Commented Sep 25, 2017 at 14:49
  • @user2950593 You've specified the utf-8 encoding. Is that the encoding of the file? Commented Sep 25, 2017 at 14:53
  • what is the encoding of "keys.json", utf-8? Commented Sep 25, 2017 at 14:54
  • 1
    @Stefan Okay, a little bit of back-pedalling here. By default it is decoded, yes. It depends on what it is decoded as (whatever locale.getpreferredencoding() returns) is what I should have said. Additionally I wasn't really advocating that OP call decode, I wanted to point out that they're doing the logically wrong operation to begin with. Commented Sep 25, 2017 at 15:16

1 Answer 1

146

encode means characters to binary. What you want when reading a file is binary to charactersdecode. But really this entire process is way too manual, simply do this:

with open('keys.json', encoding='utf-8') as fh:
    data = json.load(fh)

print(data)

with handles the correct opening and closing of the file, the encoding argument to open ensures the file is read using the correct encoding, and the load call reads directly from the file handle instead of storing a copy of the file contents in memory first.

If this still outputs invalid characters, it means your source encoding isn't UTF-8 or your console/terminal doesn't handle UTF-8.

Sign up to request clarification or add additional context in comments.

4 Comments

what does fh stand for?
"File handle". You can choose any other name you want. It's more descriptive than "keys_file" though. Also note that "data" is much more appropriate a name than "keys_json"; when you have loaded the JSON data it's not JSON anymore, it's a Python list/dict.
How would you achieve this in py2?
@Akbar No. load reads from a file pointer, efficiently. loads reads from a string, which is less efficient if you read the file into memory first and then parse it as a string from there.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.