1

I am developing a small app for managing my favourite recipes. I have two classes - Ingredient and Recipe. A Recipe consists of Ingredients and some additional data (preparation, etc). The reason i have an Ingredient class is, that i want to save some additional info in it (proper technique, etc). Ingredients are unique, so there can not be two with the same name.

Currently i am holding all ingredients in a "big" dictionary, using the name of the ingredient as the key. This is useful, as i can ask my model, if an ingredient is already registered and use it (including all it's other data) for a newly created recipe.

But thinking back to when i started programming (Java/C++), i always read, that using strings as an identifier is bad practice. "The Magic String" was a keyword that i often read (But i think that describes another problem). I really like the string approach as it is right now. I don't have problems with encoding either, because all string generation/comparison is done within my program (Python3 uses UTF-8 everywhere if i am not mistaken), but i am not sure if what i am doing is the right way to do it.

Is using strings as an object identifier bad practice? Are there differences between different languages? Can strings prove to be an performance issue, if the amount of data increases? What are the alternatives?

2 Answers 2

3

No - actually identifiers in Python are always strings. Whether you keep then in a dictionary yourself (you say you are using a "big dictionary") or the object is used programmaticaly, with a name hard-coded into the source code. In this later case, Python creates the name in one of its automaticaly handled internal dictionary (that can be inspected as the return of globals() or locals()).

Moreover, Python does not use "utf-8" internally, it does use "unicode" - which means it is simply text, and you should not worry how that text is represented in actual bytes.

Sign up to request clarification or add additional context in comments.

2 Comments

I always forget how much Python uses dicts internally, good answer! I get that Python3 uses unicode (Python2 had a class for that if im not mistaken), but isn't the actual encoding used UTF-8?
No, for perfomance reasons, the internal encoding of text has to have a fixed size per character. It is "UCS4" encoding, but there are optimizations in place to use a more compact encoding, depending on each string.
1

Python relies on dictionaries for many of its core features. For that reason the pythonic default dict already comes with a quite effective, fast implementation "from factory", decent hash, etc.

Considering that, the dictionary performance itself should not be a concern for what you need (eventual calls to read and write on it), although the way you handle it / store it (in a python file, json, pickle, gzip, etc.) could impact load/access time, etc.

Maybe if you provide a few lines of code showing us how you deal with the dictionary we could provide specific details.

About the string identifier, check jsbueno's answer, he gave a much better explanation then I could do.

2 Comments

I use the dict quite straight forward. Just {"name": actual_obj, ..} in a factory like object, that supervises all of my objects, so that no object with the same name is created twice.
Sounds to me like you won't have to worry. In a project of my own I deal with a dictionary of a few thousand entries, each one containing lists and dictionaries, plus a huge multi-lined string for its textual description. The only thing that bothered me was its size on disk, but I manage to solve that with pickle and gzip.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.