24

I'm coming from Java and learning Python. So far what I found very cool, yet very hard to adapt, is that there's no need to declare types. I understand that each variable is a pointer to an object, but so far I'm not able to understand how to design my code then.

For example, I'm writing a function that accepts a 2D NumPy array. Then in the body of the function I'm calling different methods of this array (which is an object of array in Numpy). But then in the future suppose I want to use this function, by that time I might have forgotten totally what I should pass to the function as a type. What do people normally do? Do they just write documentation for this? Because if that is the case, then this involves more typing and would raise the question about the idea of not declaring the type.

Also suppose I want to pass an object similar to an array in the future. Normally in Java one would implement an interface and then let both classes to implement the methods. Then in the function parameters I define the variable to be of the type of the interface. How can this issue be solved in Python or what approaches can be used to make the same idea?

12
  • 1
    The purpose of duck typing is not to write less code in the first place Commented Mar 2, 2014 at 13:55
  • 1
    If your code actually, for some reason, really depends on a specific type passed in, you can use assert isinstance(foo, Foo) as the first line of the function/method (this also serves as documentation when reading the code), but that often really just limits what can be done with that function later. Commented Mar 2, 2014 at 17:57
  • 4
    @ErikAllik: don't assert on that, raise TypeError Commented Mar 2, 2014 at 21:38
  • @NeilG: well, yeah, true; depends tho; sometimes you want assertions to make sure you've understood your own code correctly, but of course that's more general than parameter types anyway. Commented Mar 2, 2014 at 23:57
  • @ErikAllik: mail.python.org/pipermail/python-list/2013-November/660401.html Commented Mar 2, 2014 at 23:59

6 Answers 6

32

This is a very healthy question.

Duck typing

The first thing to understand about python is the concept of duck typing:

If it walks like a duck, and quacks like a duck, then I call it a duck

Unlike Java, Python's types are never declared explicitly. There is no restriction, neither at compile time nor at runtime, in the type an object can assume.

What you do is simply treat objects as if they were of the perfect type for your needs. You don't ask or wonder about its type. If it implements the methods and attributes you want it to have, then that's that. It will do.

def foo(duck):
    duck.walk()
    duck.quack()

The only contract of this function is that duck exposes walk() and quack(). A more refined example:

def foo(sequence):
    for item in sequence:
        print item

What is sequence? A list? A numpy array? A dict? A generator? It doesn't matter. If it's iterable (that is, it can be used in a for ... in), it serves its purpose.

Type hinting

Of course, no one can live in constant fear of objects being of the wrong type. This is addressed with coding style, conventions and good documentation. For example:

  • A variable named count should hold an integer
  • A variable Foo starting with an upper-case letter should hold a type (class)
  • An argument bar whose default value is False, should hold a bool too when overridden

Note that the duck typing concept can be applied to to these 3 examples:

  • count can be any object that implements +, -, and <
  • Foo can be any callable that returns an object instance
  • bar can be any object that implements __nonzero__

In other words, the type is never defined explicitly, but always strongly hinted at. Or rather, the capabilities of the object are always hinted at, and its exact type is not relevant.

It's very common to use objects of unknown types. Most frameworks expose types that look like lists and dictionaries but aren't.

Finally, if you really need to know, there's the documentation. You'll find python documentation vastly superior to Java's. It's always worth the read.

Sign up to request clarification or add additional context in comments.

10 Comments

This is a very healthy answer.
what does healthy question mean in the first place? :p
@AlexTwain In my opinion: "very interesting and good to ask. It's not a silly question about syntax error but a real question about the philosophy of Python that every Python programmer should know"
"A variable named count should hold an integer" -- or a European aristocrat. I joke of course, but actually in practice one does need to be careful in "self-documenting" code about real ambiguities or imprecise names.
Sometimes duck typing doesn't work out. If your function expects a sequence of strings and you pass it a string, your function will just end up iterating over the individual characters.
|
7

I've reviewed a lot of Python code written by Java and .Net developers, and I've repeatedly seen a few issues I might warn/inform you about:

Python is not Java

Don't wrap everything in a class:

Seems like even the simplest function winds up being wrapped in a class when Java developers start writing Python. Python is not Java. Don't write getters and setters, that's what the property decorator is for.

I have two predicates before I consider writing classes:

  1. I am marrying state with functionality
  2. I expect to have multiple instances (otherwise a module level dict and functions is fine!)

Don't type-check everything

Python uses duck-typing. Refer to the data model. Its builtin type coercion is your friend.

Don't put everything in a try-except block

Only catch exceptions you know you'll get, using exceptions everywhere for control flow is computationally expensive and can hide bugs. Try to use the most specific exception you expect you might get. This leads to more robust code over the long run.

Learn the built-in types and methods, in particular:

From the data-model

str

  • join
  • just do dir(str) and learn them all.

list

  • append (add an item on the end of the list)
  • extend (extend the list by adding each item in an iterable)

dict

  • get (provide a default that prevents you from having to catch keyerrors!)
  • setdefault (set from the default or the value already there!)
  • fromkeys (build a dict with default values from an iterable of keys!)

set

Sets contain unique (no repitition) hashable objects (like strings and numbers). Thinking Venn diagrams? Want to know if a set of strings is in a set of other strings, or what the overlaps are (or aren't?)

  • union
  • intersection
  • difference
  • symmetric_difference
  • issubset
  • isdisjoint

And just do dir() on every type you come across to see the methods and attributes in its namespace, and then do help() on the attribute to see what it does!

Learn the built-in functions and standard library:

I've caught developers writing their own max functions and set objects. It's a little embarrassing. Don't let that happen to you!

Important modules to be aware of in the Standard Library are:

  • os
  • sys
  • collections
  • itertools
  • pprint (I use it all the time)
  • logging
  • unittest
  • re (regular expressions are incredibly efficient at parsing strings for a lot of use-cases)

And peruse the docs for a brief tour of the standard library, here's Part 1 and here's Part II. And in general, make skimming all of the docs an early goal.

Read the Style Guides:

You will learn a lot about best practices just by reading your style guides! I recommend:

Additionally, you can learn great style by Googling for the issue you're looking into with the phrase "best practice" and then selecting the relevant Stackoverflow answers with the greatest number of upvotes!

I wish you luck on your journey to learning Python!

Comments

2

For example I'm writing a function that accepts a 2D Numpy array. Then in the body of the function I'm calling different methods of this array (which is an object of array in Numpy). But then in the future suppose I want to use this function, by that time I might forgot totally what should I pass to the function as a type. What do people normally do? Do they just write a documentation for this?

You write documentation and name the function and variables appropriately.

def func(two_d_array): 
    do stuff

Also suppose I want in the future to pass an object similar to an array, normally in Java one would implement an interface and then let both classes to implement the methods.

You could do this. Create a base class and inherit from it, so that multiple types have the same interface. However, quite often, this is overkill and you'd simply use duck typing instead. With duck typing, all that matters is that the object being evaluated defines the right properties and methods required to use it within your code.

Note that you can check for types in Python, but this is generally considered bad practice because it prevents you from using duck typing and other coding patterns enabled by Python's dynamic type system.

5 Comments

Your first argument is weakened by the fact that your example is invalid syntax, and when fixed remains (1) ugly and (2) "Systems Hungarian notation" with all its disadvantages. It's true that useful type information can be conveyed in names, but the names should still be meaningful beyond that.
Invalid syntax? You mean the do stuff line? I believe that bit is self-explanatory... As for your second point, the example is intended to be very generic. I would hope that the intention was clear here.
No, I refer to 2d_array, which is not an identifier.
Doh! Fixed. Good catch!
thx for pointing out! I forgot the "consenting adults" saying ;P
1

Yes, you should document what type(s) of arguments your methods expect, and it's up to the caller to pass the correct type of object. Within a method, you can write code to check the types of each argument, or you can just assume it's the correct type, and rely on Python to automatically throw an exception if the passed-in object doesn't support the methods that your code needs to call on it.

The disadvantage of dynamic typing is that the computer can't do as much up-front correctness checking, as you've noted; there's a greater burden on the programmer to make sure that all arguments are of the right type. But the advantage is that you have much more flexibility in what types can be passed to your methods:

  • You can write a method that supports several different types of objects for a particular argument, without needing overloads and duplicated code.
  • Sometimes a method doesn't really care about the exact type of an object as long as it supports a particular method or operation — say, indexing with square brackets, which works on strings, arrays, and a variety of other things. In Java you'd have to create an interface, and write wrapper classes to adapt various pre-existing types to that interface. In Python you don't need to do any of that.

Comments

0

You can use assert to check if conditions match:

In [218]: def foo(arg):
     ...:     assert type(arg) is np.ndarray and np.rank(arg)==2, \
     ...:         'the argument must be a 2D numpy array'
     ...:     print 'good arg'

In [219]: foo(np.arange(4).reshape((2,2)))
good arg

In [220]: foo(np.arange(4))
---------------------------------------------------------------------------
AssertionError                            Traceback (most recent call last)
<ipython-input-220-c0ee6e33c83d> in <module>()
----> 1 foo(np.arange(4))

<ipython-input-218-63565789690d> in foo(arg)
      1 def foo(arg):
      2     assert type(arg) is np.ndarray and np.rank(arg)==2, \
----> 3         'the argument must be a 2D numpy array'
      4     print 'good arg'

AssertionError: the argument must be a 2D numpy array

It's always better to document what you've written completely as @ChinmayKanchi mentioned.

6 Comments

While technically correct and occasionally appropriate, type checking should always be used as a last resort in Python. If an object behaves the same as a numpy array as far as the function is concerned, the actual type hierarchy should not matter.
@ChinmayKanchi how do you check if the argument behaves totally the same as numpy arrays?
You don't, unless you have a specific reason to expect that a function may receive invalid arguments under normal operation of the program. You just assume that the caller knows what s/he is doing and let Python throw a TypeError/KeyError/NameError when the caller does something unexpected.
@zhangxaochen Typically, you don't. For some cases, there are abstract base classes. Note that you a fair comparison to static type systems needs to be lenient with what counts into "totally the same behavior" because static type systems check some invariants, but far from all (e.g. exceptions thrown, interpretation of arguments).
I'd argue that most functions do however operate on either one type or a very limited set of types, in practice.
|
0

Here are a few pointers that might help you make your approach more 'Pythonic'.

The PEPs

In general, I recommend at least browsing through the PEPs. It helped me a lot to grok Python.

Pointers

Since you mentioned the word pointers, Python doesn't use pointers to objects in the sense that C uses pointers. I am not sure about the relationship to Java. Python uses names attached to objects. It's a subtle but important difference that can cause you problems if you expect similar-to-C pointer behavior.

Duck Typing

As you said, yes, if you are expecting a certain type of input you put it in the docstring.

As zhangxaochen wrote, you can use assert to do realtime typing of your arguments, but that's not really the python way if you are doing it all the time with no particular reason. As others mentioned, it's better to test and raise a TypeError if you have to do this. Python favors duck typing instead - if you send me something that quacks like a numpy 2D array, then that's fine.

4 Comments

Actually, as far as analogies with other language's concepts go, "pointers to objects" is pretty damn good and I'm not aware of any misunderstandings or problems caused by it. But really, Python objects behave almost identically to Java objects in the sense of "names attacked to objects".
In case you haven't seen it, have a look at this question and the miles of comments and you can see that there is actually quite a lot of disagreement there. Coming from a C background, it was an important step for me to differentiate between pointers and names. Comparing to C, names certainly are not pointers. I don't know how other languages use the word 'pointers' though.
It's true that Python's names are more restricted, there is no equivalent to a pointer-to-pointer (let alone deeper nesting). But aside from that, a name behaves like a pointer to an abstract struct PyObject, which is hardly surprising because that's exactly how they are implemented.
zhangxaochen's answer is incorrect. Don't use assertions this way.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.