How to Write "Pythonic" Code ############################ :Author: Christopher Arndt :Date: 2008-04-13 :Location: RuPy Conference Poznań, Poland .. class:: handout :Copyright: CC Attribution/Share-Alike .. container:: handout Knowing the syntax and the standard library of Python alone doesn't make one a good Python programmer. In more than a decade of increasing usage and popularity of the language, Pythonistas have developed many typical Python idioms and pythonic ways "to do it". These are either based on the fact that in Python, being a highly dynamic language, many things necessary in more static languages just make no sense or on the fundamental principles laid down in the so-called "Zen of Python". This talks tries to explain these principles and how they translate into actual code. .. contents:: :class: handout .. |bullet| unicode:: U+02022 .. footer:: RuPy, Poznań |bullet| 2008-04-13 http://chrisarndt.de/talks/rupy Acknowledgements ---------------- This talk is based to a good part on the slides of David Goodger's talk `Code Like a Pythonista. Idiomatic Python`_ resp. Jeff Hinrich's adaptation_ with the same title, from which I picked the topics I could most relate too and then added a few of my own favorite Python idioms. The presentation is released under the Creative Commons `Attribution/Share-Alike License`_. .. _Code Like a Pythonista. Idiomatic Python: http://python.net/~goodger/projects/pycon/2007/idiomatic/ .. _adaptation: http://www.omahapython.org/IdiomaticPython.html .. _Attribution/Share-Alike License: http://creativecommons.org/licenses/by-sa/3.0/ The Zen of Python ----------------- .. class:: section-title The Zen of Python The Zen of Python (1) --------------------- .. class:: handout Try this at your Python prompt:: >>> import this The Zen of Python, by Tim Peters .. container:: incremental | Beautiful is better than ugly. | Explicit is better than implicit. | Simple is better than complex. | Complex is better than complicated. | Flat is better than nested. | Sparse is better than dense. | Readability counts. | Special cases aren't special enough to break the rules. | Although practicality beats purity. | Errors should never pass silently. | Unless explicitly silenced. The Zen of Python (2) --------------------- The Zen of Python, cont. .. container:: incremental | In the face of ambiguity, refuse the temptation to guess. | There should be one -- and preferably only one -- obvious way to do it. | Although that way may not be obvious at first unless you're Dutch. | Now is better than never. | Although never is often better than *right* now. | If the implementation is hard to explain, it's a bad idea. | If the implementation is easy to explain, it may be a good idea. | Namespaces are one honking great idea -- let's do more of those! Coding Style ------------ .. class:: section-title Beautiful is better than ugly Coding Style ------------ Programs must be written for people to read, and only incidentally for machines to execute. — Abelson & Sussman, Structure and Interpretation of Computer Programs Read PEP 8! ----------- Every Python programmer should know PEP 8: http://www.python.org/dev/peps/pep-0008/ PEP = Python Enhancement Proposal The Python community has its own standards for what source code should look like, codified in PEP 8. These standards *are different* from those of other communities, like C, C++, C#, Java, VisualBasic, etc. Because indentation and whitespace are so important in Python, the Style Guide for Python Code is as good as a standard. .. class:: handout Most open-source projects and (hopefully) in-house projects follow the style guide quite closely and there are even tools to check whether code adheres to the standard. Whitespace (1) -------------- * 4 spaces per indentation level. * No hard tabs. * Never mix tabs and spaces. * One blank line between functions. * Two blank lines between classes. Whitespace (2) -------------- * Add a space after ``","`` in dicts, lists, tuples, & argument lists, and after ``":"`` in dicts, but *not* before. * Put spaces around assignments & comparisons (except in argument lists). * *No* spaces just inside parentheses or just before argument lists. * *No* spaces just inside docstrings. .. sourcecode:: python def make_squares(key, value=0): """Return a dictionary and a list...""" d = {key: value} l = [key, value] return d, l Naming Conventions ------------------ * ``joined_lower`` for functions, methods, attributes * ``joined_lower`` or ``ALL_CAPS`` for constants * ``StudlyCaps`` for classes * ``camelCase`` *only* to conform to pre-existing conventions * Attributes: ``interface``, ``_internal``, (``__private``) .. class:: handout I never use ``__private`` form. And so will probably you. Long Lines & Continuations -------------------------- Keep lines below 80 characters in length. Use implied line continuation inside parentheses/brackets/braces: .. sourcecode:: python def __init__(self, first, second, third, fourth, fifth, sixth): output = (first + second + third + fourth + fifth + sixth) Use backslashes as a last resort: .. sourcecode:: python VeryLong.left_hand_side \ = even_longer.right_hand_side() Backslashes are fragile; they must end the line they're on. If you add a space after the backslash, it won't work any more. Also, they're ugly. Long Strings (1) ---------------- Adjacent literal strings are concatenated by the parser: .. sourcecode:: python >>> print 'o' 'n' "e" one The string prefixed with an "r" is a "raw" string. Backslashes are not evaluated as escapes in raw strings. They're useful for regular expressions and Windows filesystem paths. Note named string objects are not concatenated: .. sourcecode:: python >>> a = 'three' >>> b = 'four' >>> a b File "", line 1 a b ^ SyntaxError: invalid syntax Long strings (2) ---------------- That's because this automatic concatenation is a feature of the Python parser/compiler, not the interpreter. You must use the "+" operator to concatenate strings at run time. .. sourcecode:: python text = ('Long strings can be made up ' 'of several shorter strings.') The parentheses allow implicit line continuation. Multiline strings use triple quotes: .. sourcecode:: python """\ Triple double quotes""" Compound Statements ------------------- .. container:: incremental .. class:: goodex Good: .. sourcecode:: python if foo == 'blah': do_something() do_one() do_two() do_three() .. container:: incremental .. class:: badex Bad: .. sourcecode:: python if foo == 'blah': do_something() do_one(); do_two(); do_three() Docstrings & Comments --------------------- Docstrings = How to use code Comments = Why (rationale) & how code works .. container:: incremental Docstrings explain how to use code, and are for the users of your code. .. container:: incremental Comments explain why, and are for the maintainers of your code. That includes yourself! Simple is Better Than Complex ----------------------------- .. class:: section-title Simple is Better Than Complex Simple is Better Than Complex ----------------------------- Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it. — Brian W. Kernighan In other words, KISS! General Python Idioms --------------------- .. class:: section-title General Python Idioms Swap Values ----------- In other languages: .. sourcecode:: python temp = a a = b b = temp In Python: .. sourcecode:: python b, a = a, b Tuples ------ We saw that the comma is the tuple constructor, not the parentheses. Example: .. sourcecode:: python >>> 1, (1,) The Python interpreter shows the parentheses for clarity, and I recommend you use parentheses too: .. sourcecode:: python >>> (1,) (1,) Don't forget the comma! .. sourcecode:: python >>> (1) 1 Interactive ``"_"`` ------------------- This is a really useful feature that surprisingly few people know. In the interactive interpreter, whenever you evaluate an expression or call a function, the result is bound to a temporary name, _ (an underscore): .. sourcecode:: python >>> 1 + 1 2 >>> _ 2 ``_`` stores the last printed expression. When a result is ``None``, nothing is printed, so ``_`` doesn't change. That's convenient! .. note:: This only works in the interactive interpreter, not within a module. Building Strings from Substrings -------------------------------- Start with a list of strings and then ``join()`` it: .. sourcecode:: python colors = ['red', 'blue', 'green', 'yellow'] print ", ".join(colors) We want to join all the strings together into one large string. Especially when the number of substrings is large... .. container:: incremental .. class:: badex Don't do this: .. sourcecode:: python result = '' for s in colors: result += s Testing for None ---------------- .. container:: incremental .. class:: goodex Good: .. sourcecode:: python if foo is None: do_something() .. container:: incremental .. class:: okex Maybe: .. sourcecode:: python if not foo: do_someting() .. container:: incremental .. class:: badex Bad: .. sourcecode:: python if foo == None: do_something() Iterating a list ---------------- .. container:: incremental .. class:: goodex Good: .. sourcecode:: python for i, item in enumerate(mylist): if i >= 1: print mylist[i-1] + item .. container:: incremental .. class:: badex Bad: .. sourcecode:: python for i in range(len(mylist)): if i >= 1: print mylist[i-1] + item .. container:: incremental .. class:: badex Also bad: .. sourcecode:: python i = 0 for item in mylist: if i >= 1: print mylist[i-1] + item i += 1 Use ``in`` where possible ------------------------- .. class:: goodex Good: .. sourcecode:: python for key in d: print key * in is generally faster. * This pattern also works for items in arbitrary containers (such as lists, tuples, and sets). * in is also an operator (as we'll see). .. container: incremental .. class:: badex Bad: .. sourcecode:: python for key in d.keys(): print key This is limited to objects with a keys() method. Dictionary ``setdefault`` Method -------------------------------- Dicts have a ``setdefault`` method that is very useful to initialise dicts: .. sourcecode:: python navs = {} for (portfolio, equity, position) in data: navs.setdefault(portfolio, 0) navs[portfolio] += position * prices[equity] The setdefault dictionary method returns the default value, and we're taking advantage of setdefault's side effect, that it sets the dictionary value only if there is no value already. .. tip:: Python 2.5 has the ``defaultdict`` class. Look it up in the standard library reference! Other languages have "variables" (1) ------------------------------------ In many other languages, assigning to a variable puts a value into a box. .. sourcecode:: python int a = 1; .. image:: images/a1box.png Box ``"a"`` now contains an integer 1. Other languages have "variables" (2) ------------------------------------ Assigning another value to the same variable replaces the contents of the box: .. sourcecode:: python a = 2; .. image:: images/a2box.png Now box ``"a"`` contains an integer 2. Other languages have "variables" (2) ------------------------------------ Assigning one variable to another makes a copy of the value and puts it in the new box: .. sourcecode:: python int b = a; .. image:: images/b2box.png .. image:: images/a2box.png ``"b"`` is a second box, with a copy of integer 2. Box ``"a"`` has a separate copy. Python has "names" (1) ---------------------- In Python, a "name" or "identifier" is like a parcel tag (or nametag) attached to an object. .. sourcecode:: python a = 1 .. image:: images/a1tag.png Here, an integer ``1`` object has a tag labelled ``"a"``. If we reassign to ``"a"``, we just move the tag to another object: .. sourcecode:: python a = 2 .. image:: images/a2tag.png .. image:: images/1.png Python has "names" (2) ---------------------- If we assign one name to another, we're just attaching another nametag to an existing object: .. sourcecode:: python b = a .. image:: images/ab2tag.png The name ``"b"`` is just a second tag bound to the same object as ``"a"``. Default Parameter Values (1) ---------------------------- This is a common mistake that beginners often make. Even more advanced programmers make this mistake if they don't understand Python names. .. sourcecode:: python def bad_append(new_item, a_list=[]): a_list.append(new_item) return a_list .. container:: incremental The problem here is that the default value of a_list, an empty list, is evaluated at function definition time. So every time you call the function, you get the same default value. Try it several times: .. sourcecode:: python >>> print bad_append('one') ['one'] >>> print bad_append('two') ['one', 'two'] Default Parameter Values (2) ---------------------------- Lists are a mutable objects; you can change their contents. The correct way to get a default list (or dictionary, or set) is to create it at run time instead, inside the function: .. sourcecode:: python def good_append(new_item, a_list=None): if a_list is None: a_list = [] a_list.append(new_item) return a_list List Comprehensions ------------------- List comprehensions ("listcomps" for short) are syntax shortcuts for this general pattern: The traditional way, with for and if statements: .. sourcecode:: python new_list = [] for item in a_list: if condition(item): new_list.append(fn(item)) As a list comprehension: .. sourcecode:: python new_list = [fn(item) for item in a_list if condition(item)] Generator Expressions (1) ------------------------- Let's sum the squares of the numbers up to 100. As a loop: .. sourcecode:: python total = 0 for num in range(1, 101): total += num * num We can use the sum function to quickly do the work for us, by building the appropriate sequence. As a list comprehension: .. sourcecode:: python total = sum([num * num for num in range(1, 101)]) Generator Expressions (2) ------------------------- As a generator expression: .. sourcecode:: python total = sum(num * num for num in xrange(1, 101)) Rule of thumb: * Use a list comprehension when a computed list is the desired end result. * Use a generator expression when the computed list is just an intermediate step. Generators ---------- Here's a usful genarator à la ``find(1)``: .. sourcecode:: python def walkfiles(startdir, pattern=None): """Return generator for full paths of all files below startdir. Optionally filters out files not matching pattern. """ for dir, dirlist, filelist in os.walk(startdir): for fname in filelist: if pattern and not fnmatch.fnmatch(fname, pattern): continue yield os.path.join(dir, fname) Sorting with DSU ---------------- DSU = Decorate-Sort-Undecorate Instead of creating a custom comparison function, we create an auxiliary list that will sort naturally: .. sourcecode:: python alist = [(4, 5), (3, 2), (2, 1), (6, 7)] # Decorate: to_sort = [(item[2], item) for item in alist] # Sort: to_sort.sort() # Undecorate: alist = [item[-1] for item in to_sort] Sorting with DSU ---------------- In Python 2.4 and above, you can use the ``key`` parameter to ``sort`` to do this in one step. .. sourcecode:: python from operator import itemgetter alist.sort(key=itemgetter(1)) EAFP vs. LBYL ------------- .. class:: incremental It's easier to ask forgiveness than permission .. class:: incremental Look before you leap .. container:: incremental .. class:: goodex Good: .. sourcecode:: python try: return str(x) except TypeError: ... .. class:: badex Bad: .. sourcecode:: python if isinstance(x, basestring): do_something(x) Program structure ----------------- .. class:: section-title Program structure Program structure ----------------- #. (Shebang) #. Source encoding declaration #. Module docstring #. Imports (stdlib, third-party, private modules) #. Global constants and initialization code #. Exceptions #. Module-level functions #. Classes #. ``main`` function Command line scripts -------------------- .. sourcecode:: python :linenos: #!/usr/bin/env python # examples/script-template.py def main(args): if not args: print "Usage: foo ARG1 [ARG2...]" return 2 return 0 if __name__ == '__main__': import sys status = main(sys.argv[1:]) sys.exit(status) # or combined # sys.exit(main(sys.argv[1:])) OO-Programming -------------- .. class:: section-title OO-Programming Setters & Getters (1) --------------------- Bad: .. sourcecode:: python :linenos: class Foo: def __init__(self, spamm, eggs): self.spamm = spamm self.eggs = eggs def get_spamm(self): return self.spamm def set_spamm(self, value): self.spamm = value # et cetera f = Foo('bar', 'baz') myspamm = f.get_spamm() Setters & Getters (2) --------------------- Good: .. sourcecode:: python :linenos: class Foo: def __init__(self, spamm, eggs): self.spamm = spamm self.eggs = eggs f = Foo('bar', 'baz') myspamm = f.spamm Setters & Getters (3) --------------------- But what if you need to make your attribute dynamic later? .. container:: incremental Bad: .. sourcecode:: python :linenos: class Foo: # ... def get_spamm(self): return make_spamm() f = Foo() myspamm = f.get_spamm() Setters & Getters (4) --------------------- Solution: use the ``property`` builtin: .. container:: incremental Good: .. sourcecode:: python :linenos: class Foo: # ... def _spamm(self): """Return fresh portion of spamm.""" return make_spamm() spamm = property(_spamm) f = Foo() myspamm = f.spamm Setters & Getters (5) --------------------- Or, using ``property`` as a *decorator*: .. container:: incremental Very good: .. sourcecode:: python :linenos: class Foo: # ... @property def spamm(self): return make_spamm() f = Foo() myspamm = f.spamm .. container:: incremental .. warning:: Both forms will turn ``spamm`` into a read-only attribute! Setters & Getters (6) --------------------- The same works for setting attributes: Bad: .. sourcecode:: python :linenos: class Foo: def set_spamm(self, value): if is_valid(value): self.spamm = value else: raise ValueError('I want my spamm!') f = Foo() f.set_spamm = "Eggs" Setters & Getters (7) --------------------- Good (again using ``property``): .. sourcecode:: python :linenos: # examples/property_01.py class Foo(object): def _get_spamm(self): return make_spamm() def _set_spamm(self, value): if is_valid(value): self.__dict__['spamm'] = value else: raise ValueError('I want my spamm!') spamm = property(_get_spamm, _set_spamm, None, "Tasty spamm") Static and Class methods ------------------------ Static methods -------------- Static methods don't receive the instance as the first argument. They can be be thought of as functions living in the namespace of the class. They are similar to the same concept in Java or C++. They are not very useful in Python (just use a normal function instead) but can be used for helper functions in a class, which doesn't need access to ``self``, and are no use outside the class. .. container:: incremental .. sourcecode:: python :linenos: class Foo: # ... @staticmethod def _format_name(name): return name.strip().replace('_').capitalize() Class methods (1) ----------------- Class methods receive the *class* object as the first argument, not the instance. It is therefore good practice to name the first parameter ``cls`` (``class`` is a keyword!) instead of ``self``. A good use for class methods are *factory functions*, i.e. alternative, convenient ways to create pre-configured class instances. Class methods (2) ----------------- .. sourcecode:: python :linenos: # examples/classmethod_01.py class Template: def __init__(self, template, **data): self.template = template self.data = data @classmethod def from_file(cls, filename, **data): """Return Template with template string read from filename.""" return cls(open(filename).read(), data) def render(self, **data): subst = self.data.copy() subst.update(data) return self.template % data Class methods (3) ----------------- .. note:: Classmethods can be called on the class:: Template.from_file(...) or the instance with the same effect:: t.from_file(...) Summary ------- .. class:: section-title Summary Summary ------- * Follow PEP 8 * Read the standard library reference * Know your lists, dicts and iterators * ``Import this`` How to recognize trees from afar -------------------------------- .. class:: section-title And now for something completely different... How to recognize trees from afar -------------------------------- .. class:: section-title Number 1: The Larch EggBasket --------- .. image:: images/eggbasket_screenshot.png For more infomation, please visit http://chrisarndt.de/projects/Eggbasket.