3

Differences between revisions 1 and 18 (spanning 17 versions)
Revision 1 as of 2012-04-19 16:50:35
Size: 1403
Editor: barry
Comment:
Revision 18 as of 2012-04-27 15:34:52
Size: 11351
Editor: barry
Comment:
Deletions are marked like this. Additions are marked like this.
Line 3: Line 3:
It is a release goal for Ubuntu 12.10 to have only Python 3 on the installation CD images. We have a [[https://blueprints.launchpad.net/ubuntu/+spec/foundations-q-python-versions|Q-series]] blueprint for discussion of this goal at [[http://uds.ubuntu.com/|UDS-Q]] in Oakland, California, in May of 2012. There is a [[https://wiki.ubuntu.com/Python/FoundationsQPythonVersions|more detailed spec]] for this effort and a publicly shared [[http://tinyurl.com/7dsyywo|Google docs spreadsheet]] to track this effort. This is an ambitious effort that will only succeed with help from the greater Ubuntu, Debian, and Python communities. In other words, we need '''you'''! It is a release goal for Ubuntu 12.10 to have only Python 3 on the desktop CD images. We have a [[https://blueprints.launchpad.net/ubuntu/+spec/foundations-q-python-versions|Q-series]] blueprint for discussion of this goal at [[http://uds.ubuntu.com/|UDS-Q]] in Oakland, California, in May of 2012. There is a [[https://wiki.ubuntu.com/Python/FoundationsQPythonVersions|more detailed spec]] for this effort and a publicly shared [[http://tinyurl.com/7dsyywo|Google docs spreadsheet]] to track this effort. This is an ambitious effort that will only succeed with help from the greater Ubuntu, Debian, and Python communities. In other words, we need '''you'''!
Line 7: Line 7:
== Before you start ==

Here are recommendations for you to follow before you start porting.

 * Target Python 3.2, 2.7, and optionally 2.6. Ignore anything older than that.
 * Use a single code base for both Python 2 and 3.
 * Do not rely on [[http://docs.python.org/py3k/library/2to3.html|2to3]] or the third party [[http://pypi.python.org/pypi/six|six]] module (the latter, only if absolutely necessary)
 * Modernize your Python 2 code first, getting it working in Python 2.7 before starting to port.
 * Clarify your data model: what are bytes (data) and what are strings (text)?

I cannot overemphasize the last point. Without a clear separation in your mind and data model between bytes and strings, your port will likely be much more painful than it needs to be. This is the biggest distinction between Python 2 and Python 3. Where Python 2 let you be sloppy, with its 8-bit strings that served as both data and ASCII strings, with automatic (but error prone) conversions between 8-bit strings and unicodes, in Python 3 there are only bytes and strings (i.e. unicodes), with no automatic conversion between the two. This is A Good Thing.

== Python source ==

=== Basic compatibility ===

Put the following at the top of all your Python files:

{{{
from __future__ import absolute_import, print_function, unicode_literals
}}}

This turns on three important compatibility flags.
 * Absolute imports are the default in Python 3 [[http://python3porting.com/differences.html#imports|[more info]]]
 * {{{print()}}} is a function in Python 3 [[http://python3porting.com/noconv.html#print-section|[more info]]]
 * Unadorned string literals are unicode type in Python 3 [[http://python3porting.com/problems.html#binary-section|[more info]]]

In your code, make these changes:
 * Change all your {{{print}}} statements to use {{{print()}}} functions, and remove all the {{{u''}}} prefixes from your strings.
 * If you have string literals in your code that represent data, prefix them all with the {{{b''}}} prefix [[http://python3porting.com/problems.html#byte-literals|[more info]]]
 * Remove all {{{L}}} suffixes from your long integers. [[http://python3porting.com/differences.html#long|[more info]]]

=== built-ins ===

 * Change usage of {{{xrange()}}} to {{{range()}}}. [[http://python3porting.com/differences.html#filter-map-range-and-xrange|[more info]]]

=== dictionaries ===

 * Change all your uses of the dictionary methods {{{iteritems()}}}, {{{iterkeys()}}}, and {{{itervalues()}}} to use the non-{{{iter}}} variety, e.g. {{{items()}}}, {{{keys()}}}, and {{{values()}}} respectively. These return dictionary views in Python 3, not concrete lists, so if you need a concrete list, wrap these calls in {{{list()}}} or {{{sorted()}}}. [[http://python3porting.com/differences.html#index-6|[more info]]]

=== strings/bytes/unicodes ===

 * {{{bytes}}} objects in Python 3 have no {{{.format()}}} method. Use concatenation instead.
 * For raw-bytes objects, use the {{{br''}}} string prefix ({{{rb''}}} was added to Python 3.3)

=== iterators ===

 * Change your iterator classes from providing a {{{next()}}} method to providing a {{{__next__()}}} method. For cross-compatibility, in your class, set {{{next = __next__}}}. [[http://python3porting.com/differences.html#next|[more info]]]
 * Use {{{itertools.zip_longest()}}} in Python 3, with a conditional import for {{{itertools.izip_longest()}}} in Python 2.

=== operators ===

 * Python 3 has no {{{operator.isSequenceType()}}}. Use the following code for cross-compatibility.

{{{
from collections import Sequence
return isinstance(obj, Sequence)
}}}

=== codecs ===

 * Python 2 codecs which do str-to-str conversions (e.g. ''rot-13'') do not work in Python 3. Use this instead:

{{{
from codecs import getencoder
encoder = getencoder('rot-13')
rot13string = encoder(mystring)[0]
}}}

[[http://www.wefearchange.org/2012/01/python-3-porting-fun-redux.html|[more info]]]

=== Metaclasses ===

 * Syntax for creating instances with different metaclasses is very different between Python 2 and 3. Use the ability to call {{{type}}} instances as a way to portably create such instances. [[http://cgit.freedesktop.org/dbus/dbus-python/tree/dbus/gobject_service.py|[example]]]

=== doctests ===

 * In your doctest's {{{setUp()}}}, add these globals to your test object's {{{globs}}} so they'll have the same {{{__future___}}} environment that your code has:

{{{
from __future__ import absolute_import, print_function, unicode_literals
def setUp(testobj):
    testobj.globs['absolute_import'] = absolute_import
    testobj.globs['print_function'] = print_function
    testobj.globs['unicode_literals'] = unicode_literals
}}}

[[http://www.wefearchange.org/2012/01/python-3-porting-fun-redux.html|[more info]]]

 * Bytes have different reprs in Python 2 and Python 3. This convenience function is used to print bytes objects in cross-compatible ways:

{{{
def print_bytes(obj)
    if bytes is not str:
        obj = repr(obj)[2:-1]
    print(obj)
}}}

=== zope.interfaces ===

 * The {{{implements()}}} method does not work in Python 3. Use the {{{@implementer}}} class decorator instead. [[http://www.wefearchange.org/2012/01/python-3-porting-fun-redux.html|[more info]]]

== Python extension modules ==

 * Define a {{{PY3}}} macro which you can later {{{#ifdef}}} on for C code which cannot be written portably for both Python 2 and Python 3. [[http://cgit.freedesktop.org/dbus/dbus-python/tree/include/dbus-python.h#n35|[example]]]

=== Compatibility macros ===

 * The {{{PyInt_*}}} functions are gone in Python 3. In your extension module, change all of these to {{{PyLong_*}}} functions, which will work in both versions. [[http://python3porting.com/cextensions.html#changes-in-python|[more info]]]
 * {{{#include <bytesobject.h>}}} and change all {{{PyString_*}}} functions with their {{{PyBytes_*}}} equivalents, changing those that really operate on unicodes to use {{{PyUnicode_*}}} functions. [[http://python3porting.com/cextensions.html#strings-and-unicode|[more info]]]
 * Instead of explicitly dereferencing {{{ob_type}}}, use the {{{Py_TYPE()}}} macro instead. [[http://python3porting.com/cextensions.html#object-initialization|[more info]]]

=== C types ===

There are lots of differences you need to be aware of when defining types in C extensions. A few important ones:

 * Use {{{PyVarObject_HEAD_INIT()}}} and don't define the {{{tp_size}}} slot [[http://python3porting.com/cextensions.html#object-initialization|[more info]]]
 * Remove references to {{{Py_TPFLAGS_HAVE_WEAKREFS}}} and {{{Py_TPFLAGS_HAVE_ITER}}} since these are unnecessary (and undefined) in Python 3. If you need to support both Python 2 and 3, you'll need an {{{#ifdef}}}.

=== PyArg_Parse() ===

 * {{{PyArg_Parse()}}} and friends lack a {{{y}}} code (for bytes objects) in Python 2, so you will have to {{{#ifdef}}} around these.
 * In Python 3, there's no equivalent of the {{{z}}} code for bytes objects (accepting {{{None}}} as well). Write an {{{O&}}} converter.

=== PyCObject ===

 * Rewrite these to use {{{PyCapsule}}} instead. If you can drop Python 2.6, there's no need to {{{#ifdef}}} these, since {{{PyCapsule}}} is available in Python 2.7. [[http://cgit.freedesktop.org/dbus/dbus-python/tree/_dbus_bindings/module.c#n411|[example]]]

=== reprs ===

 * If you derive new types from builtin C types, e.g. PyBytes, and you want to override the repr in the subclass, you'll find you have a problem with cross-compatibility. In Python 2, the super class's repr will return bytes (a.k.a. 8-bit strings) while in Python 3, they will return unicodes. Python's C API has a little known format code {{{%V}}} which can be used to bridge this gap. Add this macro:

{{{
#define REPRV(obj) \
    (PyUnicode_Check(obj) ? (obj) : NULL), \
    (PyUnicode_Check(obj) ? NULL : PyBytes_AS_STRING(obj))
}}}

and use it like this:

{{{
return PyUnicode_FromFormat("...%V...", REPRV(parent_repr));
}}}

[[http://www.wefearchange.org/2011/12/lessons-in-porting-to-python-3.html|[more info]]]
Line 9: Line 155:
 * {{{#python3}}} IRC channel on Freenode
 * [[http://mail.python.org/mailman/listinfo/python-porting|python-porting mailing list]]
 * [[http://wiki.debian.org/Python/LibraryStyleGuide|Debian Python packaging style guide (covers Python 2 and Python 3)]]
Line 10: Line 159:
 * '''Excellent''' [[http://python3porting.com/|in-depth Python 3 porting guide]]
 * [[http://python3wos.appspot.com/|Python 3 "Wall of Shame"]]
 * [[http://pypi.python.org/pypi?:action=browse&c=533&show=all|Cheeseshop packages explicitly claiming Python 3 support]]
 * [[http://wiki.python.org/moin/PortingPythonToPy3k|Python wiki porting guide (pure-Python)]]
 * [[http://wiki.python.org/moin/PortingExtensionModulesToPy3k|Python wiki porting guide (extension modules)]]
 * Barry Warsaw's blog
   * [[http://www.wefearchange.org/2012/01/debian-package-for-python-2-and-3.html|Debian packaging for Python 2 and 3]]
   * [[http://www.wefearchange.org/2011/12/lessons-in-porting-to-python-3.html|Python 3 porting (part 1)]]
   * [[http://www.wefearchange.org/2012/01/python-3-porting-fun-redux.html|Python 3 porting (part 2)]]
   * [[http://www.wefearchange.org/2012/04/python-3-on-desktop-for-quantal-quetzal.html|Python 3 plans for Ubuntu 12.10 Quantal Quetzal]]
   * [[http://www.wefearchange.org/2011/11/update-on-ubuntus-python-plans.html|Python 3 plans for Ubuntu 12.04 Precise Pangolin]]
 * Ned Bachelder's Pycon 2012 talk [[http://pyvideo.org/video/948/pragmatic-unicode-or-how-do-i-stop-the-pain|Pragmatic Unicode, or How Do I Stop the Pain?]] '''''Watch this NOW'''''

== Q/A ==

 * Why not rely on 2to3 or the six module? 2to3 is a pretty slow tool so it can impede on the speed with which you develop your code. six is (IMHO) a mostly unnecessary extra dependency.
 

Python 3 on Ubuntu

It is a release goal for Ubuntu 12.10 to have only Python 3 on the desktop CD images. We have a Q-series blueprint for discussion of this goal at UDS-Q in Oakland, California, in May of 2012. There is a more detailed spec for this effort and a publicly shared Google docs spreadsheet to track this effort. This is an ambitious effort that will only succeed with help from the greater Ubuntu, Debian, and Python communities. In other words, we need you!

At the bottom of this page, you will find various resources for diving more into aspects of supporting Python 3, from the pure-Python, C extension module, Debian packaging, and other perspectives. The intent of this page is to provide specific guidelines in a quick reference format, so that you only need to go here once you're familiar with the basic concepts and approaches, but need a refresher on specific coding techniques. This is a wiki page, and you are encouraged to contribute, but try to keep your recommendations tightly focused on accomplishing the release goal of Python 3 only on the 12.10 CDs.

Before you start

Here are recommendations for you to follow before you start porting.

  • Target Python 3.2, 2.7, and optionally 2.6. Ignore anything older than that.
  • Use a single code base for both Python 2 and 3.
  • Do not rely on 2to3 or the third party six module (the latter, only if absolutely necessary)

  • Modernize your Python 2 code first, getting it working in Python 2.7 before starting to port.
  • Clarify your data model: what are bytes (data) and what are strings (text)?

I cannot overemphasize the last point. Without a clear separation in your mind and data model between bytes and strings, your port will likely be much more painful than it needs to be. This is the biggest distinction between Python 2 and Python 3. Where Python 2 let you be sloppy, with its 8-bit strings that served as both data and ASCII strings, with automatic (but error prone) conversions between 8-bit strings and unicodes, in Python 3 there are only bytes and strings (i.e. unicodes), with no automatic conversion between the two. This is A Good Thing.

Python source

Basic compatibility

Put the following at the top of all your Python files:

from __future__ import absolute_import, print_function, unicode_literals

This turns on three important compatibility flags.

  • Absolute imports are the default in Python 3 [more info]

  • print() is a function in Python 3 [more info]

  • Unadorned string literals are unicode type in Python 3 [more info]

In your code, make these changes:

  • Change all your print statements to use print() functions, and remove all the u'' prefixes from your strings.

  • If you have string literals in your code that represent data, prefix them all with the b'' prefix [more info]

  • Remove all L suffixes from your long integers. [more info]

built-ins

dictionaries

  • Change all your uses of the dictionary methods iteritems(), iterkeys(), and itervalues() to use the non-iter variety, e.g. items(), keys(), and values() respectively. These return dictionary views in Python 3, not concrete lists, so if you need a concrete list, wrap these calls in list() or sorted(). [more info]

strings/bytes/unicodes

  • bytes objects in Python 3 have no .format() method. Use concatenation instead.

  • For raw-bytes objects, use the br'' string prefix (rb'' was added to Python 3.3)

iterators

  • Change your iterator classes from providing a next() method to providing a __next__() method. For cross-compatibility, in your class, set next = __next__. [more info]

  • Use itertools.zip_longest() in Python 3, with a conditional import for itertools.izip_longest() in Python 2.

operators

  • Python 3 has no operator.isSequenceType(). Use the following code for cross-compatibility.

from collections import Sequence
return isinstance(obj, Sequence)

codecs

  • Python 2 codecs which do str-to-str conversions (e.g. rot-13) do not work in Python 3. Use this instead:

from codecs import getencoder
encoder = getencoder('rot-13')
rot13string = encoder(mystring)[0]

[more info]

Metaclasses

  • Syntax for creating instances with different metaclasses is very different between Python 2 and 3. Use the ability to call type instances as a way to portably create such instances. [example]

doctests

  • In your doctest's setUp(), add these globals to your test object's globs so they'll have the same __future___ environment that your code has:

from __future__ import absolute_import, print_function, unicode_literals
def setUp(testobj):
    testobj.globs['absolute_import'] = absolute_import
    testobj.globs['print_function'] = print_function
    testobj.globs['unicode_literals'] = unicode_literals

[more info]

  • Bytes have different reprs in Python 2 and Python 3. This convenience function is used to print bytes objects in cross-compatible ways:

def print_bytes(obj)
    if bytes is not str:
        obj = repr(obj)[2:-1]
    print(obj)

zope.interfaces

  • The implements() method does not work in Python 3. Use the @implementer class decorator instead. [more info]

Python extension modules

  • Define a PY3 macro which you can later #ifdef on for C code which cannot be written portably for both Python 2 and Python 3. [example]

Compatibility macros

  • The PyInt_* functions are gone in Python 3. In your extension module, change all of these to PyLong_* functions, which will work in both versions. [more info]

  • #include <bytesobject.h> and change all PyString_* functions with their PyBytes_* equivalents, changing those that really operate on unicodes to use PyUnicode_* functions. [more info]

  • Instead of explicitly dereferencing ob_type, use the Py_TYPE() macro instead. [more info]

C types

There are lots of differences you need to be aware of when defining types in C extensions. A few important ones:

  • Use PyVarObject_HEAD_INIT() and don't define the tp_size slot [more info]

  • Remove references to Py_TPFLAGS_HAVE_WEAKREFS and Py_TPFLAGS_HAVE_ITER since these are unnecessary (and undefined) in Python 3. If you need to support both Python 2 and 3, you'll need an #ifdef.

PyArg_Parse()

  • PyArg_Parse() and friends lack a y code (for bytes objects) in Python 2, so you will have to #ifdef around these.

  • In Python 3, there's no equivalent of the z code for bytes objects (accepting None as well). Write an O& converter.

PyCObject

  • Rewrite these to use PyCapsule instead. If you can drop Python 2.6, there's no need to #ifdef these, since PyCapsule is available in Python 2.7. [example]

reprs

  • If you derive new types from builtin C types, e.g. PyBytes, and you want to override the repr in the subclass, you'll find you have a problem with cross-compatibility. In Python 2, the super class's repr will return bytes (a.k.a. 8-bit strings) while in Python 3, they will return unicodes. Python's C API has a little known format code %V which can be used to bridge this gap. Add this macro:

#define REPRV(obj) \
    (PyUnicode_Check(obj) ? (obj) : NULL), \
    (PyUnicode_Check(obj) ? NULL : PyBytes_AS_STRING(obj))

and use it like this:

return PyUnicode_FromFormat("...%V...", REPRV(parent_repr));

[more info]

Resources

Q/A

  • Why not rely on 2to3 or the six module? 2to3 is a pretty slow tool so it can impede on the speed with which you develop your code. six is (IMHO) a mostly unnecessary extra dependency.

Python/3 (last edited 2016-02-26 01:40:43 by cmawebsite)