Embedding Python with Boost.Python Part 1
by Howard "SiCrane" Jeng

Building Applications that Embed Python

Python is an increasingly popular programming language used in a variety of contexts. Python has been called a language optimized for development speed. This puts it in contrast with compiled languages like C and C++, which can be called languages optimized for execution speed.

This contrast between Python and C/C++ often fuels a development methodology in Python circles: code the application in Python, profile the application, and rewrite the performance critical parts in C or C++. This makes the topic of developing hybrid systems by extending Python very well covered in various Python documentation. Less well covered is the situation where you have a C++ program where you want to embed Python. Python's ease of use make it well suited in game development where the ability to quickly modify code is more important than execution speed. This includes areas such as artificial intelligence, event scripting or level design.

This article series covers embedding Python in a C++ application with Boost.Python, one of the Boost project's many useful libraries. This particular article focuses on the basics of embedding, specifically on the process of building applications that embed Python.

Embedding Python with Boost.Python Part 2 focuses more on the interface between Python and C++ code, including an extensive example of embedding based on an SDL application where the majority of the core logic is done in Python. The last article shows the flexibility of Python in an embedded context by further extending the application presented in the second article, with what may be considered "bells and whistles", including incorporating an interactive Python interpreter in the application.

This article series assumes familiarity with the basics of Python and intermediate knowledge of C++. 1) Also, the series relies on Boost.Python which in turn requires Python 2.2 or higher. The code in these articles have been tested with Boost 1.32.0, 1.33.0 and 1.33.1 as well as Python 2.3 and 2.4.

The First Embedded Application

The absolute minimal application that you can create that can claim to have embedded Python looks like: 2)

#include <Python/Python.h>

int main(int, char **) {
  Py_Initialize();
  
  Py_Finalize();
  return 0;
}

All this program does is initialize Python and then cleans it up. However, even a program as simple as this can be instructive. For one thing, unless you've already set up Python for your C++ development environment, it shouldn't compile or link.

In order to get this program to build you need to have Python installed on your computer. Your computer may already have Python installed. For example, recent versions of Mac OS X ship with Python as a framework. If not, there are pre-built binaries for Windows and Mac available from http://www.python.org, and source archives available for other operating systems. The source archive builds and installs much like you would normally expect for a automake project: ./configure ; make ; make install. Among the outputs is the libpythonx.y.a archive (where x.y is the version number, e.g. libpython2.4.a), which is required for linking against for Python extending or embedding. It is put in the root directory of the source archive. 3)

Once you have Python installed, you still need to get the headers recognized by your compiler. I put the headers in a path so that they reside in the Python/ virtual directory, which is also how the Python framework for Mac sets things up by default. This is in contrast with normal Python extending/embedding sources, which assumes that the Python headers are put in a place that you can access without a directory specification (i.e. #include <Python.h> instead of #include <Python/Python.h>). For the purposes of this article the difference in directory structure is minor, since for the most part we will be including boost/python.hpp instead of the Python.h header directly. You will also need to add the Python lib directory to your linker path or otherwise directly add the Python library to your project (either pythonxy.lib, libpythonx.y.a or adding the Python framework in OS X).

Also, in debug mode, with a default Python installation, the program will not link in MSVC. This is because the Python MSI installer comes with pythonxy.lib, but not the pythonxy_d.lib requested as an input file to the linker via a header pragma. You can get around this by including boost/python.hpp instead of Python.h. boost/python.hpp does some preprocessor magic to change the pragma to link the pythonxy.lib instead. However, don't include both. That may cause your program to try to link against both pythonxy.lib and pythonxy_d.lib. 4)

One final thing to note is that with older Python versions, on Mac platforms you needed to use PyMac_Initialize() instead of Py_Initialize(). This should only matter if you are using Python 2.1 or earlier. However, this article covers Boost.Python, which requires Python 2.2 or later, so this should be a non-issue.

Once you can get it to compile and link, try running the program. Hopefully it will do what it's supposed to do: have no apparent effect. 5)

An Embedded Program That Does Something

Now let's modify the program to actually do something. Tradition dictates that the next step be Hello World. So let's create a Python file, hello.py with the contents:

def hello():
  print "Hello World!"

Then modify our main() function to:

#include <Python/Python.h>

int main(int, char **) {
  Py_Initialize();

  PyRun_SimpleString("import hello");
  PyRun_SimpleString("hello.hello()");
  
  Py_Finalize();
  return 0;
}

This does more or less what you would expect if you ran the two statements in Python. 6) PyRun_SimpleString() can be used to run a variety of Python code; however, it has several limitations. First off, PyRun_SimpleString() can only be used to run statements, not evaluate expressions. It also doesn't communicate its result to the calling code. In order to adjust the inputs to the function, you need to alter the C string used to call PyRun_SimpleString(), which can be cumbersome. Finally, the Python code is always run in the namespace of module __main__.

For this Hello World program these limitations are not a big issue. Now let's look at something where we want to extract a value from Python. First we need to put a value into Python in order to extract it. So first we change PyRun_SimpleString() to take an assignment statement. Then in order to get at the assigned variable, we use the low level Python API:

int main(int, char **) {
  Py_Initialize();
  
  PyRun_SimpleString("result = 5 ** 2");
  
  PyObject * module = PyImport_AddModule("__main__"); // borrowed reference

  assert(module);                                     // __main__ should always exist
  PyObject * dictionary = PyModule_GetDict(module);   // borrowed reference
  assert(dictionary);                                 // __main__ should have a dictionary
  PyObject * result
    = PyDict_GetItemString(dictionary, "result");     // borrowed reference

  assert(result);                                     // just added result
  assert(PyInt_Check(result));                        // result should be an integer
  long result_value = PyInt_AS_LONG(result);          // already checked that it is an int
  
  std::cout << result_value << std::endl;
  
  Py_Finalize();
  return 0;
}

The first thing you might notice is the strange comments about borrowed references. In the Python C API, most functions return PyObject pointers, which are reference counted. These pointers can either be new references or borrowed references. A new references means that the Python runtime has already incremented the reference count for you before returning the PyObject pointer. When you receive a new reference from the API you need to eventually call Py_DECREF() or Py_XDECREF() to decrement the reference count after you are finished with it. For borrowed references you don't need to decrement the reference count; however, this also means that since the reference count hasn't been incremented, the Python runtime may garbage collect the object while you still have a pointer to it. For these borrowed references, the objects should last all the way until the Py_Finalize() call, so incrementing the reference count is not necessary.

The actual function calls are a bit more straightforward. It's equivalent to the Python code:

result = 5 ** 2
import __main__
__main__.__dict__["result"]

With the result then being transferred to the C++ program to be outputted. Now let's try the other half of the equation: setting a variable in Python from the C++ program. We'll just add a few lines between the cout and Py_Finalize() call.

  std::cout << result_value << std::endl;
  
  PyObject * twenty = PyInt_FromLong(20);             // new reference
  PyDict_SetItemString(dictionary, "result", twenty);
  Py_DECREF(twenty);                                  // release reference count
  PyRun_SimpleString("print result");
  
  Py_Finalize();

This creates a new Python integer object with the value of 20 and assigns it to the dictionary. 7) In this case instead of getting a borrowed reference from Python, we get a new reference that we need to track, so we call Py_DECREF() after finishing with the object.

Introducing Boost.Python

At this point, all the uses of raw pointers as well as explicit management of reference counts may be triggering instincts of wrapping PyObject pointers with smart pointers. Fortunately, Boost.Python does that for you as well as wrapping some of the API calls so that you don't need to deal with the low level API directly as much.

Unfortunately, Boost.Python still seems mostly geared toward writing Python extensions rather than embedding, so we still need to use some API calls directly. A direct translation of the previous program looks like:

#include <boost/python.hpp>
#include <iostream>

int main(int, char **) {
  using namespace boost::python;

  Py_Initialize();
  
  try {
    PyRun_SimpleString("result = 5 ** 2");
    
    object module(handle<>(borrowed(PyImport_AddModule("__main__"))));
    object dictionary = module.attr("__dict__");
    object result = dictionary["result"];
    int result_value = extract<int>(result);
    
    std::cout << result_value << std::endl;
    
    dictionary["result"] = 20;

    PyRun_SimpleString("print result");
  } catch (error_already_set) {
    PyErr_Print();
  }
  
  Py_Finalize();
  return 0;
}

The call to PyImport_AddModule() becomes slightly more complex, as what was originally a comment about the borrowed reference becomes part of the code itself. However, once we have a reference to the __main__ module, instead of using an API call, we use the attr() member function to get at the __dict__ attribute of the __main__ module, which more closely resembles the equivalent Python code. Similarly, getting the reference to the result variable is done with operator[], and we can get at the value with the extract<>() function, whose usage resembles a cast.

Also added is a slightly better error handling scheme than before. Instead of assuming that everything works properly, now it traps Python exceptions inside a try/catch block.

Now the challenge is to get this to build. For directions on how to get Boost.Python built with Boost.Jam go to http://www.boost.org/more/getting_started.html. If you can't get that to work, I've included an appendix to this article about building a project with Boost.Python without bjam. Even if you can get bjam to work, there are a couple of notes for building your project against the Boost.Python library. Firstly, exception handling and RTTI are required to be enabled for Boost.Python to function properly. Secondly, under MSVC, you should build your project with the Multithread DLL versions of the C runtime library or you can get some very subtle and strange runtime errors.

Hopefully once you get this built and running, it will display 25 and 20 as expected.

Creating a Module

The last topic for this article is creating a module in the application that can be imported from the embedded Python interpreter. To define a Python module with Boost.Python you declare the module with the BOOST_PYTHON_MODULE() macro, and put the various initializers inside the curly braces that follow it. The next article deals more with the initializers and how to export classes. Right now just focus on the one commented line in the following source.

#include <boost/python.hpp>

using namespace boost::python;

int add_five(int x) {
  return x + 5;
}

BOOST_PYTHON_MODULE(Pointless)
{
    def("add_five", add_five);
}

int main(int, char **) {

  Py_Initialize(); 
  
  try {
    initPointless(); // initialize Pointless
  
    PyRun_SimpleString("import Pointless");
    PyRun_SimpleString("print Pointless.add_five(4)");
  } catch (error_already_set) {
    PyErr_Print();
  }
  
  Py_Finalize();
  return 0;
}

This program shouldn't require any new measures to build or run properly. The important detail here is that in order for the module to be properly registered with the embedded Python interpreter, the init function for the module needs to be called. The function is called init followed by the module name. In this case it is initPointless(). So if you declare a module with BOOST_PYTHON_MODULE(Foo) you should call initFoo() sometime after Py_Initialize(), but before being potentially imported. 8)

Wrapping Up

This should cover all the minor details required to successfully build an application with an embedded Python interpreter with and without Boost.Python. In the next article I cover more details on how to perform actual integration between Python and C++.

References

Appendix

Setting up embedded Boost.Python projects if you can't (or don't want to) get Boost.Jam to work.

While later versions of Boost have been very good about installing properly with bjam, sometimes it's hard to get it to work right if you don't know how to set up the right environment variables, or if you have an otherwise eccentric build system. Alternately, you may not want to dynamically link against the Boost.Python library. For those cases, here are a couple of examples of setting up projects without using Boost.Jam to put together an executable that statically links against Boost.Python. I've chosen the two IDEs I've seen people have the most trouble getting bjam to work with: MSVC .NET 2003 for Windows and XCode for Max OS X.

These instructions have been tested with both Boost 1.32, 1.33.0 and 1.33.1 with Python 2.3 and 2.4, though it should work with Python 2.2.

In Windows with MSVC .NET 2003

The Windows environment presents special challenges in getting Boost.Python to build if you can't get bjam to work. The primary problem is that Boost.Python on Windows is designed to be built as a dynamic link library, so simply including the Boost.Python source files in your project isn't sufficient, you also need to sprinkle in the correct pre-processor macros.

The first step is to download Boost and Python. Boost just needs to be unpacked. For Windows, there is an MSI installer for Python that contains the necessary headers and link libraries for use with MSVC. On my computer I put the boost_1_32_0/ directory in C:\include, and the C header files from the include/ directory of the Python installation in the C:\include\Python directory.

Then create an empty console project in MSVC, and add the boost_1_32_0/ directory and the directory containing the Python header directory to your include path in project settings. Next, add some Boost.Python test code as a main.cpp file:

#include <boost/python.hpp>
#include <iostream>

int main(int, char **) {
  using namespace boost::python;

  Py_Initialize();
  
  try {
    PyRun_SimpleString("result = 5 ** 2");
    
    object module(handle<>(borrowed(PyImport_AddModule("__main__"))));
    object dictionary = module.attr("__dict__");
    object result = dictionary["result"];
    int result_value = extract<int>(result);
    
    std::cout << result_value << std::endl;
    
    dictionary["result"] = 20;

    PyRun_SimpleString("print result");
  } catch (error_already_set) {
    PyErr_Print();
  }
  
  Py_Finalize();
  return 0;
}

Next, you need to add BOOST_PYTHON_STATIC_LIB to the list of preprocessor defines in the project settings. This prevents Boost.Python from defining functions as __declspec(dllimport) or __declspec(dllexport). Don't forget to do this for both Debug and Release mode. Also Boost.Python need RTTI enabled to function properly, so turn that on, and switch to the multi-threaded DLL versions of the C runtime library. You can get very strange and annoying runtime errors if you neglect either of those steps.

If you follow the same structure for your include directory hierarchy you then get a number of unable to open header errors. You just need to prefix Python/ to the header file names in the problem files. So #include <patchlevel.h> becomes #include <Python/patchlevel.h>. Alternately you can add the actual Python header directory to your include path and not follow the way I do things.

While we're messing with the Boost files, you should uncomment the #define BOOST_ALL_NO_LIB line in boost/config/user.hpp to tell Boost that we aren't trying to use the bjam-built DLLs.

Now add all the files in the boost_1_32_0/libs/python/src directory to your project. You may get a few more of the unable to open header errors, but that has the same fix as before.

After all that, you can run the program and it should produce the number 25 and the number 20 on the console.

As a side note, on MSVC it is important not to include Python.h directly. Doing this in debug mode may cause the linker to try to load a version of the Python link library that doesn't come with the Python MSI installation package. Even if you have the Python source and have built both versions of the Python link library, it may cause your program to try link against both versions of the Python library simultaneously, which will cause no end of unhappy linker messages.

In OS X with XCode

Building an application that embeds Python for Mac OS X with Boost.Python is slightly different. In OS X 10.4 (Tiger), Python 2.3 comes as a Framework. The first step is to actually download the boost library and uncompress and untar the archive.

In XCode create a new project as a "C++ Command Line Tool". Now add some Boost.Python test code to the main.cpp file:

#include <boost/python.hpp>
#include <iostream>

int main(int, char **) {
  using namespace boost::python;

  Py_Initialize();
  
  try {
    PyRun_SimpleString("result = 5 ** 2");
    
    object module(handle<>(borrowed(PyImport_AddModule("__main__"))));
    object dictionary = module.attr("__dict__");
    object result = dictionary["result"];
    int result_value = extract<int>(result);
    
    std::cout << result_value << std::endl;
    
    dictionary["result"] = 20;

    PyRun_SimpleString("print result");
  } catch (error_already_set) {
    PyErr_Print();
  }
  
  Py_Finalize();
  return 0;
}

Now go to the project window and Add/Existing Framework/Python.Framework. Then in source files, add a new group, call it something like "Boost Python Sources" and then Add / Existing Files / boost_1_32_0/libs/python/src, highlight all the files and the folders in the directory and hit Add. Select the radio button for "Recursively create groups for added folders" and hit Add again.

Now go to Project Settings and add boost_1_32_0/ to the Header Search Paths under Styles. (At this point I also disable ZeroLink, but that's a personal preference.)

At this point it should give you several build errors in the boost headers. In order to build properly when using the Python framework, the boost headers that give errors should have #includes that reference python headers have the Python/ prefix appended to the name. So #include <patchlevel.h> becomes #include <Python/patchlevel.h>. Alternately you can add the directory for the Python frameworks header file to the list of Header Search Paths under Styles in the Project Settings.

Either way, once that it is finished you should be able to build the project, and 25 and 20 show up in the log window as expected.

As a note, the project should also have RTTI enabled, which XCode should do by default.

Of course, you can use a non-framework version of Python as well. To do so, I suggest building Python from source as normal. ./configure ; make ; make install will build and install Python to your computer. To build an embedded Python project instead of adding the Python Framework, add the libpythonx.y.a (ex: libpython2.4.a) from the root of the Python source directory to your project instead of the Python Framework. You may also need to add the /usr/local/include directory to your header search path, and you may need to add a symlink to the python2.4/ directory to Python/ to get the headers to be found. Or just add /usr/local/include/python2.4 to your header search path.


1) For those unfamiliar with Python and reading this article to evaluate embedding Python as a scripting language, I recommend Dive into Python as an introduction to the Python programming language.

2) Technically you can also leave off the Py_Finalize(), though that would be bad manners.

3) Unless you chose a different method when invoking the configure script.

4) The preprocessor contortions that Boost.Python uses to link to the python library in debug mode also prevents it from compiling on MSVC .NET 2005 with Boost 1.32 or 1.33.0. This seems to have been fixed with Boost 1.33.1, and release mode builds will also work with Boost 1.32 or 1.33.0.

5) As opposed to generating a runtime assertion, dumping core or otherwise generating a runtime error.

6) This includes compiling the code and generating the hello.pyc file.

7) In actuality, the current C implementation of the Python interpreter pre-creates integer objects for the range of -1 to 100. So if you create an integer value in that range you get a reference to one of the pre-existing integer objects. The Python/C API mentions that it would be theoretically possible to change the value of 20 via this returned object; the resulting behavior being undefined.

8) This seems to be one of the most frequently encountered problems with embedding Python with Boost.