Speed up Python with C++

Python is a popular programming language known for its simplicity and readability. However, it has a few drawbacks, one of which is slow performance. This can be especially problematic when dealing with complex computations or large datasets. Fortunately, there are a few strategies and techniques that can help improve Python’s performance and make it more efficient.

To improve performance, it is possible to develop the core application in Python and write the critical parts in C++.

That way, you would be getting the best of worlds: Ease of Python and speed of C++ when needed.

There are many 3rd party libraries to enable Python – C++ interaction, but I will be focusing on the standard library “ctypes”.

Project structure

  • Create a new Python folder containing venv and all.
  • Create a file called main.py (or any other name), which will contain our Python code.
  • Create a folder called clib (or any other name), which will contain our C++ stuff.

C++ coding

Create the file clib/kerem.cpp (or any other name), and put the following code inside.

#include <iostream>

class Hello {
    void sayHello() {
        std::cout << "Hello from C++" << std::endl;
    void countBig() {
        int a;
        for (int n = 0; n < 10000000; n++) {
            a = 1;

int main() {
    Hello hello;
    return 0;

extern "C" {
    Hello* hello_new() { return new Hello(); }
    void Hello_sayHello(Hello* hello) { hello->sayHello(); }
    void Hello_countBig(Hello* hello) { hello->countBig(); }
  • In the main section, we wrote a C++ class with two methods. One simply says “Hello world”, and the other one runs a loop 10M times. We will use this for performance measurement later on.
  • In the extern section, we define a public interface targeting the C language. “ctypes” expects C (instead of C++), so we need a small conversion here.

On the terminal, run the following commands to compile kerem.cpp into a shared library file. The compiler on your OS may differ from mine though; the code below is for Mac OS.

g++ -c -fPIC kerem.cpp -o kerem.o
g++ -shared -o keremlib.so kerem.o

This will produce clib/keremlib.so, which is the Python-compliant C++ library.

Some hints:

  • If you will move your app to another OS / processor / etc, you will probably need to re-compile your C++ files there.
  • You could make the initial compilation process part of your setup.py if needed.

Python coding

Open main.py and write the following code. Obviously, you need to change the path(s).

""" Main """
from ctypes import cdll
import timeit
import os

def runme():
    # Load
    lib = cdll.LoadLibrary('/Users/kerem/python_cpp/clib/keremlib.so')
    kerem_obj = lib.hello_new()

    # python loop
    start = timeit.default_timer()
    for i in range(1, 10000000):
        a = 1
    stop = timeit.default_timer()
    print("python: ", stop - start)

    # c loop
    start = timeit.default_timer()
    stop = timeit.default_timer()
    print("c++: ", stop - start)

  • First, we load the C++ library and say “Hello world”
  • For performance measurement, we run a loop of 10M lines using Python and C+++


When we execute the Python code, we get the following terminal output.

Hello from C++
python:  0.22564460199999997
c++:  0.014554413999999988
  • The first line indicates that we were able to call the library successfully. “Hello from C++” is produced from our C++ library.
  • The second line shows the measurement of our Python loop covering 10M iterations.
  • The third line shows the measurement of our C++ loop covering 10M iterations.

It is plain to see that the C++ loop is 20x faster than Python!

More Sophisticated Example

Here is an example which can manipulate strings. It follows the same logic. No defensive programming though.


#ifndef replacer_hpp
#define replacer_hpp

class Replacer {
    void replace2(const char* source, char* target);

#endif /* replacer_h */


#include <stdio.h>
#include <string.h>
#include "replacer.hpp"

void Replacer::replace2(const char* source, char* target) {
    strncpy(target, source, sizeof(&target));
    target[1] = 'a';


#include <iostream>
#include "replacer.cpp"

int main(int argc, const char * argv[]) {
    // insert code here...
    std::cout << "Hello, World!\n";
    Replacer rep;
    char replaced2[100];
    rep.replace2("zzz", replaced2);
    std::cout << replaced2;
    return 0;

extern "C" {
    Replacer* get_replacer() { return new Replacer(); }
    void replace2(Replacer* replacer, const char* source, char* target) {
        replacer->replace2(source, target);


from ctypes import cdll

lib = cdll.LoadLibrary("/Users/kerem/Desktop/pytest/pytest/main.so")
replacer = lib.get_replacer()
from_str = b"12345"
to_str = b"67890"
lib.replace2(replacer, from_str, to_str)

Output of app.py is:



There are many programming languages with overlapping functionality, but each language has its own pros & cons. No language can do everything equally well.

For instance; writing an entire web back end with C++ is theoretically possible and would run pretty fast if done right, but the development & support effort would be much higher than a more suitable language with ready-to-use infrastructures; such as Python, Node.js or Java.

Runtime speed is not always the number one priority either. If something runs fast enough, other priorities typically take over.

In todays world; I believe that it makes sense to know multiple languages and use them where they really excel. We can usually make them interact easily anyway. Using “Python where we can, C++ where we must” (ref: Google) is a good example of this philosophy.








Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s