About Charles Law

Charles started out in Systems Engineering, designing algorithms, but became a programmer once discovering Python's clean syntax. He has experience using many popular products and services in production environments including web2py, uwsgi, AWS, and OpenShift, as well as experience setting up front-ends in Javascript/Python. He tends to cover his experiences, and notes, from setting up servers, and talks about his experiments with different front-end tools.

Imports, Namespaces, and References

References I had a decent idea of how importing and namespaces worked, but it wasn’t until I started setting up unit tests that everything clicked. False assumptions I had about namespaces and imports caused test failures. It forced me to do things the right way, so that I was changing the right objects instead of creating objects in random places. I learned how most things are references, and when you think of variables that way, namespaces and imports make a lot of sense.

Some basics

When you write a module, you start defining things: functions, variables, classes, etc.. When you define something, Python creates an object in memory. The variable name you use is how you refer to that chunk of memory. Everything is references behind the scenes.

A namespace is a container for these references. It maps variable names to addresses in memory. Any module you write has its own namespace. When you define variable mylist in your module, you are defining mylist in your module’s namespace. Variables defined in other modules, or I should say in other modules’ namespaces, are not accessible in your module – unless you import them.

You will work with open source libraries, and to unlock their power you need to access their functions and classes which are in the library’s namespace. You gain access though importing. At a high level Python imports modules by doing the following:

  1. See if the module name is in sys.modules – this like a cache of imported modules.
  2. If not, look for the file that matches the module name
  3. Run the module being imported. This setups up all global variables, including variables for classes & functions
  4. Put the loaded module into sys.modules
  5. Setup references so you can access imports – where the fun happens.

I want to explain these in a little more detail as well as how they affect modifying variables inside of an import. I will go through the steps out of order because step 1, loading from the cache, doesn’t mean much except after step 4, putting a module into the cache.

Step 2: Finding what to import

Finding the modules is pretty useful and interesting, but does not have a lot to do with references so I am skipping the details here. Also, the Python docs outline this clearly and succulently:

https://docs.python.org/3.3/tutorial/modules.html#the-module-search-path

Step 3: Executing Code

After finding the file, Python loads and executes it. To give a concrete example, I have 2 simple modules:

cars.py

car_list = [1]

def myfun():
    return car_list

print "Importing..."

dealer.py

import cars

If you run cardealer.py, you see the following:

> python cardealer.py
Importing...

So by loading and executing I mean Python runs everything at the top level. That includes setting variables, defining functions, and running any other code, such as the print statement in cars.

It also does this in each module’s own namespace. Defining car_list in cars basically adds a pointer in cars to this new list. It does not affect dealer, which could have its own, completely independent variable named car_list. After the import is done, dealer will also not have direct access to car_list. cars has the reference/pointer to that object in memory, and the only way dealer can access the list is to go through cars: cars.car_list.

Step 4 & Step 1

Caching takes place after loading the module. Python stores the module in a module class object, which it puts into the sys.modules dictionary. The key is usually the name of the module, with 1 exception. If the module is the the one you are directly running, its key is ‘main‘. (Aside: this is why we use if name == ‘main‘).

What’s cool is that you can access modules through sys.modules. So back to the simple module. After importing it, you can access the list as cars.car_list. You can also access it through sys.modules:

>>> import sys
>>> sys.modules['cars'].car_list
[1]

The caching in sys.modules has many effects. Importing a cached module means it doesn’t get executed again. Reimporting something is cheap! It also means that all the classes, functions, and variables defined at the top level are only defined once, and statements only execute once. The print statement in cars won’t print if it is imported again, or imported anywhere else in the same process.

Step 5: Reference Setup

This is where the most interesting stuff happens. Depending on how you import something, references are setup in your module’s namespace to let you access what you imported. There are a few variations, and I have some code that shows what Python effectively does. Remember, the modules are already loaded into sys.modules at this point.

  1. import cars

    cars= sys.modules['cars']
  2. import cars as vehicles

    vehicles= sys.modules['cars']
  3. from cars import car_list

    car_list = sys.modules['cars'].car_list

No matter how you import something, code from a module always runs in its own namespace. So when cars looks up a variable, it looks it up in cars’s own namespace, not who imported.

Mutability

Imported variables behave is the same way as regular variables, which depends on mutability.

Remember, when Python creates an object, it puts it somewhere in memory. For mutable objects (dict, list, most other objects), when you make a change to that object, you change the value in memory. For immutable objects (int, float, str, tuple), you can never change the value in memory. For immutable objects you can reassign the variable to point to a new value, but you cannot change the original value in memory.

To illustrate this I can use the id function which prints an address in memory for a variable. I modify the value of an int and a list.

Immutable:

>>> a = 1
>>> id(a)
140619792058920
>>> a += 1
>>> id(a)
140619792058896

Mutable

>>> b = [1]
>>> id(b)
4444200832
>>> b += [2]
>>> id(b)
4444200832

When I change the value of a, the immutable integer, the id changes. But when I do the same for a list, the id remains the same.

There is a sneaky catch though! Many times when you assign a variable, even a mutable one, you change the reference.

>>> b = [1]
>>> id(b)
4320637944
>>> b = b + [2]
>>> id(b)
4320647936

Behind the scene, Python is creating a new list, and changing where the b reference is pointing. The difference can be subtle between this case and the += case above. I wouldn’t worry about it much, it’s never affected me, but it’s good to be aware.

Modifying Imports

So how can I change values in imported modules? And how does mutability affect imports?

The way you access, and sometimes change, variables in imported modules may, or may not, have an effect on the module. This is where unit tests taught me a lot, since my tests imported modules and tried to set variables in order to setup certain test cases. I learned it all comes down to references. There are a few key rules.

Modules execute their code in their own namespace

When a function in cars tries to use car_list, it looks up car_list in cars namespace. That means:

The module is only affected if:

  • you update the object data behind that reference, or
  • **you change where the reference points in the module’s namespace **

There are some simple ways to update what’s behind the reference – which works no matter how you import. Let’s run this code:

tester_1.py

import cars
from cars import car_list, myfun
# update data through the cars module
cars.car_list.append(2)
print "updated through cars:", cars.myfun()
# update through my namespace
car_list.append(3)
print "updated imported var:", myfun()

This outputs:

> python tester_1.py
Importing...
updated through cars: [1, 2]
updated imported var: [1, 2, 3]

It’s trickier to change cars’s reference, which does depends on how you import:

tester_2.py

import cars
from cars import car_list, myfun
# change referece through the cars module
cars.car_list = [2]
print "changed cars reference    :", cars.myfun()
# the imported variable still points to the original value
print "test original import      :", cars_list
# changing the imported variable will not affect the "real" value
# so this should print the same as the first call
car_list = [3]
print "changed imported reference:", myfun()

This outputs:

> python tester_2.py
changed cars reference    : [2]
test original import      : [1]
changed imported reference: [2]

In both cases, we are creating a new list, and changing what the variables are referencing. But remember, myfun is in cars, and gets variables from the cars namespace. In the first case we are going to the cars namespace, and setting its car_list to this new list. When myfun runs, it gets the reference we just updated and returns our new list.

When we from cars import car_list we create a car_list in the tester_2 namespace which points to the same list cars has. When cars’s reference is set to [2], tester_2′s reference stays unchanged, which is why it keeps the value [1]. When we update car_list in tester_2′s namespace, it only changes where tester_2 is pointing. It is not changing where cars points (and not changing the value in memory where cars points), so it has no effect on how the cars module runs. So when we check myfun, it returns [2].

Final thoughts

When you run into a problem where you set a variable in one module, then access it though another, knowing what’s going on behind the scenes helps make sense of it all. With imports it’s easy to figure out what’s going on when you realize that variables are pointers to objects in memory. If your code is not doing what you expect when setting variables, then something is pointing to the wrong object. Knowing how Python works behind the scenes with namespaces and imports will help you figure out what’s going on. It’s really not bad once you realize it’s all just references.

Python & C

Python CAs promised, I took a script written in Python and ported parts of it to C. The Python interface is the same, but behind the scenes is blazing fast C code. The example is a little different from what I usually do, but I discovered a few things I had never seen.

I am working off the same example as the previous post, How to Write Faster Python, which has the full original code. The example takes an input image and resizes it down by a factor of N. It takes NxN blocks and brings them down to 1 pixel which is the median RGB of the original block.

Small Port

This example is unusual for me because it’s doing 1 task, resizing, and the core of this 1 task is a very small bit of logic – the median function. I first ported the median code to C, and checked the performance. I know the C code is better because C’s sorting function lets you specify how many elements from the list you want to sort. In Python, you sort the whole list (as far as I know) and so I was able to avoid checks on the size of the list vs. the number of elements I want to sort.

Below is the code, but I have a few changes I want to mention as well.

myscript2.py:

import cv2
import numpy as np
import math
from datetime import datetime
import ctypes

fast_tools = ctypes.cdll.LoadLibrary('./myscript_tools.so')
fast_tools.median.argtypes = (ctypes.c_void_p, ctypes.c_int)


def _median(list, ne):
    """
    Return the median.
    """

    ret = fast_tools.median(list.ctypes.data, ne)
    return ret


def resize_median(img, block_size):
    """
    Take the original image, break it down into blocks of size block_size
    and get the median of each block.
    """

    #figure out how many blocks we'll have in the output
    height, width, depth = img.shape
    num_w = math.ceil(float(width)/block_size)
    num_h = math.ceil(float(height)/block_size)

    #create the output image
    out_img = np.zeros((num_h, num_w, 3), dtype=np.uint8)

    #iterate over the img and get medians
    row_min = 0
    num_elems = block_size * block_size
    block_b = np.zeros(num_elems, dtype=np.uint8)
    block_g = np.zeros(num_elems, dtype=np.uint8)
    block_r = np.zeros(num_elems, dtype=np.uint8)
    while row_min < height:
        r_max = row_min + block_size
        if r_max > height:
            r_max = height

        col_min = 0
        new_row = []
        while col_min < width:
            c_max = col_min + block_size
            if c_max > width:
                c_max = width

            #block info:
            num_elems = (r_max-row_min) * (c_max-col_min)
            block_i = 0
            for r_i in xrange(row_min, r_max):
                for c_i in xrange(col_min, c_max):
                    block_b[block_i] = img[r_i, c_i, 0]
                    block_g[block_i] = img[r_i, c_i, 1]
                    block_r[block_i] = img[r_i, c_i, 2]
                    block_i += 1

            #have the block info, sort by L
            median_colors = [
                int(_median(block_b, num_elems)),
                int(_median(block_g, num_elems)),
                int(_median(block_r, num_elems))
            ]

            out_img[int(row_min/block_size), int(col_min/block_size)] = median_colors
            col_min += block_size
        row_min += block_size
    return out_img

def run():
    img = cv2.imread('Brandberg_Massif_Landsat.jpg')
    block_size = 16
    start_time = datetime.now()
    resized = resize_median(img, block_size)
    print "Time: {0}".format(datetime.now() - start_time)
    cv2.imwrite('proc.png', resized)

if __name__ == '__main__':
    run()

myscript_tools.c

#include <stdint.h>
#include <stdio.h>
#include <stdlib.h> // for malloc
#include <math.h>

int cmpfunc (const void * a, const void * b)
{
   return ( *(uint8_t*)a - *(uint8_t*)b );
}

int _median(uint8_t * list, int ne){
    //sort inplace
    qsort(list, ne, sizeof(uint8_t), cmpfunc);
    int i;
    if (ne % 2 == 0){
        i = (int)(ne / 2);
        return ((int)(list[i-1]) + (int)(list[i])) / 2;
    } else {
        i = (int)(ne / 2);
        return list[i];
    }
}

int median(const void * outdatav, int ne){
    int med;
    uint8_t * outdata = (uint8_t *) outdatav;
    med = _median(outdata, ne);
    return med;
}

and to compile:

gcc -fPIC -shared -o myscript_tools.so myscript_tools.c

Note: I discovered the .so filename should not match any Python script you plan to import. Otherwise you will get a “ImportError: dynamic module does not define init function” because Python is trying to import the C code instead of your Python module.

The main change I made was to port the median function to C. I also switched from using a list of colors, to using a numpy array of colors. I did this because it’s easier to pass numpy arrays, and because that is more useful in general. Also, so I wouldn’t have to convert the list to a numpy array on every median call, I crated the numpy array in the resize_median code itself. Before this last change, my benchmark showed I was actually running slower. The faster C code was not making up for all the list –> numpy array conversions!

After the 2nd fix I timed it

python -m timeit "import myscript2; myscript2.run()"

and got

500 loops, best of 3: 2.04 sec per loop

Nice! That’s compared to 2.51 from my last post, so it’s around 18% faster.

All C (sort of)

I wasn’t happy with the 18% boost – I thought it would be faster! But, it was only a very small portion ported. I decided to port more, but there wasn’t a good natural place to split the resize function up any further, so I ported the whole thing. This way I could also show how to send and receive a whole image back from C – which I do by reference/pointer.

myscript2.py

import cv2
import numpy as np
import math
from datetime import datetime
import ctypes

fast_tools = ctypes.cdll.LoadLibrary('./myscript_tools.so')

def c_resize_median(img, block_size):
    height, width, depth = img.shape

    num_w = math.ceil(float(width)/block_size)
    num_h = math.ceil(float(height)/block_size)

    #create a numpy array where the C will write the output
    out_img = np.zeros((num_h, num_w, 3), dtype=np.uint8)

    fast_tools.resize_median(ctypes.c_void_p(img.ctypes.data),
                             ctypes.c_int(height), ctypes.c_int(width),
                             ctypes.c_int(block_size),
                             ctypes.c_void_p(out_img.ctypes.data))
    return out_img


def run():
    img = cv2.imread('Brandberg_Massif_Landsat.jpg')
    block_size = 16
    start_time = datetime.now()
    resized = c_resize_median(img, block_size)
    print "Time: {0}".format(datetime.now() - start_time)
    cv2.imwrite('proc.png', resized)

if __name__ == '__main__':
    run()

myscript_tools.c

#include <stdint.h>
#include <stdio.h>
#include <stdlib.h> // for malloc
#include <math.h>

int cmpfunc (const void * a, const void * b)
{
   return ( *(uint8_t*)a - *(uint8_t*)b );
}

int _median(uint8_t * list, int ne){
    //sort inplace
    qsort(list, ne, sizeof(uint8_t), cmpfunc);
    int i;
    if (ne % 2 == 0){
        i = (int)(ne / 2);
        return ((int)(list[i-1]) + (int)(list[i])) / 2;
    } else {
        i = (int)(ne / 2);
        return list[i];
    }
}

void resize_median(const void * bgr_imgv, const int height, const int width,
                   const int block_size, void * outdatav) {
    const uint8_t * bgr_img = (uint8_t  *) bgr_imgv;
    uint8_t * outdata = (uint8_t *) outdatav;

    int col, row, c_max, r_max, c_si, r_si;
    int max_val;
    const int num_elems = block_size * block_size;
    int j, offset;

    uint8_t *b_values;
    uint8_t *g_values;
    uint8_t *r_values;
    b_values = (uint8_t *) malloc(sizeof(uint8_t) * num_elems);
    g_values = (uint8_t *) malloc(sizeof(uint8_t) * num_elems);
    r_values = (uint8_t *) malloc(sizeof(uint8_t) * num_elems);

    int out_i = 0;

    row = 0;
    while (row < height) {
        col = 0;
        r_max = row + block_size;
        if (r_max > height) {
            r_max = height;
        }
        while (col < width) {
            c_max = col + block_size;
            if (c_max > width) {
                c_max = width;
            }

            // block info:
            j = 0;
            for (r_si = row; r_si < r_max; ++r_si) {
                for (c_si = col; c_si < c_max; ++c_si) {
                    offset = ((r_si*width)+c_si)*3;
                    b_values[j] = bgr_img[offset];
                    g_values[j] = bgr_img[offset+1];
                    r_values[j] = bgr_img[offset+2];
                    j += 1;
                }
            }

            // have the block info, get medians
            max_val = j;
            outdata[out_i]   = _median(b_values, max_val);
            outdata[out_i+1] = _median(g_values, max_val);
            outdata[out_i+2] = _median(r_values, max_val);

            // update indexes
            out_i += 3;
            col += block_size;
        }
        row += block_size;
    }

    free(b_values);
    free(g_values);
    free(r_values);
}

This one is a little more complicated. With multi-dimensional arrays, numpy flattens the data. The new array is [pixel00_blue, pixel00_green, pixel00_red, pixel01_blue, ...] – see the offset variable above for the row/column equation. I also pass pointers with no type, so before using the arrays, I have to typecast them. I realize this example is unusual because the whole script is basically ported, but it illustrates many things that are non-trivial and required some time to figure out.

And for the moment of truth…

500 loops, best of 3: 440 msec per loop

82% faster!

Although it’s a mostly C at this point.

Final Thoughts

So in this case, the Python took about 5 times as long to run compared to the C version. When something takes a few milliseconds, 5-10x isn’t that important, but when it gets more complex, it starts to matter (assuming you’re not I/O bound). Here I ported pretty much the whole module. I think this only makes sense with a library, and not code that you will refer to often – Python is just so easy to read.

An idea popped in my head as I was writing this as well. This should be easy to port to RapydScript. JavaScript has the Image object I could use in place of opencv. I wonder where JavaScript would fall on the spectrum.

How to Write Faster Python

Peugeot F1 This is a high level guide on how to approach speeding up slow apps. I have been writing a computer vision app on the backend of a server and came to the realization that it is easy to write slow Python, and Python itself is not fast. If you’re iterating over a 1000 x 1000 pixels (1 megapixel) image, whatever you’re doing inside will run 1 million times. My code ran an iterative algorithm, and it took over 2 minutes to run the first version of the code. I was able to get that time down to under 10 seconds – and it only took a few hours time with minimal code changes. The hardest part was converting some snippets of code into C, which I plan to detail in the future. The basic approach I followed was:

  1. Set a baseline with working code & test
  2. Find the bottlenecks
  3. Find a way to optimize bottlenecks
  4. Repeat

And fair warning, these sorts of optimizations only work if you have selected a good algorithm. As we used to say at work “You can’t polish a turd”.

Slow, but working code

I am going to start with an example program. This resizes an image using the medians of each region. If you resize to 1/4th of the original size the new image will be the medians of 4×4 blocks. It is not too important you understand the details, but I still included code for reference:

import cv2
import numpy as np
import math
from datetime import datetime

def _median(list):
    """
    Return the median
    """

    sorted_list = sorted(list)
    if len(sorted_list) % 2 == 0:
        #have to take avg of middle two
        i = len(sorted_list) / 2
        #int() to convert from uint8 to avoid overflow
        return (int(sorted_list[i-1]) + int(sorted_list[i])) / 2.0
    else:
        #find the middle (remembering that lists start at 0)
        i = len(sorted_list) / 2
        return sorted_list[i]

def resize_median(img, block_size):
    """
    Take the original image, break it down into blocks of size block_size
    and get the median of each block.
    """

    #figure out how many blocks we'll have in the output
    height, width, depth = img.shape
    num_w = math.ceil(float(width)/block_size)
    num_h = math.ceil(float(height)/block_size)

    #create the output image
    out_img = np.zeros((num_h, num_w, 3), dtype=np.uint8)

    #iterate over the img and get medians
    row_min = 0
    #TODO: optimize this, maybe precalculate height & widths
    while row_min < height:
        r_max = row_min + block_size
        if r_max > height:
            r_max = height

        col_min = 0
        new_row = []
        while col_min < width:
            c_max = col_min + block_size
            if c_max > width:
                c_max = width

            #block info:
            block_b = []
            block_g = []
            block_r = []
            for r_i in xrange(row_min, r_max):
                for c_i in xrange(col_min, c_max):
                    block_b.append(img[r_i, c_i, 0])
                    block_g.append(img[r_i, c_i, 1])
                    block_r.append(img[r_i, c_i, 2])

            #have the block info, sort by L
            median_colors = [
                int(_median(block_b)),
                int(_median(block_g)),
                int(_median(block_r))
            ]

            out_img[int(row_min/block_size), int(col_min/block_size)] = median_colors
            col_min += block_size
        row_min += block_size
    return out_img

def run():
    img = cv2.imread('Brandberg_Massif_Landsat.jpg')
    block_size = 16
    start_time = datetime.now()
    resized = resize_median(img, block_size)
    print "Time: {0}".format(datetime.now() - start_time)
    cv2.imwrite('proc.png', resized)

if __name__ == '__main__':
    run()

This takes around 3 seconds to run against a random 1000×1000 image from Wikipedia (link):

500 loops, best of 3: 3.09 sec per loop

So this is a good starting point – it works! I am also saving the output image, so that when I make changes, I can spotcheck that I didn’t break something.

Profiling

Python’s standard library has a profiler that works very well (see The Python Profilers). Although I’m not a fan of the text output – I feel it does not intuitively roll up results – you can dump stats to a file which you can view in a UI.

So I can profile the script:

python -m cProfile -o stats.profile myscript.py

Note that it can sometimes add a lot of overhead.

And after installing runsnakerun I can view the stats:

runsnake stats.profile

which shows me: Runsnakerun GUI

So there are a few things that jump out as easy fixes. First, that app spends around 1/3rd of the time appending colors. Second, finding the median takes a significant amount of time as well – most of it from the sorted call – this is not visible in the image above. The other lines of code are not significant enough to display.

Optimizing

I have a few quick rules of thumb for optimizing:

  • creating/appending to lists is slow due to memory allocation – try to use list comprehensions
  • try not to run operations that create copies of objects unless it’s required – this is also slow due to memory allocation
  • dereferencing is slow: if you’re doing mylist[i] several times, just do myvar = mylist[i] up front
  • use libraries as much as possible – many are written in C/C++ and fast

Other than that, make use of search like Google or DuckDuckGo. You can tweak your Python code (you might discover you’re doing something wrong!), use Cython, write C libraries, or find another solution.

So profiling tells me that appending is slowing my code. I can get around this problem by declaring the list once and keeping track of how many elements I “add”. I also know that sorted() is not preferred because it creates a copy of the original list. I can instead use list.sort() and sort the list in place. I make these changes and run the code, and see the output is still good so I probably did not break it. Let’s time it.

500 loops, best of 3: 2.51 sec per loop

That’s almost 20% faster! Not bad for a few minutes of effort.

For completeness, here is the modified code:

import cv2
import numpy as np
import math
from datetime import datetime

def _median(list, ne):
    """
    Return the median.
    """

    #sort inplace
    if ne != len(list):
        list = list[0:ne]
    list.sort()
    if len(list) % 2 == 0:
        #have to take avg of middle two
        i = len(list) / 2
        #int() to convert from uint8 to avoid overflow
        return (int(list[i-1]) + int(list[i])) / 2.0
    else:
        #find the middle (remembering that lists start at 0)
        i = len(list) / 2
        return list[i]

def resize_median(img, block_size):
    """
    Take the original image, break it down into blocks of size block_size
    and get the median of each block.
    """

    #figure out how many blocks we'll have in the output
    height, width, depth = img.shape
    num_w = math.ceil(float(width)/block_size)
    num_h = math.ceil(float(height)/block_size)

    #create the output image
    out_img = np.zeros((num_h, num_w, 3), dtype=np.uint8)

    #iterate over the img and get medians
    row_min = 0
    #TODO: optimize this, maybe precalculate height & widths
    num_elems = block_size * block_size
    block_b = [0] * num_elems
    block_g = [0] * num_elems
    block_r = [0] * num_elems
    while row_min < height:
        r_max = row_min + block_size
        if r_max > height:
            r_max = height

        col_min = 0
        new_row = []
        while col_min < width:
            c_max = col_min + block_size
            if c_max > width:
                c_max = width

            #block info:
            num_elems = (r_max-row_min) * (c_max-col_min)
            block_i = 0
            for r_i in xrange(row_min, r_max):
                for c_i in xrange(col_min, c_max):
                    block_b[block_i] = img[r_i, c_i, 0]
                    block_g[block_i] = img[r_i, c_i, 1]
                    block_r[block_i] = img[r_i, c_i, 2]
                    block_i += 1

            #have the block info, sort by L
            median_colors = [
                int(_median(block_b, num_elems)),
                int(_median(block_g, num_elems)),
                int(_median(block_r, num_elems))
            ]

            out_img[int(row_min/block_size), int(col_min/block_size)] = median_colors
            col_min += block_size
        row_min += block_size
    return out_img

def run():
    img = cv2.imread('Brandberg_Massif_Landsat.jpg')
    block_size = 16
    start_time = datetime.now()
    resized = resize_median(img, block_size)
    print "Time: {0}".format(datetime.now() - start_time)
    cv2.imwrite('proc.png', resized)

if __name__ == '__main__':
    run()

Repeat

Now that I optimized the code a little, I repeat the process to make it even better. I try to decide what should be slow, and what should not. So, for example, sorting is not very fast here, but that makes sense. Sorting is the most complex part of this code – the rest is iterating and keeping track of observed colors.

Final Thoughts

After having optimized a few projects I have a few final thoughts and lessons learned:

  • Optimizing a slow/bad algorithm is a waste of time, you’ll need to redesign it anyways
  • The granularity of the profiler is sometimes a little funny. You can get around this by making sections of code into functions – metrics on functions are usually captured.
  • The profile will group the calls to the library functions together even if called from different places. So be careful with metrics for libraries/utils – i.e. numpy.
  • Use search to find ways to optimize your code.

If all else fails, you can implement slow parts in C relatively painlessly. This is the option I usually go with, and I will detail it out with some code in the next few weeks.

Have fun speeding up your code!

Bringing a Project to Life

I have always found that building something out of nothing is hard – whether it be a new feature, or a new product. It’s much easier to fix a bug or make an improvement to working code. Yet, I’ve been on some ambitious projects that got done (building a completely new API for a large ad serving company & a radar program that was designed & built from scratch in 3 years, to name the most difficult). Even though many people thought it was a miracle we got these done, I noticed some techniques that were useful in bringing these projects to fruition which apply to open source projects as well.

Before getting started, you must first give up the idea of building a finished product. You need to make something just usable to start. If you have something useable (and useful), people will start using it – and for open source projects, some of those uses will help make it better. After a project is brought to life, it can evolve into something more – a finished product. The difficultly then lies in making that first usable product.

Fail Fast

Nothing has been accomplished yet. You have to start planning and try to eliminate bad ideas before they take up too much time. The most efficient way of doing this is by getting feedback from others. If you find experts, or even just people with different perspectives, they can draw upon their experiences to quickly point out problems, and save you spending days or weeks working on a solution that will never work.

In addition to getting feedback, prototype things and make proof of concept models. Most importantly, don’t invest a lot of time here. You don’t need pretty code – but it should be accurate. The goal is to wrap your head around the problem and discover potential solutions and failures early – not to write something maintainable. The models themselves aren’t important, the data/information you learn from these experiments is what matters.

Build a pre-alpha version

You need to get something working that you can build off of. Most projects have more than 1 component. Even small web apps have a view & controller. Connecting the different parts together is difficult because it is easy to make assumptions about how & what data passed between interfaces.

On the radar system I worked on, many properties of the raw input signal were not planned to be passed along to processes further down the line. But after building the first version of the code, that format of that data quickly changed to include additional information. And after the first version of every component was done, we could look at a display, albeit a very crude one, and see a real object – that is very motivating.

Focus, Iterate & Test

The radar system I worked on was also hard because we were short on time due to an accelerated schedule – time was our most valuable resource. The main reasons we were still able to finish was because the motto of the project was an idea from Voltaire: “perfect is the enemy of good enough”. We were told not to spend any time working on something that already met the specs – we could not spare it. Instead we had to find something that was not good enough, and focus there.

This list of areas we needed to focus became our list of improvements/bugs. We made it past the difficult part of building something from nothing with our pre-alpha version, and now we were working on fixes and improvements – the easy part. After repeating the focus & fix process many times, we got to the point where we had a useable product.

Also crucial here is testing. You need to test so you can confidently make changes without causing regressions. The less effort required to run them, better (automated ;))

Share your project

Alex Martelli gave a talk at PyCon 2013, “Good enough is good enough”. He said, if you make something people need, they will use it, even if it’s not perfect. The catch here is it has to be something people find useful. If enough people use your open source project, some of them will contribute back, fix bugs, and maybe even clean up and optimize code. Even if people do not fix anything, they can file issue tickets. This last weekend we had someone file several issues with example code – perfect test cases.

Erlang & Ejabberd on OpenShift

I’m mixing it up a little here – this post is about the backend for our GrafPad site and not related to Python or RapydScript. We will probably be using an XMPP for some new GrafPad features. Sure, there are Python XMPP servers, but we won’t touch the XMPP server code, we’ll only talk to it, so language doesn’t matter. We’re going to use the most proven XMPP server, ejabberd, which happens to written in Erlang. Unfortunately, setting up ejabberd, or any Erlang application for that matter, is a nontrivial task on OpenShift. In most places Erlang likes to automatically bind to 0.0.0.0 and/or 127.0.0.1, which is something not accessible on OpenShift. Even when you provide an IP list to serve on, Erlang interally adds 127.0.0.1. On top of that, Erlang likes to use ports that are not allowed on OpenShift.

ejabberd on OpenShift After messing around for a day or 2, I discovered that these issues cannot be overcome by configs – the changes had to be made in the original Erlang source. Also, fair warning – my goal was to get something working. The solution I have works, but it’s not production ready. I have a repo with the modified Erlang/OTP source, so feel free to make this better!

There are 2 core problems that need to be solved. It turns out the issues all come from 2 things:

  • You cannot bind to 0.0.0.0 or 127.0.0.1
  • The only usuable ports are 8080, and 15000-30000.

Solving these problems is tricky. I originally set out to find everywhere where something binded to 0.0.0.0 or 127.0.0.1 and change where they bind to, but there were still issues starting it up even after changing the values everywhere obvious in the sources (grepping for {0,0,0,0} and 0.0.0.0). I ended up chaging the binding code to change the IP to the $OPENSHIFT_DIY_IP if the address is 0.0.0.0 or 127.0.0.1. I liked this solution a lot better since it accomplishes exactly what I want, and I won’t have to make changes to libraries that try to bind to 0.0.0.0.

So how can you get this setup? It should be pretty easy using my Erlang source. I thought about putting this in a repo that would build when the repo was pushed, but the build takes more than 1 hr so it always gets stopped midway though. I saw messages like Shell command '.../.openshift/action_hooks/build' exceeded timeout of 3516. Instead you have to SSH onto the machine and run the following script manually:

cd $OPENSHIFT_TMP_DIR
wget https://github.com/charleslaw/otp/archive/openshift.zip -O openshift.zip
unzip openshift.zip
cd $OPENSHIFT_TMP_DIR/otp-openshift
sed -i 's/{0,0,0,0}/'"{${OPENSHIFT_DIY_IP//[.]/,}}"'/g'  ./lib/erl_interface/src/connect/eirecv.c
ERL_ROOT=$OPENSHIFT_DATA_DIR/erl_home
./otp_build autoconf
./configure --prefix=$ERL_ROOT --without-termcap
make
make install

After it is done, you can test it by starting epmd and seeing that it works:

$OPENSHIFT_DATA_DIR/erl_home/bin/epmd -address $OPENSHIFT_DIY_IP -debug

Next, you have to install ejabberd. You can do this running this script:

ERL_ROOT=$OPENSHIFT_DATA_DIR/erl_home

cd $OPENSHIFT_TMP_DIR
wget http://downloads.sourceforge.net/expat/expat-2.1.0.tar.gz
tar xzvf expat-2.1.0.tar.gz
cd expat-2.1.0
./configure --prefix=$ERL_ROOT
make
make install

cd $OPENSHIFT_TMP_DIR
wget http://github.com/processone/ejabberd/archive/v2.1.13.tar.gz -O v2.1.13.tar.gz
tar xvf v2.1.13.tar.gz
cd ejabberd-2.1.13/src/
export PATH=$ERL_ROOT/bin:$PATH
./configure --prefix=$ERL_ROOT
make
make install

sed -i 's/localhost/$OPENSHIFT_DIY_IP/g' $ERL_ROOT/sbin/ejabberdctl
sed -i 's/localhost/'"$HOSTNAME"'/g' $ERL_ROOT/etc/ejabberd/inetrc
sed -i 's/127,0,0,1/'"${OPENSHIFT_DIY_IP//[.]/,}"'/g' $ERL_ROOT/etc/ejabberd/inetrc
sed -i 's/127,0,0,1/'"${OPENSHIFT_DIY_IP//[.]/,}"'/g' $ERL_ROOT/etc/ejabberd/ejabberdctl.cfg
sed -i 's/5280/8080/g' $ERL_ROOT/etc/ejabberd/ejabberd.cfg

EDIT: (added based on feedback) Make sure port 8080 is free. If you run ps -ef and see a ruby app running, you’ll need to kill it (If you really want to make sure it’s running on port 8080 run netstat -tulpn | grep $OPENSHIFT_DIY_IP).

Next you can start ejabberd running the following 2 commands, which you’ll want to put in your .openshift/action_hooks/start script:

$OPENSHIFT_DATA_DIR/erl_home/bin/epmd -address $OPENSHIFT_DIY_IP &
$OPENSHIFT_DATA_DIR/erl_home/sbin/ejabberdctl start

If there are no errors, you should be all set! At this point you’ll want to configure ejabberd to your own liking. You will need a user to login to the admin interface. If you don’t want to keep localhost as an XMPP host, change it in $OPENSHIFT_DATA_DIR/erl_home/etc/ejabberd.cfg. You’ll need an admin user on one of your hosts, so run the following command (using your host instead of localhost):

$OPENSHIFT_DATA_DIR/erl_home/sbin/ejabberdctl register admin localhost password1234

Connect to OpenShift My app was at http://erl-jabberserver.rhcloud.com so I went to http://erl-jabberserver.rhcloud.com/admin and logged in using username [email protected] / password1234. Success! I can also connect to it using any XMPP software that supports BOSH. The default setup does not have encryption, so make sure not to require it for testing.

So this works for us since we’re only experimenting with this at this point. But it could be better. Specifically, right now when port 0 is specified, I pick a random port between 20001 & 30000. This should probably instead pick a port that is not used in that range. There may be other issues that I just haven’t run into, but this is a good start.

Python to Javsacript: Compilers vs. Translators

One thing Alex and I always say about RapydScript is that it is really JavaScript with a Pythonic syntax. Following that, people often ask me what that means for them – they want to know how this affects how they develop code, and how it is different than something like Pyjamas/Pyjs. I want to answer that here to for anyone that has wondered what “RapydScript is Pythonic JavaScript” means, and how compilers like Pyjs are different from translators like RapydScript, and why I (full disclosure) prefer translators.

An Example

A clear example of the difference is with division. Say your source looks like:

a = 5
b = 0
c = a / b

The translator will output JavaScript like:

var a = 5;
var b = 0;
var c = a / b;

The translator process is very simple to understand – it’s pretty much just changing the syntax, but this leads to some gotcha’s. When this code runs, c will be set to Infinity, a JavaScript constant, while the original Pythonic source would have raised an Exception.

A compiler, on the other hand, attempts to mimic Python exactly so it may have an output more similar to:

var py_int = function(val){
    this.val = val;
};
py_int.prototype.div = function(denom){
    if (denom.val == 0) {
        throw ZeroDivisionError;
    }
    return py_int(self.val / denom.val);
};
var a = py_int(5);
var b = py_int(0);
var c = a.div(b);

The variables here will all be objects that include methods for all the operations. The division doesn’t directly divide 2 numbers, it runs the divide method in the objects. So when this code runs it will throw a ZeroDivisionError exception just like Python does.

So what are the tradeoffs?

Writing using a compiler is nice because you get to think like a Python developer, which can abstract away some things like cross browser support. It also means that, in many cases, code can be moved between the frontend and backend with no changes. So it’s easy to have Python code compile to Javascript. But if you’re doing something that’s JavaScript specific, like getting HTML elements, taking in keyboard inputs, etc, the compiler you’re using will have to have a working and documented API for accessing these functions.

The real drawback, though, is with the output code is slower, significantly heavier, and, with the compilers I’ve used, unreadable. There are several issues I have with this, but it really boils down unreadable code leads to usless tracebacks when running code in a browser, and apps don’t run (or run well) on mobile devices.

Translators take a very different approach and don’t try to run just like Python. The idea here is to do 80% of the work for 20% of the cost. Your code may look like Python but it will run like JavaScript, as with the division example. There’s a lot of overlap between the two languages, but they’re not exactly the same, so the main drawback here is that you might see some unexpected, but predictable, behavior. This is very easy though if you know JavaScript, and if not, it’s easy to learn the differences.

There are some nice benefits to using a translator. Your input code and output code will look very similar. They will be roughly the same size and it will be easy to map a line with an error in the JavaScript output back to the Pythonic input for easy debugging. The output will be on the order of kB instead of MB and will run faster.

So those are the main differences between compilers and translators like RapydScript. RapydScript is really JavaScript behind the scenes so it will behave differently than Python, which lets it run a lot more efficiently.

Which is right for you?

The choice of which to use comes down to a few things:

  • First, something I have not mentioned, libraries. In general, JavaScript libraries work better with translators and Python libraries work better with compilers. The one caveat is compilers can only translate Pure Python, so if you’re using something like numpy, which uses C, there’s no easy answer for you.
  • Second, if you don’t know any JavaScript, you will have a tougher time with a translator. Speaking from experience though, JavaScript is not very different from Python, and I encourage you to try a translator because you’ll save time debugging, and your app will be more maintainable in the long term.
  • Lastly is performance requirements. If performance is important, you will want to use a translator over a compiler.

I think it makes sense for a beginner who may writing a simple internal app that won’t have any performance requirements to use a compiler. But if you’ll be writing many apps, or even a complex one, I would go with a translator. Having picked up the differences between JavaScript and Python, I now exclusively use RapydScript, a translator, even if I don’t need the performance. It’s easier to debug in the browser where it actually runs, which saves me time.

Pyjamas and Web2py

UPDATE: Pyjamas has since been renamed Pyjs and is under new leadership. Everything is still backwards compatible.

At this point, if you’ve been following along the posts, you should know how to create a simple web2py application. In this post I’m going to describe how to write a page that can connect to a backend written in web2py. This is another step on the path of having an app on GAE, in fact, the code we write here will get deployed on GAE. This code will also run on any system, including your own computer, and avoids the lock-in some people experience when developing for GAE.

There are 2 ways that I use for communicating with a web server are RESTful JSON calls, and JSON-RPC calls. Instead of covering both I plan on just showing how to use JSON-RPC. I suggest when you design your app first search online for comparisons between REST & JSON-RPC to see the trade-offs.

Before we get into the code, I also have a slight curve-ball. I originally wrote this code using Pyjs, but I’ve switched to using Rapydscript for all my development. In a future post I’ll show how to connect Rapydscript to GAE, which I plan to link here. I suggest reading Alex’s earlier post https://blogs-pyjeon.rhcloud.com/?p=301 for a good summary of various Python to JS compilers and their trade-offs.

web2py Services

Time to code!

First you need to setup your web2py app to use web2py’s services module. This lets your application work as a web service so it can respond to calls from clients, including support for JSON-RPC calls. Inside of models/db.py add the following lines:

from gluon.tools import Service
service = Service()

In the controller, add a call function that returns the service. Then all you have to do is add a decorator to each functions you want to act as services receiving jsonrpc calls and web2py handles the rest. In controllers/default.py I have added the following:

@service.jsonrpc
def myfun(data_from_JSON):
    return data_from_JSON.upper()

def call():
    return service()

The example function here, myfun, will make everything uppercase. Also worth nothing is data_from_JSON is already decoded data from the JSON request.

To access this service use the URI ‘/cyborg/default/call/jsonrpc’. For more information on services check out http://web2py.com/books/default/chapter/29/10#Remote-procedure-calls.

Pyjs Clients

I’ve had this code floating around my computer for a couple years now. I’ve made several minor changes, some just because I wanted slightly lighter code, and some because of changes in Pyjs, but it was originally based on code by Amund Tveit on his blog (http://amundblog.blogspot.com/2008/12/ajax-with-python-combining-pyjs-and.html). This is a simple page with a text area for sending text to a JSON-RPC service.

from pyjamas.ui.RootPanel import RootPanel
from pyjamas.ui.TextArea import TextArea
from pyjamas.ui.Label import Label
from pyjamas.ui.Button import Button
from pyjamas.ui.VerticalPanel import VerticalPanel
from pyjamas.JSONService import JSONProxy


class JSONExample:
    def onModuleLoad(self):
        self.rpc_service = JSONProxy('/cyborg/default/call/jsonrpc', ['myfun'])

        self.text_area = TextArea()
        self.text_area.setText(r"Hello World")
        self.text_area.setCharacterWidth(80)
        self.text_area.setVisibleLines(4)

        button = Button('Send to Server', self)

        self.status = Label()

        panel = VerticalPanel()
        panel.add(self.text_area)
        panel.add(button)
        panel.add(self.status)

        RootPanel().add(panel)


    def onClick(self, sender):
        print('sending to server')
        self.status.setText('Waiting for response...')

        textarea_txt = self.text_area.getText()
        if self.rpc_service.myfun(textarea_txt, self) < 0:
            self.status.setText('Server Error')


    def onRemoteResponse(self, response, request_info):
        print('good response')
        self.status.setText(response)


    def onRemoteError(self, code, message, request_info):
        print('error')
        print(code)
        print(message)
        self.status.setText("Server Error or Invalid Response: ERROR  - " + str(message))


if __name__ == '__main__':
    app = JSONExample()
    app.onModuleLoad()

To make the call, you need an instance of the JSONProxy class, so I have the line

self.rpc_service = JSONProxy('/cyborg/default/call/jsonrpc', ['myfun'])

Then call your function using that instance, along with passing in an object/variable (which Pyjs encodes) to send to the service, and a reference to an instance of a class with onRemoteResponse and onRemoteError methods.

self.rpc_service.myfun(data, response_class)

In the example code above, response_class is self, so the JSON response comes through onRemoteResponse.

Putting it Together

Believe it or not, this was the trickiest part for me when I first got this working. Figuring out the correct URIs to use, and having code listening for the call was tough to debug. Luckily for anyone following along, the example code here already has everything setup correctly :).

We can start out just by making sure everything ties together. Go to your web2py folder then applications\cyborg\static. This is the static directory for the cyborg app. Now take all the output from Pyjs and put it there. I called my Pyjs file JSONExample.py, which generated JSONExample.html, so I access this using http://127.0.0.1:8000/cyborg/static/JSONExample.html. The ‘Send to Server’ button on the page should work. If it doesn’t, make sure you followed every step exactly. There might have also been changes/bugs in Pyjs or web2py – it’s not likely, but it has happened.

I personally wanted my main app to call the JSON-RPC service, not a file in the static director. I am going to have the index load the JSON page. First thing to do is cleanup the index function in default.py. I just want index to return the view, so index now returns an empty dictionary:

def index():
    return {}

I then replace the contents of applications\cyborg\views\default\index (the view for index() in the default controller) with the contents of the main Pyjs output file, JSONExample.html in my case.

The compiled html/js files are still in the static dir though. Remember, users are accessing the files in the views directory by accessing controllers. So if the other Pyjs files were in the views, you would still need to have a controller function to access them. There is no clean, simple way to put all your Pyjs cache files in the views directory. Instead we need to modify the new index.html to point to the files in the static dir. So the index.html code gets modified once for the module:

<meta name="pygwt:module" content="/cyborg/static/JSONExample">

and, in several places, for bootstrap.js:

<script language="javascript" src="/cyborg/static/bootstrap.js"></script>

Now you can visit http://127.0.0.1:8000/cyborg/ and send JSON-RPC calls!

One thing to note is that with the default routes calling /cyborg/, /cyborg/default, and /cyborg/default/index all load the same view/controller. If your JSONProxy class uses a relative link it might only work when visiting 1 of these URIs. That is why, in the Pyjs code, I refer to URI starting from /. When I was learning how to do this, I set everything using relative URI’s, like ../static/JSONExample, and ../default/call/jsonrpc, and that made everything difficult, so I stay away from that.

A Simple web2py App

I know this is waaaaay overdue, but better late than never. Over the last year and a half I’ve switch over to using Linux and away from Pyjamas (I’m still using python though). I’ll have more info in future posts over the next couple weeks. Don’t worry if you’re using Windows though, the steps here work in both Linux and Windows (thank you Python!).

Installing and Running web2py

Start by downloading web2py (source – version 2.2.1 for me). I am doing this in XP, so I extracted the code to C:\Projects\web2py. Open a command window, navigate to your web2py dir and start it up with the command

> python web2py.py -a tmp_pass

This starts up web2py on 127.0.0.1:8000 with the admin password set to tmp_pass. You can use the -h option to see how to set web2py up in other ways. One thing to note is if the server is running on 127.0.0.1 you won’t be able to access it using your real IP address. If you want to test your server using external computer have web2py use the IP 0.0.0.0.

With web2py running, I could then visit http://127.0.0.1:8000 where an sort of hidden admin interface button lets me login using my admin password, tmp_pass.

Creating an App

There is a panel on the right, with a section called “New simple application” which you can use to create an app. This sets up all the template files for you. In my case I created a program called cyborg.

The server shows a list of files which I could edit. It’s a lot of code, definately more than I wanted for my app, but that should be easy to cleanup later on. With the web2py server up, I navigated to http://127.0.0.1:8000/cyborg/. which showed a page with some interesting bullets, including:

  • You visited the url /cyborg/
  • Which called the function index() located in the file web2py/applications/cyborg/controllers/default.py
  • The output of the file is a dictionary that was rendered by the view web2py/applications/cyborg/views/default/index.html

These are the MVC files discussed in my previous post.

Simplifying the App

I decided to investigate each of the steps taken to run the code and try to trim unneeded code: 1) Well, this one is obvious

2) The default code has calls to use databases, and uploading files, etc. Actions I don’t plan on supporting, at least just yet. First, I want to get comfortable with everyting. I changed default.py to be the simplest function possible:

# -*- coding: utf-8 -*-

def index():
    return dict(message="Hello World", message2="How's it going?")

3) The view should get the dictionary from index() and render it. Really whats happening is the view is loaded, and any python code in the index uses the dictionary from index(). Python code is embedded between {{}}

{{if 'message' in globals():}}
<h3>{{=message}}</h3>
{{pass}}
<br>
{{if 'message' in globals():}}
<h3>{{=message2}}</h3>
{{pass}}

The only kind of gotcha I found was that you need to have a pass aligned with every if. I think this is because there isn’t a way to unindent in the html files.

Cleanup

Since I only plan on using a couple function to start, I want to remove all the files that seem unnecessary. I started by removing and checking the page still worked: I removed these folders completely (I think most get recreated by web2py when the app runs, but with just dummy files):

  • databases (this folder has the database files)
  • errors (this is a logs folder)
  • languages (translation files)
  • models (usually where you put information about your database)
  • private
  • static
  • uploads

And cleaned the folders:

controllers: Only kept default.py views: only kept views\default\index.html

I realoaded http://127.0.0.1:8000/cyborg/, and I was happy to see the messages I passed in through the dict.

Using Web2py

As promised in the GAE post, here is part 1 of using JSONRPC calls on GAE. I originally intended to write more of a how-to, but now that I’m halfway through, this seems like it should be it’s own post, and I’ll write a how-to part later.

If you’re reading this blog, you probably know how to write Pyjamas apps, but you might not know about web2py, and why or how to use it. I will give a quick overview on how web2py works in the 2nd half of this post, but if you want to just know about using web2py with Pyjamas, or deploying to GAE (on windows), skip to the next post.

I want to start out with a warning. You MUST be careful if you plan to develop an app on GAE. Let’s say hypothetically you have an idea for a business using webapps. You start out with no money and no users. GAE looks really appealing because you can get reliable hosting for free. A few months later you’ve grown and you’re ready to move to your own server. This is where you run into issues. All the server code was written for GAE. If you were good and your code is clean, you can copy several large chunks of code directly over to your new server, but interfaces for things like JSON calls, and interfaces to your database will all need to be rewritten. If you look around, you can find a lot of talk about “portability” and “lock in” and ways to get around this issue. It’s definately a hot topic when it comes to using GAE.

Lock in only happens if you aren’t careful up front. You don’t have to develop using GAE directly, instead I suggest using a web application framework, and build your code on top of that. At the very least, this will protect your code. I’m working on a small game, so I haven’t looked at all into moving database info, but this is something you might want to research. Worst case, you could write a web app to print out all your db info on a web page in a format that you can import to your new server. If anyone knows any good ways to move databases, feel free to leave a comment!

Back to frameworks. There are a couple frameworks available that support GAE. The two most popular are web2py and django. From what I read, neither framework is clearly better – each has its own strenths. It’s more about finding a framework that fits your needs more. For hobbyists, I’d recommend web2py. Web2py supports quicker development. A lot of people say it’s more intuitive and you can be more productive. I feel it aligns better with the Python philosophy. Django has it’s own strengths though. It gives you more control as a developer. Sometimes you’ll want that extra control, it’s really up to you.

If you’ve decided to go with web2py, or even if you didn’t, here is a quick overview of how it works from my non-SW perspective. There are 4 areas I heavily use inside each of my applications – model, controller, view, and the static dir. With that in mind, I want to start with the web2py URLs which take on the form of http://domain/application/controller/function (where function is closely related to view). You also have a static directory which can be accessed using http://domain/application/static/. The static dir is where you’ll have website images, Pyjamas apps, and other static files. There are some defaults setup so if someone visits http://domain/ your server will load something, but most URLs will take on the form http://domain/application/controller/function.

So what do controllers and functions do, and how are they related to views? Well you have a “controllers” folder in your application’s directory which is full of controller scripts made up of several functions. If someone visits http://domain/application/game/loadGame the server will call the game.py file in your controllers directory and run the loadGame() function in that file.

This leads to views. Some functions will be used to handle forms, or handle JSON calls, but your most important functions will return a dict(). When you return a dictionary, it is really returning a reference to the view with the same name as the function. Let’s say this loadGame() function returns a dictionary. In your views directory there should be a html file game/loadGame.html. This is what the server sends users when they visit http://domain/application/game/loadGame.

This views page will probably include links to js. files, images and other static files. This is the 3rd area, the static dir. When you have static files, you will host them here.

There’s one last area – models. The models area has the setup for any databases you’re using, and the setup for other things like the JSONRPC interface. When I tried adding users and storing high scores, I was using this file a lot.

So that’s web2py in a nutshell.

Pyjamas Applications on Google App Engine

The next topic I want to cover is writing an app on top of GAE. This one took me a while to originally figure out. I had absolutely no framework experience before, and no idea how GAE worked. Initially, I was very confused at first so I want to discuss how GAE works and what I learned. This post assumes you have the GAE SDK installed and you’ve created an application using the Google App Engine dashboard.

Let’s start with the most important file for your web app – app.yaml. Anytime someone visits any URL under your domain, http://yourapp.appspot.com, their request will go through the app.yaml file to figure out what to load. In proper terms, the app.yaml file forwards the URL path to a “request handler”. The app.yaml file only complicates things for web app with static files, but it becomes very powerful when you start building more complex apps that use services like JSONRPC.

My first GAE app used the helloworld example from Pyjamas. I’ll post the app.yaml file I used and explain what the different sections mean, and then I’ll explain how to package everything to upload it.

application: myapp
version: 1
runtime: python
api_version: 1
handlers:
- url: /
  static_files: output/Hello.html
  upload: output/Hello.html

- url: /*
  static_dir: output

The first 4 lines are standard. The only unique thing you’ll want is your application name. This application should already be created.

The rest of the file is a list of handlers. Each handler starts with a URL pattern. When a user visits a URL on your domain, GAE uses the first handler with a URL patterns that matches the client request. In the app.yaml file above the URL pattern for the first handler is /, which will catch requests for http://yourapp.appspot.com/. The line after the URL pattern tells us we have a static file handler. This static file handler will load output/Hello.html – the main HTML file for the helloworld app. This means that when a user visits http://yourapp.appspot.com/, they load output/Hello.html.

The 2nd handler is for catching calls to http://yourapp.appspot.com/*. This handler is for a static directory, not just a single file. This handler is here because Pyjamas compiles into multiple javascript/cache files which Hello.html loads. Files like Hello.nocache.html need to be accessible from URLs like http://yourapp.appspot.com/Hello.nocache.html in order for the app to work.

There’s one last type of handler: script handlers. I personally use the script handlers to add support for JSONRPC calls, which I will demonstrate in a future post. For the full details on how to configure your app.yaml file visit the GAE documentation at http://code.google.com/appengine/docs/python/config/appconfig.html .

Finally, with the app.yaml file complete, all you need to do is put everything into 1 nice package. I put the app.yaml file at the same level as the output directory (to be clear, 1 level above all the output files). Add this existing project to the Google App Engine Launcher, Deploy it, and you’re good to go!