Python & C

Python CAs promised, I took a script written in Python and ported parts of it to C. The Python interface is the same, but behind the scenes is blazing fast C code. The example is a little different from what I usually do, but I discovered a few things I had never seen.

I am working off the same example as the previous post, How to Write Faster Python, which has the full original code. The example takes an input image and resizes it down by a factor of N. It takes NxN blocks and brings them down to 1 pixel which is the median RGB of the original block.

Small Port

This example is unusual for me because it’s doing 1 task, resizing, and the core of this 1 task is a very small bit of logic – the median function. I first ported the median code to C, and checked the performance. I know the C code is better because C’s sorting function lets you specify how many elements from the list you want to sort. In Python, you sort the whole list (as far as I know) and so I was able to avoid checks on the size of the list vs. the number of elements I want to sort.

Below is the code, but I have a few changes I want to mention as well.

myscript2.py:

import cv2
import numpy as np
import math
from datetime import datetime
import ctypes

fast_tools = ctypes.cdll.LoadLibrary('./myscript_tools.so')
fast_tools.median.argtypes = (ctypes.c_void_p, ctypes.c_int)


def _median(list, ne):
    """
    Return the median.
    """

    ret = fast_tools.median(list.ctypes.data, ne)
    return ret


def resize_median(img, block_size):
    """
    Take the original image, break it down into blocks of size block_size
    and get the median of each block.
    """

    #figure out how many blocks we'll have in the output
    height, width, depth = img.shape
    num_w = math.ceil(float(width)/block_size)
    num_h = math.ceil(float(height)/block_size)

    #create the output image
    out_img = np.zeros((num_h, num_w, 3), dtype=np.uint8)

    #iterate over the img and get medians
    row_min = 0
    num_elems = block_size * block_size
    block_b = np.zeros(num_elems, dtype=np.uint8)
    block_g = np.zeros(num_elems, dtype=np.uint8)
    block_r = np.zeros(num_elems, dtype=np.uint8)
    while row_min < height:
        r_max = row_min + block_size
        if r_max > height:
            r_max = height

        col_min = 0
        new_row = []
        while col_min < width:
            c_max = col_min + block_size
            if c_max > width:
                c_max = width

            #block info:
            num_elems = (r_max-row_min) * (c_max-col_min)
            block_i = 0
            for r_i in xrange(row_min, r_max):
                for c_i in xrange(col_min, c_max):
                    block_b[block_i] = img[r_i, c_i, 0]
                    block_g[block_i] = img[r_i, c_i, 1]
                    block_r[block_i] = img[r_i, c_i, 2]
                    block_i += 1

            #have the block info, sort by L
            median_colors = [
                int(_median(block_b, num_elems)),
                int(_median(block_g, num_elems)),
                int(_median(block_r, num_elems))
            ]

            out_img[int(row_min/block_size), int(col_min/block_size)] = median_colors
            col_min += block_size
        row_min += block_size
    return out_img

def run():
    img = cv2.imread('Brandberg_Massif_Landsat.jpg')
    block_size = 16
    start_time = datetime.now()
    resized = resize_median(img, block_size)
    print "Time: {0}".format(datetime.now() - start_time)
    cv2.imwrite('proc.png', resized)

if __name__ == '__main__':
    run()

myscript_tools.c

#include <stdint.h>
#include <stdio.h>
#include <stdlib.h> // for malloc
#include <math.h>

int cmpfunc (const void * a, const void * b)
{
   return ( *(uint8_t*)a - *(uint8_t*)b );
}

int _median(uint8_t * list, int ne){
    //sort inplace
    qsort(list, ne, sizeof(uint8_t), cmpfunc);
    int i;
    if (ne % 2 == 0){
        i = (int)(ne / 2);
        return ((int)(list[i-1]) + (int)(list[i])) / 2;
    } else {
        i = (int)(ne / 2);
        return list[i];
    }
}

int median(const void * outdatav, int ne){
    int med;
    uint8_t * outdata = (uint8_t *) outdatav;
    med = _median(outdata, ne);
    return med;
}

and to compile:

gcc -fPIC -shared -o myscript_tools.so myscript_tools.c

Note: I discovered the .so filename should not match any Python script you plan to import. Otherwise you will get a “ImportError: dynamic module does not define init function” because Python is trying to import the C code instead of your Python module.

The main change I made was to port the median function to C. I also switched from using a list of colors, to using a numpy array of colors. I did this because it’s easier to pass numpy arrays, and because that is more useful in general. Also, so I wouldn’t have to convert the list to a numpy array on every median call, I crated the numpy array in the resize_median code itself. Before this last change, my benchmark showed I was actually running slower. The faster C code was not making up for all the list –> numpy array conversions!

After the 2nd fix I timed it

python -m timeit "import myscript2; myscript2.run()"

and got

500 loops, best of 3: 2.04 sec per loop

Nice! That’s compared to 2.51 from my last post, so it’s around 18% faster.

All C (sort of)

I wasn’t happy with the 18% boost – I thought it would be faster! But, it was only a very small portion ported. I decided to port more, but there wasn’t a good natural place to split the resize function up any further, so I ported the whole thing. This way I could also show how to send and receive a whole image back from C – which I do by reference/pointer.

myscript2.py

import cv2
import numpy as np
import math
from datetime import datetime
import ctypes

fast_tools = ctypes.cdll.LoadLibrary('./myscript_tools.so')

def c_resize_median(img, block_size):
    height, width, depth = img.shape

    num_w = math.ceil(float(width)/block_size)
    num_h = math.ceil(float(height)/block_size)

    #create a numpy array where the C will write the output
    out_img = np.zeros((num_h, num_w, 3), dtype=np.uint8)

    fast_tools.resize_median(ctypes.c_void_p(img.ctypes.data),
                             ctypes.c_int(height), ctypes.c_int(width),
                             ctypes.c_int(block_size),
                             ctypes.c_void_p(out_img.ctypes.data))
    return out_img


def run():
    img = cv2.imread('Brandberg_Massif_Landsat.jpg')
    block_size = 16
    start_time = datetime.now()
    resized = c_resize_median(img, block_size)
    print "Time: {0}".format(datetime.now() - start_time)
    cv2.imwrite('proc.png', resized)

if __name__ == '__main__':
    run()

myscript_tools.c

#include <stdint.h>
#include <stdio.h>
#include <stdlib.h> // for malloc
#include <math.h>

int cmpfunc (const void * a, const void * b)
{
   return ( *(uint8_t*)a - *(uint8_t*)b );
}

int _median(uint8_t * list, int ne){
    //sort inplace
    qsort(list, ne, sizeof(uint8_t), cmpfunc);
    int i;
    if (ne % 2 == 0){
        i = (int)(ne / 2);
        return ((int)(list[i-1]) + (int)(list[i])) / 2;
    } else {
        i = (int)(ne / 2);
        return list[i];
    }
}

void resize_median(const void * bgr_imgv, const int height, const int width,
                   const int block_size, void * outdatav) {
    const uint8_t * bgr_img = (uint8_t  *) bgr_imgv;
    uint8_t * outdata = (uint8_t *) outdatav;

    int col, row, c_max, r_max, c_si, r_si;
    int max_val;
    const int num_elems = block_size * block_size;
    int j, offset;

    uint8_t *b_values;
    uint8_t *g_values;
    uint8_t *r_values;
    b_values = (uint8_t *) malloc(sizeof(uint8_t) * num_elems);
    g_values = (uint8_t *) malloc(sizeof(uint8_t) * num_elems);
    r_values = (uint8_t *) malloc(sizeof(uint8_t) * num_elems);

    int out_i = 0;

    row = 0;
    while (row < height) {
        col = 0;
        r_max = row + block_size;
        if (r_max > height) {
            r_max = height;
        }
        while (col < width) {
            c_max = col + block_size;
            if (c_max > width) {
                c_max = width;
            }

            // block info:
            j = 0;
            for (r_si = row; r_si < r_max; ++r_si) {
                for (c_si = col; c_si < c_max; ++c_si) {
                    offset = ((r_si*width)+c_si)*3;
                    b_values[j] = bgr_img[offset];
                    g_values[j] = bgr_img[offset+1];
                    r_values[j] = bgr_img[offset+2];
                    j += 1;
                }
            }

            // have the block info, get medians
            max_val = j;
            outdata[out_i]   = _median(b_values, max_val);
            outdata[out_i+1] = _median(g_values, max_val);
            outdata[out_i+2] = _median(r_values, max_val);

            // update indexes
            out_i += 3;
            col += block_size;
        }
        row += block_size;
    }

    free(b_values);
    free(g_values);
    free(r_values);
}

This one is a little more complicated. With multi-dimensional arrays, numpy flattens the data. The new array is [pixel00_blue, pixel00_green, pixel00_red, pixel01_blue, ...] – see the offset variable above for the row/column equation. I also pass pointers with no type, so before using the arrays, I have to typecast them. I realize this example is unusual because the whole script is basically ported, but it illustrates many things that are non-trivial and required some time to figure out.

And for the moment of truth…

500 loops, best of 3: 440 msec per loop

82% faster!

Although it’s a mostly C at this point.

Final Thoughts

So in this case, the Python took about 5 times as long to run compared to the C version. When something takes a few milliseconds, 5-10x isn’t that important, but when it gets more complex, it starts to matter (assuming you’re not I/O bound). Here I ported pretty much the whole module. I think this only makes sense with a library, and not code that you will refer to often – Python is just so easy to read.

An idea popped in my head as I was writing this as well. This should be easy to port to RapydScript. JavaScript has the Image object I could use in place of opencv. I wonder where JavaScript would fall on the spectrum.

How to Write Faster Python

Peugeot F1 This is a high level guide on how to approach speeding up slow apps. I have been writing a computer vision app on the backend of a server and came to the realization that it is easy to write slow Python, and Python itself is not fast. If you’re iterating over a 1000 x 1000 pixels (1 megapixel) image, whatever you’re doing inside will run 1 million times. My code ran an iterative algorithm, and it took over 2 minutes to run the first version of the code. I was able to get that time down to under 10 seconds – and it only took a few hours time with minimal code changes. The hardest part was converting some snippets of code into C, which I plan to detail in the future. The basic approach I followed was:

  1. Set a baseline with working code & test
  2. Find the bottlenecks
  3. Find a way to optimize bottlenecks
  4. Repeat

And fair warning, these sorts of optimizations only work if you have selected a good algorithm. As we used to say at work “You can’t polish a turd”.

Slow, but working code

I am going to start with an example program. This resizes an image using the medians of each region. If you resize to 1/4th of the original size the new image will be the medians of 4×4 blocks. It is not too important you understand the details, but I still included code for reference:

import cv2
import numpy as np
import math
from datetime import datetime

def _median(list):
    """
    Return the median
    """

    sorted_list = sorted(list)
    if len(sorted_list) % 2 == 0:
        #have to take avg of middle two
        i = len(sorted_list) / 2
        #int() to convert from uint8 to avoid overflow
        return (int(sorted_list[i-1]) + int(sorted_list[i])) / 2.0
    else:
        #find the middle (remembering that lists start at 0)
        i = len(sorted_list) / 2
        return sorted_list[i]

def resize_median(img, block_size):
    """
    Take the original image, break it down into blocks of size block_size
    and get the median of each block.
    """

    #figure out how many blocks we'll have in the output
    height, width, depth = img.shape
    num_w = math.ceil(float(width)/block_size)
    num_h = math.ceil(float(height)/block_size)

    #create the output image
    out_img = np.zeros((num_h, num_w, 3), dtype=np.uint8)

    #iterate over the img and get medians
    row_min = 0
    #TODO: optimize this, maybe precalculate height & widths
    while row_min < height:
        r_max = row_min + block_size
        if r_max > height:
            r_max = height

        col_min = 0
        new_row = []
        while col_min < width:
            c_max = col_min + block_size
            if c_max > width:
                c_max = width

            #block info:
            block_b = []
            block_g = []
            block_r = []
            for r_i in xrange(row_min, r_max):
                for c_i in xrange(col_min, c_max):
                    block_b.append(img[r_i, c_i, 0])
                    block_g.append(img[r_i, c_i, 1])
                    block_r.append(img[r_i, c_i, 2])

            #have the block info, sort by L
            median_colors = [
                int(_median(block_b)),
                int(_median(block_g)),
                int(_median(block_r))
            ]

            out_img[int(row_min/block_size), int(col_min/block_size)] = median_colors
            col_min += block_size
        row_min += block_size
    return out_img

def run():
    img = cv2.imread('Brandberg_Massif_Landsat.jpg')
    block_size = 16
    start_time = datetime.now()
    resized = resize_median(img, block_size)
    print "Time: {0}".format(datetime.now() - start_time)
    cv2.imwrite('proc.png', resized)

if __name__ == '__main__':
    run()

This takes around 3 seconds to run against a random 1000×1000 image from Wikipedia (link):

500 loops, best of 3: 3.09 sec per loop

So this is a good starting point – it works! I am also saving the output image, so that when I make changes, I can spotcheck that I didn’t break something.

Profiling

Python’s standard library has a profiler that works very well (see The Python Profilers). Although I’m not a fan of the text output – I feel it does not intuitively roll up results – you can dump stats to a file which you can view in a UI.

So I can profile the script:

python -m cProfile -o stats.profile myscript.py

Note that it can sometimes add a lot of overhead.

And after installing runsnakerun I can view the stats:

runsnake stats.profile

which shows me: Runsnakerun GUI

So there are a few things that jump out as easy fixes. First, that app spends around 1/3rd of the time appending colors. Second, finding the median takes a significant amount of time as well – most of it from the sorted call – this is not visible in the image above. The other lines of code are not significant enough to display.

Optimizing

I have a few quick rules of thumb for optimizing:

  • creating/appending to lists is slow due to memory allocation – try to use list comprehensions
  • try not to run operations that create copies of objects unless it’s required – this is also slow due to memory allocation
  • dereferencing is slow: if you’re doing mylist[i] several times, just do myvar = mylist[i] up front
  • use libraries as much as possible – many are written in C/C++ and fast

Other than that, make use of search like Google or DuckDuckGo. You can tweak your Python code (you might discover you’re doing something wrong!), use Cython, write C libraries, or find another solution.

So profiling tells me that appending is slowing my code. I can get around this problem by declaring the list once and keeping track of how many elements I “add”. I also know that sorted() is not preferred because it creates a copy of the original list. I can instead use list.sort() and sort the list in place. I make these changes and run the code, and see the output is still good so I probably did not break it. Let’s time it.

500 loops, best of 3: 2.51 sec per loop

That’s almost 20% faster! Not bad for a few minutes of effort.

For completeness, here is the modified code:

import cv2
import numpy as np
import math
from datetime import datetime

def _median(list, ne):
    """
    Return the median.
    """

    #sort inplace
    if ne != len(list):
        list = list[0:ne]
    list.sort()
    if len(list) % 2 == 0:
        #have to take avg of middle two
        i = len(list) / 2
        #int() to convert from uint8 to avoid overflow
        return (int(list[i-1]) + int(list[i])) / 2.0
    else:
        #find the middle (remembering that lists start at 0)
        i = len(list) / 2
        return list[i]

def resize_median(img, block_size):
    """
    Take the original image, break it down into blocks of size block_size
    and get the median of each block.
    """

    #figure out how many blocks we'll have in the output
    height, width, depth = img.shape
    num_w = math.ceil(float(width)/block_size)
    num_h = math.ceil(float(height)/block_size)

    #create the output image
    out_img = np.zeros((num_h, num_w, 3), dtype=np.uint8)

    #iterate over the img and get medians
    row_min = 0
    #TODO: optimize this, maybe precalculate height & widths
    num_elems = block_size * block_size
    block_b = [0] * num_elems
    block_g = [0] * num_elems
    block_r = [0] * num_elems
    while row_min < height:
        r_max = row_min + block_size
        if r_max > height:
            r_max = height

        col_min = 0
        new_row = []
        while col_min < width:
            c_max = col_min + block_size
            if c_max > width:
                c_max = width

            #block info:
            num_elems = (r_max-row_min) * (c_max-col_min)
            block_i = 0
            for r_i in xrange(row_min, r_max):
                for c_i in xrange(col_min, c_max):
                    block_b[block_i] = img[r_i, c_i, 0]
                    block_g[block_i] = img[r_i, c_i, 1]
                    block_r[block_i] = img[r_i, c_i, 2]
                    block_i += 1

            #have the block info, sort by L
            median_colors = [
                int(_median(block_b, num_elems)),
                int(_median(block_g, num_elems)),
                int(_median(block_r, num_elems))
            ]

            out_img[int(row_min/block_size), int(col_min/block_size)] = median_colors
            col_min += block_size
        row_min += block_size
    return out_img

def run():
    img = cv2.imread('Brandberg_Massif_Landsat.jpg')
    block_size = 16
    start_time = datetime.now()
    resized = resize_median(img, block_size)
    print "Time: {0}".format(datetime.now() - start_time)
    cv2.imwrite('proc.png', resized)

if __name__ == '__main__':
    run()

Repeat

Now that I optimized the code a little, I repeat the process to make it even better. I try to decide what should be slow, and what should not. So, for example, sorting is not very fast here, but that makes sense. Sorting is the most complex part of this code – the rest is iterating and keeping track of observed colors.

Final Thoughts

After having optimized a few projects I have a few final thoughts and lessons learned:

  • Optimizing a slow/bad algorithm is a waste of time, you’ll need to redesign it anyways
  • The granularity of the profiler is sometimes a little funny. You can get around this by making sections of code into functions – metrics on functions are usually captured.
  • The profile will group the calls to the library functions together even if called from different places. So be careful with metrics for libraries/utils – i.e. numpy.
  • Use search to find ways to optimize your code.

If all else fails, you can implement slow parts in C relatively painlessly. This is the option I usually go with, and I will detail it out with some code in the next few weeks.

Have fun speeding up your code!

RapydScript by Example

With RapydScript gaining popularity, I figured it’s about time for a blog post talking about an example app. After all, I learn fastest by observing examples, many others probably do as well.

Let’s imagine we wanted to write a game using RapydScript. Since I’m not feeling particularly creative today, let’s concentrate on an existing game, rather than coming up with an idea from scratch. For this example, I’ll use a game called Chip’s Challenge.

Chip's Challenge

Looking at the original game, we quickly notice that the environment is a grid of blocks. Each block type has a corresponding image, and some have an effect on the character stepping on them. If we assume that we’ll use a canvas element for drawing the grid, then each block will need to track its coordinates in the grid and the url of the image corresponding to this block. Let’s assume we created a canvas with id #canvas in our document using html. We now simply need to create a reference to it in RapydScript:

CANVAS = document.getElementById('canvas').getContext('2d')

Since canvas is not only an object, but also a module containing methods for creating canvas-compatible objects (like images and patterns), we will use a global to reference it rather than properly abstracting it. Next, let’s create a simple block:

BLOCK_SIZE = 32 #pixels
NUM_X_BLOCKS = 21 # horizontal blocks on a field
NUM_Y_BLOCKS = 21 # vertical blocks
NORMAL_BLOCK = 0 # enum identifying block type

class Block:
    def __init__(self, x, y, image_url):
        self.x = x
        self.y = y
        img = new Image()
        img.src = image_url
        self.type = NORMAL_BLOCK
        self.blockPattern = CANVAS.createPattern(img)

    def redraw(self):
        CANVAS.save()
        CANVAS.fillColor = self.blockPattern
        CANVAS.fillRect(self.x*BLOCK_SIZE, self.y*BLOCK_SIZE, BLOCK_SIZE, BLOCK_SIZE)
        CANVAS.restore()

Now that we have a basic block, let’s create advanced blocks that can affect chip in some way. For example, Chip’s Challenge has a water block that Chip can drown in unless he’s wearing flippers. There is also an ice block, that makes chip slide to the next one unless he’s wearing ice skates. Let’s create these two blocks:

WATER_BLOCK = 1
ICE_BLOCK = 2

class WaterBlock(Block):
    def __init__(self, x, y):
        Block.__init__(self, x, y, 'images/water.jpg')
        self.type = WATER_BLOCK

    def effect(self, unit):
        if not unit.hasItem('flippers'):
            unit.die() # drown

class IceBlock(Block):
    def __init__(self, x, y):
        Block.__init__(self, x, y, 'images/ice.jpg')
        self.type = ICE_BLOCK

    def effect(self, unit):
        if not unit.hasItem('skates'):
            unit.move(unit.direction) #slip

Let’s add a character that can move around. We will use Block as a base class since a character is really just another block drawn on top of the existing block. Unlike a regular block, the picture could change depending on the direction the character is facing:

LEFT = 0
UP = 1
RIGHT = 2
DOWN = 40

class Character(Block):
    def __init__(self, x, y, images):
        self.patterns = [CANVAS.createPattern(img) for img in images]
        self.direction = DOWN # default facing direction
        self.blockPattern = self._patterns[self.direction]

    def move(self, direction):
        self.direction = direction
        self.blockPattern = self.patterns[direction]
        if direction == DOWN:
            self.y += 1
        elif direction == UP:
            self.y -= 1
        elif direction == LEFT:
            self.x -= 1
        else:
            self.x += 1
        self.redraw()

This character can now serve as the base class both for Chip himself and the monsters that roam around. If we wanted to create a typical monster, for example, we would write:

MONSTER = 10

class Monster(Character):
    def __init__(self, x, y, redrawCallback):
        images = [Image() for i in range(4)]
        images[DOWN].src = 'image/monsterDown.jpg'
        images[UP].src = 'image/monsterUp.jpg'
        images[LEFT].src = 'image/monsterLeft.jpg'
        images[RIGHT].src = 'image/monsterRight.jpg'
        self.type = MONSTER
        Character.__init__(self, x, y, images)

        # have the monster move randomly every 0.5 seconds
        main = self
        window.setInterval(def(): main.move(Math.round(Math.random()*4)); redrawCallback();, 500)

    def die(self):
        pass    # monsters don't die

Similarly, let’s create chip:

CHIP = 11

class Chip(Character):
    def __init__(self, die=False):
        # center Chip
        self.x = int(NUM_X_BLOCKS/2)+1
        self.y = int(NUM_Y_BLOCKS/2)+1
        self.items = {}

        if not die:
            self.deaths = 0
            images = [Image() for i in range(4)]
            images[DOWN].src = 'image/chipDown.jpg'
            images[UP].src = 'image/chipUp.jpg'
            images[LEFT].src = 'image/chipLeft.jpg'
            images[RIGHT].src = 'image/chipRight.jpg'
            self.type = CHIP
        Character.__init__(self, x, y, images)

    def die(self):
        self.deaths += 1
        self.__init__(True)

    def hasItem(self, item):
        return self.items[item]

    def getItem(self, item):
        self.items[item] = True

We now have most of the basics done. Let’s start putting the pieces together by creating the actual class that renders these. Our main class (which we will call Field) will need to create the grid and populate it. We’ll pass it a matrix of blocks to create the field, and an array of monsters.

class Field:
    def __init__(self, grid, numMonsters):
        self.grid = grid
        self.monsters = []
        for i in range(numMonsters):
            monster = Monster(
                Math.round(Math.random()*NUM_X_BLOCKS),
                Math.round(Math.random()*NUM_Y_BLOCKS),
                self.redraw
            )
            self.monsters.append(monster)
        self.chip = Chip()
        main = self
        moveChip = def(event):
            main.chip.move(event.keyCode - 37)
            while main.grid[main.chip.x][main.chip.y] != NORMAL_BLOCK:
                main.grid[main.chip.x][main.chip.y].effect(main.chip)

            for monster in main.monsters:
                if monster.x == main.chip.x and monster.y == main.chip.y:
                    chip.die()
            main.redraw()
        window.addEventListener("keydown", moveChip)

    def redraw(self):
        for x_array in self.grid:
            for block in x_array:
                block.redraw()
        for monster in self.monsters:
            monster.redraw()
        self.chip.redraw()

We now have an almost complete game (although inefficient due to redrawing the entire field every time a monster or the user generates an event). The only thing left to do is to create blocks that that provide chip with flippers and skates. I will leave that exercise to the reader. These items should disappear after picked up, I recommend using Character base class for those and making them die() after Chip runs into them. As you can see RapydScript code is easy to follow, which is the main benefit of the language.

Pyjamas and Web2py

UPDATE: Pyjamas has since been renamed Pyjs and is under new leadership. Everything is still backwards compatible.

At this point, if you’ve been following along the posts, you should know how to create a simple web2py application. In this post I’m going to describe how to write a page that can connect to a backend written in web2py. This is another step on the path of having an app on GAE, in fact, the code we write here will get deployed on GAE. This code will also run on any system, including your own computer, and avoids the lock-in some people experience when developing for GAE.

There are 2 ways that I use for communicating with a web server are RESTful JSON calls, and JSON-RPC calls. Instead of covering both I plan on just showing how to use JSON-RPC. I suggest when you design your app first search online for comparisons between REST & JSON-RPC to see the trade-offs.

Before we get into the code, I also have a slight curve-ball. I originally wrote this code using Pyjs, but I’ve switched to using Rapydscript for all my development. In a future post I’ll show how to connect Rapydscript to GAE, which I plan to link here. I suggest reading Alex’s earlier post https://blogs-pyjeon.rhcloud.com/?p=301 for a good summary of various Python to JS compilers and their trade-offs.

web2py Services

Time to code!

First you need to setup your web2py app to use web2py’s services module. This lets your application work as a web service so it can respond to calls from clients, including support for JSON-RPC calls. Inside of models/db.py add the following lines:

from gluon.tools import Service
service = Service()

In the controller, add a call function that returns the service. Then all you have to do is add a decorator to each functions you want to act as services receiving jsonrpc calls and web2py handles the rest. In controllers/default.py I have added the following:

@service.jsonrpc
def myfun(data_from_JSON):
    return data_from_JSON.upper()

def call():
    return service()

The example function here, myfun, will make everything uppercase. Also worth nothing is data_from_JSON is already decoded data from the JSON request.

To access this service use the URI ‘/cyborg/default/call/jsonrpc’. For more information on services check out http://web2py.com/books/default/chapter/29/10#Remote-procedure-calls.

Pyjs Clients

I’ve had this code floating around my computer for a couple years now. I’ve made several minor changes, some just because I wanted slightly lighter code, and some because of changes in Pyjs, but it was originally based on code by Amund Tveit on his blog (http://amundblog.blogspot.com/2008/12/ajax-with-python-combining-pyjs-and.html). This is a simple page with a text area for sending text to a JSON-RPC service.

from pyjamas.ui.RootPanel import RootPanel
from pyjamas.ui.TextArea import TextArea
from pyjamas.ui.Label import Label
from pyjamas.ui.Button import Button
from pyjamas.ui.VerticalPanel import VerticalPanel
from pyjamas.JSONService import JSONProxy


class JSONExample:
    def onModuleLoad(self):
        self.rpc_service = JSONProxy('/cyborg/default/call/jsonrpc', ['myfun'])

        self.text_area = TextArea()
        self.text_area.setText(r"Hello World")
        self.text_area.setCharacterWidth(80)
        self.text_area.setVisibleLines(4)

        button = Button('Send to Server', self)

        self.status = Label()

        panel = VerticalPanel()
        panel.add(self.text_area)
        panel.add(button)
        panel.add(self.status)

        RootPanel().add(panel)


    def onClick(self, sender):
        print('sending to server')
        self.status.setText('Waiting for response...')

        textarea_txt = self.text_area.getText()
        if self.rpc_service.myfun(textarea_txt, self) < 0:
            self.status.setText('Server Error')


    def onRemoteResponse(self, response, request_info):
        print('good response')
        self.status.setText(response)


    def onRemoteError(self, code, message, request_info):
        print('error')
        print(code)
        print(message)
        self.status.setText("Server Error or Invalid Response: ERROR  - " + str(message))


if __name__ == '__main__':
    app = JSONExample()
    app.onModuleLoad()

To make the call, you need an instance of the JSONProxy class, so I have the line

self.rpc_service = JSONProxy('/cyborg/default/call/jsonrpc', ['myfun'])

Then call your function using that instance, along with passing in an object/variable (which Pyjs encodes) to send to the service, and a reference to an instance of a class with onRemoteResponse and onRemoteError methods.

self.rpc_service.myfun(data, response_class)

In the example code above, response_class is self, so the JSON response comes through onRemoteResponse.

Putting it Together

Believe it or not, this was the trickiest part for me when I first got this working. Figuring out the correct URIs to use, and having code listening for the call was tough to debug. Luckily for anyone following along, the example code here already has everything setup correctly :).

We can start out just by making sure everything ties together. Go to your web2py folder then applications\cyborg\static. This is the static directory for the cyborg app. Now take all the output from Pyjs and put it there. I called my Pyjs file JSONExample.py, which generated JSONExample.html, so I access this using http://127.0.0.1:8000/cyborg/static/JSONExample.html. The ‘Send to Server’ button on the page should work. If it doesn’t, make sure you followed every step exactly. There might have also been changes/bugs in Pyjs or web2py – it’s not likely, but it has happened.

I personally wanted my main app to call the JSON-RPC service, not a file in the static director. I am going to have the index load the JSON page. First thing to do is cleanup the index function in default.py. I just want index to return the view, so index now returns an empty dictionary:

def index():
    return {}

I then replace the contents of applications\cyborg\views\default\index (the view for index() in the default controller) with the contents of the main Pyjs output file, JSONExample.html in my case.

The compiled html/js files are still in the static dir though. Remember, users are accessing the files in the views directory by accessing controllers. So if the other Pyjs files were in the views, you would still need to have a controller function to access them. There is no clean, simple way to put all your Pyjs cache files in the views directory. Instead we need to modify the new index.html to point to the files in the static dir. So the index.html code gets modified once for the module:

<meta name="pygwt:module" content="/cyborg/static/JSONExample">

and, in several places, for bootstrap.js:

<script language="javascript" src="/cyborg/static/bootstrap.js"></script>

Now you can visit http://127.0.0.1:8000/cyborg/ and send JSON-RPC calls!

One thing to note is that with the default routes calling /cyborg/, /cyborg/default, and /cyborg/default/index all load the same view/controller. If your JSONProxy class uses a relative link it might only work when visiting 1 of these URIs. That is why, in the Pyjs code, I refer to URI starting from /. When I was learning how to do this, I set everything using relative URI’s, like ../static/JSONExample, and ../default/call/jsonrpc, and that made everything difficult, so I stay away from that.

A Simple web2py App

I know this is waaaaay overdue, but better late than never. Over the last year and a half I’ve switch over to using Linux and away from Pyjamas (I’m still using python though). I’ll have more info in future posts over the next couple weeks. Don’t worry if you’re using Windows though, the steps here work in both Linux and Windows (thank you Python!).

Installing and Running web2py

Start by downloading web2py (source – version 2.2.1 for me). I am doing this in XP, so I extracted the code to C:\Projects\web2py. Open a command window, navigate to your web2py dir and start it up with the command

> python web2py.py -a tmp_pass

This starts up web2py on 127.0.0.1:8000 with the admin password set to tmp_pass. You can use the -h option to see how to set web2py up in other ways. One thing to note is if the server is running on 127.0.0.1 you won’t be able to access it using your real IP address. If you want to test your server using external computer have web2py use the IP 0.0.0.0.

With web2py running, I could then visit http://127.0.0.1:8000 where an sort of hidden admin interface button lets me login using my admin password, tmp_pass.

Creating an App

There is a panel on the right, with a section called “New simple application” which you can use to create an app. This sets up all the template files for you. In my case I created a program called cyborg.

The server shows a list of files which I could edit. It’s a lot of code, definately more than I wanted for my app, but that should be easy to cleanup later on. With the web2py server up, I navigated to http://127.0.0.1:8000/cyborg/. which showed a page with some interesting bullets, including:

  • You visited the url /cyborg/
  • Which called the function index() located in the file web2py/applications/cyborg/controllers/default.py
  • The output of the file is a dictionary that was rendered by the view web2py/applications/cyborg/views/default/index.html

These are the MVC files discussed in my previous post.

Simplifying the App

I decided to investigate each of the steps taken to run the code and try to trim unneeded code: 1) Well, this one is obvious

2) The default code has calls to use databases, and uploading files, etc. Actions I don’t plan on supporting, at least just yet. First, I want to get comfortable with everyting. I changed default.py to be the simplest function possible:

# -*- coding: utf-8 -*-

def index():
    return dict(message="Hello World", message2="How's it going?")

3) The view should get the dictionary from index() and render it. Really whats happening is the view is loaded, and any python code in the index uses the dictionary from index(). Python code is embedded between {{}}

{{if 'message' in globals():}}
<h3>{{=message}}</h3>
{{pass}}
<br>
{{if 'message' in globals():}}
<h3>{{=message2}}</h3>
{{pass}}

The only kind of gotcha I found was that you need to have a pass aligned with every if. I think this is because there isn’t a way to unindent in the html files.

Cleanup

Since I only plan on using a couple function to start, I want to remove all the files that seem unnecessary. I started by removing and checking the page still worked: I removed these folders completely (I think most get recreated by web2py when the app runs, but with just dummy files):

  • databases (this folder has the database files)
  • errors (this is a logs folder)
  • languages (translation files)
  • models (usually where you put information about your database)
  • private
  • static
  • uploads

And cleaned the folders:

controllers: Only kept default.py views: only kept views\default\index.html

I realoaded http://127.0.0.1:8000/cyborg/, and I was happy to see the messages I passed in through the dict.

Adding True Internet Explorer 9 support to Pyjamas

About two months ago, Rich Newpol added Internet Explorer 9 support to pyjamas. Before then, pyjamas tried to use old mozilla (pre-3.5 version) format whenever it detected IE9, which would result in an epic fail. One small problem, however, is that Rich’s solution was not good enough for the app I’m developing. The solution was to tell IE9 to render the website the same way previous versions of IE would render it. Since my app relies heavily on HTML5 canvas element, this means I’d be stuck with old crappy VML. Instead I have started examining pyjamas to add true IE9 support. In the end, it turned out simpler than I thought. This post talks about how I did it, as well as the quirks I had to address.

Setting up ie9 directory

To maintain at least partial support in older IE browsers, I decided to create a new __ie9__ directory inside pyjamas library. My first goal was to get pyjamas to compile a separate cache file for IE9 (*.ie9.cache.html). To do so, I had to modify home.nocache.html file inside of boilerplate to detect ie9:

...
20 else if (ua.indexOf('msie7.0') != -1) {
21 return 'ie6';
22 }
23 else if (ua.indexOf('msie 8.0') != -1) {
24 return 'ie6';
25 }
26 else if (ua.indexOf('msie 9.0') != -1) {
27 return 'ie9';
28 }
...
50 window["prop$user.agent"] = function() {
51 var v = window["provider$user.agent"]();
52 switch (v) {
53 case "ie9":
54 case "ie6":
55 case "mozilla":
56 case "oldmoz":
57 case "opera":
58 case "safari":
59 return v;
60 default:
61 parent.__pygwt_onBadProperty("%(app_name)s", "user.agent",
["ie9", "ie6", "mozilla", "oldmoz", "opera", "safari"], v);
62 throw null;
63 }
64 };
...

This tells the website to load *.ie9.cache.html if the browser identifies itself as ‘msie 9.0′. The next task is to modify pyjamas so it actually generates *.ie9.cache.html file for us. The script responsible for this is pyjs/src/pyjs/browser.py. Here are the changes I made to it:

...
21 AVAILABLE_PLATFORMS = ('IE6', 'IE9', 'Opera', 'OldMoz', 'Safari', 'Mozilla')
...
45 class BrowserLinker(linker.BaseLinker):
46
47 # parents are specified in most-specific last
48 platform_parents = {
49 'mozilla':['browser'],
50 'ie6':['browser'],
51 'ie9':['browser'],
52 'safari':['browser'],
53 'oldmoz':['browser'],
54 'opera':['browser'],
55 }
...

With these files modified, pyjamas should now generate *.ie9.cache.html file for us, that the *.nocache.html file will load if you visit the page in IE9. The problem is that there is nothing in our ie9 folder yet.

Populating ie9 directory

The obvious starting point would probably be to copy the contents of __ie6__ directory into __ie9__ and start tweaking it from there. However, after playing with that approach for a while, I started realizing that __ie9__ is so different from previous versions that the best approach is to actually leave __ie9__ blank and start troubleshooting from there. It turns out that if you leave __ie9__ directory empty, the compiled *.ie9.cache.html file is actually somewhat usable. After some testing, I found that the only methods that did not work as expected were DOM.eventGetButton(), DOM.getAbsoluteLeft(), DOM.getAbsoluteTop() (it’s possible I missed some, but my very complex web app seems to be working fine in IE9 now). After writing my own implementations of those, I noticed that DOM.py from __mozilla__ implements all 3 of those methods similar to my personal implementation, and all 3 implementations fix IE9 as well. This is not too much of a surprise, since Opera 11 now seems to run just fine with mozilla cache file as well, as I reported on pyjamas mailing list here. Perhaps __mozilla__ should be renamed to __default__ or __compliant__? Anyway, if you’ve been following along, chances are that even with all these changes you still might have issues with your web-app layout. The next section is designed to help you troubleshoot those.

Layout and appearance issues

First of all, IE9 will automatically throw you into ‘Quirks Mode’ if you don’t define a correct DOCTYPE. Quirks mode is designed to help with rendering non-W3C-compliant webpages. Its purpose is to render pages that were designed for older browsers and are no longer compliant with today’s standards. You can find more about it on Wikipedia. Actually, all browsers have a quirks mode, and they will all throw you into it if you don’t define a valid DOCTYPE. And if you look through pyjamas examples, you’ll probably notice a consistent lack of DOCTYPE. That’s right, most of pyjamas examples actually run in quirks mode when opened inside a browser (and not just in IE). This is usually not a big deal, unless you happen to be running IE9.

It turns out IE9 does not support event handler when in quirks mode. Unfortunately, event handler is one of the most important building blocks for pyjamas, as it handles just about all user interaction. So to make your web app compatible with IE9, you will have to make it W3C-compliant, this is something that will affect all browsers, not just IE9. Most of your changes will be in your app’s main html file itself and your css stylesheet, although it’s possible that you will need some slight code changes as well (in particular if you used DOM.setStyleAttribute() method).

First of all, we need to define a valid DOCTYPE. To do so, open up your app’s main html file (should be located in your app’s public directory) and add this line at the very top (before tag):

<!DOCTYPE html>

This tells it to render your app using the latest html standard (which would be HTML5). If your app looks and works great, congratulations, you know your html and can probably stop reading now. If you’re someone like me, however, you’re probably staring at a deformed version of your app while scratching your head. First of all, let’s address the issues that are common across all browsers. To do so, read my previous post about making your app HTML5-compliant. Once done, your app should appear correctly in all browsers (except maybe some glitchtes in IE9). Here are the last few fixes for the annoyances with IE9.

If you use a text area element (pyjamas’ TextArea() class), IE9 seems to override the width and height you set in the stylesheet unless you use ‘!important’ tag. It also seemed to add a scrollbar for me, so I had to set ‘overflow’ property to auto as well. I also used this oportunity to prevent Chrome from making my text area resizable (that little corner in bottom-right corner that the user can drag) by setting ‘resize’ to none:

.gwt-TextArea {
    overflow: auto;
    resize: none;
    width: 300px !important; /* important tag needed for IE9 */
    height: 100px !important; /* important tag needed for IE9 */
}

The second annoyance seems to be that unlike other browsers, IE9 (when operating in strict mode) does not assume that a single value for a field should be applied to both dimensions. This means that to center a background image, for example (as I needed to do with my loading icon), you’d need to specify “center” twice in CSS stylesheet. That should be pretty much it, I have to admit that for the most part IE9 is actually more standards compliant than Safari or Chrome. There might be other quirks, which I haven’t ran into with my app. Feel free to mention them to me so I can update this post.

Code Tags and Syntax Highlighting

UPDATE: Now that I’m using WordPress, I no longer rely on this. With markdown and markdown syntax highlighting plugins, all this logic is automatically handled for me at page render time. This is still a good reference for people using plogspot or implementing their own syntax highlighting on the back-end. Also, as Adam has mentioned, for blogspot there are other alternatives available that are simpler to setup if you don’t want to do your own implementation.

I really like the flexibility of blogspot, it’s one of very few free blogging platforms that allows me to customize just about any feature of my blog, from custom CSS stylesheets to unique Javascript snipplets. Naturally, I was surprised that there was no <code> tag to wrap code blocks in despite so many existing programmer blogs. Luckily, a quick search revealed another blog that resolved this via a simple CSS tag. So I followed the same advice, but somehow staring at monotone text in a code block was about as satisfying as coding in notepad.

As I started investigating, I found multiple solutions, ranging from online parsers to various language libraries. My favorite was pygments, a python library (are you surprised?) for syntax highlighting. After a little more searching, I found the following article that not only explains how to use pygments but also writes a quick script for parsing Python files. The script had a small glitch, instead of indir=r”, I had to use indir=r’.’, so the code I started with ended up looking like this:

from pygments import highlight
from pygments.lexers import PythonLexer
from pygments.formatters import HtmlFormatter
import os

formatter = HtmlFormatter()

indir= r'.'
for x in os.listdir(indir):
    if x.endswith('.py'):
        infile = os.path.join(indir, x)
        outfile = infile.replace('.py', '.html')
        text = ''.join(list(open(infile)))
        html = highlight(text, PythonLexer(), formatter)
        f = open(outfile, 'w')
        f.write(html)
        f.close()
        print 'finished :',x

I then decided to go a step further and update the script to handle multiple programming/markup languages, since I also use CSS, HTML, XML, and Javascript in my blog. But creating a separate file for every snipplet of code I decide to use in my blog post didn’t seem like a good solution.

Instead, I decided to parse out and highlight all the <code> tags from a single file. This would allow me to type an entire blog as one text file and run it through the parser once instead of combining multiple files into a single blog post. I was initially thinking of using Python’s regular expression library, but decided to use XML parsing library instead. It just seemed like the right tool for the job, it can handle additional attributes added to the <code> tag, as well as edit the document tree in-place. In other words, with XML parser I’m much less likely to screw something up with my future “improvements”.

There are several XML libraries available in Python, I decided to go with xml.dom. The main reason I chose it is that its naming scheme is the same as that of Pyjamas and Javascript’s DOM methods (unlike lxml, whose method names feel awkward to me), xml.dom also happens to be the library Pyjamas Desktop uses for XML parsing (I’ll write another tutorial on that soon).

As far as applying correct syntax highlighting based on the language, there are a few ways to go about doing it. One is to use guess_lexer() method from pygments, which can usually detect the correct language automatically. However, a lot of my code-blocks are two-liners that would probably look the same in many different languages, so I decided against that. As a blog writer, I should know what language I’m posting the code in anyway, so it’s very easy for me to specify it myself. Since I’m using a real XML parser, I can add as many attributes to my <code> tag as I wish. So I decided to to specify the language via the “class” attribute (i.e. <code class='python'>).

I then imported the lexers I wanted to use, and defined a dictionary mapping “class” names to lexers (note that omitting the class attribute entirely defaults to non-highlighting TextLexer):

from pygments.lexers import HtmlLexer, JavascriptLexer, CssLexer, XmlLexer, 
                            PythonLexer, TextLexer
import xml.dom.minidom

lexers = {'': TextLexer,
          'html': HtmlLexer,
          'js': JavascriptLexer,
          'css': CssLexer,
          'xml': XmlLexer,
          'python': PythonLexer}

The next step was to loop through all of these code tags, selecting the correct lexer based on the “class” attribute, and applying it to the contents. The last step was to convert the XML DOM tree back to xml format before outputting it to the file. Trying something like this, however, will not work:

for element in code_tags:
    lexer = lexers[element.getAttribute('class')]()
    element.firstChild.nodeValue =
        highlight(element.firstChild.nodeValue, lexer, formatter)
html = doc.toxml()

The problem with the above code is that xml.dom automatically converts < and > angle brackets to &lt; and &gt;, respectively to avoid interpreting the node value as part of the tree. This means that all of the pretty syntax highlighting pygments added will be shown to the user as code instead of getting interpreted by the browser. Since I’m outputting all of the “XML” back to the document, I figured there is no harm in xml.dom “interpreting” the pygments output, and replaced the last line of the for loop to the following:

element.replaceChild(xml.dom.minidom.parseString(
    highlight(element.firstChild.nodeValue, lexer, formatter))
    .documentElement, element.firstChild)

This worked great for Python, CSS, and Javascript parsing, but HTML and XML got “interpreted” into the tree before I could pass them into pygments (element.firstChild.nodeValue did not exist). The trick was to apply the reverse of the solution I applied to get normal highlighting working. We do this via toxml() call, which converts the tree structure into a string. The problem is, if you call toxml() on the entire element, you’ll end up also syntax-highlighting the <code> tag itself, which we’re trying to keep hidden from the viewer. And since nodeList doesn’t have toxml() method, you’ll simply have to loop through each child individually as follows:

for child in element.childNodes:
    element.replaceChild(xml.dom.minidom.parseString(
        highlight(child.toxml(), lexer, formatter))
        .documentElement, child)

The work is almost complete. If you’ve been following along, adding these lines to your code and attempting to run it, you probably noticed that xml.dom throws a syntax error when you try to parse your text file. The reason is that xml.dom requires all of file contents to be inside a root element, and this is true of all XML content (i.e. webpages are always inside tag, svg images are always inside <svg> tag). My initial solution while testing this was to create fake <root> tag around the text body. It works, but this wouldn’t be much of a convenience tool if it made me jump through extra hoops. For my final solution I decided to get rid of them. It’s really not hard, all you have to do is add <root> to the beginning of the string and </root> to the end before parsing it with xml.dom. And in the end, simply remove the last 7 characters and the first 28 (to account for <root> as well as <?xml version="1.0" ?> xml.dom slaps on). The final version of the code ended up looking as follows:

from pygments import highlight
from pygments.lexers import HtmlLexer, JavascriptLexer, CssLexer, XmlLexer,
                            PythonLexer, TextLexer
from pygments.formatters import HtmlFormatter
import os
import xml.dom.minidom

lexers = {'': TextLexer,
          'html': HtmlLexer,
          'js': JavascriptLexer,
          'css': CssLexer,
          'xml': XmlLexer,
          'python': PythonLexer}

formatter = HtmlFormatter()

indir= r'.'
for x in os.listdir(indir):
    if x.endswith('.txt'):
        infile = os.path.join(indir, x)
        outfile = infile.replace('.txt', '.html')
        data = list(open(infile))
        data.insert(0, '<root>')
        data.append('</root>')
        text = ''.join(data)
        doc = xml.dom.minidom.parseString(text)
        code_tags = doc.getElementsByTagName('code')
        for element in code_tags:
            lexer = lexers[element.getAttribute('class')]()
            for child in element.childNodes:
                element.replaceChild(xml.dom.minidom.parseString(
                    highlight(child.toxml(), lexer, formatter))
                    .documentElement, child)
        html = doc.toxml()[28:-7]
        f = open(outfile, 'w')
        f.write(html)
        f.close()
        print 'finished :',x

If you want, you can take this a step further by customizing the <code> tag for each language you use. If you want something like line numbers, just add the following to the formatter initialization line:

formatter = HtmlFormatter(linenos=True)

Also, don’t forget to generate new classes responsible for actually defining the colors you highlight with and placing them into your blog’s CSS stylesheet. To generate the stylesheet, just run the following command in bash:

pygmentize -S default -f html > style.css

And just to prove that my new parser works, I’ve used it to highlight this entire blog post (note, this was originally posted on Blogspot, this WordPress post no longer uses this mechanism). When I have the time, I might go back and highlight my old posts the same way as well.

Pyjamas Applications on Google App Engine

The next topic I want to cover is writing an app on top of GAE. This one took me a while to originally figure out. I had absolutely no framework experience before, and no idea how GAE worked. Initially, I was very confused at first so I want to discuss how GAE works and what I learned. This post assumes you have the GAE SDK installed and you’ve created an application using the Google App Engine dashboard.

Let’s start with the most important file for your web app – app.yaml. Anytime someone visits any URL under your domain, http://yourapp.appspot.com, their request will go through the app.yaml file to figure out what to load. In proper terms, the app.yaml file forwards the URL path to a “request handler”. The app.yaml file only complicates things for web app with static files, but it becomes very powerful when you start building more complex apps that use services like JSONRPC.

My first GAE app used the helloworld example from Pyjamas. I’ll post the app.yaml file I used and explain what the different sections mean, and then I’ll explain how to package everything to upload it.

application: myapp
version: 1
runtime: python
api_version: 1
handlers:
- url: /
  static_files: output/Hello.html
  upload: output/Hello.html

- url: /*
  static_dir: output

The first 4 lines are standard. The only unique thing you’ll want is your application name. This application should already be created.

The rest of the file is a list of handlers. Each handler starts with a URL pattern. When a user visits a URL on your domain, GAE uses the first handler with a URL patterns that matches the client request. In the app.yaml file above the URL pattern for the first handler is /, which will catch requests for http://yourapp.appspot.com/. The line after the URL pattern tells us we have a static file handler. This static file handler will load output/Hello.html – the main HTML file for the helloworld app. This means that when a user visits http://yourapp.appspot.com/, they load output/Hello.html.

The 2nd handler is for catching calls to http://yourapp.appspot.com/*. This handler is for a static directory, not just a single file. This handler is here because Pyjamas compiles into multiple javascript/cache files which Hello.html loads. Files like Hello.nocache.html need to be accessible from URLs like http://yourapp.appspot.com/Hello.nocache.html in order for the app to work.

There’s one last type of handler: script handlers. I personally use the script handlers to add support for JSONRPC calls, which I will demonstrate in a future post. For the full details on how to configure your app.yaml file visit the GAE documentation at http://code.google.com/appengine/docs/python/config/appconfig.html .

Finally, with the app.yaml file complete, all you need to do is put everything into 1 nice package. I put the app.yaml file at the same level as the output directory (to be clear, 1 level above all the output files). Add this existing project to the Google App Engine Launcher, Deploy it, and you’re good to go!

Installing Pyjamas on Windows XP

Now for the good stuff! I followed a wiki post and got an old Pyjamas version working. The old Pyjamas version was missing some features, so I figured out a way to build the latest version from scratch. For documentation purposes, here are the steps I followed on Windows XP:

  1. Download and run a Python 2.6 installer (here’s a link to the latest: http://www.python.org/download/releases/2.6.6/)
  2. Install comtypes 0.6.1. (comtypes-0.6.1.win32.exe in http://sourceforge.net/projects/comtypes/files/comtypes/0.6.1/)
  3. Permanently update the path for Python (and Pyjamas while you’re at it)
    1. Go to System Properties through Control Panel
    2. Open the Advanced tab, then click on the Environmental Variables button
    3. Add the following to your “PATH” system variable: c:\python26;c:\Pyjamas\pyjs\bin; (the 2nd dir will exist after you install pyjamas)
  4. Install Git for Windows (If you have Git skip this step)
    1. Install msygit from http://code.google.com/p/msysgit/downloads/detail?name=Git-1.7.3.1-preview20101002.exe&can=2&q= B
    2. Install TortisGit from http://code.google.com/p/tortoisegit/downloads/detail?name=TortoiseGit-1.5.8.0-32bit.msi&can=2&q=
  5. Git the latest Pyjamas code.
    1. Create a directory C:\Pyjamas\
    2. Open an explorer window, and navigate to the C:\ drive.
    3. Right click on the Pyjamas directory and select Git Clone.
    4. Enter the URL: https://github.com/pyjs/pyjs.git git://pyjs.org/git/pyjamas.git
  6. Open a new command line (this has to be done after updating the path):
    1. > cd C:\Pyjamas\pyjs C:\Pyjamas\pyjamas
    2. > python bootstrap.py Note: The bootstrap won’t print anything to the screen, but it will create the bin directory (which was added to your path in step 3)
  7. Now test an application:
    1. > cd examples\helloworld
    2. > python __main__.py
    3. Use Firefox or IE and open output\Hello.html Note: Chome won’t load AJAX pages off your local machine for security reasons. You’ll have to upload it somewhere to see it in chrome. For more info see http://code.google.com/p/chromium/issues/detail?id=40787
  8. If you find anything different, update the wiki! Pyjamas Wiki

I ran through steps 1-4 once about 6 months ago. Since then I periodically rerun steps 5 and 6 to get the latest Pyjamas code.

One last note. I wouldn’t recommend pyjamas desktop on Windows if you’re doing Canvas apps. It will be good when a webkit pyjamas desktop version is out, but if you want pyjamas desktop, I’d go with Ubuntu 9.10. Because I have an Ubuntu box, I’ve never put in the time into figuring out how to get pyjamas desktop working on Windows with the latest Pyjamas code, but I do know others that are using it.

SVG text transform (translate, scale, rotate)

I was playing with the idea of saving data from canvas element into an SVG file, and experienced first-hand just how much of a pain the SVG standard is. In this post I’ll concentrate on the most annoying part I had to deal with so far, text transformation. I wanted to share it with other interested readers since none of the online tutorials I found explain the order of operations well enough and also to document it for myself in case I need to deal with it in the future, since I spent a good 5 hours playing with it until I figured all the math out. First of all, the text tag is saved as follows:

<text x="left_x_coord" y="btm_y_coord" fill="color" font-size="10" 
font-family="sans-serif" transform="translate(...) scale(...) rotate(...)">
actual_text</text>

Items in red are to be specified by the user. Items in yellow, like font size and family are configurable in HTML5Canvas wrapper I submitted to Pyjamas, if you’re using GWTCanvas, assume font-size of 10 and sans-serif font-family since those are the values it defaults to. Be careful not to define the font as 10pt, which is not the same as 10 pixels. Items in pink identify the transformation matrix, they can be specified in any order but order will affect how the coordinates are applied, this blog post is about them. While each tag by itself is pretty simple, it’s their interaction that can be counter-intuitive. The following is an explanation of how to use them based on order:

translate(X Y)

Translate translates the frame of reference by X and Y pixels, and affects the x, y positions of text. So the following two statements will produce text in the same position:

<text x="x_coord0" y="y_coord0" transform="translate(x_offset, y_offset)">text
</text>


<text x="x_coord0+x_offset" y="y_coord0+y_offset">text</text>

scale(X [Y])

Scale takes in the ratio to scale X and Y coordinates by, if Y coordinate is not provided then the same scale ratio is applied to both. Scale occurs relative to 0, 0 coordinate (top-left corner) of the drawing, which means that scale(2) will not only double the size of text but also double its distance from top-left corner. So if you want to scale the text in-place, you have to either apply translate() beforehand to offset it by -coord0(scale-1) or apply it afterwards with offset of -coord0(scale-1)/scale. Essentially, either of the following two statements would scale the text without moving its position:

<text x="x_coord0" y="y_coord0" transform="translate(-x_coord0*(x_scale-1), 
-y_coord0*(y_scale-1)) scale(x_scale, y_scale)">
text</text>


<text x="x_coord0" y="y_coord0" transform="scale(x_scale, y_scale) 
translate(-x_coord0*(x_scale-1)/x_scale, -y_coord0*(y_scale-1)/y_scale)">
text
</text>

rotate(degs [X Y])

Rotate applies rotation around the X, Y coordinate specified by the user. If no coordinate is specified, the rotation occurs around 0, 0 coordinate of the drawing. Therefore if we want the text rotated around its own bottom-left corner, we have to call rotate(degs x_coord0 y_coord0). The ability to add these two coordinates makes dealing with rotate much more intuitive than scale, and we can place rotate() in any order relative to the other 2 transformations without having to change these parameters (assuming we’re rotating around bottom-left corner).

Putting it all together (rotation/scale around text center)

Rotation around the origin is pretty easy. But if we need to rotate the text around the center, we’ll still need some magic discovered in the scale section. First, we’ll need to know the text dimensions. The height is font-size*y_scale/2 (the actual font size accounts for whitespace below and above the letters used for letters like “f” and “g”, other letters like “o” only use 1/2 font height). The width can be obtained via HTML5Canvas measureText() method and will also need to be multiplied by x_scale. Since translate() centered the drawing at bottom-left corner of the text, the midpoint will be subject to the same position distortion due to scale() as everything else. Which means that if we just apply width/2 and -height/2 offsets, we’re essentially applying width/2*current_x_scale and -height/2*current_y_scale (where the scale parameters are either 1 if rotation occurs before scale() transformation or the same as specified by scale if it occurs after). So we’d need to divide by that scale parameter. The following two statements would produce equivalent result (remember that text_width and text_height correspond to the text AFTER it has been scaled, even if rotate transform is before scale()):

<text x="x_coord0" y="y_coord0" transform="rotate(degs, x_coord0+text_width/2, 
y_coord0-text_height/2) translate(-x_coord0*(x_scale-1), -y_coord0*(y_scale-1))
scale(x_scale, y_scale)">
text</text>


<text x="x_coord0" y="y_coord0" transform="translate(-x_coord0*(x_scale-1), 
-y_coord0*(y_scale-1)) scale(x_scale, y_scale) rotate(degs,
x_coord0+text_width/(2*x_scale), y_coord0-text_height/(2*y_scale))">
text</text>

Now what if in addition to applying rotation around the center, we also wanted the scale to apply around the center? The logic is the same, problem is that logic messes up rotation() since we’d need to apply it to translate(). Let’s ignore rotation for now, we apply same offsets as follows into each of the statements from the scale() section:

<text x="x_coord0" y="y_coord0" transform="translate(
-(x_coord0*(x_scale-1)+text_width/2), -(y_coord0*(y_scale-1)-text_height/2))
scale(x_scale, y_scale)">
text</text>


<text x="x_coord0" y="y_coord0" transform="scale(x_scale, y_scale) translate(
-(x_coord0*(x_scale-1)+text_width/2)/x_scale,
-(y_coord0*(y_scale-1)-text_height/2)/y_scale)">
text</text>

Now the text is scaled around the center. Problem is that translate moved our drawing origin, so the rotate() offset we calculated for the midpoint is no longer valid. The good news is that only the first of the 2 rotate statements above would be affected, since in the second statement rotate() gets applied after scale() it already factors in the transformation from scale. To fix the first statement all we need to do is divide the text_width and text_height used in rotation by their scale factors. If we apply this logic to the two earlier rotation statements we get the following (notice that the 2nd statement hasn’t changed, also notice that rotate() statement is now identical between the two):

<text x="x_coord0" y="y_coord0" transform="rotate(degs, 
x_coord0+text_width/(2*x_scale), y_coord0-text_height/(2*y_scale)) translate(
-(x_coord0*(x_scale-1)+text_width/2), -(y_coord0*(y_scale-1)-text_height/2))
scale(x_scale, y_scale)">
text</text>


<text x="x_coord0" y="y_coord0" transform="translate(
-(x_coord0*(x_scale-1)+text_width/2), -(y_coord0*(y_scale-1)-text_height/2))
scale(x_scale, y_scale) rotate(degs, x_coord0+text_width/(2*x_scale),
y_coord0-text_height/(2*y_scale))">
text</text>

If you do the math (or just play with some numbers), you’ll notice that the two statements in rotate also can be represented as -translate_input/(scale-1). So, if we wanted to create an SVG tag for text we drew using HTML5Canvas in Pyjamas, we’d write something as follows:

def textToSVG(x, y, rotation, scale_x, scale_y, textString, canvas):
    """Creates an SVG tag for textString located at x, y on the canvas scaled
    horizontally by a factor of scale_x, vertically by a factor of scale_y
    around the center and rotated by rotation degrees around the center"""

    transform_tag = ""
    scaled = False
    width = canvas.measureText(textString)
    font = canvas.getFont().split('px ') #assumes font-size in pixels
    height = float(font[0])
    if scale_x != 1:
        scale_diff_x = x_scale-1
        offset_x = x*scale_diff_x+width/2
        rotate_offset_x = offset_x/scale_diff_x
        scaled = True
    if scale_y != 1:
        scale_diff_y = y_scale-1
        offset_y = y*scale_diff_y-height/4   #accounts for whitespace
        rotate_offset_y = offset_y/scale_diff_y
        scaled = True
    if scaled or rotation:
        if scaled:
            tranform_tag = strcat("translate(-", str(offset_x), " -",
                                  str(offset_y), ") scale(", str(scale_x),
                                  " ", str(scale_y), ") ")
        if rotation:
            tranform_tag = strcat(svg, "rotate(", str(rotation), " ",
                                  str(rotate_offset_x), " ",
                                  str(rotate_offset_y), ")")
        transform_tag = strcat(" transform=\"", transform_tag, "\"")
    return strcat("<text x=\"", str(x), "\" y=\"", str(y),
                  "\" fill=\"black\" font-size=\"", font[0],
                  "\" font-family=\"", font[1], "\"", transform_tag,
                  ">", textString, "</text>")

strcat() is my implementation of string concatenation for Pyjamas, which I explained in an earlier blog post. The above function can quickly generate an SVG text tag with any desired transformation, try it out. There is also a matrix() transformation, which can combine the 3 transformations described above into a single operation. I will not cover it in this post since it’s already getting too big, but I might describe it at a later date.