About Alexander Tsepkov

Founder and CEO of Pyjeon. He started out with C++, but switched to Python as his main programming language due to its clean syntax and productivity. He often uses other languages for his work as well, such as JavaScript, Perl, and RapydScript. His posts tend to cover user experience, design considerations, languages, web development, Linux environment, as well as challenges of running a start-up.

Scoping in JavaScript

I do a lot of programming in Perl, not because I like it, but because the company I work for uses it as its main language. In fact, I hate Perl (it tries to be overly implicit), but I do like how it handles scoping. Any variable declared inside a block (anything surrounded by {} brackets) will be local to that inner-most block. This means that variables declared inside loops, conditionals or even stand-alone {} will not be seen from the outside.

JavaScript will not localize variables like this. Any var declaration will simply be scoped to the inner-most function. That doesn’t mean, however, that you can’t use (function() {})(); same way you would use the brackets in Perl. In fact, you’re probably already familiar with wrapping chunks of code in the above pattern. Many developers do it to prevent leaking variables into global scope, including Facebook’s Like button:

(function(d, s, id) {
  var js, fjs = d.getElementsByTagName(s)[0];
  if (d.getElementById(id)) return;
  js = d.createElement(s); js.id = id;
  js.src = "//connect.facebook.net/en_US/all.js#xfbml=1";
  fjs.parentNode.insertBefore(js, fjs);
}(document, 'script', 'facebook-jssdk'));

Similar pattern, however, can also be applied anywhere else in your code. Consider a page, for example, with multiple elements sharing similar element ID structure, varying only by the index used within the ID. We then want to iterate through these elements, giving them all a click-handler. The following jQuery-based code seems like it will do the job:

for (var i=0; i<n; i++) {
    var cachedi = i;
    $('#' + i + '-element').click(function() {
        $('#' + cachedi + '-popup').show();
    })
}

At first glance, this code looks fine. We made sure to cache the index so that cachedi gets the value of i at the time of the function creation rather than using i directly, which would use i at the time of function call (after the loop terminates and i is set to n). However, running the above code we still get all elements attempting to trigger the popup with [n]-popup ID. The problem is that our declaration of cachedi gets moved outside the for loop and the same instance of the variable gets used in every single closure we generate inside the loop. There is an easy work-around, however:

for (var i=0; i<n; i++) {
    (function() {
        var cachedi = i;
        $('#' + i + '-element').click(function() {
            $('#' + cachedi + '-popup').show();
        })
    })();
}

Now our code works as expected. This is a handy trick for anyone wishing saner scoping in JavaScript. In fact, I’d prefer that RapydScript would scope things this way too, but that would contradict Python’s loop scoping. An alternative to this trick (and probably a more orthodox solution in JavaScript) would be to move the cachedi declaration inside the function making use of this closure.

Implicit Logic Is Not Your Friend

When creating RapydML and RapydScript, I had to make quite a few design choices – similar design choices other developers make when coming up with a new language, or even an API. For inspiration, I’ve looked into Python, existing JavaScript abstraction languages like CoffeeScript, and even JavaScript itself. While doing so, I’ve noticed a few features in CoffeeScript and related languages that should never have been borrowed from Ruby, and that Ruby in turn should never have borrowed from Perl. Most of these features relate to implicit logic, where the compiler makes assumptions for you. While they seem like nice shortcuts at first, more often than not, they harm your productivity more than they help. In fact, they’re not shortcuts at all, but rather branching paths in a maze that often lead to a dead end.

You’ve probably already been bitten by a few of these implicit “shortcuts” in the past, such as JavaScript’s “optional” semi-colons. If this feature didn’t exist, the compiler would complain about the missing semi-colon as soon as the page loads, and you would be able to fix the bug right away. But since it’s a “feature”, JavaScript tries to guess where to insert the semi-colon for you. As a rule of thumb, whenever you have the compiler guessing anything, you’re asking for trouble. You’ve probably already seen an example bug resulting from this logic, something along the lines of:

return
    {
        font: 'Verdana',
        size: 10,
        type: ['italic', 'bold']
    };

The intent here was to return the object literal, instead JavaScript assumes a semi-colon at the end of the return statement and returns nothing. While I would disagree with such alignment of return statement anyway, I can definitely understand the frustration a programmer writing this would go through. An easy solution would be to move the bracket to the same line as the return statement, but a novice programmer unaware of this trying to follow a simple code convention that says curly brackets must have the same indentation as a matching bracket will likely let this one slip through the cracks.

As you can see, implicit semi-colons prevented an easy-to-find bug we could have fixed at compile time at the cost of a more annoying one that we won’t find until several hours of debugging later. Some might argue that this is an easy bug to prevent if the programmer knows the language, but the truth is most bugs are easy to prevent if you design your code conventions around them. All code conventions do is train the eye to notice errors, in this case JavaScript does the reverse. In most languages it’s either the semi-colon or the newline that finalizes a statement, your eye is trained to look for them. In JavaScript, it’s the semi-colon, unless there is a newline, unless the statement is incomplete. Your eye can’t do that kind of logic, and your brain should be scanning for more serious bugs. This is a common trend I noticed with implicit logic, it prevents easily-detectable bugs at the expense of more devious ones later on.

Let’s look at a few more examples. CoffeeScript introduced optional parentheses (like Ruby and Perl). At first it seems like a cool feature, the code has less clutter in it and we save a character. The problems start occurring when we wrap function calls, or even use multiple arguments. For example, let’s say you’ve written some code and a few weeks later noticed a bug. You traced the bug to this line:

a b,c d

Without additional context, you have no way of telling what the bug is by glancing at this line, or even what the line is trying to do. Was d supposed to be a third argument to a and you accidentally omitted the comma? Was the comma placed there in error and b is a method that was supposed to take c(d) as an argument? Was the comma supposed to be between c and d instead? Had you used parentheses, the error would immediately be obvious without looking at the definitions of these variables. In fact, you probably wouldn’t have made it in the first place.

Sure, this example uses poor variable names, but if you’ve been developing for a while, you’ve probably noticed that unless there are strict code conventions, many projects’ variable names aren’t much better. And even if you do use good naming conventions, you’re not immune from this. Imagine if the line you were debugging looked like this instead:

my_function MyClass ['item']

Was the intent here to pass a new instance of MyClass (whose constructor was initialized using an array consisting of 1 string) or to pass the item attribute of My_Class? LiveScript takes this “feature” a step further, making commas implicit as well for non-callable arguments (strings, numbers, arrays), making things even more ambiguous. Take a look at the following line of valid LiveScript, and try to figure out who’s calling who with what arguments:

a b c 1 [d 2] 3 e [f 'g' h] 4 i [j 5] k 'l' m

This is great for code golf and maybe riddles, but I definitely don’t want to see this kind of code in my project.

Shall we continue with more examples? How about implicit returns. Automatically returning last-performed operation of a function seems like a great idea, because we can’t be bothered with putting 6 extra characters at the bottom of our function to signify a proper return. Too bad you (or another developer) could miss the subtle returns when modifying the function later.

For example, let’s imagine you have a function with an implicit return whose return value is used by another function. Several months later you notice a bug due to the function not resetting some global setting or a setting in the class it belongs to. Being a busy guy, you delegate this task to another developer. Sure enough, he goes and fixes the bug by setting that global/class setting correctly at the end of the function. Too bad he forgot to check that another function was using this function’s return value. If you’re lucky, the code will break as soon as it runs, developer will notice his error and fix it before submitting the change. If you’re not lucky, the affected logic won’t get triggered during the test (not all tests have 100% coverage), developer will submit broken code and you will pat him on the back for doing a good job.

Even if you’re perfect, and never make mistakes, code is rarely developed in isolation. It’s in your interest to make code easy to understand to other developers, not just yourself. But if you’re like the rest of us, mortals, you will probably break your own code if you have to deal with it several months later. As another example, let’s imagine you have a long function with the following format (assuming implicit returns):

def fun(args):
    ...
    some_var = ...
    ...
    if SOME_GLOBAL_VAR == True:
        if some_var:
            ...
        else:
            ...
    else
        ...

Let’s also imagine that you’re calling it from multiple places, one of which uses its return for doing additional computations. Let’s also imagine that you’ve modified the logic in one of the other places calling this function (that previously didn’t need the return value, and that always sets SOME_GLOBAL_VAR to True before calling fun()) such that it now needs to know if some_var got set or not. “No problem” you decide to yourself, slapping “return some_var” at the end of the outer “if” block, breaking the implicit return that one of the other functions was expecting.

There are countless other examples of implicit logic in languages that seemed like a great idea at first, but with time proved to do more harm than good. Some examples are:

  • JavaScript/Perl functions automatically discarding extra arguments
  • JavaScript/Perl functions automatically setting missing arguments to undefined
  • JavaScript implicitly converting operand types when using + operator
  • JavaScript implicitly converting unrelated types when using ==
  • JavaScript/C++ making brackets optional for single-line conditional statements
  • Switch statements without break in JavaScript/C++ automatically falling through to next case
  • Object attributes defaulting to public in Python
  • JavaScript assuming global scope when var isn’t used

There are very few cases when implicit logic doesn’t cause confusion. A couple that come to mind are tuple packing/unpacking in Python and implicit boolean typecasting in many languages’ if statements without having to say == True. As a rule of thumb, if you’re asking yourself whether you should make something implicit, you probably should not.

To summarize, here are all the reasons why implicit anything is bad:

  • It saves time when writing the code at the expense of time spent debugging it
  • It makes code more ambiguous to other developers as well as yourself in the future
  • In cases when it relies on compiler inferring your intent, it can be inferred incorrectly (or rather your assumptions about how it will be interpreted could be incorrect)
  • It makes the code depend on nearby context, increasing the likelihood that something will break when you add more logic
  • It hides some of the logic from untrained eye, increasing the likelihood that something will break when you add more logic and you won’t notice it
  • It hides some of the logic from untrained eye, increasing the likelihood that something will be lost in translation when refactoring the code, or rewriting it in a different language

Even if you never make mistakes, you probably have other developers on the team. It’s in your interest to make the code clear to them, not just yourself. You want to decrease ambiguity, and implicit logic does the opposite.

RapydScript II

Building New Compiler A while back I mentioned that I wanted to rewrite RapydScript to rely on an internal AST (Abstract Syntax Tree) structure rather than PyMeta, which would allow it several advantages, including code transformations prior to output, more consistent parsing, and better error detection/handling.

I have looked at several tools for helping me accomplish this. One of the most promising seemed to be CoffeeScript, that is until I actually started converting its source code into RapydScript (via an automated AST-to-RapydScript generating script I put together). Having discovered that the output is very ugly and unpythonic, I gave up on the idea. After some more searching, however, I found the right tool for the job, ironically it was UglifyJS, a tool designed to make your JavaScript less readable.

UglifyJS code-base is very clean, and although it uses its own format that doesn’t directly translate into Python, the logic is well-structured and it’s easy to see where the code could be replaced by Python/RapydScript constructs. The code does a good job being explicit rather than relying on various JavaScript hacks, like CoffeeScript does.

Using UglifyJS as a base, I rewrote the compiler to convert RapydScript into native JavaScript. You can get it here: https://github.com/atsepkov/RapydScript. There are a couple minor features from the original compiler RapydScript II doesn’t support yet, but it’s already better than the original in several ways:

  • It compiles code much faster (on the order of milliseconds instead of seconds)
  • It handles leading whitespace much better now (same way as native Python would), so it will not complain if comments or secondary lines mix spaces and tabs
  • It will properly localize outer-scope variables when using module wrapper
  • It’s a single-pass compiler that uses consistent logic to identify tokens, whereas the original RapydScript would sometimes try to perform the same logic in Python stage that it would later do using PyMeta, resulting in potential parsing inconsistencies in certain special cases
  • It supports some notations that the original does not (such as the code shown later in this post)
  • It handles if val in array the same way Python would, rather than JavaScript, checking for existence of the value rather than the index
  • It’s more resistant to bad code, reporting errors that are more relevant and identifying the exact line/column number that caused the error
  • It’s more resistant to valid code that doesn’t follow typical syntax conventions, parsing it correctly rather than erroring out
  • The class-parsing logic has been improved to allow better durability and some requirements of the original compiler no longer apply (the __init__ function no longer needs to be first in the class body, for example, classes can now be nested and even declared inside functions)
  • It is written in JavaScript, which means it can eventually be ported to native RapydScript, it also means it can be modified to run in a browser, and that we can actually test function rather than structure in our unit and integration tests
  • It’s AST-aware, so it doesn’t need to generate excessive parentheses in the output for safety like the original compiler
  • Since the compiler is AST-based, it doesn’t need to generate output at the time of parsing, which makes the compiler more flexible and allows various code transformations and analysis to take place before the output is produced. This also makes it easier to make changes to the way output is generated in the future, should we need to do so.

The compiler already supports most of the features of the original, the only features that aren’t yet supported are:

  • (UPDATE: these now work) Chained Comparisons (1 < x < 5)
  • (UPDATE: these now work) List Comprehensions
  • (UPDATE: these now work) Inline Functions

Everything else should work at least as well as the original, and in some cases much better. For example, you can now do the following:

(def(a):
    return a
)(1)

Here is a list of other features that the original compiler does not support:

  • Logic such as 5 in [5,6,7] now correctly returns True
  • stdlib is no longer required for using for loops, len() or print(), more functions will be moved out of stdlib with time
  • Negative array indices now compile to arr[arr.length-1] instead of using slices, this allows assignment to them and not just referencing them (you still can’t use variables as negative indices)
  • Simple minifier is now built into the compiler (it will remove whitespace and unnecessary punctuation)
  • Pythonic imports are now supported, use them via --namespace-imports flag (experimental, some code will probably break, star-imports are not supported yet, you can set variables in global scope but not modify them)

I can’t yet guarantee that your code will work correctly with RapydScript II, but all RapydScript II test cases (which are more rigorous than original RapydScript tests, in that they also test that the code performs correct logic after compilation) as well as the examples from original (with slight modifications to remove the unsupported features) seem to work. And as a present for those afraid of GPL, I’m releasing the entire compiler for RapydScript II under Apache 2.0 license.

RapydScript by Example

With RapydScript gaining popularity, I figured it’s about time for a blog post talking about an example app. After all, I learn fastest by observing examples, many others probably do as well.

Let’s imagine we wanted to write a game using RapydScript. Since I’m not feeling particularly creative today, let’s concentrate on an existing game, rather than coming up with an idea from scratch. For this example, I’ll use a game called Chip’s Challenge.

Chip's Challenge

Looking at the original game, we quickly notice that the environment is a grid of blocks. Each block type has a corresponding image, and some have an effect on the character stepping on them. If we assume that we’ll use a canvas element for drawing the grid, then each block will need to track its coordinates in the grid and the url of the image corresponding to this block. Let’s assume we created a canvas with id #canvas in our document using html. We now simply need to create a reference to it in RapydScript:

CANVAS = document.getElementById('canvas').getContext('2d')

Since canvas is not only an object, but also a module containing methods for creating canvas-compatible objects (like images and patterns), we will use a global to reference it rather than properly abstracting it. Next, let’s create a simple block:

BLOCK_SIZE = 32 #pixels
NUM_X_BLOCKS = 21 # horizontal blocks on a field
NUM_Y_BLOCKS = 21 # vertical blocks
NORMAL_BLOCK = 0 # enum identifying block type

class Block:
    def __init__(self, x, y, image_url):
        self.x = x
        self.y = y
        img = new Image()
        img.src = image_url
        self.type = NORMAL_BLOCK
        self.blockPattern = CANVAS.createPattern(img)

    def redraw(self):
        CANVAS.save()
        CANVAS.fillColor = self.blockPattern
        CANVAS.fillRect(self.x*BLOCK_SIZE, self.y*BLOCK_SIZE, BLOCK_SIZE, BLOCK_SIZE)
        CANVAS.restore()

Now that we have a basic block, let’s create advanced blocks that can affect chip in some way. For example, Chip’s Challenge has a water block that Chip can drown in unless he’s wearing flippers. There is also an ice block, that makes chip slide to the next one unless he’s wearing ice skates. Let’s create these two blocks:

WATER_BLOCK = 1
ICE_BLOCK = 2

class WaterBlock(Block):
    def __init__(self, x, y):
        Block.__init__(self, x, y, 'images/water.jpg')
        self.type = WATER_BLOCK

    def effect(self, unit):
        if not unit.hasItem('flippers'):
            unit.die() # drown

class IceBlock(Block):
    def __init__(self, x, y):
        Block.__init__(self, x, y, 'images/ice.jpg')
        self.type = ICE_BLOCK

    def effect(self, unit):
        if not unit.hasItem('skates'):
            unit.move(unit.direction) #slip

Let’s add a character that can move around. We will use Block as a base class since a character is really just another block drawn on top of the existing block. Unlike a regular block, the picture could change depending on the direction the character is facing:

LEFT = 0
UP = 1
RIGHT = 2
DOWN = 40

class Character(Block):
    def __init__(self, x, y, images):
        self.patterns = [CANVAS.createPattern(img) for img in images]
        self.direction = DOWN # default facing direction
        self.blockPattern = self._patterns[self.direction]

    def move(self, direction):
        self.direction = direction
        self.blockPattern = self.patterns[direction]
        if direction == DOWN:
            self.y += 1
        elif direction == UP:
            self.y -= 1
        elif direction == LEFT:
            self.x -= 1
        else:
            self.x += 1
        self.redraw()

This character can now serve as the base class both for Chip himself and the monsters that roam around. If we wanted to create a typical monster, for example, we would write:

MONSTER = 10

class Monster(Character):
    def __init__(self, x, y, redrawCallback):
        images = [Image() for i in range(4)]
        images[DOWN].src = 'image/monsterDown.jpg'
        images[UP].src = 'image/monsterUp.jpg'
        images[LEFT].src = 'image/monsterLeft.jpg'
        images[RIGHT].src = 'image/monsterRight.jpg'
        self.type = MONSTER
        Character.__init__(self, x, y, images)

        # have the monster move randomly every 0.5 seconds
        main = self
        window.setInterval(def(): main.move(Math.round(Math.random()*4)); redrawCallback();, 500)

    def die(self):
        pass    # monsters don't die

Similarly, let’s create chip:

CHIP = 11

class Chip(Character):
    def __init__(self, die=False):
        # center Chip
        self.x = int(NUM_X_BLOCKS/2)+1
        self.y = int(NUM_Y_BLOCKS/2)+1
        self.items = {}

        if not die:
            self.deaths = 0
            images = [Image() for i in range(4)]
            images[DOWN].src = 'image/chipDown.jpg'
            images[UP].src = 'image/chipUp.jpg'
            images[LEFT].src = 'image/chipLeft.jpg'
            images[RIGHT].src = 'image/chipRight.jpg'
            self.type = CHIP
        Character.__init__(self, x, y, images)

    def die(self):
        self.deaths += 1
        self.__init__(True)

    def hasItem(self, item):
        return self.items[item]

    def getItem(self, item):
        self.items[item] = True

We now have most of the basics done. Let’s start putting the pieces together by creating the actual class that renders these. Our main class (which we will call Field) will need to create the grid and populate it. We’ll pass it a matrix of blocks to create the field, and an array of monsters.

class Field:
    def __init__(self, grid, numMonsters):
        self.grid = grid
        self.monsters = []
        for i in range(numMonsters):
            monster = Monster(
                Math.round(Math.random()*NUM_X_BLOCKS),
                Math.round(Math.random()*NUM_Y_BLOCKS),
                self.redraw
            )
            self.monsters.append(monster)
        self.chip = Chip()
        main = self
        moveChip = def(event):
            main.chip.move(event.keyCode - 37)
            while main.grid[main.chip.x][main.chip.y] != NORMAL_BLOCK:
                main.grid[main.chip.x][main.chip.y].effect(main.chip)

            for monster in main.monsters:
                if monster.x == main.chip.x and monster.y == main.chip.y:
                    chip.die()
            main.redraw()
        window.addEventListener("keydown", moveChip)

    def redraw(self):
        for x_array in self.grid:
            for block in x_array:
                block.redraw()
        for monster in self.monsters:
            monster.redraw()
        self.chip.redraw()

We now have an almost complete game (although inefficient due to redrawing the entire field every time a monster or the user generates an event). The only thing left to do is to create blocks that that provide chip with flippers and skates. I will leave that exercise to the reader. These items should disappear after picked up, I recommend using Character base class for those and making them die() after Chip runs into them. As you can see RapydScript code is easy to follow, which is the main benefit of the language.

Why GPL?

Recently I received an email suggesting that I change the license for RapydScript to something other than GPL. I understand that no matter how powerful, a language is useless if no one wants to use it. That’s why it makes sense not to keep it proprietary, it might have worked for Oracle back in 1977, but it will no longer work today. You might be able to convince corporations to use your platform, but not people working on hobby projects. Likewise, the GPL license might scare those who intend to write commercial software. This is not my intent, I do not wish to hurt either group, which is why the RapydScript libraries are already licensed under Apache license. Part of the fear of the GPL license comes from misunderstanding its terms (especially by those not used to dealing with open-source). The work created using a product under GPL license is not itself subject to GPL license. This is important, GPL is not a cancer that keeps spreading through your tools to your product. Rather, GPL is a way to protect your own work from getting stolen and repackaged as someone else’s for profit (i.e. Cedega building upon Wine). Basically, GPL is a way to protect open-source work from plagiarism.

This protection becomes even more important when the plagiarist has the ability to hurt your project in some way. He could steal your product and then damage your ability to work on your own branch of the it in an attempt to get rid of the competition. For example, imagine that a fictional ACME Corporation writes an operating system. People try it out, they like it, and soon all devices end up running it. Years go by, ACME becomes rich, and many of the devices evolve. Instead of rewriting the OS for new devices, however, ACME decides to keep patching the original OS. They also realize that they could outsource part of the development to further cut costs. Eventually ACME OS code becomes an unmaintanable mess, and the software itself becomes buggy. Then a few guys, frustrated by the bugs in ACME OS, develop an alternative operating system in the basement of their home, call it FREE OS, and release it as open-source. FREE OS immediately becomes popular. ACME, noticing a loss in its profits, decides to repackage FREE OS as a new version of their own. This in itself might hurt FREE OS by stealing its users (and removing the incentive for developers to work on it). But even if FREE OS already got enough traction such that people stick with it, ACME, having much bigger pockets, could go to hardware manufacturers and pay them to add DRM capability using piracy prevention as an excuse (ironically, piracy is exactly what ACME is doing). ACME could then add a proprietary driver to their OS that is able to play files encrypted with this DRM.

Fortunately, RapydScript is not an OS, and as a language it’s not in danger of being affected by a DRM. That doesn’t mean it can’t be ruined by a corporation, however. The compiler aims to generate fast, lightweight code, compatible with various JavaScript obfuscation mechanisms (it should in theory be compatible with advanced mode of Google Closure, although I haven’t tested this). In fact, the new version of Grafpad itself uses an obfuscator based on RapydScript code (which I haven’t released yet) in combination with UglifyJS. The only thing preventing a 3rd party from coming in and selling their own proprietary obfuscator/optimizer based on RapydScript code without contributing anything back is the GPL license of the compiler. This in itself would not actually hurt the language, but if they start breaking backwards compatibility with RapydScript, that could in fact hurt the community. Once the RapydScript community grows and the language gains some initial popularity, this will not be as much of an issue anymore (the obfuscator will be unlikely to gain traction without full support for RapydScript). At that time I won’t mind changing the license to Apache for the entire project.

I’m not strongly attached to the GPL license, and the libraries are already licensed under Apache, allowing companies to build their own private implementations on top of them. Only the compiler itself is licensed under GPL, and I want to encourage that code to stay open to benefit everyone. GPL seemed like the right license for the job. So far the arguments I hear against GPL from other developers are either due to it being used in the wrong places (the APIs/libraries) or unfounded paranoia due to misunderstanding how the license works (people assuming that work created via GPLed product is subject to GPL as well).

Eventually, I do plan to release both, RapydML and RapydScript under Apache license. If you believe I should do so now, I would love to hear your argument. As of yet, I do not see a legitimate case of how the GPL license could hurt a company deciding to use RapydScript (aside from their legal department getting paranoid about the ‘GPL’ acronym). If they wish to use RapydScript as a compiler to create proprietary work, the GPL license does not affect their own code. If they wish to reuse RapydScript libraries in their proprietary code, the Apache license of the libraries will allow them to do that. If they wish to make changes to the compiler that will only be used internally by the company, the GPL license will not affect them. If they wish to release a stand-alone tool to be used with RapydScript, RapydScript’s license does not apply. If they wish to make changes to the compiler that would affect the rest of the community, then they have to release the source code for these changes.

RapydScript in a Nutshell

RapydScript Logo My last post mentioned RapydScript, a JavaScript variant with Pythonic syntax which I’ve come up with to speed up web-development. I haven’t had much free time to post since then, but I have been updating RapydScript this whole time. The bitbucket repository for the language started drawing some followers, and I am now getting questions in my inbox regarding plausibility of developing large-scale application in RapydScript. To address these concerns, and explain how to leverage RapydScript, I decided to write this post.

Before starting your project, it’s important to understand what RapydScript is and what it is not. RapydScript does not aim to emulate Python in JavaScript like most other Python-to-JS compilers such as Pyjamas, Pyjaco, and Skulpt. RapydScript to Python is what CoffeeScript is to Ruby (actually, RapydScript and Python have more in common). The point is, it’s important to understand that when writing RapydScript, you’re still using JavaScript. The code doesn’t get abstracted into a sandbox or wrapped in special types. This has a few advantages and disadvantages.

Advantages

The main advantage of such approach is closer integration with JavaScript. Your code will load and run faster than similar code written for one of the other Python compilers (except PyvaScript, which will work just as fast). Additionally, your code doesn’t need hacks or wrappers to invoke or create native JavaScript and DOM objects, anything JavaScript has access to, RapydScript has access to. Drawing its power from JavaScript’s prototype, RapydScript’s class inheritance system is more intuitive than Python’s. RapydScript can create object literals (similar to C structs), create anonymous multi-line functions, and do some other goodies that pure Python can not. You can use $ in your variable names and use jQuery (or any other JS framework) without hacks. You can even debug your code by reading generated JavaScript that works and looks almost the same way as original code. Basically, RapydScript has all the advantages of native JavaScript. In fact, it’s very easy to debug RapydScript using Chrome’s Developer Tools or Firefox’s Firebug, other Pythonic solutions don’t have this benefit.

The other advantage is that while RapydScript does have a standard library, it does not rely on it to be usable. This means that even if I decide to abandon the project tomorrow for some reason, you will not be stuck with a half-functional solution for your front-end. Even with a limited subset of Python library implemented, RapydScript is 100% usable and capable of building large-scale web apps, because its main advantage isn’t in its standard library, but in cleaning up JavaScript syntax. In fact, you can forego RapydScript’s stdlib.js in your project altogether and replace it with underscore.js. You will also automatically benefit from new JavaScript features as they appear and can continue using the compiler regardless of how old it is.

Disadvantages

The disadvantage of RapydScript (when compared to other Python-to-JS compilers, not counting PyvaScript) is also its close integration with JavaScript. RapydScript does not aim to catch errors JavaScript ignores. If you access an out-of-bounds cell in an array, you will get an undefined variable, not an IndexError (likewise, 1/0 is infinity). You can’t use negative indexes to traverse arrays in reverse (you can use negative indexes via list.__getitem__(), however). It turns out, however, that for properly written code, these disadvantages aren’t a problem. As long as you understand that the language is not pure Python, and the coding style a bit different, you will not have problems with RapydScript.

Why is it better than JavaScript?

If RapydScript has the same advantages and disadvantages as JavaScript, the question you might then have is “Why use it instead of JavaScript?”. The answer is “Because it brings many of the same benefits Python programmers already enjoy into JavaScript”. These benefits include all of the following and more:

  • Classes and Inheritance
  • Python Standard Library (only a portion of it is included so far, but more will be with time)
  • Clean, easy-to-read, Pythonic syntax
  • Better Variable Scoping (and Variable Shadowing)
  • Implicit Tuple Packing/Unpacking
  • Better Optional Argument Implementation
  • List comprehensions
  • Ability to import multiple modules into a single chunk of code (allowing easier code reuse)

If you spend a few minutes coding in RapydScript, or even checking out its examples, you will notice that once the ugly parts are removed from JavaScript, it’s actually a very beautiful language (allowing for code that is even cleaner than its Python-equivalent). And that’s exactly what RapydScript does, it cleans up the language so you can enjoy its true potential.

So who should use RapydScript?

The audience that will enjoy it the most is probably Python developers who want to write JavaScript code. There is a reason many Python companies choose to do their JavaScript development in CoffeeScript instead of using one of the existing Python-to-JavaScript compilers. The reason is performance, debugging ability, and integration with other JavaScript. RapydScript shares all of the advantages of CoffeeScript without introducing messy and confusing syntax.

RapydScript in Commercial Projects

I have no problem with people profiting from the code they write in RapydScript. Both, your code and the JavaScript you generate is yours to do whatever you want with. RapydScript itself is licensed under GPL, but all of its current libraries are covered by Apache license. This is to allow you to import those libraries into your project without being forced to open-source your entire front-end. I ask that other developers submitting new libraries for RapydScript also use a permissive license, but that is up to the developer.

Pyjamas Alternatives for Web Development

In my last post I mentioned that I’m switching away from Pyjamas. In this one, I will explain the alternatives I looked into, as well as cons and pros of each one. While doing my research (which started almost half a year ago), I’ve stumbled upon multiple Python-inspired alternatives to JavaScript, many of which have been abandoned, and only few of which have reached the stage where they’re suitable for large projects. Out of those I have settled on a solution that works for me, but there is no perfect framework/language, your choice could be different than mine. Let’s look at a few frameworks and decide how usable each one is for writing a full application.

Pyjamas

While I might have issues with how Pyjamas approaches web development, I have to admit that it has gotten quite far, and it’s quite possible to write a usable application with it.

Verdict: usable – if overhead is not a problem

Skulpt

Skulpt does not process your Python code at all until execution time, this is a unique approach that I haven’t seen any other framework take. It’s essentially like a browser-based eval() for your Python. This is a neat concept, since it can be used to extend another framework to support Python-like eval() (something that even Pyjamas doesn’t do), allowing your script to write and execute code on the fly. While I’m not a big fan of eval() function, this would definitely be neat for a framework that aims to achieve complete Python compatibility. Similarly, one of Skulpt’s major disadvantages is having to include the entire “compiler” as a script in your page, if this project ever grows to have the same functionality as Pyjamas, the overhead of loading the extra logic could be enormous. The second disadvantage is speed, Python interpreted and executed in the browser on the fly will never be as fast as even boilerplate-based Pyjamas, let alone pure JavaScript. The final problem is that you can’t obfuscate/compress your code at all, since its interpreter needs to make sense of your Python code (including the whitespace), this might be a non-issue to open-source projects.

Verdict: not usable

pyxc-js

Like Skulpt, this compiler is unsuitable for any large-scale web development, but it has an elegant import mechanism, which is something many other frameworks/compilers lack. Just like Skulpt can be used to implement eval(), this compiler can be used to implement proper import in your compiler (I know Pyjamas has proper import, but it’s heavily integrated into the rest of the framework). This project has been abandoned and the author does not respond to emails. Aside from its import mechanism (which is based on a directed graph the compiler builds at compile time, eliminating any unused and repeated imports), this project doesn’t have much to offer. The documentation is lacking, it took me a while (and some tweaks) to be able to compile anything, the included test cases don’t work, and the generated code is rather messy (it often spits out multiple semi-colons at the end of the line and blank lines get semicolons too, and if I remember correctly there are occasional bugs in how your code gets compiled). On the plus side, it offers an internal compression mechanism that can shorten your code further (although I’m not sure how much I trust it, given the quality of the non-compressed code it generates).

Verdict: not usable

PyCow

This was the first framework aside from Pyjamas that I felt could support large projects. PyCow compiles Python code into MooTools-based JavaScript. More importantly, its source code is cleanly laid out and the original developer is very responsive and helpful (despite having abandoned the project). It relies on templates for many of its conversions rather than hard-coding them in the source or implementing alternative names for JavaScript functions. For example, list.append() gets converted to list.push() at compile time using a template specified in a hash table. This is a great idea, since it makes the compiler code cleaner and the final output stays closer to real JavaScript, introducing less overhead (and less need for large stdlib). The disadvantage of this approach is that we have no good way of knowing whether a certain method is part of our language and needs to be replaced by JavaScript equivalent, or if it belonds to another API and should be left alone. Here is an example where replacing the append() would break our code:

container = $('#element'); // jQuery used to create a container of elements
...
container.append(item);    // replace this append() and jQuery will complain

PyCow is also MooTools’-based, and I’d prefer a language that doesn’t tie me to a certain framework. I investigated rewriting it such that it generates classes using pure JavaScript, but later realized it would not be worth the effort. One other minor pet-peeve is that PyCow is written for Python 2.5, which happened to be around the same time as Python developers were rewriting their AST parser. As a result, PyCow uses a transitional AST package and needs to be rewritten to support the later one that came after 2.6 (2.5 is still closer to the current AST, however, than the AST parser Pyjamas uses form compiler module, which has been deprecated as of Python 2.6). This framework also doesn’t support any importing of modules, but I managed to fix that in my local version by ripping out the import mechanism from pyxc-js. Unfortunately, I’ve moved on since then, abandoning my PyCow enhancements (let me know if you’re interested in continuing from where I left off, however, I can send you the source code).

Verdict: usable

py2js (Variation 1)

There are at least 3 different frameworks by this name, and they’re only vaguely related to each-other. The first one has been abandoned since 2009, but it impressed me for several reasons. First of all, it takes PyCow’s templating one step further, allowing you to map virtually any function/method using more complex filtering. For instance, %0 represents the class that the method is being called on, %1 reprensents the first argument, %* represents all of them. You could use this to add very powerful templating. For example, if you want to output a list of elements as a comma-delimited string in Python, you could use join() as follows: ', '.join(list). In JavaScript, however, join() is a method of array/list (which actually makes more sense to me) rather than a string, requiring you to do list.join(', ') instead. But say you wanted to auto-convert your Python joins to JavaScript, all you would have to do is add the following rule to your template hash:

'%0.join(%1)' : '%1.join(%0)'

I was able to extend this variation of py2js to support multiple Python features with JavaScript alternatives this way. As mentioned earlier, however, this method is susceptible to renaming functions whose names you have no control over.

This compiler relies on templates heavily, introducing another advantage most other compilers lack. Once the code has been converted to AST, it can be manipulated almost entirely using templates (with little need to write additional code). With some tweaks, it can become a universal source-to-source compiler (well, universal is a bit strong of a word here, statically typed languages are a whole other animal). Imagine being able to convert Python to Perl, Ruby, or any other dynamically typed language just by generating a map of logic equivalents. Something like this, for example, could allow you to convert the syntax of a Python function to Perl (block would then further get interpreted by similar templates for loops, conditionals, etc.):

### Input Template
def <method>(%*):
    <block>

### Output Template
sub <method>{
    my (%*) = @_;
    <block>;
}

Admittedly, the more fancy you get with the language (i.e. using eval()), the more likely you’re to run into problems with the template breaking. This compiler was inspired by py2py, another project designed to clean up Python code. In theory, any input languages can be supported as well, as long as you write your own AST parser that converts the code to Pythonic AST tree (or a subset of). This is a fun project in itself, and I might continue it when I have more time. As a compiler, however, this project is not yet suitable for prime-time, lacking a lot of functionality, including classes.

Verdict: not usable

py2js (Variation 2)

The only similarity between this py2js and the previous is the name. They share nothing else (well… they do now, keep reading). Variation 2 just happens to be another project by the same name as variation 1. In fact, even the compile mechanism for this py2js is very different from other compilers mentioned before. The compiler works by “running” your code at compile time. You write your function/class in Python, and then add @JavaScript decorator to it if you want it to be included in the compiled output. You then “run” your program and it dumps JavaScript to STDOUT. While it’s a bit awkward to add @JavaScript decorator to every single function, this offers something most other compilers (including Pyjamas) lack. Since it actually runs your code, you get the benefit of partial “compile-time” error catching, a feature typically only seen in statically-typed languages. You can be sure, for example, that all syntax errors will be caught, as well as undeclared variables/methods. The disadvantage is that this compiler is a bit too strict, all your code has to be pure Python and all your variables (or stubs for them) have to exist at compile time. So, for example, if you want to use something from jQuery, you better have a jQuery stub written in Python (it doesn’t need to mimic jQuery functionality, however, only to have the method names defined). This project has also been abandoned several years ago in a premature stage, which is where the 3rd variation comes in.

Verdict: not usable

py2js (Variation 3) (now Pyjaco)

The 3rd variation started off as a fork of the 2nd variation, after 3 developers decided to take over the original, abandoned, project. Over time, this variation morphed into a powerful compiler, whose functionality is only outdone by Pyjamas itself. It added support for Pythonic assertions, tuples, classes, static methods, getattr()/setattr(), *args/**kwargs, better integration with JavaScript (allowing you to call JavaScript logic by wrapping it inside JS() or automatically handling the conversion at compile time if you use one of predefined globals (document, window, etc.)) as well as many other features you would expect from Python. It has its own standard library with most Python stdlib implemented.

It, however, also falls short in a few areas. Like Pyjamas, it tries to build Pythonic error handling on top of JavaScript. Admittedly, this task might be much easier here, due to partial “compile-time” error catching, which Pyjamas lacks. It also doesn’t properly translate Python’s math methods to JavaScript, a task that shouldn’t be too hard, however.

Additionally, while it’s convenient to have document and window be treated like special classes, this can add confusion when coding. In some cases your strings and numbers will get converted to JavaScript equivalents automatically, while in the others you have to do so manually (when invoking a native JavaScript method and passing it a string, for example). A py2js string is a wrapper around native JavaScript string with additional Python functionality, same applies to many other objects. This is a good thing, since it allows support for Python’s methods without having to overwrite native objects (and I believe Pyjamas takes the same approach for its primitive objects (arrays/strings/etc.). This is a minor pet-peeve, however, and would not be a problem most of the time.

Another annoyance that I already mentioned earlier, is having to explicitly define which functions you want compiled via @JavaScript tag + str() call on the function or class (but this is just a matter of personal preference).

The biggest problem with this third variation, however, is copyright issues (which the developers are trying to resolve now – they could already have been resolved). In its quest to acquire similar functionality as Pyjamas, this compiler has “borrowed” a lot of code from other projects (including the first py2js I mentioned). This py2js project itself carries MIT license, not all of the projects it borrows from, however, are compatible with that license. Additionally, some of these projects were not given proper mention in the copyright notice due to an oversight by one of the developers. As a result, the developers had a falling out, and now forked a couple alternative versions of this project, one of which is still maintained and aims to rewrite the code in question and/or get proper authorization to use it.

Additionally, if this project joins forces with the new Pyjamas, I believe both projects will benefit a lot. The code generated by py2js is significantly smaller (Pyjamas averages 1:10 ratio for non-compiled:compiled code, while py2js is closer to 1:2 – not counting stdlib) and more readable than Pyjamas, while retaining most of Pyjamas’ features. Pyjamas, on the other hand, already has its own implementation for most of py2js code that’s subject to copyright issues, which could be used to solve the biggest problem with py2js. Finally, py2js’ compile-time error catching makes it easier to build Pythonic exception handling on top of JavaScript. And to remedy the annoyance of @JavaScript decorators, the AST parser can be used to append them automatically to temporary version of the code. This can also be used as an opportunity to update Pyjamas to use the latest AST parser implementation, which it will need anyway if it ever wants to be compatible with Python 3 (which is missing the deprecated AST implementation).

Verdict: usable

UPDATE: I’ve been informed that this project now became Pyjaco, and the copyright is no longer an issue. So for those who want to stay closer to Python, this is a very solid alternative. Christian, the project leader, also informed me that not all of the details I mentioned are accurate. Also, apparently the developers seem to have misinterpreted the original post as me claiming that RapydScript (my own compiler) is the best for everything. That was not my intent, and I tried to avoid this issue by mentioning in the first paragraph of the original article that my choice is based on my own projects and the flexibility they need (mostly Grafpad), even stating that “your choice could be different than mine”. I hope they don’t hold a grudge at me because of this misunderstanding.

PyvaScript

PyvaScript is the opposite extreme of Pyjamas. It’s perhaps the closest to pure JavaScript out of all other Python-like languages. It was one of the first alternatives I looked into, and admittedly I originally decided it would be unsuitable for large projects. There are numerous disadvantages compared to other Python frameworks. Its stdlib is puny, and most of the Python methods you would want aren’t available. It not only lacks import logic, but even classes aren’t implemented. It can’t handle negative indexes for arrays. Its compiler isn’t even based on Python AST (it uses Ometa), as a result it’s often oblivious to errors other compilers would catch, and at other times it chokes on issues that other compilers have no problems with.

Its advantages, however, provide the perfect fit for the niche that Pyjamas left open. It’s the most JavaScript-compatible solution, supporting almost all JavaScript functionality out of the box. I can manipulate other JavaScript and DOM without the need of any wrappers or special syntax. Need to use jQuery? Just do it the same way you would in JavaScript: $(element).method(), the compiler won’t complain about the dollar sign. PyvaScript also supports anonymous functions, something regular Python lacks (except 1-liner lambdas), which makes it easier to write your code like JavaScript (I didn’t see the advantage to this at first, but believe me, when writing web apps, it really helps – especially when using other JavaScript logic as a tutorial).

Since PyvaScript adds no bells or whistles to your code, or requires any sort of wrappers, your code behaves a lot more like native JavaScript, allowing you to use closures more effectively and to bind functions to objects dynamically without the language complaining. At first this seemed like a bad idea, but after playing with it, I realized that with proper use of this, I can actually write cleaner code than with Python itself. Most importantly, PyvaScript is extremely light-weight, adding almost no overhead to the generated code and does not force a framework (MooTools, jQuery, etc.) on the user. I also realized, that PyvaScript’s lazy compilation has its own advantages (which I will explain later).

Verdict: usable – but doesn’t alleviate much pain from plain JS

CoffeeScript

Sharing a structure similar to that of Python, it deserves a mention as well. If you ignore its ugly syntax, and poor choice of variable scoping (preferring globals over locals), you will see that it has all the same advantages as PyvaScript, which makes it a good candidate for Pyjamas replacement as well. It has similar feel to Python (although it feels closer to Ruby), introduces list comprehensions, and even adds classes (something PyvaScript does not). It also adds namespaces, preventing variables in different modules from interfering. If I invert the scoping design, remove all the junk variables like on/off (synonyms for true/false), and modify the syntax to use Python tokens, this will be my ideal language for web development, but that’s a project for later.

Verdict: usable

And the winner is…

RapydScript, which I didn’t even mention yet. RapydScript (best described as PyvaScript++) is my wrapper around PyvaScript that provides the best of both worlds (for me at least). I’ve made a few enhancements (abusing PyvaScript’s lazy compilation to allow me to auto-generate broken Python code that becomes proper JavaScript once compiled by PyvaScript) that achive almost everything I would want, and just about all of CoffeeScript’s functionality. Some of the new RapydScript features include support for Pythonic classes including inheritance (single inheritance only, but you can bind functions from any class to fake multiple inheritance), Pythonic optional function arguments, anonymous functions now supported in dictionaries/object literals (something PyvaScript chokes on), beefed up stdlib (also optimized already implemented methods from PyvaScript’s stdlib), support for multi-line comments/doc-strings, preliminary (compile-time) checking for syntax errors and issues that PyvaScript chokes on (because of minor bugs in PyvaScript), module importing (currently everything gets dumped into a single namespace, but compiler does warn about conflicting names), checking proper use of Math functions, automatic insertion of ‘new’ keyword when creating an object (not sure why CoffeeScript doesn’t already do the same).

To me, the main advantages of RapydScript over PyvaScript are ability to break down my modules into separate files (like I would in Python), easier time to build large projects due to proper class implementation (class declaration is done the same way as in native Python), Pythonic declaration of optional arguments to a function (I’m not a big an of JavaScript’s solution for optional arguments), and support for anonymous functions as hash values (which allows me to build object literals the same way as in JavaScript). As for other projects, the main advantages of RapydScript are seamless integration with the DOM and other JavaScript libraries/modules (just treat them like regular Python objects), ability to use both Python and JavaScript best practices as well as rely on JavaScript tutorials (one of the biggest problems for projects in their alpha stage is lack of documentation, for RapydScript you really don’t need any), and lack of bloat (RapydScript gets me as close to Python as I need without forcing any boilerplate on me). As a nice bonus, I think I can add suppport for advanced compilation mode of Google’s Closure compiler with only minor tweaks.

If you want to give RapydScript a try, you can find it in the following repository: https://bitbucket.org/pyjeon/rapydscript. Keep in mind, this is still early alpha version and some of the functionality might change. Chances are, I will eventually rewrite this, using CoffeeScript as base, but the functionality should stay very similar (after all, I am rewriting Grafpad in this, and I don’t want to rewrite it a third time). In my next post, I will dive into more detail about RapydScript, for those interested in using it and needing a tutorial.

Why Pyjamas Isn’t a Good Framework for Web Apps

Earlier this week, I stated that Pyjamas no longer seems like a viable solution for Grafpad (or many other web-apps for that matter). In this post, I will explain the flaws with Pyjamas that ultimately made me decide to switch away from it. I’m aware that Pyjamas project is currently getting an overhaul, and I hope that these flaws get addressed in the upcoming Pyjamas releases. Before I go any further into bashing Pyjamas, I want to mention that I’ve been using Pyjamas for several years, writing over 20,000 lines of Python code that runs inside the browser (as well as several Pyjamas wrappers for extending its functionality). I appreciate the problem Pyjamas is trying to solve, and I definitely think it’s a useful tool. Perhaps one day Pyjamas will be good enough for the browser, unfortuantely it has a lot of issues to solve before that’s the case.

Experienced JavaScript developers might already be familiar with many of the points I will bring up. To summarize Pyjamas’ flaws in one sentence, it basically assumes that JavaScript is still a joke of a language it was several years ago and tries to apply outdated solutions that don’t scale well. Today’s JavaScript, however, can run circles around its predecessor, both in terms of performance and functionality. Many innovative design patterns have also been posted for keeping JavaScript code clean and object-oriented. In some ways, JavaScript has even surpassed Python in terms of design, which still lacks proper private variables, for example. So what are some of the big offenders in Pyjamas?

Browser Detection instead of Feature Detection

Many of you are probably familiar with Pyjamas’ compilation scheme. If not, it basically creates multiple versions of the JavaScript code, one for each major browser (IE, Firefox, Safari/Chrome, Opera) and serves the appropriate one depending on your user-agent string. A quick Google search will reveal thousands of pages explaining the problems with this technique (called browser detection), so there is really no point for me to go into much detail here. The first problem with browser detection is that we assume that the user will be using one of the browsers we’re detecting (sorry Konqueror). The second problem is that we’re assuming the user is using one of the versions of this browser that still has the same issues/functionality. I’ve already posted about the changes I had to make in Pyjamas to make it use IE9 properly, which has full canvas support, yet Pyjamas still treats it like IE6 (ironically, IE9 actually behaves more like WebKit than IE6). The third problem is that many browsers spoof the user-agent string, pretending to be a different browser (for various reasons). These browsers may support features that the spoofed browser doesn’t support and vice versa, forcing us to use an unnecessary work-around for a feature that the browser supports natively (just like IE9 being forced to use VML instead of canvas), or preventing the feature from working altogether (imagine if Chrome, with no VML support, spoofed IE6 user-agent string).

Bloat and Boilerplate Hell

If you’ve peeked at Grafpad’s JavaScript, you probably saw 80,000 lines of code in a 3.5MB file. But did you know that the pre-compiled version of Grafpad front-end is only about 8,000 lines of code? We have 10 times the needed code just to pretend like we’re still using Python. What’s worse, most of that code is only there to support obscure Python functionality most of us are never going to use in a web-app anyway. Pyjamas has become the most complete Python framework for the browser, unfortunately it has also become the most bloated one, with most other frameworks (such as py2js) only needing to generate 1.5 lines of JavaScript for each line of Python code. You can see the 80/20 principle at work here, where 20% of Python’s features account for 80% of Pyjamas’ boilerplate. In my opinion, it would make a lot more sense to only support the commonly used features of Python, allowing the user to rewrite the bits that don’t work well for JavaScript. After all, the most tedious things to port between languages are the algorithms, not the object structure.

Debugging

In theory, Pyjamas is much easier to debug than JavaScript. Unlike JavaScript, which either throws vague errors or worse yet, silently fails a block of code and continues execution like nothing happened, Pyjamas throws Pythonic exceptions, which most of the time do a very good job pinpointing the exact line that caused the problem… at least when you run your program through Pyjamas Desktop. The problem is, Pyjamas Desktop has been broken for almost 3 years now, requiring you to either use a 3-year old Linux distribution (last known version to have support for python-hulahop) or rely on WebKit or MSXML implementations, neither of which supports canvas.

Alternatively, you can debug your code directly in the browser. Pyjamas sports a good set of Python exceptions emulated in JavaScript through clever use of try/catch blocks. Unfortunately, this alternative not only lacks proper stack trace, but also the original code (and compiling your Pyjamas app with any of the debug modes doesn’t solve this, regardless of what various outdated posts on the mailing list claim). Needless to say, the errors raised by Pyjamas in the browser are not very useful. If you made an IndexError on line 50 of your code by referencing “object.array[5]“, for example, expect Pyjamas to throw some weird error (that’s right, chances are it won’t even realize it’s an IndexError – or at least won’t report it well, the except blocks seem to work correctly) on line 30,000 of your compiled JavaScript, which will reference $p['getattr'](object, 'array').__getitem__(5), among a bunch of other boilerplate which could have caused an error as a result of an earlier error in your code or a Pyjamas bug. Even when debugging using Pyjamas Desktop, the browser errors can occasionally be inconsistent with normal Python (usually due to a bug in Pyjamas), and it’s a pain to troubleshoot these. And there is really not much Pyjamas can do to remedy this, in my opinion.

Adding additional assertions to catch every possible case to throw Pythonic errors is a fool’s errand no different than trying to parse HTML using regular expressions. Python’s ability to throw relevant assertions stems from its fundamental design. It’s very strict about using non-existing/undefined variables and comparison of irrelevant types. JavaScript, on the other hand, is very lazy/permissive about these, much like Perl. Python is proactive about its assertions, Pyjamas tries to be reactive. It’s unrealistic to forsee every special case that could arise and account for it with an assertion the same way Python would. Even if you manage to do so, you will have added even more code to Pyjamas’ already large chunk of boilerplate (not to mention potential for new Pyjamas bugs). One option is to compile these assertions away when the debug flag isn’t set, but even then you would be doing the exercise of examining all possible errors that Python could throw in each case, plugging in more “reactive” logic to make JavaScript work the same way. Instead, we should make the framework easy to debug in the environment it’s meant to be in. Since we can’t make JavaScript behave like Python, and we can’t do compile-time debugging like we would with C++ or Java, we should make the output easy to understand, so that we can map it back to the original code.

Python is not Java, DOM is not a Desktop

This brings me to my next point. GWT (the original inspiration for Pyjamas) might be more bloated than Pyjamas, but there is something it can do that Pyjamas can’t: compile-time error catching. If it wasn’t for Python being a dynamically-typed language, a lot of my rant in the previous section about debugging would be irrelevant. Additionally, I don’t feel that Pyjamas is approaching the problem from the right angle. Python has the advantage of being much more similar to JavaScript than Java ever will, and a lot of Pyjamas’ wrapper logic wouldn’t even be necessary if Pyjamas didn’t try to pretend to be GWT (in addition to pretending to be Python). GWT was designed to make web development similar to Desktop GUI development, since that’s the background many Java developers come from. What other purpose is there to fake MouseListener and KeyboardListener in an environment that wasn’t designed to need either (KeyboardListener, by the way, is another source of grief for Pyjamas – it’s what makes the keyboard pop-up all over the place on mobile devices, it also attaches a fake input element to the current element, pretending like they’re the same element, adding even more boilerplate and wrappers to the code)? What other purpose is there to build the entire DOM dynamically (which, by the way, is also extremely inefficient)? The browser page was not designed to function the same way your Desktop calculator app does. Anyone who has taken a few minutes to learn how the DOM works probably agrees that it’s actually superior to the old-fashioned Desktop way of writing the GUI. I’m lazy (otherwise I wouldn’t have written my front-end in Python), so when a new technology comes along that clearly makes my life easier, why ignore it?

If it wasn’t for trying to fake a Desktop GUI, Pyjamas wouldn’t need all these wrappers. Most other Python-faking frameworks allow one to invoke JavaScript logic as if it was a regular Python object/function. Pyjamas, on the other hand, requires one to first write a wrapper for Pyjamas Desktop using Python, then for the browser using some limbo version of Python/JavaScript hybrid (where you can’t even access elements of array using standard indexing), and finally rewrite a separate version of your limbo code for each non-compliant browser (definitely IE, and possibly some others). This wrapping might have been necessary in Java, but should not be needed for Python at all, and could have been prevented with better design. But wait, there IS an alternative! You can put raw JavaScript in your code using JS() method and passing it one giant string of JavaScript code. Unfortunately, that chunk of the code will get completely ignored in Pyjamas Desktop (which you’re using to debug your entire app, since the browser debugger is no help at all), and to actually reference anything from this chunk of code in the browser, you will need to reference these variables the same way: “a = JS(‘a’)” (again, don’t expect “a” to get set in Pyjamas Desktop). Oh, and don’t try to modify any of the DOM elements created by Pyjamas from anything other than Pyjamas, you will run into object state sync issues. Pyjamas wraps each DOM element in a Python object, which then stores the element’s state as a set of variables, and assumes it doesn’t change without Pyjamas’ permission. Pyjamas plays well with other JavaScript frameworks… as long as they don’t touch any portion of the DOM Pyjamas uses.

JavaScript has its Strengths

JavaScript might not be the cleanest language, and I still much prefer Python to it. But I must give it credit where credit is due. First of all, it integrates the DOM into itself really well. I can take any DOM element, assign a function to onMouseDown event as if it was a regular JavaScript object, and all of a sudden I got an element that reacts to my mouse clicks. No need for complicated ClickHandlers.

Pyjamas has a lot of abstraction layers, both to hide JavaScript inconsistencies, and make it easier to build widgets. However, native JavaScript libraries, like jQuery, do a much better job at both. Yes, jQuery doesn’t scale well for larger projects, but there are libraries that do, like MooTools (which, by the way, was inspired by Python). But realistically, if you create a simple wrapper for generating classes (or loot one from John Resig’s blog – the same guy who wrote jQuery), even jQuery becomes good enough for creating large projects. Pyjamas, on the other hand, adds so much abstraction, that sometimes I need hacks just to manipulate the DOM. If you look at the DOM of a typical Pyjamas app, you will notice layers of unnecessary elements: images wrapped in divs, wrapped in more divs, placed inside some table that resides inside yet another table. When I try to render my app on a tablet, it often crashes due to the DOM bloat.

Pyjamas also assumes that JavaScript is slow, which was true when the project first started. As a result, it duplicates parts of its boilerplate code to avoid an extra function call (while adding excessive function calls and abstractions in other places). Ironically, JavaScript engines have come a long way, and a lot of Pyjamas’ optimizations are no longer relevant (such as using object["property"] instead of object.property). In fact, a quick paint app I wrote in Pyjamas actually runs faster in Chrome than Pyjamas Desktop. That same app runs faster still when written in pure JavaScript. It’s especially noticeable when using the paint-bucket tool, which works by pixel-scanning and takes a couple seconds in Pyjamas yet almost instantaneous in JavaScript.

Summary

While Pyjamas is the most complete Python emulation in a browser, it has become a very bloated and brittle framework. It doesn’t embrace any portion of JavaScript, nor the DOM, trying to hide them away like some sort of deformed beast. By pretending to be pure Python, it not only puts unrealistic expecations on itself, but also fails to make use of good parts of JavaScript. Instead, Pyjamas embraces a solution designed for a statically-typed language, favoring a GUI structure that should have died a decade ago.

So What’s The Alternative?

I did mention that I am porting Grafpad away from Pyjamas. However, I’m not crazy enough to rewrite the entire project in pure JavaScript. Rewriting all the code in a language with different quirks and troubleshooting differences like division rounding and modulo signage is not my idea of fun. I also still prefer to keep my front-end code interchangeable with the back-end (more or less), which has already provided multiple advantages, such as moving the proprietary clipping and recognition algorithms to the back-end in just a few hours of work. I happen to have another ace up my sleeve. In the next post, I will review multiple alternatives for Pyjamas and explain the solution I’ve chosen.

Touch Gestures in Pyjamas

Before I continue, I wanted to apologize for the lack of updates. To pay the bills, I got a new job in October, which took away a lot of my free time. You also may be familiar with the current state of Pyjamas from the mailing list. Before I go further, I wanted to mention that I have realized Pyjamas is not the best API for Grafpad in November of last year, and have been working on alternative solutions (I will write up a detailed post explaining this soon). However, for those still using Pyjamas, I wanted to provide my solution for handling touch events on mobile devices.

Some of you probably noticed that Grafpad has had support for iPad/Android devices for almost a year now despite Pyjamas still lacking support for touch gestures. Some of you may have even seen the communication between Luke and me where I was trying to figure out how to write a proper touch event wrapper for Pyjamas. Alas, it has been almost a year, and the wrapper is nowhere to be seen, yet Grafpad got support for touch gestures within a few weeks of that conversation. So what happened?

To summarize what happened, after a few weeks of beating my head against the wall, I gave up on writing the wrapper. I have updated a dozen or so different Pyjamas modules, all the pieces seemed to be there. I’ve made sure that all places responsible for triggering mouse-event logic also triggered touch events, I even made sure my logic was consistent with GWT way of triggering touch events. Pyjamas wasn’t throwing any errors, but my touch handlers just weren’t firing. I could’ve spent more time to debug this correctly, but my main priority was Grafpad, and I couldn’t afford to spend more time on this.

In my frustration, I came up with a hack based on another set of blogs I saw (I no longer have the links to them, so if you recognize them from the source code, please reply to this post and I will add the link). The solution is actually quite simple, and required only some understanding of how the browsers already handle touch events.

Since most websites aren’t designed for mobile devices, mobile browsers have adopted some work-arounds, faking some mouse events to achieve sane behavior on most websites. For many websites these fake events work fine, for many web-apps they do not. In particular, when the user clicks an element on a website, the mobile browser first sends onMouseOver event (that’s right, not onMouseDown). The mobile browser then checks for a change in the DOM, if no change occurs, the browser then sends onMouseDown event. This is actually pretty smart, and in most cases works pretty well. Unfortunately for Grafpad, this triggers a tooltip and requires a second press to actually activate the option being clicked on. The next annoyance is due to how onMouseMove works. Unlike PCs/Macs, mobile devices do not send onMouseMove events continuously while the mouse/finger is pressed down. Instead, they send a single onMouseMove event when the mouse is released (simultaneously with onMouseUp). My guess is that this is done to conserve CPU resources, since continuous onMouseMove events can get pretty resource intensive. As a side-effect, Grafpad would refuse to draw the actual Shape the user would doodle, and instead produce only a straight line from the point of touch-start to the end. The last problem is the keyboard listener. Grafpad relies on it for keyboard shortcuts, having it active at all times. Mobile devices, unfortunately, assume that the website expects keyboard input form the user, activating a keyboard that takes up half of already scarce real-estate. So how did Grafpad deal with these issues?

Apparently, mobile browsers will only try to fake real mouse events if touch gestures aren’t already handled by the website itself. Fortunately, touch gestures are very easy to detect in pure JavaScript, when you don’t have to rely on additional wrappers. To summarize my solution in one sentence, I added some JavaScript code to detect real touch events, and fake mouse events at the same coordinates. This solved all issues except the keyboard. For the keyboard, I modified the keyboard handler CSS display style to none, which seemed to have worked on the iPad. Unfortunately, this didn’t do the trick on Android devices. While I still haven’t addressed those, an easy hack (I mean solution) would be to detect the browser/device from the user-agent string (or better yet, as soon as we notice a touch occur) and disable keyboard handler entirely (we know the user isn’t going to use keyboard shortcuts on a tablet anyway).

Now to the actual solution, here is the code that I added for faking mouse events:

var TOUCHDEVICE = false;
var LASTTOUCH = 0;

function touchHandler(event)
{
    TOUCHDEVICE = true;
    var touches = event.changedTouches,
        first = touches[0],
        type = "",
        ctrl = false;
         switch(event.type)
    {
        case "touchstart":
            var d = new Date()
            var current = d.getTime();
            if (current-LASTTOUCH <= 500){
                ctrl=true;
                event.preventDefault(); //prevent zoom-in lens on double-click
            }
            else {
                LASTTOUCH = current;
            }
            type = "mousedown";
            break;
        case "touchmove":
            type="mousemove";
            event.preventDefault(); //prevent screen dragging
            break;      
        case "touchend":
            type="mouseup";
            break;
        default: return;
    }

    var simulatedEvent = document.createEvent("MouseEvent");
    simulatedEvent.initMouseEvent(type, true, true, window, 1,
                              first.screenX, first.screenY,
                              first.clientX, first.clientY, ctrl,
                              false, false, false, 0/*left*/, null);

    first.target.dispatchEvent(simulatedEvent);
}

function init()
{
    document.addEventListener("touchstart", touchHandler, true);
    document.addEventListener("touchmove", touchHandler, true);
    document.addEventListener("touchend", touchHandler, true);
    document.addEventListener("touchcancel", touchHandler, true);  
}

The above code does several things to make touch events behave the way a user would expect. First of all, once init() gets called, we map every touch event to our function, preventing default behavior from the mobile browser. The actual touchHandler function then decides how to handle each touch event. There are a few things to note in the code. First, the TOUCHDEVICE boolean. This variable was never actually used, but was initially added for the purpose of disabling the keyboard handler on Android devices. The idea is that it gets triggered the first time a touch event occurs. On a PC/Mac such event will never occur, thus the user will not be affected by this functionality. On a mobile device, this event will occur as soon as the user performs the first touch, allowing us to disable the keyboard handler before it has any effect.

The next item of interest is the LASTTOUCH variable. This variable stores the time of the last touch event. As you can see from the code, it’s used to simulate the right-click event (ctrl+click) in case two taps are detected within 0.5 seconds of eachother.

The above logic should be sufficient for single-touch events. Multi-touch events are somewhat more complicated, and I will not cover them in this post (although you should be able to rewrite the above logic to handle those too, if needed – complete documentation for those is available on HTML5Rocks website). One last thing to add, to make the above code work, make sure to add the following attribute to your body tag:

<body ... onLoad="init();" >

Lean Startup Challenge: Weekly Reflection 4

AdWords
Last week, some members of the Lean Startup Challenge (including me) attended an AdWords workshop by Google. We were told some basics about how to use the ads, that it helps to group the keywords into categories with other similar words, and that a good click-through-rate to aim for is 1%. This week I started using Google AdWords. I’ve also managed to learn a few important lessons about the AdWords system that weren’t covered during the workshop.

For example, the ad “quality” for a given keyword (which is actually more like a keyphrase) seems to depend on how many words from the keyword appear in the ad itself. At first I didn’t realize that, thinking I’d bundle all diagram-related keywords under my diagram add, figuring that the people seeing the ad would understand that UML is a type of diagram. Unfortunately, Google penalized my quality score heavily for not having “UML” in the ad. I now have 3 different ads with 51 enabled keywords (a good rule of thumb is to have about 15 keywords per group and a unique ad for each group) and seem to be doing better in terms of my quality score. For those unfamiliar with Google Ads, quality score factors into the ads position on the page. Bid amount is the other component that factors into the ad position. So if you want your ad to appear at the top (where it’s much more likely to get noticed), you can either improve the ad quality or throw a lot of money at it. However, if the quality is too low for a given keyword, no amount of money will force the ad to show up (McDonald’s for example, will never show up when you’re looking for a new laptop).

While the quality score depends on your ad alone, the position of the ad also depends on quality of the competition (as well as the size of their pockets). As a result, I started sometimes ignoring the quality score if my rank is in the top 3 and I am getting clicks from the keyword. Other times my quality score is high (7/10) but I get no clicks and Google throws me in position 11. In those cases it’s time to disable the keyword before the quality score drops. Playing with AdWords is a lot like tweaking the skill points of a character in RPG, you don’t control the damage output directly but it’s a combination of dexterity, strength and a number of other factors that you might not be able to control. The trick is to test out your build in a real environment and analyze when you’re seeing diminishing returns.

There are also some important gotchas I stumbled into that hurt my click-through-rate quite a bit. First of all, don’t use generic single-word keywords, no matter how high your rank/quality is for them (there are probably exceptions, just that this didn’t work for me). For example, somehow I got a quality score of 6/10 for the term “graphics” and I also had position 3 for it (despite it seemingly being such a popular search term). I was very happy about this and let that keyword run for the first day. When I checked my campaign the next day, this keyword got over 8,500 impressions, but only 20 clicks (about a 5th of the 1% Google recommends). My other keywords were barely shown at all since this one was so much more popular in search terms. Out of those 20 visits, I’ve received no sign ups. Needless to say, that was the first keyword to get disabled.

The second gotcha I ran into is display network. These are the ads that appear on other people’s blogs, websites, and news stories. Google recommends leaving them on for inexperienced AdWords users (so that your ad can appear in those places as well), and I did just that in the beginning. I was told at the AdWords workshop that the click-through rate for display network is lower, somewhere around 0.3%, so I was expecting less clicks there. I figured that whatever clicks I get there would just be a bonus. So the next day (after disabling “graphics” and other generic terms) I noticed that the number of impressions for my other terms on Google has not increased too much, but I managed to get about 9000 impressions on display network, with only 1 click-through. That’s when I decided that display network needs to go before it hurts my quality score on Google as well. My guess is that the click-through rate is this low because people are actively engaged in the website, whereas with Google they’re still searching for something to engage in. It’s a lot easier for the ad to be more interesting than other searches than the article that the person already clicked on.

After addressing those issues, my campaign started doing a lot better. I get less impressions now, but my click-through rate is way higher, not to mention that visitors actually sign up to use Grafpad. I still check my AdWords account every day, disabling words for which my position is low and tweaking my ad wording if I see a trend of all keywords getting a low quality vallue compared to my other ads.

App and Website
There haven’t been major changes to Grafpad this week, I have been slowly fixing bugs and other annoyances however. I also observed several first-time users play with Grafpad, which was even more revealing than reading the feedback. I noticed, for example, that several users got confused with how right-click works. They expected that right-clicking outside of right-click menu while it’s active would close the menu instead of opening it in a new place. Perhaps I need to replace the right-click menu with “figure” menu that becomes active when a figure is selected (the menu can either appear inside the selection box or above the two other menus).


Metrics
Acquisition: 0.17%
There were 34 visitors to my website this week out of 19,878 who saw the ad impression. As I mentioned earlier, it was the use of generic terms and display network that hurt me the most (costing me about 17,000 impressions with virtually no clicks). I believe next week I will be getting much closer to the 1% click-through Google recommends. Already I’m seeing 2-3% click-through for some search terms I’m using.

Activation: 14.71%
Since I haven’t done anything this week to improve activation, I would say that last week’s experiment is confirmed. Giving people free storage and comparing free account to paid one on main page does improve activation rate.


Retention: 40%
Out of 5 people that signed up this week, 2 used their account more than once within the same week. I still don’t want to draw any conclusions since the percentages are marginally different between the weeks and the number of people is in the single digits.

Referral: 0%
No referrals again, although I do seem to be getting more and more people following me on twitter. I’m not sure whether it’s my blog, Grafpad, or simply common interests.

Revenue: 0%

Plans for Next Week
I started trying to add touch-gesture support to Grafpad last week, to make Grafpad usable with iPad and Android devices. So far I’ve been unsuccessful (I tried to add it cleanly via Pyjamas, the same way mouse and keyboard handlers work). Next week I might try adding it via a hack, a Javascript hook that triggers Pyjamas mouse handlers. I’ve been getting a lot of feedback about making my app iPad-compliant. If I can at least add partial support for that, I can gauge the popularity of that idea. I do believe iPad market could be its own niche, and I want to try concentrating on it since many people don’t seem to like drawing with a mouse.