RapydScript is Self-Hosting

While the RapydScript community is small, our members are passionate about the project, and I’m grateful for that. Salvatore, for example, has put together a number of demos showcasing RapydScript integration with WebGL, NodeBox, GlowScript, and a number of other JavaScript projects. Charles made a Chip’s Challenge clone. There are also examples of RapydScript integration with D3, and a Paint app that uses HTML5 canvas and jQuery. When we started a little over a year ago, members would ask us “can RapydScript work with ___?”. Now instead of just saying “yes” we can often point them to a demo.

To take things one step further, I decided to port the compiler itself (originally in JavaScript) to RapydScript as well. This effort took a few weeks, and a lot of tweaking to a Decompiler project that was put together a while ago. The decompilation wasn’t without issues, and I had to manually tweak some code. Along the way, I also discovered a few minor bugs in RapydScript, which I have fixed in the process. Overall, however, I was pleasantly surprised by how easy it was to port the code. The ported version is not written in Pythonic style that RapydScript tries to encourage because the code-generation was mostly automated, yet it already looks more legible than the JavaScript version it was ported from. I’m very proud of this, since it’s an important step for a compiler to achieve the self-hosting status, and I’m not yet aware of any other Python-to-JS compiler that can say the same (they’re all written in Python, but the subset of Python that the languages use isn’t enough to support the compiler itself).

I hope that this will make future enhancements and tweaks to RapydScript easier, both for myself and other members of the community. As a bonus, I’ve also added kwargs implementation to RapydScript. It works similar to Python, but not completely like it (for performance reasons), I suggest you read the manual on it before use to avoid surprises.

What is Pythonic?

Occasionally I get told that RapydScript is not Pythonic enough, because I don’t support a certain feature of Python. For example, because I don’t handle string interpolation out of the box. I don’t do this because there are already JavaScript alternatives that do a very good job (http://www.diveintojavascript.com/projects/javascript-sprintf), and I don’t like reinventing the wheel. The purpose of RapydScript is not to mimic Python functionality exactly, but to make JavaScript development more sane for those who’re used to Python (or can’t stand CoffeeScript’s Perl-like syntax).

RapydScript takes a different approach from other Python-in-a-browser frameworks, I do not try to reproduce Python functionality in a browser, and in fact Guido himself argued against that for the same reason I argued against Pyjamas, see his answer to “Python in the browser?” from this interview: http://developers.slashdot.org/story/13/08/25/2115204/interviews-guido-van-rossum-answers-your-questions

The %-based syntax simply doesn’t make sense to implement, because % operator is already reserved for modulo division both in JavaScript and in Python. Overriding that would require a new function, something along the lines of _$rapyd$_interpolation_or_modulo(), which would need to check at runtime what the user meant to do based on arguments. And this kind of overhead for the logic that’s already achievable via a more explicit sprintf is why Pyjamas apps were a pain to debug. The "{}".format("foo") syntax, on the other hand, I have no problem with, it requires no compiler changes and does not conflict with existing functionality. I don’t need that syntax because sprintf plugin does the job for me, but feel free to submit a library that implements it yourself if you need it. That is after all, how open-source works.

At the end of the day, you should ask yourself: What is “pythonic”? Is it something that reproduces Python’s functionality exactly? No, because Python itself evolves. Is Python 2 more pythonic than Python 3 because it was there first or is Python 3 more pythonic than Python 2 because it simplifies parts that might not have made sense? Is something pythonic because it follows The Zen of Python? If so, then isn’t an explicit sprintf more pythonic than % operator, which can be confused with modulo division based on context? I certainly think so.

Since its inception, Python’s main selling point was how similar it was to pseudocode that you would see in textbooks, not its shortcuts. RapydScript tries to stay true to that philosophy, by cleaning up parts of JavaScript that don’t make sense, and leaving the parts that do alone. That’s the main difference between RapydScript and regular Python-in-a-browser compilers. RapydScript doesn’t try to replicate all of Python syntax, that’s a fool’s errand. RapydScript tries to introduce the clarity of Python into JavaScript.

At the end of the day, you have to realize that JavaScript did not start developing its main feature-set until a few years ago. As a result, it could learn from mistakes of many other languages, including Python. If “pythonic” is something that follows Python’s philosophy and simplicity, then in some ways, the new JavaScript is actually more pythonic than Python. If you don’t believe me, look at CasperJS and compare its simplicity against any of Python’s web-scraping alternatives. Better yet, compare the simplicity of building a GUI with HTML5 and jQuery vs building a GUI in Python.

Scoping in JavaScript

I do a lot of programming in Perl, not because I like it, but because the company I work for uses it as its main language. In fact, I hate Perl (it tries to be overly implicit), but I do like how it handles scoping. Any variable declared inside a block (anything surrounded by {} brackets) will be local to that inner-most block. This means that variables declared inside loops, conditionals or even stand-alone {} will not be seen from the outside.

JavaScript will not localize variables like this. Any var declaration will simply be scoped to the inner-most function. That doesn’t mean, however, that you can’t use (function() {})(); same way you would use the brackets in Perl. In fact, you’re probably already familiar with wrapping chunks of code in the above pattern. Many developers do it to prevent leaking variables into global scope, including Facebook’s Like button:

(function(d, s, id) {
  var js, fjs = d.getElementsByTagName(s)[0];
  if (d.getElementById(id)) return;
  js = d.createElement(s); js.id = id;
  js.src = "//connect.facebook.net/en_US/all.js#xfbml=1";
  fjs.parentNode.insertBefore(js, fjs);
}(document, 'script', 'facebook-jssdk'));

Similar pattern, however, can also be applied anywhere else in your code. Consider a page, for example, with multiple elements sharing similar element ID structure, varying only by the index used within the ID. We then want to iterate through these elements, giving them all a click-handler. The following jQuery-based code seems like it will do the job:

for (var i=0; i<n; i++) {
    var cachedi = i;
    $('#' + i + '-element').click(function() {
        $('#' + cachedi + '-popup').show();
    })
}

At first glance, this code looks fine. We made sure to cache the index so that cachedi gets the value of i at the time of the function creation rather than using i directly, which would use i at the time of function call (after the loop terminates and i is set to n). However, running the above code we still get all elements attempting to trigger the popup with [n]-popup ID. The problem is that our declaration of cachedi gets moved outside the for loop and the same instance of the variable gets used in every single closure we generate inside the loop. There is an easy work-around, however:

for (var i=0; i<n; i++) {
    (function() {
        var cachedi = i;
        $('#' + i + '-element').click(function() {
            $('#' + cachedi + '-popup').show();
        })
    })();
}

Now our code works as expected. This is a handy trick for anyone wishing saner scoping in JavaScript. In fact, I’d prefer that RapydScript would scope things this way too, but that would contradict Python’s loop scoping. An alternative to this trick (and probably a more orthodox solution in JavaScript) would be to move the cachedi declaration inside the function making use of this closure.

Erlang & Ejabberd on OpenShift

I’m mixing it up a little here – this post is about the backend for our GrafPad site and not related to Python or RapydScript. We will probably be using an XMPP for some new GrafPad features. Sure, there are Python XMPP servers, but we won’t touch the XMPP server code, we’ll only talk to it, so language doesn’t matter. We’re going to use the most proven XMPP server, ejabberd, which happens to written in Erlang. Unfortunately, setting up ejabberd, or any Erlang application for that matter, is a nontrivial task on OpenShift. In most places Erlang likes to automatically bind to 0.0.0.0 and/or 127.0.0.1, which is something not accessible on OpenShift. Even when you provide an IP list to serve on, Erlang interally adds 127.0.0.1. On top of that, Erlang likes to use ports that are not allowed on OpenShift.

ejabberd on OpenShift After messing around for a day or 2, I discovered that these issues cannot be overcome by configs – the changes had to be made in the original Erlang source. Also, fair warning – my goal was to get something working. The solution I have works, but it’s not production ready. I have a repo with the modified Erlang/OTP source, so feel free to make this better!

There are 2 core problems that need to be solved. It turns out the issues all come from 2 things:

  • You cannot bind to 0.0.0.0 or 127.0.0.1
  • The only usuable ports are 8080, and 15000-30000.

Solving these problems is tricky. I originally set out to find everywhere where something binded to 0.0.0.0 or 127.0.0.1 and change where they bind to, but there were still issues starting it up even after changing the values everywhere obvious in the sources (grepping for {0,0,0,0} and 0.0.0.0). I ended up chaging the binding code to change the IP to the $OPENSHIFT_DIY_IP if the address is 0.0.0.0 or 127.0.0.1. I liked this solution a lot better since it accomplishes exactly what I want, and I won’t have to make changes to libraries that try to bind to 0.0.0.0.

So how can you get this setup? It should be pretty easy using my Erlang source. I thought about putting this in a repo that would build when the repo was pushed, but the build takes more than 1 hr so it always gets stopped midway though. I saw messages like Shell command '.../.openshift/action_hooks/build' exceeded timeout of 3516. Instead you have to SSH onto the machine and run the following script manually:

cd $OPENSHIFT_TMP_DIR
wget https://github.com/charleslaw/otp/archive/openshift.zip -O openshift.zip
unzip openshift.zip
cd $OPENSHIFT_TMP_DIR/otp-openshift
sed -i 's/{0,0,0,0}/'"{${OPENSHIFT_DIY_IP//[.]/,}}"'/g'  ./lib/erl_interface/src/connect/eirecv.c
ERL_ROOT=$OPENSHIFT_DATA_DIR/erl_home
./otp_build autoconf
./configure --prefix=$ERL_ROOT --without-termcap
make
make install

After it is done, you can test it by starting epmd and seeing that it works:

$OPENSHIFT_DATA_DIR/erl_home/bin/epmd -address $OPENSHIFT_DIY_IP -debug

Next, you have to install ejabberd. You can do this running this script:

ERL_ROOT=$OPENSHIFT_DATA_DIR/erl_home

cd $OPENSHIFT_TMP_DIR
wget http://downloads.sourceforge.net/expat/expat-2.1.0.tar.gz
tar xzvf expat-2.1.0.tar.gz
cd expat-2.1.0
./configure --prefix=$ERL_ROOT
make
make install

cd $OPENSHIFT_TMP_DIR
wget http://github.com/processone/ejabberd/archive/v2.1.13.tar.gz -O v2.1.13.tar.gz
tar xvf v2.1.13.tar.gz
cd ejabberd-2.1.13/src/
export PATH=$ERL_ROOT/bin:$PATH
./configure --prefix=$ERL_ROOT
make
make install

sed -i 's/localhost/$OPENSHIFT_DIY_IP/g' $ERL_ROOT/sbin/ejabberdctl
sed -i 's/localhost/'"$HOSTNAME"'/g' $ERL_ROOT/etc/ejabberd/inetrc
sed -i 's/127,0,0,1/'"${OPENSHIFT_DIY_IP//[.]/,}"'/g' $ERL_ROOT/etc/ejabberd/inetrc
sed -i 's/127,0,0,1/'"${OPENSHIFT_DIY_IP//[.]/,}"'/g' $ERL_ROOT/etc/ejabberd/ejabberdctl.cfg
sed -i 's/5280/8080/g' $ERL_ROOT/etc/ejabberd/ejabberd.cfg

EDIT: (added based on feedback) Make sure port 8080 is free. If you run ps -ef and see a ruby app running, you’ll need to kill it (If you really want to make sure it’s running on port 8080 run netstat -tulpn | grep $OPENSHIFT_DIY_IP).

Next you can start ejabberd running the following 2 commands, which you’ll want to put in your .openshift/action_hooks/start script:

$OPENSHIFT_DATA_DIR/erl_home/bin/epmd -address $OPENSHIFT_DIY_IP &
$OPENSHIFT_DATA_DIR/erl_home/sbin/ejabberdctl start

If there are no errors, you should be all set! At this point you’ll want to configure ejabberd to your own liking. You will need a user to login to the admin interface. If you don’t want to keep localhost as an XMPP host, change it in $OPENSHIFT_DATA_DIR/erl_home/etc/ejabberd.cfg. You’ll need an admin user on one of your hosts, so run the following command (using your host instead of localhost):

$OPENSHIFT_DATA_DIR/erl_home/sbin/ejabberdctl register admin localhost password1234

Connect to OpenShift My app was at http://erl-jabberserver.rhcloud.com so I went to http://erl-jabberserver.rhcloud.com/admin and logged in using username admin@localhost / password1234. Success! I can also connect to it using any XMPP software that supports BOSH. The default setup does not have encryption, so make sure not to require it for testing.

So this works for us since we’re only experimenting with this at this point. But it could be better. Specifically, right now when port 0 is specified, I pick a random port between 20001 & 30000. This should probably instead pick a port that is not used in that range. There may be other issues that I just haven’t run into, but this is a good start.

Python to Javsacript: Compilers vs. Translators

One thing Alex and I always say about RapydScript is that it is really JavaScript with a Pythonic syntax. Following that, people often ask me what that means for them – they want to know how this affects how they develop code, and how it is different than something like Pyjamas/Pyjs. I want to answer that here to for anyone that has wondered what “RapydScript is Pythonic JavaScript” means, and how compilers like Pyjs are different from translators like RapydScript, and why I (full disclosure) prefer translators.

An Example

A clear example of the difference is with division. Say your source looks like:

a = 5
b = 0
c = a / b

The translator will output JavaScript like:

var a = 5;
var b = 0;
var c = a / b;

The translator process is very simple to understand – it’s pretty much just changing the syntax, but this leads to some gotcha’s. When this code runs, c will be set to Infinity, a JavaScript constant, while the original Pythonic source would have raised an Exception.

A compiler, on the other hand, attempts to mimic Python exactly so it may have an output more similar to:

var py_int = function(val){
    this.val = val;
};
py_int.prototype.div = function(denom){
    if (denom.val == 0) {
        throw ZeroDivisionError;
    }
    return py_int(self.val / denom.val);
};
var a = py_int(5);
var b = py_int(0);
var c = a.div(b);

The variables here will all be objects that include methods for all the operations. The division doesn’t directly divide 2 numbers, it runs the divide method in the objects. So when this code runs it will throw a ZeroDivisionError exception just like Python does.

So what are the tradeoffs?

Writing using a compiler is nice because you get to think like a Python developer, which can abstract away some things like cross browser support. It also means that, in many cases, code can be moved between the frontend and backend with no changes. So it’s easy to have Python code compile to Javascript. But if you’re doing something that’s JavaScript specific, like getting HTML elements, taking in keyboard inputs, etc, the compiler you’re using will have to have a working and documented API for accessing these functions.

The real drawback, though, is with the output code is slower, significantly heavier, and, with the compilers I’ve used, unreadable. There are several issues I have with this, but it really boils down unreadable code leads to usless tracebacks when running code in a browser, and apps don’t run (or run well) on mobile devices.

Translators take a very different approach and don’t try to run just like Python. The idea here is to do 80% of the work for 20% of the cost. Your code may look like Python but it will run like JavaScript, as with the division example. There’s a lot of overlap between the two languages, but they’re not exactly the same, so the main drawback here is that you might see some unexpected, but predictable, behavior. This is very easy though if you know JavaScript, and if not, it’s easy to learn the differences.

There are some nice benefits to using a translator. Your input code and output code will look very similar. They will be roughly the same size and it will be easy to map a line with an error in the JavaScript output back to the Pythonic input for easy debugging. The output will be on the order of kB instead of MB and will run faster.

So those are the main differences between compilers and translators like RapydScript. RapydScript is really JavaScript behind the scenes so it will behave differently than Python, which lets it run a lot more efficiently.

Which is right for you?

The choice of which to use comes down to a few things:

  • First, something I have not mentioned, libraries. In general, JavaScript libraries work better with translators and Python libraries work better with compilers. The one caveat is compilers can only translate Pure Python, so if you’re using something like numpy, which uses C, there’s no easy answer for you.
  • Second, if you don’t know any JavaScript, you will have a tougher time with a translator. Speaking from experience though, JavaScript is not very different from Python, and I encourage you to try a translator because you’ll save time debugging, and your app will be more maintainable in the long term.
  • Lastly is performance requirements. If performance is important, you will want to use a translator over a compiler.

I think it makes sense for a beginner who may writing a simple internal app that won’t have any performance requirements to use a compiler. But if you’ll be writing many apps, or even a complex one, I would go with a translator. Having picked up the differences between JavaScript and Python, I now exclusively use RapydScript, a translator, even if I don’t need the performance. It’s easier to debug in the browser where it actually runs, which saves me time.

Implicit Logic Is Not Your Friend

When creating RapydML and RapydScript, I had to make quite a few design choices – similar design choices other developers make when coming up with a new language, or even an API. For inspiration, I’ve looked into Python, existing JavaScript abstraction languages like CoffeeScript, and even JavaScript itself. While doing so, I’ve noticed a few features in CoffeeScript and related languages that should never have been borrowed from Ruby, and that Ruby in turn should never have borrowed from Perl. Most of these features relate to implicit logic, where the compiler makes assumptions for you. While they seem like nice shortcuts at first, more often than not, they harm your productivity more than they help. In fact, they’re not shortcuts at all, but rather branching paths in a maze that often lead to a dead end.

You’ve probably already been bitten by a few of these implicit “shortcuts” in the past, such as JavaScript’s “optional” semi-colons. If this feature didn’t exist, the compiler would complain about the missing semi-colon as soon as the page loads, and you would be able to fix the bug right away. But since it’s a “feature”, JavaScript tries to guess where to insert the semi-colon for you. As a rule of thumb, whenever you have the compiler guessing anything, you’re asking for trouble. You’ve probably already seen an example bug resulting from this logic, something along the lines of:

return
    {
        font: 'Verdana',
        size: 10,
        type: ['italic', 'bold']
    };

The intent here was to return the object literal, instead JavaScript assumes a semi-colon at the end of the return statement and returns nothing. While I would disagree with such alignment of return statement anyway, I can definitely understand the frustration a programmer writing this would go through. An easy solution would be to move the bracket to the same line as the return statement, but a novice programmer unaware of this trying to follow a simple code convention that says curly brackets must have the same indentation as a matching bracket will likely let this one slip through the cracks.

As you can see, implicit semi-colons prevented an easy-to-find bug we could have fixed at compile time at the cost of a more annoying one that we won’t find until several hours of debugging later. Some might argue that this is an easy bug to prevent if the programmer knows the language, but the truth is most bugs are easy to prevent if you design your code conventions around them. All code conventions do is train the eye to notice errors, in this case JavaScript does the reverse. In most languages it’s either the semi-colon or the newline that finalizes a statement, your eye is trained to look for them. In JavaScript, it’s the semi-colon, unless there is a newline, unless the statement is incomplete. Your eye can’t do that kind of logic, and your brain should be scanning for more serious bugs. This is a common trend I noticed with implicit logic, it prevents easily-detectable bugs at the expense of more devious ones later on.

Let’s look at a few more examples. CoffeeScript introduced optional parentheses (like Ruby and Perl). At first it seems like a cool feature, the code has less clutter in it and we save a character. The problems start occurring when we wrap function calls, or even use multiple arguments. For example, let’s say you’ve written some code and a few weeks later noticed a bug. You traced the bug to this line:

a b,c d

Without additional context, you have no way of telling what the bug is by glancing at this line, or even what the line is trying to do. Was d supposed to be a third argument to a and you accidentally omitted the comma? Was the comma placed there in error and b is a method that was supposed to take c(d) as an argument? Was the comma supposed to be between c and d instead? Had you used parentheses, the error would immediately be obvious without looking at the definitions of these variables. In fact, you probably wouldn’t have made it in the first place.

Sure, this example uses poor variable names, but if you’ve been developing for a while, you’ve probably noticed that unless there are strict code conventions, many projects’ variable names aren’t much better. And even if you do use good naming conventions, you’re not immune from this. Imagine if the line you were debugging looked like this instead:

my_function MyClass ['item']

Was the intent here to pass a new instance of MyClass (whose constructor was initialized using an array consisting of 1 string) or to pass the item attribute of My_Class? LiveScript takes this “feature” a step further, making commas implicit as well for non-callable arguments (strings, numbers, arrays), making things even more ambiguous. Take a look at the following line of valid LiveScript, and try to figure out who’s calling who with what arguments:

a b c 1 [d 2] 3 e [f 'g' h] 4 i [j 5] k 'l' m

This is great for code golf and maybe riddles, but I definitely don’t want to see this kind of code in my project.

Shall we continue with more examples? How about implicit returns. Automatically returning last-performed operation of a function seems like a great idea, because we can’t be bothered with putting 6 extra characters at the bottom of our function to signify a proper return. Too bad you (or another developer) could miss the subtle returns when modifying the function later.

For example, let’s imagine you have a function with an implicit return whose return value is used by another function. Several months later you notice a bug due to the function not resetting some global setting or a setting in the class it belongs to. Being a busy guy, you delegate this task to another developer. Sure enough, he goes and fixes the bug by setting that global/class setting correctly at the end of the function. Too bad he forgot to check that another function was using this function’s return value. If you’re lucky, the code will break as soon as it runs, developer will notice his error and fix it before submitting the change. If you’re not lucky, the affected logic won’t get triggered during the test (not all tests have 100% coverage), developer will submit broken code and you will pat him on the back for doing a good job.

Even if you’re perfect, and never make mistakes, code is rarely developed in isolation. It’s in your interest to make code easy to understand to other developers, not just yourself. But if you’re like the rest of us, mortals, you will probably break your own code if you have to deal with it several months later. As another example, let’s imagine you have a long function with the following format (assuming implicit returns):

def fun(args):
    ...
    some_var = ...
    ...
    if SOME_GLOBAL_VAR == True:
        if some_var:
            ...
        else:
            ...
    else
        ...

Let’s also imagine that you’re calling it from multiple places, one of which uses its return for doing additional computations. Let’s also imagine that you’ve modified the logic in one of the other places calling this function (that previously didn’t need the return value, and that always sets SOME_GLOBAL_VAR to True before calling fun()) such that it now needs to know if some_var got set or not. “No problem” you decide to yourself, slapping “return some_var” at the end of the outer “if” block, breaking the implicit return that one of the other functions was expecting.

There are countless other examples of implicit logic in languages that seemed like a great idea at first, but with time proved to do more harm than good. Some examples are:

  • JavaScript/Perl functions automatically discarding extra arguments
  • JavaScript/Perl functions automatically setting missing arguments to undefined
  • JavaScript implicitly converting operand types when using + operator
  • JavaScript implicitly converting unrelated types when using ==
  • JavaScript/C++ making brackets optional for single-line conditional statements
  • Switch statements without break in JavaScript/C++ automatically falling through to next case
  • Object attributes defaulting to public in Python
  • JavaScript assuming global scope when var isn’t used

There are very few cases when implicit logic doesn’t cause confusion. A couple that come to mind are tuple packing/unpacking in Python and implicit boolean typecasting in many languages’ if statements without having to say == True. As a rule of thumb, if you’re asking yourself whether you should make something implicit, you probably should not.

To summarize, here are all the reasons why implicit anything is bad:

  • It saves time when writing the code at the expense of time spent debugging it
  • It makes code more ambiguous to other developers as well as yourself in the future
  • In cases when it relies on compiler inferring your intent, it can be inferred incorrectly (or rather your assumptions about how it will be interpreted could be incorrect)
  • It makes the code depend on nearby context, increasing the likelihood that something will break when you add more logic
  • It hides some of the logic from untrained eye, increasing the likelihood that something will break when you add more logic and you won’t notice it
  • It hides some of the logic from untrained eye, increasing the likelihood that something will be lost in translation when refactoring the code, or rewriting it in a different language

Even if you never make mistakes, you probably have other developers on the team. It’s in your interest to make the code clear to them, not just yourself. You want to decrease ambiguity, and implicit logic does the opposite.

RapydScript II

Building New Compiler A while back I mentioned that I wanted to rewrite RapydScript to rely on an internal AST (Abstract Syntax Tree) structure rather than PyMeta, which would allow it several advantages, including code transformations prior to output, more consistent parsing, and better error detection/handling.

I have looked at several tools for helping me accomplish this. One of the most promising seemed to be CoffeeScript, that is until I actually started converting its source code into RapydScript (via an automated AST-to-RapydScript generating script I put together). Having discovered that the output is very ugly and unpythonic, I gave up on the idea. After some more searching, however, I found the right tool for the job, ironically it was UglifyJS, a tool designed to make your JavaScript less readable.

UglifyJS code-base is very clean, and although it uses its own format that doesn’t directly translate into Python, the logic is well-structured and it’s easy to see where the code could be replaced by Python/RapydScript constructs. The code does a good job being explicit rather than relying on various JavaScript hacks, like CoffeeScript does.

Using UglifyJS as a base, I rewrote the compiler to convert RapydScript into native JavaScript. You can get it here: https://github.com/atsepkov/RapydScript. There are a couple minor features from the original compiler RapydScript II doesn’t support yet, but it’s already better than the original in several ways:

  • It compiles code much faster (on the order of milliseconds instead of seconds)
  • It handles leading whitespace much better now (same way as native Python would), so it will not complain if comments or secondary lines mix spaces and tabs
  • It will properly localize outer-scope variables when using module wrapper
  • It’s a single-pass compiler that uses consistent logic to identify tokens, whereas the original RapydScript would sometimes try to perform the same logic in Python stage that it would later do using PyMeta, resulting in potential parsing inconsistencies in certain special cases
  • It supports some notations that the original does not (such as the code shown later in this post)
  • It handles if val in array the same way Python would, rather than JavaScript, checking for existence of the value rather than the index
  • It’s more resistant to bad code, reporting errors that are more relevant and identifying the exact line/column number that caused the error
  • It’s more resistant to valid code that doesn’t follow typical syntax conventions, parsing it correctly rather than erroring out
  • The class-parsing logic has been improved to allow better durability and some requirements of the original compiler no longer apply (the __init__ function no longer needs to be first in the class body, for example, classes can now be nested and even declared inside functions)
  • It is written in JavaScript, which means it can eventually be ported to native RapydScript, it also means it can be modified to run in a browser, and that we can actually test function rather than structure in our unit and integration tests
  • It’s AST-aware, so it doesn’t need to generate excessive parentheses in the output for safety like the original compiler
  • Since the compiler is AST-based, it doesn’t need to generate output at the time of parsing, which makes the compiler more flexible and allows various code transformations and analysis to take place before the output is produced. This also makes it easier to make changes to the way output is generated in the future, should we need to do so.

The compiler already supports most of the features of the original, the only features that aren’t yet supported are:

  • (UPDATE: these now work) Chained Comparisons (1 < x < 5)
  • (UPDATE: these now work) List Comprehensions
  • (UPDATE: these now work) Inline Functions

Everything else should work at least as well as the original, and in some cases much better. For example, you can now do the following:

(def(a):
    return a
)(1)

Here is a list of other features that the original compiler does not support:

  • Logic such as 5 in [5,6,7] now correctly returns True
  • stdlib is no longer required for using for loops, len() or print(), more functions will be moved out of stdlib with time
  • Negative array indices now compile to arr[arr.length-1] instead of using slices, this allows assignment to them and not just referencing them (you still can’t use variables as negative indices)
  • Simple minifier is now built into the compiler (it will remove whitespace and unnecessary punctuation)
  • Pythonic imports are now supported, use them via --namespace-imports flag (experimental, some code will probably break, star-imports are not supported yet, you can set variables in global scope but not modify them)

I can’t yet guarantee that your code will work correctly with RapydScript II, but all RapydScript II test cases (which are more rigorous than original RapydScript tests, in that they also test that the code performs correct logic after compilation) as well as the examples from original (with slight modifications to remove the unsupported features) seem to work. And as a present for those afraid of GPL, I’m releasing the entire compiler for RapydScript II under Apache 2.0 license.

RapydScript by Example

With RapydScript gaining popularity, I figured it’s about time for a blog post talking about an example app. After all, I learn fastest by observing examples, many others probably do as well.

Let’s imagine we wanted to write a game using RapydScript. Since I’m not feeling particularly creative today, let’s concentrate on an existing game, rather than coming up with an idea from scratch. For this example, I’ll use a game called Chip’s Challenge.

Chip's Challenge

Looking at the original game, we quickly notice that the environment is a grid of blocks. Each block type has a corresponding image, and some have an effect on the character stepping on them. If we assume that we’ll use a canvas element for drawing the grid, then each block will need to track its coordinates in the grid and the url of the image corresponding to this block. Let’s assume we created a canvas with id #canvas in our document using html. We now simply need to create a reference to it in RapydScript:

CANVAS = document.getElementById('canvas').getContext('2d')

Since canvas is not only an object, but also a module containing methods for creating canvas-compatible objects (like images and patterns), we will use a global to reference it rather than properly abstracting it. Next, let’s create a simple block:

BLOCK_SIZE = 32 #pixels
NUM_X_BLOCKS = 21 # horizontal blocks on a field
NUM_Y_BLOCKS = 21 # vertical blocks
NORMAL_BLOCK = 0 # enum identifying block type

class Block:
    def __init__(self, x, y, image_url):
        self.x = x
        self.y = y
        img = new Image()
        img.src = image_url
        self.type = NORMAL_BLOCK
        self.blockPattern = CANVAS.createPattern(img)

    def redraw(self):
        CANVAS.save()
        CANVAS.fillColor = self.blockPattern
        CANVAS.fillRect(self.x*BLOCK_SIZE, self.y*BLOCK_SIZE, BLOCK_SIZE, BLOCK_SIZE)
        CANVAS.restore()

Now that we have a basic block, let’s create advanced blocks that can affect chip in some way. For example, Chip’s Challenge has a water block that Chip can drown in unless he’s wearing flippers. There is also an ice block, that makes chip slide to the next one unless he’s wearing ice skates. Let’s create these two blocks:

WATER_BLOCK = 1
ICE_BLOCK = 2

class WaterBlock(Block):
    def __init__(self, x, y):
        Block.__init__(self, x, y, 'images/water.jpg')
        self.type = WATER_BLOCK

    def effect(self, unit):
        if not unit.hasItem('flippers'):
            unit.die() # drown

class IceBlock(Block):
    def __init__(self, x, y):
        Block.__init__(self, x, y, 'images/ice.jpg')
        self.type = ICE_BLOCK

    def effect(self, unit):
        if not unit.hasItem('skates'):
            unit.move(unit.direction) #slip

Let’s add a character that can move around. We will use Block as a base class since a character is really just another block drawn on top of the existing block. Unlike a regular block, the picture could change depending on the direction the character is facing:

LEFT = 0
UP = 1
RIGHT = 2
DOWN = 40

class Character(Block):
    def __init__(self, x, y, images):
        self.patterns = [CANVAS.createPattern(img) for img in images]
        self.direction = DOWN # default facing direction
        self.blockPattern = self._patterns[self.direction]

    def move(self, direction):
        self.direction = direction
        self.blockPattern = self.patterns[direction]
        if direction == DOWN:
            self.y += 1
        elif direction == UP:
            self.y -= 1
        elif direction == LEFT:
            self.x -= 1
        else:
            self.x += 1
        self.redraw()

This character can now serve as the base class both for Chip himself and the monsters that roam around. If we wanted to create a typical monster, for example, we would write:

MONSTER = 10

class Monster(Character):
    def __init__(self, x, y, redrawCallback):
        images = [Image() for i in range(4)]
        images[DOWN].src = 'image/monsterDown.jpg'
        images[UP].src = 'image/monsterUp.jpg'
        images[LEFT].src = 'image/monsterLeft.jpg'
        images[RIGHT].src = 'image/monsterRight.jpg'
        self.type = MONSTER
        Character.__init__(self, x, y, images)

        # have the monster move randomly every 0.5 seconds
        main = self
        window.setInterval(def(): main.move(Math.round(Math.random()*4)); redrawCallback();, 500)

    def die(self):
        pass    # monsters don't die

Similarly, let’s create chip:

CHIP = 11

class Chip(Character):
    def __init__(self, die=False):
        # center Chip
        self.x = int(NUM_X_BLOCKS/2)+1
        self.y = int(NUM_Y_BLOCKS/2)+1
        self.items = {}

        if not die:
            self.deaths = 0
            images = [Image() for i in range(4)]
            images[DOWN].src = 'image/chipDown.jpg'
            images[UP].src = 'image/chipUp.jpg'
            images[LEFT].src = 'image/chipLeft.jpg'
            images[RIGHT].src = 'image/chipRight.jpg'
            self.type = CHIP
        Character.__init__(self, x, y, images)

    def die(self):
        self.deaths += 1
        self.__init__(True)

    def hasItem(self, item):
        return self.items[item]

    def getItem(self, item):
        self.items[item] = True

We now have most of the basics done. Let’s start putting the pieces together by creating the actual class that renders these. Our main class (which we will call Field) will need to create the grid and populate it. We’ll pass it a matrix of blocks to create the field, and an array of monsters.

class Field:
    def __init__(self, grid, numMonsters):
        self.grid = grid
        self.monsters = []
        for i in range(numMonsters):
            monster = Monster(
                Math.round(Math.random()*NUM_X_BLOCKS),
                Math.round(Math.random()*NUM_Y_BLOCKS),
                self.redraw
            )
            self.monsters.append(monster)
        self.chip = Chip()
        main = self
        moveChip = def(event):
            main.chip.move(event.keyCode - 37)
            while main.grid[main.chip.x][main.chip.y] != NORMAL_BLOCK:
                main.grid[main.chip.x][main.chip.y].effect(main.chip)

            for monster in main.monsters:
                if monster.x == main.chip.x and monster.y == main.chip.y:
                    chip.die()
            main.redraw()
        window.addEventListener("keydown", moveChip)

    def redraw(self):
        for x_array in self.grid:
            for block in x_array:
                block.redraw()
        for monster in self.monsters:
            monster.redraw()
        self.chip.redraw()

We now have an almost complete game (although inefficient due to redrawing the entire field every time a monster or the user generates an event). The only thing left to do is to create blocks that that provide chip with flippers and skates. I will leave that exercise to the reader. These items should disappear after picked up, I recommend using Character base class for those and making them die() after Chip runs into them. As you can see RapydScript code is easy to follow, which is the main benefit of the language.

Why GPL?

Recently I received an email suggesting that I change the license for RapydScript to something other than GPL. I understand that no matter how powerful, a language is useless if no one wants to use it. That’s why it makes sense not to keep it proprietary, it might have worked for Oracle back in 1977, but it will no longer work today. You might be able to convince corporations to use your platform, but not people working on hobby projects. Likewise, the GPL license might scare those who intend to write commercial software. This is not my intent, I do not wish to hurt either group, which is why the RapydScript libraries are already licensed under Apache license. Part of the fear of the GPL license comes from misunderstanding its terms (especially by those not used to dealing with open-source). The work created using a product under GPL license is not itself subject to GPL license. This is important, GPL is not a cancer that keeps spreading through your tools to your product. Rather, GPL is a way to protect your own work from getting stolen and repackaged as someone else’s for profit (i.e. Cedega building upon Wine). Basically, GPL is a way to protect open-source work from plagiarism.

This protection becomes even more important when the plagiarist has the ability to hurt your project in some way. He could steal your product and then damage your ability to work on your own branch of the it in an attempt to get rid of the competition. For example, imagine that a fictional ACME Corporation writes an operating system. People try it out, they like it, and soon all devices end up running it. Years go by, ACME becomes rich, and many of the devices evolve. Instead of rewriting the OS for new devices, however, ACME decides to keep patching the original OS. They also realize that they could outsource part of the development to further cut costs. Eventually ACME OS code becomes an unmaintanable mess, and the software itself becomes buggy. Then a few guys, frustrated by the bugs in ACME OS, develop an alternative operating system in the basement of their home, call it FREE OS, and release it as open-source. FREE OS immediately becomes popular. ACME, noticing a loss in its profits, decides to repackage FREE OS as a new version of their own. This in itself might hurt FREE OS by stealing its users (and removing the incentive for developers to work on it). But even if FREE OS already got enough traction such that people stick with it, ACME, having much bigger pockets, could go to hardware manufacturers and pay them to add DRM capability using piracy prevention as an excuse (ironically, piracy is exactly what ACME is doing). ACME could then add a proprietary driver to their OS that is able to play files encrypted with this DRM.

Fortunately, RapydScript is not an OS, and as a language it’s not in danger of being affected by a DRM. That doesn’t mean it can’t be ruined by a corporation, however. The compiler aims to generate fast, lightweight code, compatible with various JavaScript obfuscation mechanisms (it should in theory be compatible with advanced mode of Google Closure, although I haven’t tested this). In fact, the new version of Grafpad itself uses an obfuscator based on RapydScript code (which I haven’t released yet) in combination with UglifyJS. The only thing preventing a 3rd party from coming in and selling their own proprietary obfuscator/optimizer based on RapydScript code without contributing anything back is the GPL license of the compiler. This in itself would not actually hurt the language, but if they start breaking backwards compatibility with RapydScript, that could in fact hurt the community. Once the RapydScript community grows and the language gains some initial popularity, this will not be as much of an issue anymore (the obfuscator will be unlikely to gain traction without full support for RapydScript). At that time I won’t mind changing the license to Apache for the entire project.

I’m not strongly attached to the GPL license, and the libraries are already licensed under Apache, allowing companies to build their own private implementations on top of them. Only the compiler itself is licensed under GPL, and I want to encourage that code to stay open to benefit everyone. GPL seemed like the right license for the job. So far the arguments I hear against GPL from other developers are either due to it being used in the wrong places (the APIs/libraries) or unfounded paranoia due to misunderstanding how the license works (people assuming that work created via GPLed product is subject to GPL as well).

Eventually, I do plan to release both, RapydML and RapydScript under Apache license. If you believe I should do so now, I would love to hear your argument. As of yet, I do not see a legitimate case of how the GPL license could hurt a company deciding to use RapydScript (aside from their legal department getting paranoid about the ‘GPL’ acronym). If they wish to use RapydScript as a compiler to create proprietary work, the GPL license does not affect their own code. If they wish to reuse RapydScript libraries in their proprietary code, the Apache license of the libraries will allow them to do that. If they wish to make changes to the compiler that will only be used internally by the company, the GPL license will not affect them. If they wish to release a stand-alone tool to be used with RapydScript, RapydScript’s license does not apply. If they wish to make changes to the compiler that would affect the rest of the community, then they have to release the source code for these changes.

RapydScript in a Nutshell

RapydScript Logo My last post mentioned RapydScript, a JavaScript variant with Pythonic syntax which I’ve come up with to speed up web-development. I haven’t had much free time to post since then, but I have been updating RapydScript this whole time. The bitbucket repository for the language started drawing some followers, and I am now getting questions in my inbox regarding plausibility of developing large-scale application in RapydScript. To address these concerns, and explain how to leverage RapydScript, I decided to write this post.

Before starting your project, it’s important to understand what RapydScript is and what it is not. RapydScript does not aim to emulate Python in JavaScript like most other Python-to-JS compilers such as Pyjamas, Pyjaco, and Skulpt. RapydScript to Python is what CoffeeScript is to Ruby (actually, RapydScript and Python have more in common). The point is, it’s important to understand that when writing RapydScript, you’re still using JavaScript. The code doesn’t get abstracted into a sandbox or wrapped in special types. This has a few advantages and disadvantages.

Advantages

The main advantage of such approach is closer integration with JavaScript. Your code will load and run faster than similar code written for one of the other Python compilers (except PyvaScript, which will work just as fast). Additionally, your code doesn’t need hacks or wrappers to invoke or create native JavaScript and DOM objects, anything JavaScript has access to, RapydScript has access to. Drawing its power from JavaScript’s prototype, RapydScript’s class inheritance system is more intuitive than Python’s. RapydScript can create object literals (similar to C structs), create anonymous multi-line functions, and do some other goodies that pure Python can not. You can use $ in your variable names and use jQuery (or any other JS framework) without hacks. You can even debug your code by reading generated JavaScript that works and looks almost the same way as original code. Basically, RapydScript has all the advantages of native JavaScript. In fact, it’s very easy to debug RapydScript using Chrome’s Developer Tools or Firefox’s Firebug, other Pythonic solutions don’t have this benefit.

The other advantage is that while RapydScript does have a standard library, it does not rely on it to be usable. This means that even if I decide to abandon the project tomorrow for some reason, you will not be stuck with a half-functional solution for your front-end. Even with a limited subset of Python library implemented, RapydScript is 100% usable and capable of building large-scale web apps, because its main advantage isn’t in its standard library, but in cleaning up JavaScript syntax. In fact, you can forego RapydScript’s stdlib.js in your project altogether and replace it with underscore.js. You will also automatically benefit from new JavaScript features as they appear and can continue using the compiler regardless of how old it is.

Disadvantages

The disadvantage of RapydScript (when compared to other Python-to-JS compilers, not counting PyvaScript) is also its close integration with JavaScript. RapydScript does not aim to catch errors JavaScript ignores. If you access an out-of-bounds cell in an array, you will get an undefined variable, not an IndexError (likewise, 1/0 is infinity). You can’t use negative indexes to traverse arrays in reverse (you can use negative indexes via list.__getitem__(), however). It turns out, however, that for properly written code, these disadvantages aren’t a problem. As long as you understand that the language is not pure Python, and the coding style a bit different, you will not have problems with RapydScript.

Why is it better than JavaScript?

If RapydScript has the same advantages and disadvantages as JavaScript, the question you might then have is “Why use it instead of JavaScript?”. The answer is “Because it brings many of the same benefits Python programmers already enjoy into JavaScript”. These benefits include all of the following and more:

  • Classes and Inheritance
  • Python Standard Library (only a portion of it is included so far, but more will be with time)
  • Clean, easy-to-read, Pythonic syntax
  • Better Variable Scoping (and Variable Shadowing)
  • Implicit Tuple Packing/Unpacking
  • Better Optional Argument Implementation
  • List comprehensions
  • Ability to import multiple modules into a single chunk of code (allowing easier code reuse)

If you spend a few minutes coding in RapydScript, or even checking out its examples, you will notice that once the ugly parts are removed from JavaScript, it’s actually a very beautiful language (allowing for code that is even cleaner than its Python-equivalent). And that’s exactly what RapydScript does, it cleans up the language so you can enjoy its true potential.

So who should use RapydScript?

The audience that will enjoy it the most is probably Python developers who want to write JavaScript code. There is a reason many Python companies choose to do their JavaScript development in CoffeeScript instead of using one of the existing Python-to-JavaScript compilers. The reason is performance, debugging ability, and integration with other JavaScript. RapydScript shares all of the advantages of CoffeeScript without introducing messy and confusing syntax.

RapydScript in Commercial Projects

I have no problem with people profiting from the code they write in RapydScript. Both, your code and the JavaScript you generate is yours to do whatever you want with. RapydScript itself is licensed under GPL, but all of its current libraries are covered by Apache license. This is to allow you to import those libraries into your project without being forced to open-source your entire front-end. I ask that other developers submitting new libraries for RapydScript also use a permissive license, but that is up to the developer.