RapydScript II

Building New Compiler A while back I mentioned that I wanted to rewrite RapydScript to rely on an internal AST (Abstract Syntax Tree) structure rather than PyMeta, which would allow it several advantages, including code transformations prior to output, more consistent parsing, and better error detection/handling.

I have looked at several tools for helping me accomplish this. One of the most promising seemed to be CoffeeScript, that is until I actually started converting its source code into RapydScript (via an automated AST-to-RapydScript generating script I put together). Having discovered that the output is very ugly and unpythonic, I gave up on the idea. After some more searching, however, I found the right tool for the job, ironically it was UglifyJS, a tool designed to make your JavaScript less readable.

UglifyJS code-base is very clean, and although it uses its own format that doesn’t directly translate into Python, the logic is well-structured and it’s easy to see where the code could be replaced by Python/RapydScript constructs. The code does a good job being explicit rather than relying on various JavaScript hacks, like CoffeeScript does.

Using UglifyJS as a base, I rewrote the compiler to convert RapydScript into native JavaScript. You can get it here: https://github.com/atsepkov/RapydScript. There are a couple minor features from the original compiler RapydScript II doesn’t support yet, but it’s already better than the original in several ways:

  • It compiles code much faster (on the order of milliseconds instead of seconds)
  • It handles leading whitespace much better now (same way as native Python would), so it will not complain if comments or secondary lines mix spaces and tabs
  • It will properly localize outer-scope variables when using module wrapper
  • It’s a single-pass compiler that uses consistent logic to identify tokens, whereas the original RapydScript would sometimes try to perform the same logic in Python stage that it would later do using PyMeta, resulting in potential parsing inconsistencies in certain special cases
  • It supports some notations that the original does not (such as the code shown later in this post)
  • It handles if val in array the same way Python would, rather than JavaScript, checking for existence of the value rather than the index
  • It’s more resistant to bad code, reporting errors that are more relevant and identifying the exact line/column number that caused the error
  • It’s more resistant to valid code that doesn’t follow typical syntax conventions, parsing it correctly rather than erroring out
  • The class-parsing logic has been improved to allow better durability and some requirements of the original compiler no longer apply (the __init__ function no longer needs to be first in the class body, for example, classes can now be nested and even declared inside functions)
  • It is written in JavaScript, which means it can eventually be ported to native RapydScript, it also means it can be modified to run in a browser, and that we can actually test function rather than structure in our unit and integration tests
  • It’s AST-aware, so it doesn’t need to generate excessive parentheses in the output for safety like the original compiler
  • Since the compiler is AST-based, it doesn’t need to generate output at the time of parsing, which makes the compiler more flexible and allows various code transformations and analysis to take place before the output is produced. This also makes it easier to make changes to the way output is generated in the future, should we need to do so.

The compiler already supports most of the features of the original, the only features that aren’t yet supported are:

  • (UPDATE: these now work) Chained Comparisons (1 < x < 5)
  • (UPDATE: these now work) List Comprehensions
  • (UPDATE: these now work) Inline Functions

Everything else should work at least as well as the original, and in some cases much better. For example, you can now do the following:

    return a

Here is a list of other features that the original compiler does not support:

  • Logic such as 5 in [5,6,7] now correctly returns True
  • stdlib is no longer required for using for loops, len() or print(), more functions will be moved out of stdlib with time
  • Negative array indices now compile to arr[arr.length-1] instead of using slices, this allows assignment to them and not just referencing them (you still can’t use variables as negative indices)
  • Simple minifier is now built into the compiler (it will remove whitespace and unnecessary punctuation)
  • Pythonic imports are now supported, use them via --namespace-imports flag (experimental, some code will probably break, star-imports are not supported yet, you can set variables in global scope but not modify them)

I can’t yet guarantee that your code will work correctly with RapydScript II, but all RapydScript II test cases (which are more rigorous than original RapydScript tests, in that they also test that the code performs correct logic after compilation) as well as the examples from original (with slight modifications to remove the unsupported features) seem to work. And as a present for those afraid of GPL, I’m releasing the entire compiler for RapydScript II under Apache 2.0 license.

This entry was posted in Languages and tagged by Alexander Tsepkov. Bookmark the permalink.

About Alexander Tsepkov

Founder and CEO of Pyjeon. He started out with C++, but switched to Python as his main programming language due to its clean syntax and productivity. He often uses other languages for his work as well, such as JavaScript, Perl, and RapydScript. His posts tend to cover user experience, design considerations, languages, web development, Linux environment, as well as challenges of running a start-up.

8 thoughts on “RapydScript II

  1. I’m learning both Python and Javascript, and I had discovered your interesting project recently. So today I saw the following website that helps compare various frameworks by how they each present the same small example app. http://todomvc.com/ You might want to check it out and see if it would be helpful (or even relevant) for RapydScript II to be included in their list.

  2. Hi, i just started using Rapydscript and enjoy it. Question for you: Are Python standard library I/O operations accessible with Rapydscript? I tried using Python’s “open()” command to read in a file to no avail.

  3. Hi Alexander, Thanks for Rapydscript, I’m enjoying playing with it.

    When trying class inheritance, I seem to be getting code run in the parent, though I never instantiated anything (tried posting to google group but it didn’t take my post for some reason, sorry for posting this here..)

    if I run this

    class Bird: def init(self): self.about = “I’m a bird – Tweet!” print (self.about)

    class Parrot(Bird): def init(self): self.about = “I’m a parrot – Squawk!”

    print (“This program creates nothing and should do nothing”)


    I’m a bird – Tweet! This program creates nothing and should do nothing

    I never use any class/object code, just a print, yet parent seem to be getting called If this is intended behavior, then maybe it should be noted somewhere in your tutorial/notes, as users may get unintended code being run just by including an inherited class.

    Hope this is is useful – I’m still enjoying using everything else… Thanks Steve

    • (sorry for formatting above, trying this – please delete/edit as you see fit :)

      if run this

      class Bird:
          def __init__(self):
              self.about = "I'm a bird - Tweet!"
              print (self.about)
      class Parrot(Bird):
          def __init__(self):
              self.about = "I'm a parrot - Squawk!"
      print ("This program creates nothing and should do nothing")


      I'm a bird - Tweet!
      This program creates nothing and should do nothing
      • I apologize for not replying earlier, didn’t notice this post. The problem is due to JavaScript’s prototypical inheritance. In order for RapydScript to emulate Python-like inheritance correctly it needs to bind and instance of parent to the prototype property, so the logic in the class constructor has to fire for every inheriting class, not just the objects you create manually from that class. Since instances are typically independent of each other, this shouldn’t hurt your logic aside from a few glitches, like dumping the print statement. And yes, I will document this in the README when I get a chance. Thanks for pointing this out.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>