Using Web2py

As promised in the GAE post, here is part 1 of using JSONRPC calls on GAE. I originally intended to write more of a how-to, but now that I’m halfway through, this seems like it should be it’s own post, and I’ll write a how-to part later.

If you’re reading this blog, you probably know how to write Pyjamas apps, but you might not know about web2py, and why or how to use it. I will give a quick overview on how web2py works in the 2nd half of this post, but if you want to just know about using web2py with Pyjamas, or deploying to GAE (on windows), skip to the next post.

I want to start out with a warning. You MUST be careful if you plan to develop an app on GAE. Let’s say hypothetically you have an idea for a business using webapps. You start out with no money and no users. GAE looks really appealing because you can get reliable hosting for free. A few months later you’ve grown and you’re ready to move to your own server. This is where you run into issues. All the server code was written for GAE. If you were good and your code is clean, you can copy several large chunks of code directly over to your new server, but interfaces for things like JSON calls, and interfaces to your database will all need to be rewritten. If you look around, you can find a lot of talk about “portability” and “lock in” and ways to get around this issue. It’s definately a hot topic when it comes to using GAE.

Lock in only happens if you aren’t careful up front. You don’t have to develop using GAE directly, instead I suggest using a web application framework, and build your code on top of that. At the very least, this will protect your code. I’m working on a small game, so I haven’t looked at all into moving database info, but this is something you might want to research. Worst case, you could write a web app to print out all your db info on a web page in a format that you can import to your new server. If anyone knows any good ways to move databases, feel free to leave a comment!

Back to frameworks. There are a couple frameworks available that support GAE. The two most popular are web2py and django. From what I read, neither framework is clearly better – each has its own strenths. It’s more about finding a framework that fits your needs more. For hobbyists, I’d recommend web2py. Web2py supports quicker development. A lot of people say it’s more intuitive and you can be more productive. I feel it aligns better with the Python philosophy. Django has it’s own strengths though. It gives you more control as a developer. Sometimes you’ll want that extra control, it’s really up to you.

If you’ve decided to go with web2py, or even if you didn’t, here is a quick overview of how it works from my non-SW perspective. There are 4 areas I heavily use inside each of my applications – model, controller, view, and the static dir. With that in mind, I want to start with the web2py URLs which take on the form of http://domain/application/controller/function (where function is closely related to view). You also have a static directory which can be accessed using http://domain/application/static/. The static dir is where you’ll have website images, Pyjamas apps, and other static files. There are some defaults setup so if someone visits http://domain/ your server will load something, but most URLs will take on the form http://domain/application/controller/function.

So what do controllers and functions do, and how are they related to views? Well you have a “controllers” folder in your application’s directory which is full of controller scripts made up of several functions. If someone visits http://domain/application/game/loadGame the server will call the game.py file in your controllers directory and run the loadGame() function in that file.

This leads to views. Some functions will be used to handle forms, or handle JSON calls, but your most important functions will return a dict(). When you return a dictionary, it is really returning a reference to the view with the same name as the function. Let’s say this loadGame() function returns a dictionary. In your views directory there should be a html file game/loadGame.html. This is what the server sends users when they visit http://domain/application/game/loadGame.

This views page will probably include links to js. files, images and other static files. This is the 3rd area, the static dir. When you have static files, you will host them here.

There’s one last area – models. The models area has the setup for any databases you’re using, and the setup for other things like the JSONRPC interface. When I tried adding users and storing high scores, I was using this file a lot.

So that’s web2py in a nutshell.

Code Tags and Syntax Highlighting

UPDATE: Now that I’m using WordPress, I no longer rely on this. With markdown and markdown syntax highlighting plugins, all this logic is automatically handled for me at page render time. This is still a good reference for people using plogspot or implementing their own syntax highlighting on the back-end. Also, as Adam has mentioned, for blogspot there are other alternatives available that are simpler to setup if you don’t want to do your own implementation.

I really like the flexibility of blogspot, it’s one of very few free blogging platforms that allows me to customize just about any feature of my blog, from custom CSS stylesheets to unique Javascript snipplets. Naturally, I was surprised that there was no <code> tag to wrap code blocks in despite so many existing programmer blogs. Luckily, a quick search revealed another blog that resolved this via a simple CSS tag. So I followed the same advice, but somehow staring at monotone text in a code block was about as satisfying as coding in notepad.

As I started investigating, I found multiple solutions, ranging from online parsers to various language libraries. My favorite was pygments, a python library (are you surprised?) for syntax highlighting. After a little more searching, I found the following article that not only explains how to use pygments but also writes a quick script for parsing Python files. The script had a small glitch, instead of indir=r”, I had to use indir=r’.’, so the code I started with ended up looking like this:

from pygments import highlight
from pygments.lexers import PythonLexer
from pygments.formatters import HtmlFormatter
import os

formatter = HtmlFormatter()

indir= r'.'
for x in os.listdir(indir):
    if x.endswith('.py'):
        infile = os.path.join(indir, x)
        outfile = infile.replace('.py', '.html')
        text = ''.join(list(open(infile)))
        html = highlight(text, PythonLexer(), formatter)
        f = open(outfile, 'w')
        f.write(html)
        f.close()
        print 'finished :',x

I then decided to go a step further and update the script to handle multiple programming/markup languages, since I also use CSS, HTML, XML, and Javascript in my blog. But creating a separate file for every snipplet of code I decide to use in my blog post didn’t seem like a good solution.

Instead, I decided to parse out and highlight all the <code> tags from a single file. This would allow me to type an entire blog as one text file and run it through the parser once instead of combining multiple files into a single blog post. I was initially thinking of using Python’s regular expression library, but decided to use XML parsing library instead. It just seemed like the right tool for the job, it can handle additional attributes added to the <code> tag, as well as edit the document tree in-place. In other words, with XML parser I’m much less likely to screw something up with my future “improvements”.

There are several XML libraries available in Python, I decided to go with xml.dom. The main reason I chose it is that its naming scheme is the same as that of Pyjamas and Javascript’s DOM methods (unlike lxml, whose method names feel awkward to me), xml.dom also happens to be the library Pyjamas Desktop uses for XML parsing (I’ll write another tutorial on that soon).

As far as applying correct syntax highlighting based on the language, there are a few ways to go about doing it. One is to use guess_lexer() method from pygments, which can usually detect the correct language automatically. However, a lot of my code-blocks are two-liners that would probably look the same in many different languages, so I decided against that. As a blog writer, I should know what language I’m posting the code in anyway, so it’s very easy for me to specify it myself. Since I’m using a real XML parser, I can add as many attributes to my <code> tag as I wish. So I decided to to specify the language via the “class” attribute (i.e. <code class='python'>).

I then imported the lexers I wanted to use, and defined a dictionary mapping “class” names to lexers (note that omitting the class attribute entirely defaults to non-highlighting TextLexer):

from pygments.lexers import HtmlLexer, JavascriptLexer, CssLexer, XmlLexer, 
                            PythonLexer, TextLexer
import xml.dom.minidom

lexers = {'': TextLexer,
          'html': HtmlLexer,
          'js': JavascriptLexer,
          'css': CssLexer,
          'xml': XmlLexer,
          'python': PythonLexer}

The next step was to loop through all of these code tags, selecting the correct lexer based on the “class” attribute, and applying it to the contents. The last step was to convert the XML DOM tree back to xml format before outputting it to the file. Trying something like this, however, will not work:

for element in code_tags:
    lexer = lexers[element.getAttribute('class')]()
    element.firstChild.nodeValue =
        highlight(element.firstChild.nodeValue, lexer, formatter)
html = doc.toxml()

The problem with the above code is that xml.dom automatically converts < and > angle brackets to &lt; and &gt;, respectively to avoid interpreting the node value as part of the tree. This means that all of the pretty syntax highlighting pygments added will be shown to the user as code instead of getting interpreted by the browser. Since I’m outputting all of the “XML” back to the document, I figured there is no harm in xml.dom “interpreting” the pygments output, and replaced the last line of the for loop to the following:

element.replaceChild(xml.dom.minidom.parseString(
    highlight(element.firstChild.nodeValue, lexer, formatter))
    .documentElement, element.firstChild)

This worked great for Python, CSS, and Javascript parsing, but HTML and XML got “interpreted” into the tree before I could pass them into pygments (element.firstChild.nodeValue did not exist). The trick was to apply the reverse of the solution I applied to get normal highlighting working. We do this via toxml() call, which converts the tree structure into a string. The problem is, if you call toxml() on the entire element, you’ll end up also syntax-highlighting the <code> tag itself, which we’re trying to keep hidden from the viewer. And since nodeList doesn’t have toxml() method, you’ll simply have to loop through each child individually as follows:

for child in element.childNodes:
    element.replaceChild(xml.dom.minidom.parseString(
        highlight(child.toxml(), lexer, formatter))
        .documentElement, child)

The work is almost complete. If you’ve been following along, adding these lines to your code and attempting to run it, you probably noticed that xml.dom throws a syntax error when you try to parse your text file. The reason is that xml.dom requires all of file contents to be inside a root element, and this is true of all XML content (i.e. webpages are always inside tag, svg images are always inside <svg> tag). My initial solution while testing this was to create fake <root> tag around the text body. It works, but this wouldn’t be much of a convenience tool if it made me jump through extra hoops. For my final solution I decided to get rid of them. It’s really not hard, all you have to do is add <root> to the beginning of the string and </root> to the end before parsing it with xml.dom. And in the end, simply remove the last 7 characters and the first 28 (to account for <root> as well as <?xml version="1.0" ?> xml.dom slaps on). The final version of the code ended up looking as follows:

from pygments import highlight
from pygments.lexers import HtmlLexer, JavascriptLexer, CssLexer, XmlLexer,
                            PythonLexer, TextLexer
from pygments.formatters import HtmlFormatter
import os
import xml.dom.minidom

lexers = {'': TextLexer,
          'html': HtmlLexer,
          'js': JavascriptLexer,
          'css': CssLexer,
          'xml': XmlLexer,
          'python': PythonLexer}

formatter = HtmlFormatter()

indir= r'.'
for x in os.listdir(indir):
    if x.endswith('.txt'):
        infile = os.path.join(indir, x)
        outfile = infile.replace('.txt', '.html')
        data = list(open(infile))
        data.insert(0, '<root>')
        data.append('</root>')
        text = ''.join(data)
        doc = xml.dom.minidom.parseString(text)
        code_tags = doc.getElementsByTagName('code')
        for element in code_tags:
            lexer = lexers[element.getAttribute('class')]()
            for child in element.childNodes:
                element.replaceChild(xml.dom.minidom.parseString(
                    highlight(child.toxml(), lexer, formatter))
                    .documentElement, child)
        html = doc.toxml()[28:-7]
        f = open(outfile, 'w')
        f.write(html)
        f.close()
        print 'finished :',x

If you want, you can take this a step further by customizing the <code> tag for each language you use. If you want something like line numbers, just add the following to the formatter initialization line:

formatter = HtmlFormatter(linenos=True)

Also, don’t forget to generate new classes responsible for actually defining the colors you highlight with and placing them into your blog’s CSS stylesheet. To generate the stylesheet, just run the following command in bash:

pygmentize -S default -f html > style.css

And just to prove that my new parser works, I’ve used it to highlight this entire blog post (note, this was originally posted on Blogspot, this WordPress post no longer uses this mechanism). When I have the time, I might go back and highlight my old posts the same way as well.

Pyjamas Applications on Google App Engine

The next topic I want to cover is writing an app on top of GAE. This one took me a while to originally figure out. I had absolutely no framework experience before, and no idea how GAE worked. Initially, I was very confused at first so I want to discuss how GAE works and what I learned. This post assumes you have the GAE SDK installed and you’ve created an application using the Google App Engine dashboard.

Let’s start with the most important file for your web app – app.yaml. Anytime someone visits any URL under your domain, http://yourapp.appspot.com, their request will go through the app.yaml file to figure out what to load. In proper terms, the app.yaml file forwards the URL path to a “request handler”. The app.yaml file only complicates things for web app with static files, but it becomes very powerful when you start building more complex apps that use services like JSONRPC.

My first GAE app used the helloworld example from Pyjamas. I’ll post the app.yaml file I used and explain what the different sections mean, and then I’ll explain how to package everything to upload it.

application: myapp
version: 1
runtime: python
api_version: 1
handlers:
- url: /
  static_files: output/Hello.html
  upload: output/Hello.html

- url: /*
  static_dir: output

The first 4 lines are standard. The only unique thing you’ll want is your application name. This application should already be created.

The rest of the file is a list of handlers. Each handler starts with a URL pattern. When a user visits a URL on your domain, GAE uses the first handler with a URL patterns that matches the client request. In the app.yaml file above the URL pattern for the first handler is /, which will catch requests for http://yourapp.appspot.com/. The line after the URL pattern tells us we have a static file handler. This static file handler will load output/Hello.html – the main HTML file for the helloworld app. This means that when a user visits http://yourapp.appspot.com/, they load output/Hello.html.

The 2nd handler is for catching calls to http://yourapp.appspot.com/*. This handler is for a static directory, not just a single file. This handler is here because Pyjamas compiles into multiple javascript/cache files which Hello.html loads. Files like Hello.nocache.html need to be accessible from URLs like http://yourapp.appspot.com/Hello.nocache.html in order for the app to work.

There’s one last type of handler: script handlers. I personally use the script handlers to add support for JSONRPC calls, which I will demonstrate in a future post. For the full details on how to configure your app.yaml file visit the GAE documentation at http://code.google.com/appengine/docs/python/config/appconfig.html .

Finally, with the app.yaml file complete, all you need to do is put everything into 1 nice package. I put the app.yaml file at the same level as the output directory (to be clear, 1 level above all the output files). Add this existing project to the Google App Engine Launcher, Deploy it, and you’re good to go!

Installing Pyjamas on Windows XP

Now for the good stuff! I followed a wiki post and got an old Pyjamas version working. The old Pyjamas version was missing some features, so I figured out a way to build the latest version from scratch. For documentation purposes, here are the steps I followed on Windows XP:

  1. Download and run a Python 2.6 installer (here’s a link to the latest: http://www.python.org/download/releases/2.6.6/)
  2. Install comtypes 0.6.1. (comtypes-0.6.1.win32.exe in http://sourceforge.net/projects/comtypes/files/comtypes/0.6.1/)
  3. Permanently update the path for Python (and Pyjamas while you’re at it)
    1. Go to System Properties through Control Panel
    2. Open the Advanced tab, then click on the Environmental Variables button
    3. Add the following to your “PATH” system variable: c:\python26;c:\Pyjamas\pyjs\bin; (the 2nd dir will exist after you install pyjamas)
  4. Install Git for Windows (If you have Git skip this step)
    1. Install msygit from http://code.google.com/p/msysgit/downloads/detail?name=Git-1.7.3.1-preview20101002.exe&can=2&q= B
    2. Install TortisGit from http://code.google.com/p/tortoisegit/downloads/detail?name=TortoiseGit-1.5.8.0-32bit.msi&can=2&q=
  5. Git the latest Pyjamas code.
    1. Create a directory C:\Pyjamas\
    2. Open an explorer window, and navigate to the C:\ drive.
    3. Right click on the Pyjamas directory and select Git Clone.
    4. Enter the URL: https://github.com/pyjs/pyjs.git git://pyjs.org/git/pyjamas.git
  6. Open a new command line (this has to be done after updating the path):
    1. > cd C:\Pyjamas\pyjs C:\Pyjamas\pyjamas
    2. > python bootstrap.py Note: The bootstrap won’t print anything to the screen, but it will create the bin directory (which was added to your path in step 3)
  7. Now test an application:
    1. > cd examples\helloworld
    2. > python __main__.py
    3. Use Firefox or IE and open output\Hello.html Note: Chome won’t load AJAX pages off your local machine for security reasons. You’ll have to upload it somewhere to see it in chrome. For more info see http://code.google.com/p/chromium/issues/detail?id=40787
  8. If you find anything different, update the wiki! Pyjamas Wiki

I ran through steps 1-4 once about 6 months ago. Since then I periodically rerun steps 5 and 6 to get the latest Pyjamas code.

One last note. I wouldn’t recommend pyjamas desktop on Windows if you’re doing Canvas apps. It will be good when a webkit pyjamas desktop version is out, but if you want pyjamas desktop, I’d go with Ubuntu 9.10. Because I have an Ubuntu box, I’ve never put in the time into figuring out how to get pyjamas desktop working on Windows with the latest Pyjamas code, but I do know others that are using it.

Thoughts on Pyjamas, Python and Web Apps

I was looking for a fun project to work on in my spare time, and while hanging out with my buddy Alex we started talking about coding and web apps. Even though I got an Electrical Engineering degree, I applied to college as a CS major and I still have an interest in CS and especially algorithms. What turned me off to CS was all the extra “fluff” you needed to write a real program. In school I’d always get templates that said “//Insert your algorithm here” and I never liked that.

That’s where Alex comes in. He was talking about how Python, and how clean and understandable the code is. I looked into it and there are a few features that really make coding simple. Python doesn’t require variable declarations, it includes lots of nice features like accessing the last element in an array using a [-1] index, the standard types have really uniform syntax, and the overall the Python syntax is clean. All those features made it simple to focus on algorithms, which is exactly what I was wanted.

What’s even better is Alex also talked about using a program called Pyjamas. With Pyjamas, you can write a web app in Python, and “compile” it to javascript so it runs in a standard browser. The idea sounded so clever. You could write an app using a language that lets you be very productive, and then compile and run your app in a regular browser.

I got a copy of Pyjamas (it’s open source code so it’s free at http://pyjs.org) and tried it out. After getting through the installation, which was really dated at the time (I’ll list all the steps I went through in my next post), I was writing really nice web apps. It was great! I started out creating a small game. I created a player which I could draw on the screen, and I could move around. I then created objects, like walls, that the player could interact with. Within a couple days I had a small game coded up using a language that was brand new to me.

Loading Notification for Web Apps

If you fooled around with some of the more complex Pyjamas examples, such as GChartTestApp, you probably noticed that they take quite a while to load. My apps too have been steadily growing in size and can take up to several seconds to load. It makes sense to inform the user that the app is loading and the browser is not frozen. One way to do so is by doing exactly what GChartTestApp does, displaying text that says “Please wait, the app is loading”. But since this is an app after all, I wanted something a bit more elegant than static test. At the same time, since the app is already large enough, I didn’t want to add any overhead to it by writing a new module for the loading screen. My solution was pretty simple, I decided to go find an animated .gif image that looks like a loading bar or an hour glass and slap it as background image in the middle of the screen. It uses no resources (aside from loading the image to begin with) and the best part is your app will load on top of it, hiding the image after the load completes (assuming your app background is not transparent).

First I decided to go find an image. What I found, however, was even better. http://ajaxload.info is a loading .gif generator that lets you choose a template and customize the color to fit your website. Then I wrote a quick body tag for my CSS stylesheet to setup the image properly:

body {
   height:100%;
   background-color: white;
   background:url(loading.gif) no-repeat center;
}

Height is necessary to make sure the image appears in the center of the screen, otherwise HTML defaults to height of zero and expands it as more content appears (since there is no content, the background image will appear at height of 0, with half of it cut off). If you’re using the standard CSS style sheets from examples, most apps don’t need more work than that, since the opaque background of the panels will hide the loading image. If you’ve picked one of the panels that don’t have a style sheet, or GWTCanvas/HTML5Canvas (which has transparent background by default), you can either create a new style sheet and assign it to the main panel, or use DOM to change the background of the panel without creating a style sheet:

DOM.setStyleAttribute(self.getElement(), "background", "white")

Now just place the loading image in your output directory and reload the page.