State of RapydScript

Matrix-agent-Smith-clones Many of you probably noticed that I’ve been slow to contribute to RapydScript lately. It’s hard to summarize all of the reasons, but here are the top few:

  • Despite the occasional bug, the project is in solid state and quite usable in production (I do eat my own dog food by the way).
  • I’ve been working on other projects, most recent one being TileZone.
  • I am one man, and my time doesn’t scale. I wish I could continue improving the language and fix the bugs people report but at the end of the day I don’t get paid for it, and as a result I prioritize my other projects and my job over this. I’m grateful for the contributions from other members, but the community isn’t large enough to encourage others to contribute regularly.
  • With ECMAScript 6 and Babel, the future of RapydScript seems somewhat uncertain. Babel is an awesome library that brings many similar features to RapydScript (although with uglier syntax, in my opinion). I think, however, that RapydScript, if continued, will be more relevant in ES6 world than something like CoffeeScript.
  • As mentioned before, I don’t believe Python is a panacea. It’s one of my favorite languages, but I’ll be the first one to admit that it made quite a few mistakes design-wise. Some issues that come to mind are global-scope collection methods (len, map, filter), lambda functions, inconsistent naming scheme for native objects (why are native classes called int and str when the convention is to use title case for them?), exposing of optimization methods (xrange) to the user when they should really be handled behind the scenes, and so on. As a result, it’s frustrating to see new RapydScript users complain that the language should be more like Python, especially the parts of Python that I hate.

With that said, I love RapydScript, and will continue using it. I prefer it to both, Python and JavaScript. Ironically, I started RapydScript to avoid dealing with JavaScript, but in the process learned to appreciate it more. Some design patterns I use in RapydScript would not be possible in Python. JavaScript is a powerful language, and yet it’s messy. This StackExchange answer to a question about language verbosity does an amazing job explaining what I like about RapydScript. If we were to grade the 4 languages I mentioned on those 2 categories, I would rate them as follows:

Language Signal vs Noise Cryptic vs Clear
JavaScript ⭐⭐⭐⭐⭐
CoffeeScript ⭐⭐⭐⭐⭐
Python ⭐⭐⭐⭐ ⭐⭐⭐⭐
RapydScript ⭐⭐⭐⭐ ⭐⭐⭐⭐⭐

JavaScript is definitely the most verbose of the 4, having to type out all of those prototypes, optional argument checks, array[array.length-1], and brackets gets repetitive very fast. Looking at all that code gets repetitive even faster. As far as CoffeeScript, there is no arguing that it’s terse. Its signal to noise ratio is amazing, every character serves a purpose. The only problem is making sense of all that code later, remembering differences between fat and skinny arrow, and wrapping your head around all those invisible parentheses. CoffeeScript solved JavaScript’s verbosity problem, but introduced a new one. Its cryptic syntax is the reason many JavaScript developers recommend steering clear of it. Python has a good balance of the two, and as a result has gained quite a large following from developers who’re tired of dealing with the visual clutter of other languages. But, to be honest, it’s not as clear as modern JavaScript. Let me show you a few examples:

Python JavaScript Explanation
len(a) a.length Global scope, unnecessary abbreviation, should I say more?
def function Another vague abbreviation, but at least def makes it very clear what type of object we’re defining
lambda function At least they didn’t abbreviate it
*rest ...rest They went out of their way to rename all && and || and then introduced this?
**kw {foo:"bar"} Nothing says dictionary like 2 multiply signs next to each other
','.join(array) array.join(',') Are we joining elements via delimiter or delimiters via elements?
str String Another useless abbreviation that ignores the convention of capitalizing classes
list Array Linked list?
ZeroDivisionError Infinity Sure, it’s mathematically correct, but when I divide by zero, 9 times out of 10 infinity is what I want, I’ll end up assigning 999999 to the number in the catch block anyway

You may argue that Python’s operators are more clear than JavaScript’s ||, &&, but which one do most languages follow? Developers are so used to seeing those operators that it really doesn’t matter. RapydScript does as well as Python on signal, and sometimes a bit better (thanks to closures and ability to inject JS), I gave it an extra point for clarity because it can often leverage the cleaner JS approach, as I usually do in my code (although to be fair JavaScript still inches ahead a bit here).

Going Forward

So where does that leave us? As I said, I love RapydScript, but I don’t have the manpower to continue fixing the bugs or make RapydScript more like Python when I have no need for Python. My backends are all in node.js now. It also seems like Kovid has been more active in enabling Python users to feel at home with his fork, so I’ll let him continue that. At this point the philosophies have diverged somewhat. I disagree with adding extra syntax and special-casing variables for Python, he disagrees with breaking Python rules. At this point, I’m feeling like Python has become a bit of a shackle, there is no reason I should repeat the same mistakes as Guido in the name of backwards compatibility with code that wasn’t even written for the same platform.

My goal is to make RapydScript relevant in ES6 age, and here is how I plan to do that (at my own pace of course, given other projects, but you’re more than welcome to help me out):

  1. Fix low-hanging fruit bugs.
  2. Separate tokenizer from lexer (I’ll explain why in a bit).
  3. Rewrite output module to generate ES6 instead, let Babel handle special cases, not us.
  4. Modify output module to fill in templates rather than handcrafting output code.
  5. Drop pythonisms from core (no more int, str, bool) and embrace native JS types (Number, String, Boolean)
  6. Add optional static typing and ability to infer types, let users leverage this information to create/optimize better templates.
  7. Now that tokenizer is separate from lexer (step 2 accomplished this), add a macro-processing step in-between that can mutate the code as the AST is being formed, this will give RapydScript powers similar to sweet.js, but which can handle whitespace.

After all 7 of those steps are complete, users can enhance the language as they see fit, without having to modify the core. If they want a new operator, or pythonic variable names, it would be as simple as creating a macro. If they want different output/format, it would be as easy as writing an output template. Admittedly, this will probably take me a while. But if anyone wants to help, we can probably get this done much sooner.

Equality Operator

Matrix-agent-Smith-clones One thing that always bothered me, and many others, is how useless the == operator is in JavaScript. It seems to combine the worst of both worlds from equality and identity:

  • When comparing primitive types, it acts as a lax equality operator
  • When comparing objects, it acts as an identity operator

I can’t think of any use case for which such behavior is desired. In fact, reversing this behavior would produce a much more useful operator since the concept of “equals” tends to relax as data gets more complex because some dimensions are simply irrelevant. Do I care that two objects were not created at the same exact time? Do I care that their IDs differ? Do I care if the measured length is 4.00000000001 inches rather than 4? When it comes to equality comparison, JavaScript fails miserably. This is actually one of Douglas Crockford’s pet peeves about the language, which he explains in more detail in his book: JavaScript: The Good Parts.

Fortunately, there is also a more consistent identity operator (===), often erroneously called “strict equality” by people who don’t understand the difference between identity and equality. Unfortunately, as I just mentioned in previous paragraph, there is no concept of equality operator in the language at all. For those unfamiliar with these concepts, identity answers the question of “Does A and B refer to the same object?” while equality answers the question of “Is B a clone of A?“.

Naturally, both of these questions come up in programming a lot, so it’s a shame that only one can be easily answered in JavaScript. You probably already stumbled into this problem if you ever tried something like this:

new Date('1/1/2000') == new Date('1/1/2000')

or this:

{"foo": 1} == {"foo": 1}

The problem is not unique to JavaScript, many other languages lack equality operator as well. Typically, however, those languages have an alternative mechanism for handling this case, such as operator overloading. Indeed, there are cases when operator overloading is actually superior – such as when objects contain meta data that one wishes to omit from comparison (i.e. exact creation time, unique id, etc.). Unfortunately, JavaScript doesn’t allow for operator overloading either, and herein lies the problem. While one can easily roll a proper equality function (and many libraries such as underscore already include one) you would still have to decide whether it’s worth using on a case-by-case basis.

We’re not in FORTRAN age anymore, where developers had to tweak each operation. Today we’re spoiled, we often can get away with simply telling the compiler what to do rather than how, and enjoy optimal performance 90% of the time with negligible overhead. One such example is the sort function. When was the last time you had to wonder if you should use merge or insertion sort? You simply use the built-in sort and assume that unless there is something very special about your data, you’re better off moving on to other things rather than attempting to optimize the last 10% out of the algorithm. In most modern languages equality operator falls in the same category. Sure, you’d be able to shave off a few microseconds by replacing that == (equality) with is (identity) for certain comparison operations in Python, but is it worth the extra brain cycles? Most of the time the answer is “No”.

Why then do we have to be explicitly aware of this in JavaScript? Moreover, why can’t RapydScript compile == into deep equality? First, I should mention that, like some other libraries/languages for JavaScript, RapydScript already has a deep equality comparison via the inbuilt eq function. Yes, eq(a, b) (previously called deep_eq) has been supported for years. The problem is that (up until recently) I wasn’t able to make the decision for you of whether you want eq or ==. The issue boils down to the fact that if I decide to compile == to equality across the board for the developer, I effectively introduce enormous overhead (about 700% according to jsperf) on the user in cases where he/she expected identity comparison instead (which is about 90% of the time, since primitives are a lot more common). There has been a lot of discussion about this, which you can follow in issues 93 and 94 on my github page.

As you probably already guessed from previous paragraph, I’m now able to bridge that gap. Effective last month (that’s right, I snuck a change in without anyone noticing), RapydScript is the first JavaScript-based transcompiler to support high-performance deep equality. How performant is this operation, you may ask? Well, take a look for yourself:

Screen Shot 2015-12-05 at 2.59.38 PM

According to jsperf, the overhead is negligible (that’s right, the measurement noise is greater than any visible difference – as you can see from the deep-equality version outperforming identity). So what changed? Why am I suddenly able to blow the doors off of typical deep equality? Well, nothing new happened in JavaScript world. What changed is my approach. I decided to think about the problem creatively and had a sudden eureka moment (which I’m sure other compilers will copy in the future). Unfortunately you’ve probably already looked at the code from JsPerf above, ruining my surprise. But in case you haven’t, here is the pattern I came up with:

A === B || typeof A === "object" && eq(A, B)

How does it work? As you probably learned in your introductory programming class (assuming it was a legitimate C/Java class rather than an online tutorial about JavaScript), binary operators have short-circuit ability in just about all languages. This means that if left-hand side is truthy for || (or) operator, or falsy for && (and), the rest of the line is ignored (it’s as if it’s not there). That means that if A and B are primitives that are equal, the above will be equivalent to a simple A === B comparison. That handles the positive case, but negative is a bit trickier. I was struggling with it for a while (yes, the above equation makes it seem simpler than it really is – everything seems easy in hindsight), until I found a performant operation that works in both, browser and node (my first attempt was to check if A.constructor exists, which doesn’t work on all platforms). Fortunately, the very same “feature” that makes typeof useless in most cases (the fact that every non-primitive is an object) becomes the saving grace of this operation. As you can see from the JsPerf test, the overhead for this operation is negligible as well.

The best part is that unlike native JavaScript, where the developer would have to type that out by hand (because hiding it in a function call introduces the 700% overhead we’re trying to avoid), RapydScript can unroll == operator into that magic automatically. You’re probably wondering, then, how is it that this change has been in RapydScript for over a month if the == still compiles to ===? Well, for safety I’ve hidden this operator behind an import. If you wish all your == operators to compile to proper equality test shown above, add the following line at the top of your file:

from danger_zone import equality

That’s it. Now all your == will compile to the magic shown above, and != will compile to its inverse (thanks to DeMorgan’s Law). The above import will also tweak implementation of indexOf and lastIndexOf to perform deep equality as well, which in turn makes tests like if a in arr be based on deep equality (delivering a consistent experience across the board). As before, the identity operator is still there as well, via is and is not, which compile to === and !==, respectively. In the future, I also want to add an optimizer to the compiler, which will strip away the remainder of the line at compile time if one of the operands is a constant.

Regret and Opportunity

Psychology of Motivation I apologize for the lack of updates. A lot has happened since my last post, the lawsuit that I was involved with was finally settled. It wasn’t settled favorably, and I ended up losing most of my security deposit anyway – a large chunk of it to attorney fees. The seller (who I assume spent just as much on her attorney) probably ended up close to break-even as well. It’s unfortunate that she didn’t come to her senses until 2 years into this ordeal, after the fees started to outweigh the amount disputed over.

Perhaps I was wrong to fight her. Perhaps I should’ve just walked away. After 2 years of dread and stress, I ended up losing almost as much as if I simply gave up on day one and let her have the entire amount. I was ready to compromise, I was ready to split my hard-earned savings in half with a stranger just to preserve something, she was not. As a kid I was told once “Don’t get into an argument with an idiot, they’ll take you down to their level and beat you with experience”. Now I knew what that meant.

This was the most unpleasant experience of my life, 2 years of dread where you know that you’ll lose money, you just don’t know how much. I’ve never felt such disgust for a person before. Not hate, not anger, but disgust. With several million in assets (according to her own agent) from her husband’s estate, she decided that the best use of her time and resources was to go after me (because it’s my fault that flood laws changed as I was buying the house) for what was effectively pocket change to her. To me on the other hand, this was a 3rd of my savings, hence the panic I went through. The rest I already mentioned before. “You could get your deposit back if we end up in court”, my attorney told me, “but you’ll lose more in legal fees, don’t throw good money after bad”. That was my justification for wanting to settle.

However, after I was done sulking, I realized that something amazing happened. In these 2 years that the lawsuit lasted I learned more about real estate than most people do in a decade (partially from the lawsuit, partially from absorbing all the books I could get my hands on – determined not to get into same situation again), I learned more about investing than most do in a lifetime (including derivatives, various quirks of US financial structure, strategies and their pitfalls), I bought my first property to rent out, made connections with local and remote investors to do deals with in the future, learned Objective-C and Swift (which I wanted to learn for a while but didn’t have the motivation), wrote an iOS app, advanced to a lead developer at work, read 43 books, and met an amazing goal-oriented girl, whom I still date today.

What happened? I got the gift of motivation. I realized that life was short, that a single event could wipe me out unless I build a financial fortress out of my remaining assets, and that I was already a decade behind great men who realized this in early twenties. I had to catch up. All of a sudden fear turned from a blockade into rocket fuel, all because my perception changed. I still hesitate, I’m still indecisive, but I now realize that doing nothing is rarely the safest choice. The best analogy would be the Slime Climb level of Donkey Kong Country. I still have to climb up, but now I realize that there is rising water below me, and things in it that could hurt me. This is true for all of us, but most choose to ignore this fact until well into our fifties.

Imports, Namespaces, and References

References I had a decent idea of how importing and namespaces worked, but it wasn’t until I started setting up unit tests that everything clicked. False assumptions I had about namespaces and imports caused test failures. It forced me to do things the right way, so that I was changing the right objects instead of creating objects in random places. I learned how most things are references, and when you think of variables that way, namespaces and imports make a lot of sense.

Some basics

When you write a module, you start defining things: functions, variables, classes, etc.. When you define something, Python creates an object in memory. The variable name you use is how you refer to that chunk of memory. Everything is references behind the scenes.

A namespace is a container for these references. It maps variable names to addresses in memory. Any module you write has its own namespace. When you define variable mylist in your module, you are defining mylist in your module’s namespace. Variables defined in other modules, or I should say in other modules’ namespaces, are not accessible in your module – unless you import them.

You will work with open source libraries, and to unlock their power you need to access their functions and classes which are in the library’s namespace. You gain access though importing. At a high level Python imports modules by doing the following:

  1. See if the module name is in sys.modules – this like a cache of imported modules.
  2. If not, look for the file that matches the module name
  3. Run the module being imported. This setups up all global variables, including variables for classes & functions
  4. Put the loaded module into sys.modules
  5. Setup references so you can access imports – where the fun happens.

I want to explain these in a little more detail as well as how they affect modifying variables inside of an import. I will go through the steps out of order because step 1, loading from the cache, doesn’t mean much except after step 4, putting a module into the cache.

Step 2: Finding what to import

Finding the modules is pretty useful and interesting, but does not have a lot to do with references so I am skipping the details here. Also, the Python docs outline this clearly and succulently:

https://docs.python.org/3.3/tutorial/modules.html#the-module-search-path

Step 3: Executing Code

After finding the file, Python loads and executes it. To give a concrete example, I have 2 simple modules:

cars.py

car_list = [1]

def myfun():
    return car_list

print "Importing..."

dealer.py

import cars

If you run cardealer.py, you see the following:

> python cardealer.py
Importing...

So by loading and executing I mean Python runs everything at the top level. That includes setting variables, defining functions, and running any other code, such as the print statement in cars.

It also does this in each module’s own namespace. Defining car_list in cars basically adds a pointer in cars to this new list. It does not affect dealer, which could have its own, completely independent variable named car_list. After the import is done, dealer will also not have direct access to car_list. cars has the reference/pointer to that object in memory, and the only way dealer can access the list is to go through cars: cars.car_list.

Step 4 & Step 1

Caching takes place after loading the module. Python stores the module in a module class object, which it puts into the sys.modules dictionary. The key is usually the name of the module, with 1 exception. If the module is the the one you are directly running, its key is ‘main‘. (Aside: this is why we use if name == ‘main‘).

What’s cool is that you can access modules through sys.modules. So back to the simple module. After importing it, you can access the list as cars.car_list. You can also access it through sys.modules:

>>> import sys
>>> sys.modules['cars'].car_list
[1]

The caching in sys.modules has many effects. Importing a cached module means it doesn’t get executed again. Reimporting something is cheap! It also means that all the classes, functions, and variables defined at the top level are only defined once, and statements only execute once. The print statement in cars won’t print if it is imported again, or imported anywhere else in the same process.

Step 5: Reference Setup

This is where the most interesting stuff happens. Depending on how you import something, references are setup in your module’s namespace to let you access what you imported. There are a few variations, and I have some code that shows what Python effectively does. Remember, the modules are already loaded into sys.modules at this point.

  1. import cars

    cars= sys.modules['cars']
  2. import cars as vehicles

    vehicles= sys.modules['cars']
  3. from cars import car_list

    car_list = sys.modules['cars'].car_list

No matter how you import something, code from a module always runs in its own namespace. So when cars looks up a variable, it looks it up in cars’s own namespace, not who imported.

Mutability

Imported variables behave is the same way as regular variables, which depends on mutability.

Remember, when Python creates an object, it puts it somewhere in memory. For mutable objects (dict, list, most other objects), when you make a change to that object, you change the value in memory. For immutable objects (int, float, str, tuple), you can never change the value in memory. For immutable objects you can reassign the variable to point to a new value, but you cannot change the original value in memory.

To illustrate this I can use the id function which prints an address in memory for a variable. I modify the value of an int and a list.

Immutable:

>>> a = 1
>>> id(a)
140619792058920
>>> a += 1
>>> id(a)
140619792058896

Mutable

>>> b = [1]
>>> id(b)
4444200832
>>> b += [2]
>>> id(b)
4444200832

When I change the value of a, the immutable integer, the id changes. But when I do the same for a list, the id remains the same.

There is a sneaky catch though! Many times when you assign a variable, even a mutable one, you change the reference.

>>> b = [1]
>>> id(b)
4320637944
>>> b = b + [2]
>>> id(b)
4320647936

Behind the scene, Python is creating a new list, and changing where the b reference is pointing. The difference can be subtle between this case and the += case above. I wouldn’t worry about it much, it’s never affected me, but it’s good to be aware.

Modifying Imports

So how can I change values in imported modules? And how does mutability affect imports?

The way you access, and sometimes change, variables in imported modules may, or may not, have an effect on the module. This is where unit tests taught me a lot, since my tests imported modules and tried to set variables in order to setup certain test cases. I learned it all comes down to references. There are a few key rules.

Modules execute their code in their own namespace

When a function in cars tries to use car_list, it looks up car_list in cars namespace. That means:

The module is only affected if:

  • you update the object data behind that reference, or
  • **you change where the reference points in the module’s namespace **

There are some simple ways to update what’s behind the reference – which works no matter how you import. Let’s run this code:

tester_1.py

import cars
from cars import car_list, myfun
# update data through the cars module
cars.car_list.append(2)
print "updated through cars:", cars.myfun()
# update through my namespace
car_list.append(3)
print "updated imported var:", myfun()

This outputs:

> python tester_1.py
Importing...
updated through cars: [1, 2]
updated imported var: [1, 2, 3]

It’s trickier to change cars’s reference, which does depends on how you import:

tester_2.py

import cars
from cars import car_list, myfun
# change referece through the cars module
cars.car_list = [2]
print "changed cars reference    :", cars.myfun()
# the imported variable still points to the original value
print "test original import      :", cars_list
# changing the imported variable will not affect the "real" value
# so this should print the same as the first call
car_list = [3]
print "changed imported reference:", myfun()

This outputs:

> python tester_2.py
changed cars reference    : [2]
test original import      : [1]
changed imported reference: [2]

In both cases, we are creating a new list, and changing what the variables are referencing. But remember, myfun is in cars, and gets variables from the cars namespace. In the first case we are going to the cars namespace, and setting its car_list to this new list. When myfun runs, it gets the reference we just updated and returns our new list.

When we from cars import car_list we create a car_list in the tester_2 namespace which points to the same list cars has. When cars’s reference is set to [2], tester_2′s reference stays unchanged, which is why it keeps the value [1]. When we update car_list in tester_2′s namespace, it only changes where tester_2 is pointing. It is not changing where cars points (and not changing the value in memory where cars points), so it has no effect on how the cars module runs. So when we check myfun, it returns [2].

Final thoughts

When you run into a problem where you set a variable in one module, then access it though another, knowing what’s going on behind the scenes helps make sense of it all. With imports it’s easy to figure out what’s going on when you realize that variables are pointers to objects in memory. If your code is not doing what you expect when setting variables, then something is pointing to the wrong object. Knowing how Python works behind the scenes with namespaces and imports will help you figure out what’s going on. It’s really not bad once you realize it’s all just references.

Using Angular Rapydly

RapydScript with AngularJS I’ve been asked a few times if RapydScript works with Angular. The answer is the same as with any other JavaScript framework/library/etc. – of course! RapydScript takes the same approach to compilation/abstraction as CoffeeScript, so there really isn’t anything JavaScript can do that RapydScript can’t. Salvatore has already proven this with multiple demos using frameworks I haven’t even thought of. I urge any developer with similar questions to just try it out, it’s easier (and more fun) to spend an hour building an app than hesitating about it for weeks.

With that said, I decided to put together this tutorial by cloning the 5 examples shown here via RapydScript. I suggest you read the original article before this post. I should also mention that I myself have not used AngularJS before, and decided to learn it through these examples as well. I was a bit disappointed to see that a big chunk of AngularJS is handled magically from their html templates rather than JavaScript itself. This means that there is very little actual JavaScript (and hence RapydScript) used. As far as html itself, you could either write it directly, or use RapydML or Jade (both will have similar syntax). For the purposes of this post, I will leave css code alone, although you can easily convert it to sass as well. Let’s get started.

Setup (Only Needed for RapydML)

Looking at the first example, we immediately notice Django-like variables in the html code. That’s a sign that we may want to use template engines in RapydML. So let’s slap one together. Glancing through the examples we see that most AngularJS variables follow this format:

{{var}}

Doesn’t look like they’re doing anything fancy, so let’s create a very simple template engine in a new pyml file:

angular = TemplateEngine('{%s}')
angular.js = create('{%s}')

I should also mention that you can use AngularJS from RapydML directly, without creating a template engine, I chose to do so for clarity. But nothing prevents you from writing this:

div:
    'This is an angular var: {{myvar}}'

Example 1 (Navigation Menu)

This one doesn’t require any JavaScript. We’ll just need to slap together a RapydML template, leveraging the 2-line template engine we just wrote:

import angular

div(id="main", ng-app):
    nav(class="angular.js(active)", ng-click="\$event.preventDefault()"):
        for $name in [home, projects, services, contact]:
            a(href='#', class="$name", ng-click="active='$name'"):
                python.str.title('$name')

    p(ng-hide='active'):
        'Please click a menu item'
    p(ng-show='active'):
        'You choose'
        b:
            "angular.js(active)"

That’s it, compile it and plug into AngularJS. You will need the most recent version of RapydML, older versions did not handle template engine calls within strings, the new one handles them in double-quoted strings and ignores them in single-quoted strings.

Example 2 (Inline Editor)

As before, we start with the RapydML code:

import angular

div(id='main', ng-app, ng-controller='InlineEditorController', ng-click='hideTooltip()'):
    div(class='tooltip', ng-click='\$event.stopPropagation()', ng-show='showtooltip'):
        input(type='text', ng-model='value')
    p(ng-click='toggleTooltip(\$event)'):
        "angular.js(value)"

Next, let’s add some RapydScript:

def InlineEditorController($scope):
    $scope.showtooltip = False
    $scope.value = 'Edit me.'

    $scope.hideTooltip = def():
        $scope.showtooltip = False

    $scope.toggleTooltip = def(e):
        e.stopPropagation()
        $scope.showtooltip = not $scope.showtooltip

Once again, after compiling the first block of code via RapydML and second via RapydScript, we’ll have a fully-working example.

Example 3 (Order Form)

In this example, we see a new way of hooking into AngularJS. Note the {active: service.active} class tag. We could either reference is in the code verbatim (since it is quoted), or add a new method to the angular template engine for consistency:

angular.active = create('active: %s')

We can now continue with the 3rd example, first the RapydML code:

import angular

form(ng-app, ng-controller='OrderFormController'):
    h1:
        'Services'
    ul:
        li(ng-repeat='service in services', ng-click='toggleActive(service)', ng-class="angular.active(service.active)")
        "angular.js(service.name)"
        span:
            "angular.js(service.price | currency)"

    div(class='total'):
        'Total: '
        span:
            "angular.js(total() | currency)"

Finally, let’s complement it with RapydScript:

def OrderFormController($scope):
    $scope.services = [
        {
            name: 'Web Development',
            price: 300,
            active: True
        }, {
            name: 'Design',
            price: 400,
            active: False
        }, {
            name: 'Integration',
            price: 250,
            active: False
        }, {
            name: 'Training',
            price: 220,
            active: False
        }
    ]

    $scope.toggleActive = def(s):
        s.active = not s.active

    $scope.total = def():
        total = 0
        angular.forEach($scope,services, def(s):
            nonlocal total
            if s.active:
                total += s.price
        )
        return total

As usual, looks similar to the JavaScript version. RapydScript did prevent us from shooting ourselves in the foot by accidentally declaring total as global, something you’d need to be careful about in native JavaScript version.

Example 4 (Instant Search)

For this example, I’ll actually show alternative RapydML code, that doesn’t use our template engine:

div(ng-app='instantSearch', ng-controller='InstantSearchController'):
    div(class='bar'):
        input(type='text', ng-model='searchString', placeholder='Enter your search terms')

    ul:
        li(ng-repeat='i in items | searchFor:searchString')
            a(href='{{i.url}}'):
                img(ng-src='{{i.image}}')
            p:
                '{{i.title}}'

That was even easier than the template engine version, I think I’ll stick to this for my next example as well. Now the RapydScript portion:

app = angular.module('instantSearch', [])
app.filter('searchFor', def():
    return def(arr, searchString):
        if not searchString:
            return arr

        result = []
        searchString = searchString.toLowerCase()
        angular.forEach(arr, def(item):
            if item.title.toLowerCase().indexOf(searchString) != -1:
                result.push(item)
        )
        return result
)

def InstantSearchController($scope):
    $scope.items = [
        {
            url: 'http://tutorialzine.com/2013/07/50-must-have-plugins-for-extending-twitter-bootstrap/',
            title: '50 Must-have plugins for extending Twitter Bootstrap',
            image: 'http://cdn.tutorialzine.com/wp-content/uploads/2013/07/featured_4-100x100.jpg'
        }, {
            url: 'http://tutorialzine.com/2013/08/simple-registration-system-php-mysql/',
            title: 'Making a Super Simple Registration System With PHP and MySQL',
            image: 'http://cdn.tutorialzine.com/wp-content/uploads/2013/08/simple_registration_system-100x100.jpg'
        }, {
            url: 'http://tutorialzine.com/2013/08/slideout-footer-css/',
            title: 'Create a slide-out footer with this neat z-index trick',
            image: 'http://cdn.tutorialzine.com/wp-content/uploads/2013/08/slide-out-footer-100x100.jpg'
        }, {
            url: 'http://tutorialzine.com/2013/06/digital-clock/',
            title: 'How to Make a Digital Clock with jQuery and CSS3',
            image: 'http://cdn.tutorialzine.com/wp-content/uploads/2013/06/digital_clock-100x100.jpg'
        }, {
            url: 'http://tutorialzine.com/2013/05/diagonal-fade-gallery/',
            title: 'Smooth Diagonal Fade Gallery with CSS3 Transitions',
            image: 'http://cdn.tutorialzine.com/wp-content/uploads/2013/05/featured-100x100.jpg'
        }, {
            url: 'http://tutorialzine.com/2013/05/mini-ajax-file-upload-form/',
            title: 'Mini AJAX File Upload Form',
            image: 'http://cdn.tutorialzine.com/wp-content/uploads/2013/05/ajax-file-upload-form-100x100.jpg'
        }, {
            url: 'http://tutorialzine.com/2013/04/services-chooser-backbone-js/',
            title: 'Your First Backbone.js App – Service Chooser',
            image: 'http://cdn.tutorialzine.com/wp-content/uploads/2013/04/service_chooser_form-100x100.jpg'
        }
    ]

On to the last example.

Example 5 (Switchable Grid)

First the RapydML code, as usual:

def layout($type, $imgtype):
    ul(ng-show="layout == '$type'", class="$type"):
        li(ng-repeat='p in pics'):
            a(href='{{p.link}}', target='_blank'):
                img(ng-src="{{p.images.$imgtype.url}}")

div(ng-app='switchableGrid', ng-controller='SwitchableGridController'):
    div(class='bar'):
        for $icon in [list, grid]:
            a(class='$icon-icon', ng-class="{active: layout == '$icon'}", ng-click="layout = '$icon'")
    layout(list, low_resolution)
    layout(grid, thumbnail)
            p:
                '{{p.caption.text}}'

Notice that I chose to be a little more DRY and create an extra function, you don’t have to do that if you believe it hurts readability. Next, the RapydScript code:

app = angular.module('switchableGrid', ['ngResource'])
app.factory('instagram', def($resource):
    return {
        fetchPopular: def(callback):
            api = $resource('https://api.instagram.com/v1/media/popular?client_id=:client_id&callback=JSON_CALLBACK', {
                client_id: '642176ece1e7445e99244cec26f4de1f'
            }, {
                fetch: { method: 'JSONP' }
            })

            api.fetch(def(response):
                callback(response.data)
            )
    }
)

def SwitchableGridController($scope, instagram):
    $scope.layout = 'grid'
    $scope.pics = []
    instagram.fetchPopular(def(data):
        $scope.pics = data
    )

Conclusion

While the RapydScript code above might be a bit cleaner than the original JavaScript code, it’s hard to argue that significant value was gained from porting it. The main benefit of RapydScript comes into play when building larger-scale applications that leverage classes and use numerous variables. Such code increases the chances of shooting yourself in the foot or getting confused when dealing with native JavaScript. Indeed, if you look at some of the jQuery plugins, you’ll see that the code itself seems unmaintainable. After a few layers of anonymous functions it’s hard to figure out what’s being passed around where. AngularJS keeps the code short by handling a lot of the logic for you on the background, RapydScript allows you to keep your own logic readable as it grows. No matter how awesome a framework is, eventually you’ll be writing a lot of your own logic as well.

Python & C

Python CAs promised, I took a script written in Python and ported parts of it to C. The Python interface is the same, but behind the scenes is blazing fast C code. The example is a little different from what I usually do, but I discovered a few things I had never seen.

I am working off the same example as the previous post, How to Write Faster Python, which has the full original code. The example takes an input image and resizes it down by a factor of N. It takes NxN blocks and brings them down to 1 pixel which is the median RGB of the original block.

Small Port

This example is unusual for me because it’s doing 1 task, resizing, and the core of this 1 task is a very small bit of logic – the median function. I first ported the median code to C, and checked the performance. I know the C code is better because C’s sorting function lets you specify how many elements from the list you want to sort. In Python, you sort the whole list (as far as I know) and so I was able to avoid checks on the size of the list vs. the number of elements I want to sort.

Below is the code, but I have a few changes I want to mention as well.

myscript2.py:

import cv2
import numpy as np
import math
from datetime import datetime
import ctypes

fast_tools = ctypes.cdll.LoadLibrary('./myscript_tools.so')
fast_tools.median.argtypes = (ctypes.c_void_p, ctypes.c_int)


def _median(list, ne):
    """
    Return the median.
    """

    ret = fast_tools.median(list.ctypes.data, ne)
    return ret


def resize_median(img, block_size):
    """
    Take the original image, break it down into blocks of size block_size
    and get the median of each block.
    """

    #figure out how many blocks we'll have in the output
    height, width, depth = img.shape
    num_w = math.ceil(float(width)/block_size)
    num_h = math.ceil(float(height)/block_size)

    #create the output image
    out_img = np.zeros((num_h, num_w, 3), dtype=np.uint8)

    #iterate over the img and get medians
    row_min = 0
    num_elems = block_size * block_size
    block_b = np.zeros(num_elems, dtype=np.uint8)
    block_g = np.zeros(num_elems, dtype=np.uint8)
    block_r = np.zeros(num_elems, dtype=np.uint8)
    while row_min < height:
        r_max = row_min + block_size
        if r_max > height:
            r_max = height

        col_min = 0
        new_row = []
        while col_min < width:
            c_max = col_min + block_size
            if c_max > width:
                c_max = width

            #block info:
            num_elems = (r_max-row_min) * (c_max-col_min)
            block_i = 0
            for r_i in xrange(row_min, r_max):
                for c_i in xrange(col_min, c_max):
                    block_b[block_i] = img[r_i, c_i, 0]
                    block_g[block_i] = img[r_i, c_i, 1]
                    block_r[block_i] = img[r_i, c_i, 2]
                    block_i += 1

            #have the block info, sort by L
            median_colors = [
                int(_median(block_b, num_elems)),
                int(_median(block_g, num_elems)),
                int(_median(block_r, num_elems))
            ]

            out_img[int(row_min/block_size), int(col_min/block_size)] = median_colors
            col_min += block_size
        row_min += block_size
    return out_img

def run():
    img = cv2.imread('Brandberg_Massif_Landsat.jpg')
    block_size = 16
    start_time = datetime.now()
    resized = resize_median(img, block_size)
    print "Time: {0}".format(datetime.now() - start_time)
    cv2.imwrite('proc.png', resized)

if __name__ == '__main__':
    run()

myscript_tools.c

#include <stdint.h>
#include <stdio.h>
#include <stdlib.h> // for malloc
#include <math.h>

int cmpfunc (const void * a, const void * b)
{
   return ( *(uint8_t*)a - *(uint8_t*)b );
}

int _median(uint8_t * list, int ne){
    //sort inplace
    qsort(list, ne, sizeof(uint8_t), cmpfunc);
    int i;
    if (ne % 2 == 0){
        i = (int)(ne / 2);
        return ((int)(list[i-1]) + (int)(list[i])) / 2;
    } else {
        i = (int)(ne / 2);
        return list[i];
    }
}

int median(const void * outdatav, int ne){
    int med;
    uint8_t * outdata = (uint8_t *) outdatav;
    med = _median(outdata, ne);
    return med;
}

and to compile:

gcc -fPIC -shared -o myscript_tools.so myscript_tools.c

Note: I discovered the .so filename should not match any Python script you plan to import. Otherwise you will get a “ImportError: dynamic module does not define init function” because Python is trying to import the C code instead of your Python module.

The main change I made was to port the median function to C. I also switched from using a list of colors, to using a numpy array of colors. I did this because it’s easier to pass numpy arrays, and because that is more useful in general. Also, so I wouldn’t have to convert the list to a numpy array on every median call, I crated the numpy array in the resize_median code itself. Before this last change, my benchmark showed I was actually running slower. The faster C code was not making up for all the list –> numpy array conversions!

After the 2nd fix I timed it

python -m timeit "import myscript2; myscript2.run()"

and got

500 loops, best of 3: 2.04 sec per loop

Nice! That’s compared to 2.51 from my last post, so it’s around 18% faster.

All C (sort of)

I wasn’t happy with the 18% boost – I thought it would be faster! But, it was only a very small portion ported. I decided to port more, but there wasn’t a good natural place to split the resize function up any further, so I ported the whole thing. This way I could also show how to send and receive a whole image back from C – which I do by reference/pointer.

myscript2.py

import cv2
import numpy as np
import math
from datetime import datetime
import ctypes

fast_tools = ctypes.cdll.LoadLibrary('./myscript_tools.so')

def c_resize_median(img, block_size):
    height, width, depth = img.shape

    num_w = math.ceil(float(width)/block_size)
    num_h = math.ceil(float(height)/block_size)

    #create a numpy array where the C will write the output
    out_img = np.zeros((num_h, num_w, 3), dtype=np.uint8)

    fast_tools.resize_median(ctypes.c_void_p(img.ctypes.data),
                             ctypes.c_int(height), ctypes.c_int(width),
                             ctypes.c_int(block_size),
                             ctypes.c_void_p(out_img.ctypes.data))
    return out_img


def run():
    img = cv2.imread('Brandberg_Massif_Landsat.jpg')
    block_size = 16
    start_time = datetime.now()
    resized = c_resize_median(img, block_size)
    print "Time: {0}".format(datetime.now() - start_time)
    cv2.imwrite('proc.png', resized)

if __name__ == '__main__':
    run()

myscript_tools.c

#include <stdint.h>
#include <stdio.h>
#include <stdlib.h> // for malloc
#include <math.h>

int cmpfunc (const void * a, const void * b)
{
   return ( *(uint8_t*)a - *(uint8_t*)b );
}

int _median(uint8_t * list, int ne){
    //sort inplace
    qsort(list, ne, sizeof(uint8_t), cmpfunc);
    int i;
    if (ne % 2 == 0){
        i = (int)(ne / 2);
        return ((int)(list[i-1]) + (int)(list[i])) / 2;
    } else {
        i = (int)(ne / 2);
        return list[i];
    }
}

void resize_median(const void * bgr_imgv, const int height, const int width,
                   const int block_size, void * outdatav) {
    const uint8_t * bgr_img = (uint8_t  *) bgr_imgv;
    uint8_t * outdata = (uint8_t *) outdatav;

    int col, row, c_max, r_max, c_si, r_si;
    int max_val;
    const int num_elems = block_size * block_size;
    int j, offset;

    uint8_t *b_values;
    uint8_t *g_values;
    uint8_t *r_values;
    b_values = (uint8_t *) malloc(sizeof(uint8_t) * num_elems);
    g_values = (uint8_t *) malloc(sizeof(uint8_t) * num_elems);
    r_values = (uint8_t *) malloc(sizeof(uint8_t) * num_elems);

    int out_i = 0;

    row = 0;
    while (row < height) {
        col = 0;
        r_max = row + block_size;
        if (r_max > height) {
            r_max = height;
        }
        while (col < width) {
            c_max = col + block_size;
            if (c_max > width) {
                c_max = width;
            }

            // block info:
            j = 0;
            for (r_si = row; r_si < r_max; ++r_si) {
                for (c_si = col; c_si < c_max; ++c_si) {
                    offset = ((r_si*width)+c_si)*3;
                    b_values[j] = bgr_img[offset];
                    g_values[j] = bgr_img[offset+1];
                    r_values[j] = bgr_img[offset+2];
                    j += 1;
                }
            }

            // have the block info, get medians
            max_val = j;
            outdata[out_i]   = _median(b_values, max_val);
            outdata[out_i+1] = _median(g_values, max_val);
            outdata[out_i+2] = _median(r_values, max_val);

            // update indexes
            out_i += 3;
            col += block_size;
        }
        row += block_size;
    }

    free(b_values);
    free(g_values);
    free(r_values);
}

This one is a little more complicated. With multi-dimensional arrays, numpy flattens the data. The new array is [pixel00_blue, pixel00_green, pixel00_red, pixel01_blue, ...] – see the offset variable above for the row/column equation. I also pass pointers with no type, so before using the arrays, I have to typecast them. I realize this example is unusual because the whole script is basically ported, but it illustrates many things that are non-trivial and required some time to figure out.

And for the moment of truth…

500 loops, best of 3: 440 msec per loop

82% faster!

Although it’s a mostly C at this point.

Final Thoughts

So in this case, the Python took about 5 times as long to run compared to the C version. When something takes a few milliseconds, 5-10x isn’t that important, but when it gets more complex, it starts to matter (assuming you’re not I/O bound). Here I ported pretty much the whole module. I think this only makes sense with a library, and not code that you will refer to often – Python is just so easy to read.

An idea popped in my head as I was writing this as well. This should be easy to port to RapydScript. JavaScript has the Image object I could use in place of opencv. I wonder where JavaScript would fall on the spectrum.

Displaced Aggression

Displaced Aggression I’m going through a very tough situation in my life – hence the slower development of Grafpad and RapydScript. I won’t get into the details, but basically I’m being sued over a house purchase that fell through. It’s a very stressful situation to be in, and I don’t believe the other party is enjoying it any more than I do. The question then is, why don’t we just discuss it like human beings?

I tried to reason with the other party repeatedly to minimize damages to both. She is not interested in negotiating, her attorney says “she is angry”. She insists that she wants to recover her damages. I still fail to see what those damages are. The mortgage is paid off, she may have incurred a few hundred dollars of extra utility bills she wouldn’t have paid if she moved earlier. I already offered several thousand to make up for it and avoid headaches, we also suggested that she offer her own “just” amount. She refused, her response was that she is angry and intends to take it to court – a court that will take over a year to get to and is just as likely to rule in my favor as in hers.

As a result of the lawsuit, both parties have already incurred several thousand dollars more in lawsuit fees. What did I do that she is so angry about, you may ask? I simply wasn’t able to complete the transaction due to a perfect storm of unfortunate events (recent change in local laws, problems with the bank due to these laws, as well as misinformation from a third party). Another buyer walked away right before me as well, and the seller was angry with them too. I should have seen that as a red flag.

Why am I sharing this? Because I don’t believe that the seller is trying to make a quick buck off of me. If the goal was to maximize her own good, she would have taken the settlement or offered a counter, minimizing harm to both parties, while still maximizing her gain. That is how rational people think. Unfortunately, it seems like it’s not what the seller is after. She is angry. It’s not her own good or damages she cares about, it’s maximizing harm to me – even if it means incurring the same harm herself in the process. Why? Because she is angry, because she wants to “punish” someone for the pain she feels.

When you think like this, you will eventually find someone to punish. Problem is, you’re punishing another victim, you’re not addressing the real cause of the problem. She was angry at the previous buyer, that same anger was later projected at me, amplified by additional headaches. Why was she angry at me? Because this house was a pain to sell (it was on the market for over a year) and because I happened to be the easiest target in this interaction. When you’re angry, you don’t think rationally. When you’re angry, you’re simply searching for a scapegoat, you’re not searching for a solution. If we asked ourselves “how to solve this problem?” more often and “who caused the problem?” less often, we’d create less stress for everyone and we’d need less lawsuits to settle things.

How to Write Faster Python

Peugeot F1 This is a high level guide on how to approach speeding up slow apps. I have been writing a computer vision app on the backend of a server and came to the realization that it is easy to write slow Python, and Python itself is not fast. If you’re iterating over a 1000 x 1000 pixels (1 megapixel) image, whatever you’re doing inside will run 1 million times. My code ran an iterative algorithm, and it took over 2 minutes to run the first version of the code. I was able to get that time down to under 10 seconds – and it only took a few hours time with minimal code changes. The hardest part was converting some snippets of code into C, which I plan to detail in the future. The basic approach I followed was:

  1. Set a baseline with working code & test
  2. Find the bottlenecks
  3. Find a way to optimize bottlenecks
  4. Repeat

And fair warning, these sorts of optimizations only work if you have selected a good algorithm. As we used to say at work “You can’t polish a turd”.

Slow, but working code

I am going to start with an example program. This resizes an image using the medians of each region. If you resize to 1/4th of the original size the new image will be the medians of 4×4 blocks. It is not too important you understand the details, but I still included code for reference:

import cv2
import numpy as np
import math
from datetime import datetime

def _median(list):
    """
    Return the median
    """

    sorted_list = sorted(list)
    if len(sorted_list) % 2 == 0:
        #have to take avg of middle two
        i = len(sorted_list) / 2
        #int() to convert from uint8 to avoid overflow
        return (int(sorted_list[i-1]) + int(sorted_list[i])) / 2.0
    else:
        #find the middle (remembering that lists start at 0)
        i = len(sorted_list) / 2
        return sorted_list[i]

def resize_median(img, block_size):
    """
    Take the original image, break it down into blocks of size block_size
    and get the median of each block.
    """

    #figure out how many blocks we'll have in the output
    height, width, depth = img.shape
    num_w = math.ceil(float(width)/block_size)
    num_h = math.ceil(float(height)/block_size)

    #create the output image
    out_img = np.zeros((num_h, num_w, 3), dtype=np.uint8)

    #iterate over the img and get medians
    row_min = 0
    #TODO: optimize this, maybe precalculate height & widths
    while row_min < height:
        r_max = row_min + block_size
        if r_max > height:
            r_max = height

        col_min = 0
        new_row = []
        while col_min < width:
            c_max = col_min + block_size
            if c_max > width:
                c_max = width

            #block info:
            block_b = []
            block_g = []
            block_r = []
            for r_i in xrange(row_min, r_max):
                for c_i in xrange(col_min, c_max):
                    block_b.append(img[r_i, c_i, 0])
                    block_g.append(img[r_i, c_i, 1])
                    block_r.append(img[r_i, c_i, 2])

            #have the block info, sort by L
            median_colors = [
                int(_median(block_b)),
                int(_median(block_g)),
                int(_median(block_r))
            ]

            out_img[int(row_min/block_size), int(col_min/block_size)] = median_colors
            col_min += block_size
        row_min += block_size
    return out_img

def run():
    img = cv2.imread('Brandberg_Massif_Landsat.jpg')
    block_size = 16
    start_time = datetime.now()
    resized = resize_median(img, block_size)
    print "Time: {0}".format(datetime.now() - start_time)
    cv2.imwrite('proc.png', resized)

if __name__ == '__main__':
    run()

This takes around 3 seconds to run against a random 1000×1000 image from Wikipedia (link):

500 loops, best of 3: 3.09 sec per loop

So this is a good starting point – it works! I am also saving the output image, so that when I make changes, I can spotcheck that I didn’t break something.

Profiling

Python’s standard library has a profiler that works very well (see The Python Profilers). Although I’m not a fan of the text output – I feel it does not intuitively roll up results – you can dump stats to a file which you can view in a UI.

So I can profile the script:

python -m cProfile -o stats.profile myscript.py

Note that it can sometimes add a lot of overhead.

And after installing runsnakerun I can view the stats:

runsnake stats.profile

which shows me: Runsnakerun GUI

So there are a few things that jump out as easy fixes. First, that app spends around 1/3rd of the time appending colors. Second, finding the median takes a significant amount of time as well – most of it from the sorted call – this is not visible in the image above. The other lines of code are not significant enough to display.

Optimizing

I have a few quick rules of thumb for optimizing:

  • creating/appending to lists is slow due to memory allocation – try to use list comprehensions
  • try not to run operations that create copies of objects unless it’s required – this is also slow due to memory allocation
  • dereferencing is slow: if you’re doing mylist[i] several times, just do myvar = mylist[i] up front
  • use libraries as much as possible – many are written in C/C++ and fast

Other than that, make use of search like Google or DuckDuckGo. You can tweak your Python code (you might discover you’re doing something wrong!), use Cython, write C libraries, or find another solution.

So profiling tells me that appending is slowing my code. I can get around this problem by declaring the list once and keeping track of how many elements I “add”. I also know that sorted() is not preferred because it creates a copy of the original list. I can instead use list.sort() and sort the list in place. I make these changes and run the code, and see the output is still good so I probably did not break it. Let’s time it.

500 loops, best of 3: 2.51 sec per loop

That’s almost 20% faster! Not bad for a few minutes of effort.

For completeness, here is the modified code:

import cv2
import numpy as np
import math
from datetime import datetime

def _median(list, ne):
    """
    Return the median.
    """

    #sort inplace
    if ne != len(list):
        list = list[0:ne]
    list.sort()
    if len(list) % 2 == 0:
        #have to take avg of middle two
        i = len(list) / 2
        #int() to convert from uint8 to avoid overflow
        return (int(list[i-1]) + int(list[i])) / 2.0
    else:
        #find the middle (remembering that lists start at 0)
        i = len(list) / 2
        return list[i]

def resize_median(img, block_size):
    """
    Take the original image, break it down into blocks of size block_size
    and get the median of each block.
    """

    #figure out how many blocks we'll have in the output
    height, width, depth = img.shape
    num_w = math.ceil(float(width)/block_size)
    num_h = math.ceil(float(height)/block_size)

    #create the output image
    out_img = np.zeros((num_h, num_w, 3), dtype=np.uint8)

    #iterate over the img and get medians
    row_min = 0
    #TODO: optimize this, maybe precalculate height & widths
    num_elems = block_size * block_size
    block_b = [0] * num_elems
    block_g = [0] * num_elems
    block_r = [0] * num_elems
    while row_min < height:
        r_max = row_min + block_size
        if r_max > height:
            r_max = height

        col_min = 0
        new_row = []
        while col_min < width:
            c_max = col_min + block_size
            if c_max > width:
                c_max = width

            #block info:
            num_elems = (r_max-row_min) * (c_max-col_min)
            block_i = 0
            for r_i in xrange(row_min, r_max):
                for c_i in xrange(col_min, c_max):
                    block_b[block_i] = img[r_i, c_i, 0]
                    block_g[block_i] = img[r_i, c_i, 1]
                    block_r[block_i] = img[r_i, c_i, 2]
                    block_i += 1

            #have the block info, sort by L
            median_colors = [
                int(_median(block_b, num_elems)),
                int(_median(block_g, num_elems)),
                int(_median(block_r, num_elems))
            ]

            out_img[int(row_min/block_size), int(col_min/block_size)] = median_colors
            col_min += block_size
        row_min += block_size
    return out_img

def run():
    img = cv2.imread('Brandberg_Massif_Landsat.jpg')
    block_size = 16
    start_time = datetime.now()
    resized = resize_median(img, block_size)
    print "Time: {0}".format(datetime.now() - start_time)
    cv2.imwrite('proc.png', resized)

if __name__ == '__main__':
    run()

Repeat

Now that I optimized the code a little, I repeat the process to make it even better. I try to decide what should be slow, and what should not. So, for example, sorting is not very fast here, but that makes sense. Sorting is the most complex part of this code – the rest is iterating and keeping track of observed colors.

Final Thoughts

After having optimized a few projects I have a few final thoughts and lessons learned:

  • Optimizing a slow/bad algorithm is a waste of time, you’ll need to redesign it anyways
  • The granularity of the profiler is sometimes a little funny. You can get around this by making sections of code into functions – metrics on functions are usually captured.
  • The profile will group the calls to the library functions together even if called from different places. So be careful with metrics for libraries/utils – i.e. numpy.
  • Use search to find ways to optimize your code.

If all else fails, you can implement slow parts in C relatively painlessly. This is the option I usually go with, and I will detail it out with some code in the next few weeks.

Have fun speeding up your code!

Business Ping Pong

Ping Pong I used to be notorious at procrastination. I still do it, but I got much better at it. The problem was my thought process. We all think in terms of pain. We avoid new pain or seek to minimize existing pain. Procrastination is simply a mechanism to avoid new pain. What we fail to realize, however, is that same pain dwells in our mind. That mental to do list is eating away our brain capacity, causing distractions from other tasks. It’s a numbing pain that keeps coming back. It’s like storing clutter in your house, where the house is your brain. After thinking about my own procrastination, I realized that the pain of dwelling on a task is bigger than the pain of actually doing the task.

As I started tackling these tasks, it became easier to do them. The hardest part is starting. Indeed I got better at this once I started forcing myself to start. The other trick is that you don’t have to do it all in one go, once you start tackling the problem, it gets smaller. The other side of the coin is addressing tasks that involve other people. We tend to procrastinate our responses until the nagging from the other side becomes severe enough. But what if you were to handle the responses as soon as you can? I recently heard a term for this in a conversation: business ping pong. When a ball comes your way, hit it back as soon as you can. Don’t procrastinate, be the one waiting on others, not the other way around. The sooner you reply, the less time you spend thinking about it, the less clutter you keep in your house.

Bringing a Project to Life

I have always found that building something out of nothing is hard – whether it be a new feature, or a new product. It’s much easier to fix a bug or make an improvement to working code. Yet, I’ve been on some ambitious projects that got done (building a completely new API for a large ad serving company & a radar program that was designed & built from scratch in 3 years, to name the most difficult). Even though many people thought it was a miracle we got these done, I noticed some techniques that were useful in bringing these projects to fruition which apply to open source projects as well.

Before getting started, you must first give up the idea of building a finished product. You need to make something just usable to start. If you have something useable (and useful), people will start using it – and for open source projects, some of those uses will help make it better. After a project is brought to life, it can evolve into something more – a finished product. The difficultly then lies in making that first usable product.

Fail Fast

Nothing has been accomplished yet. You have to start planning and try to eliminate bad ideas before they take up too much time. The most efficient way of doing this is by getting feedback from others. If you find experts, or even just people with different perspectives, they can draw upon their experiences to quickly point out problems, and save you spending days or weeks working on a solution that will never work.

In addition to getting feedback, prototype things and make proof of concept models. Most importantly, don’t invest a lot of time here. You don’t need pretty code – but it should be accurate. The goal is to wrap your head around the problem and discover potential solutions and failures early – not to write something maintainable. The models themselves aren’t important, the data/information you learn from these experiments is what matters.

Build a pre-alpha version

You need to get something working that you can build off of. Most projects have more than 1 component. Even small web apps have a view & controller. Connecting the different parts together is difficult because it is easy to make assumptions about how & what data passed between interfaces.

On the radar system I worked on, many properties of the raw input signal were not planned to be passed along to processes further down the line. But after building the first version of the code, that format of that data quickly changed to include additional information. And after the first version of every component was done, we could look at a display, albeit a very crude one, and see a real object – that is very motivating.

Focus, Iterate & Test

The radar system I worked on was also hard because we were short on time due to an accelerated schedule – time was our most valuable resource. The main reasons we were still able to finish was because the motto of the project was an idea from Voltaire: “perfect is the enemy of good enough”. We were told not to spend any time working on something that already met the specs – we could not spare it. Instead we had to find something that was not good enough, and focus there.

This list of areas we needed to focus became our list of improvements/bugs. We made it past the difficult part of building something from nothing with our pre-alpha version, and now we were working on fixes and improvements – the easy part. After repeating the focus & fix process many times, we got to the point where we had a useable product.

Also crucial here is testing. You need to test so you can confidently make changes without causing regressions. The less effort required to run them, better (automated ;))

Share your project

Alex Martelli gave a talk at PyCon 2013, “Good enough is good enough”. He said, if you make something people need, they will use it, even if it’s not perfect. The catch here is it has to be something people find useful. If enough people use your open source project, some of them will contribute back, fix bugs, and maybe even clean up and optimize code. Even if people do not fix anything, they can file issue tickets. This last weekend we had someone file several issues with example code – perfect test cases.