The Montreal Python 26 Talk

December 21, 2011

Hi Folks

I recently gave a talk for Montréal Python 26 (Jolly Justifications!) appropriately called “How Do You Solve A Problem Like Santa: Prototyping Join Patterns with stackless.py for Stackless Python.” Here are the slides for How To Solve a Problem Like Santa. Once again, I would like to thank L’equip de Montréal Python and the wonderful audience for providing me with an opportunity to share.

Seasons Greetings,
Andrew

The Santa Claus Problem with Join Patterns

November 30, 2011

Here is a complete code for an implementation of the Santa Claus Problem with join patterns. Or at least, my initial stab at something approximating a join pattern. This code accompanies the previous post How Do You Solve a Problem Like Santa?.

One may ask “What is with the Twisted?” In this case, I use Twisted to ensure that tasklets sleep for the correct unit of time. Simply using sleep() will put the entire thread (and all the tasklets running in that thread), to sleep). It would not take much effort to make the elves and reindeer represent network connections.

Yes Virginia, one can make Stackless and Twisted interact….

One of these days I will write a stacklessreactor 🙂

Now to find a WordPress theme that has a wider margin…..

#!/usr/bin/env python
"""
joinSanta.py
Andrew Francis

November 17th, 2011

The purpose of joinSanta is as follows:

Show a solution to the Santa Claus problem using a join object
"""

import random

import time
import sys

from twisted.internet                     import reactor
from twisted.internet                     import task
from twisted.python.failure               import Failure

import stackless

# we can use ETIME and RTIME to control how long reindeer
# and elves wait before returning and/or asking questions

RTIME = (10,12)
ETIME = (1,12)


def tick(seconds):
    tickCh = stackless.channel()
    reactor.callLater(seconds, tickCh.send, None)
    tickCh.receive()


def startTwisted():
    reactor.run()


def stopTwisted():
    reactor.callLater(1, reactor.stop)
    print "that's all folks"


"""
a way of telling the tasklets that Santa tasklet is done
"""
def acceptChannels(pattern, status):
    print "[ACCEPTING]"
    for chanop in pattern:
        chanop.channel.send(status)


def worker(ch, name, theRange):
    while True:
       waitTime = random.randint(theRange[0], theRange[1])
       print name, "waiting ", waitTime, " seconds"
       tick(waitTime)
       ch.send(name)
       answer = ch.receive()


def deliveryToys(reindeer):
    print "All the reindeer have arrived - delivering toys"


def consultWithSanta(elves):
    print "Santa consulting with elves"
    acceptChannels(elves, True)


def harness(reindeer):
    for deer in reindeer:
        print "harnessing ", deer.value 


def unharness(reindeer):
    print "unharnessing"
    acceptChannels(reindeer, True)
    

def santa(reindeer, elves):
    print "in santa"
    joinObject = stackless.join().addPattern([ch for _, ch, _ in reindeer]).\
                                  addPattern([ch for _, ch, _ in elves],3)

    reindeerPattern, elfPattern = joinObject.patterns

    while True:
        pattern = joinObject.join()
        print "-->", pattern ,"<---"
        if reindeerPattern.ready():
            print "*** REINDEER GET PRIORITY ***"
            reindeerPattern.join()
            pattern = reindeerPattern
        if pattern is reindeerPattern:
            harness(reindeerPattern)
            deliveryToys(reindeerPattern)
            unharness(reindeerPattern)
        elif pattern is elfPattern:
            consultWithSanta(elfPattern)

    stopTwisted()
    print "Twisted is dead"


def makeWorkers(workers):
    for name, ch, waitTime in workers:
        stackless.tasklet(worker, name)(ch, name, waitTime)
    return 


if __name__ == "__main__":
   random.seed()
   reindeers = [(reindeer, stackless.channel(reindeer), RTIME) for reindeer in \
               ["DANCER", "PRANCER", "VIXEN", "COMET", "CUPID", "DONER", \
                "DASHER", "BLITZEN", "RUDOLPH"]]

   elves = [(elf, stackless.channel(elf), ETIME) for elf in \
            ['A','B','C','D','E','F','G','H','I','J']]
 
   makeWorkers(reindeers)
   makeWorkers(elves)

   l = task.LoopingCall(stackless.schedule)
   l.start(.001)
   stackless.tasklet(startTwisted, "TWISTED")()
   stackless.tasklet(santa,"SANTA")(reindeers, elves)
   stackless.run()

How Do You Solve A Problem Like Santa? : Join Patterns in Stackless Python

November 29, 2011

Santa repeatedly sleeps until wakened by either all of his nine reindeer, back from their holidays, or by a group of three of his ten elves. If awakened by the reindeer, he harnesses each of them to his sleigh, delivers toys with them and finally unharnesses them (allowing them to go off on holiday). If awakened by a group of elves, he shows each of the group into his study, consults with them on toy R&D and finally shows them each out (allowing them to go back to work). Santa should give priority to the reindeer in the case that there is both a group of elves and a group of reindeer waiting

Introduction

I discovered the The Santa Claus Problem and join patterns innocently enough: a post in the GoLang Nuts mailing list. I regularly read Golang Nuts since Stackless Python and the Go programming language’s concurrency model share a common ancestor: The Bell Lab’s family of languages (i.e., Newsqueak, Limbo). A comment (Stackless Python has no select!) in Richard Tew’s blog Stuff What I Posted comparing Stackless Python to Go was the inspiration for my EuroPython 2010 and Pycon 2011 talks “Prototying Go’s Select with stackless.py for Stackless Python.”

Googling join patterns led to Join Patterns for Visual Basic (Visual Basic!) which in turn led to Jingle Bells: Solving the Santa Claus Problem in Polyphonic C#” and Modern Concurrency Abstractions for C#. Along the way, I downloaded JoCAML. Eventually I encountered the Simon Peyton Jones’ STM based Haskell solution featured in the book Beautiful Code. I read the original John Trono solution featured in the paper A New Exercise in Concurrency: whoa! By the way, there is an interesting Youtube video on The Santa Claus Problem.

By the way, join patterns is a synchronisation and communications concurrency construct. Based on the Join Calculus of Cedris Fournet and George Gonthier, the essential features of join patterns are asynchrony, message passing and synchronisation based on pattern matching involving channels. Join patterns promise a more declarative and safer way of implementing a variety of concurrency solutions.

An Initial Stab at Santa

For Pycon 2011 version of “Prototyping Go’s Select,” I illustrated the utility of the select statement through an implementation of the Santa Claus Problem (the code can be found here). Here is the nucleus of the code:

def santa(reindeer, elves):
    reindeerCases = [ch.receiveCase() for name, ch, waitTime in reindeer]
    elvesCases = [ch.receiveCase() for name, ch, waitTime in elves]
    reindeerQueue = []
    elvesQueue = []

    cases = reindeerCases + elvesCases

    while True:
        ch = stackless.select(cases)
        print ch.value
        if ch in reindeerCases: 
           reindeerQueue.append(ch.value)
           if len(reindeerQueue) == 9:
              harness(reindeerQueue)
              deliveryToys(reindeerQueue)
              unharness(reindeerQueue)
              reindeerQueue = []
        elif ch in elvesCases:
           elvesQueue.append(ch.value)
           if len(elvesQueue) == 3:
              consultWithSanta(elvesQueue)
              elvesQueue = []
        else:
           print "WHAT?"

(my solution is fashioned after the Benton Polyphonic C# solution in “Jingle Bells.”)

Though the solution is terse, it is imcomplete because the code to give reindeer priority was omitted. The Stackless version (and probably a Go version) is relatively straightforward because the non-preemptive scheduler and high level constructs like channels eliminate most of the problems that solutions using lower level concurrency mechanisms need to explicitly address (i.e., locking).

The main issue with my select based solution is that the Santa Claus tasklet does a lot of housekeeping in the form of maintaining lists and counters for reindeer and elves. This is clutter. Also what would happen if another process, say Mrs Claus, was competing for reindeer and elves?

An understanding of the underlying implementation would reveal that these data structures already exist in the form of the respective elf and reindeer channel queues. The trick to a cleaner solution would be to keep the elves and reindeer enqueued until needed…

Isn’t this what join patterns do?

Could a Stackless Python based solution win, place, or show against the Haskell “Beautiful Code” solution?

The Adventure Gets Under Way

Baby Steps/Game Plan

At Pycon 2011, I finally had the opportunity to meet Christian Tismer. Everything that could go wrong, went wrong at Pycon 2011. Meeting Christian and the Stackless sprint made the trip worth it. Amongst the many subjects discussed with Christian was extending the select algorithm to encompass some form of join patterns. This discussion was in the context of figuring out how to move more functionality out of Stackless Python’s C code base and into Python. A piece of advice Christian gave was to take shortcuts. For instance, just allow a channel to appear in one join pattern. I felt there was an 80/20 rule at work with join patterns. That is I could reap 80% of join pattern’s power easily by not implementing the difficult 20%.

Brainstorming

On return, I with the help of my friends Kevin Bulušek and Ted Ewanchyna, started work on join patterns. Join patterns has been the toughest project to date. Unlike select, join patterns do not cleanly map onto Stackless Python’s model. I did not have an implementation to examine (I suppose I could have looked at JoCAML’s code). I poured over papers that implemented join patterns in various languages (I can’t be the first to attempt join patterns in Python?).

As in the case with select, the project began with a flurry of email activity as Kevin and I exchanged ideas about how to go about implementing join patterns. Often, I would see that Kevin attached code attached to an email. After scratching my head looking at Kevin’s Haskell masquerading as Python code (I am so not a functional programmer), I found myself saying “thank god, he’s on our side!” The functional programming and the algebraic approach echoed what I would eventually see in the literature.

Since we worked on separate code bases, the parallel prototyping allowed for greater freedom to explore alternatives. Some of Kevin’s ideas don’t mesh well with my present design. Other solutions would have never crossed my mind (i.e., wow, you connect the queue to the chanop! Interesting!) and I would like to incorporate in a future branch.

The important point is when one can prototype quickly in a language like Python, exploring alternatives to see what works and what doesn’t becomes feasible.

“A new point of view is worth 80 IQ points.”
–Alan Kay

Beginning of a New Model

By mid May, I realised I was engaged in a game of wack-a-mole. I abandoned my prototypes and went back to the drawing board. Up to then, school occupied much of my time and defocused me. I started to carefully categorise the problem space:

  • Interactions between “regular” channels and channels participating in a join pattern (inter-tasklet).
  • Interactions between patterns in a join/select (intra-tasklet).
  • Interactions between patterns and channels (channels are treated as a special case of a join pattern).
  • Interactions between join patterns (inter-tasklet).
  • Interactions that would cause deadlock.

Eventually I was able to enumerate the cases. Fortunately there was a manageable number of cases. Some edge cases I originally did not think were possible. However now I could decide the right behaviour and write tests. A post in Go Lang Nuts and a response from Ian Lance Taylor, persuaded me to add a new feature to the design: transferring data only when the pattern matched (“atomic transactions”). The original plan was to have join patterns work more like UNIX’s select. The original design would have made join patterns vulnerable to race conditions. That is just wrong.

Also at this time, I had read the paper “Scalable Join Patterns.” Much of the paper I didn’t think was relevant to my case. However I was exposed to concepts such as message stealing and operations having transaction states. I was also consulting Gregory Andrew’s “Concurrent Programming: Principles and Practices” for insights on message passing constructs.

Note: in a separate blog post, I will example the matching rules in more detail. I am very slow when it comes to making diagrammes. Perhaps I will simply scan my log notes.

With every mistake, we must surely be learning
–George Harrison

Two Man Programming

I can’t emphasize the power of two man programming. Eventually Ted entered the fray. Just having Ted looking over my shoulder and pointing out silly mistakes, asking pointed questions when a particular piece of code didn’t make sense and telling just telling me to hang in there during particularly rough stretches. Ted also pushed me to write additional debugging code. Once function, dumpPatternState() was indispensable. Also the WingIDE (I received a free copy as a result of participating in the Pycon sprints) on numerous occasions, saved our gravy. More work got done over the course of three weeks than the prior two months.

to run fast, run alone. To run far, run together.
– African Proverb

To add to the surrealism of the project, much of the programming was done at the Notman House mansion either at night on weekends when the place was deserted. We worked under the cover of quiet. Visions of Tyler Durden’s dilapidated house in “Fight Club” sprung to mind whenever Ted and I recessed in the lounge only to be annoyed by the hum of the loud vending machine. Thinking about the loud vending machine reminded me of a line from Tom Waits’ “Frank’s Wild Years”:

Never could stand that dog.

The Approach Part I: Interpreting Polyphonic C# Join Patterns to Stackless Python

To give the reader a flavour for join patterns, here is an example from “Modern Concurrency Abstracts in C#”:

public class Buffer 
{ 
    public string Get() & public async Put(string s) {
      return s;
    }
}

A join pattern or chord consists of a header (in this case Get() and Put()) and a body (return s). The join patterns’ header methods either send messages or receive them. Furthermore, these methods are invoked either synchronously or asynchronously. The body (or lambda) is invoked only when all the patterns messages are called. What is nice is that the and operator (‘&’) indicts this is a conjunction.

Since the thread invoked Get() synchronously, it would block until all of the reminding pattern’s methods are called. If the asynchronous method Put() was invoked first, the thread would go about its merry way and the message would be enqueued. When there is a match, the body (return s) is executed in the receiver’s thread. The receiver is unblocked and the value returned. If the receiver happened to be asynchronous, then a new thread would be created.

Something that worried me was that Stackless Python does not have direct support for asynchrony. As a matter of fact, in the literature, synchronous channels are implemented as a join pattern. This is an indication of the expressive power of join patterns. In Stackless Python, it is possible to achieve asynchrony in two ways:

  • Subclass the stackless.channel class and create a buffered channel.
  • Create a new tasklet to handle a potentially blocking operation and let the parent tasklet proceed on its merry way.

I made two additional assumptions:

  • Functions that return values are equivalent to a channel’s receive() method. That is buffer.put() and buffer.get() is equivalent to buffer.send() and buffer.receive(). If this is so, why specify buffer.send() in the method signature since a receive() can only be triggered by a matching send()?
  • Pattern bodies exist to do computations on internal messages and then return a value to a sender.

What if, upon pattern completion, rather than execute a function inside the sender’s tasklet, all the internally queued messages were transferred to the sender? What sort of expressiveness would be compromised with this simplification? Could this mechanism still be called a join pattern?

For now, my only real test is to see how many join pattern examples I can replicate with my implementation shortcuts……

The Approach Part II: Expanding stackless.select() to Support Join Patterns

stackless.select()

To understand how join patterns are implemented, it is necessary to understand select…..

select allows a tasklet to wait on multiple channels without resorting to busy polling. select is required due to the blocking nature of synchronous channels. The main difference between Go’s implementation and the Stackless Python prototype is that the former is a language feature and the later is provided via classes. This turns out to have its strengths and weaknesses.

Internally, the Stackless Python prototype’s implementation of stackless.select() comes directly from Plan9’s libthread/channel.c …

This is the select algorithm. Python is a great language for pseudo-code

def select(operations):
    """
    The select function searches a list of channel operations
    if one or more channels are ready, one is randomly selected
    if no channel is ready, the tasklet suspends until a peer
    tasklet is ready

    returns a channel operation (_chanop)
    """

    choice = None
    source = getcurrent()
    numberReady = 0

    """
    check for operations for one or more ready channels
    use Pike's one pass algorithm
    """
    for operation in operations:
        if operation.ready():
            numberReady += 1
            if nrand(numberReady) == 0:
                choice = operation
 
    if choice:
       debug("[START OPERATION]")
       choice.action()
       debug("END OPERATION]")
    else:
       debug("START SUSPEND PROCESS")

       for operation in operations:
           debug("entering operation add")
           operation.add()
           debug("exiting operation add")

       schedule_remove()
       schedule()
 
       choice = source._operation
       source._operation = None
       debug("RESUME PROCESS")
 
    debug("exiting select")
    return choice 

Changes to stackless.py to support stackless.select()

To support select, it was necessary to represent channel operations as classes. Hence the chanop (channel operation) class and the channel.receiveCase() and channel.sendCase() that generate the proper chanop (for send or receive operations). Stackless’ channel object’s queue was modified to be populated by chanops rather than tasklets.

The biggest change was the implementation of the select algorithm. Note that in the select algorithm, it is the chanop via its ready() method that determines whether a tasklet ought to block. A chanop is akin to Dijkstra’s guard (guarded command). In turn, select is a disjunction of guards.

If so, why can’t a guard be composed of guards? For instance if we want to wait for three elf channels to be ready for receiving, why can’t we express this as some variation of :

elf[0].receiveCase().ready() and elf[1].receiveCase().ready() and elf[2].receiveCase().ready()

It turns out we can; huzzah!

aJoinPattern = stackless.joinPattern([list of channels])

Consequently the joinPattern and chanop classes share a common interface: abstractGuard. The joinPattern class has methods such as:

  • ready() : is the channel operation ready. That is, is there a matching channel operation on the queue?
  • add() : add the channel operation to the appropriate channel queue
  • action() : transfer data
  • remove() : remove unused channel operations from the reminder of select’s channel queues and remove other data structures.

hint, follow along with the select algorithm

In this fashion, anywhere a chanop is used in the select algorithm, a joinPattern can be substituted. Hence the select algorithm functions unaltered. An added bonus is that one can also support a disjunction of conjunctions. In other words, an orElse allowing for multiple join patterns.

It is interesting to note that in the Plan9 implementation, a channel method is implemented as a select with one operation.

def send(self, msg):
    """
    channel.send(value) -- send a value over the channel.
    If no other tasklet is already receiving on the channel,
    the sender will be blocked. Otherwise, the receiver will
    be activated immediately, and the sender is put at the end of
    the runnables list.
    """
    return select([SendChanop(self, msg)]).value

Consequently, a join pattern can be implemented as a select with one operation albeit a compound operation.

def join(self):
        """ wait until all cases are ready """
        return select([self])

Join Patterns

My main design goal was to maintain a synchronous style of programming in keeping with Stackless Python. I also wanted to maintain backward compatibility with the Stackless API. In the process I also wanted a “you get only what you pay for policy.” So if one does not want Join patterns, regular code should not suffer a noticeable penalty. Finally, I wanted to localise changes as much as possible. This meant not altering the select algorithm. And the scheduler was off limits as well as removing anything from the stackless.channel() API.

The Santa Claus Problem with Join Patterns

The following is the nucleus of the Santa Claus Problem implemented with an initial version of join patterns. Although properties like patterns exist because of the lack of language support, this version contains much less clutter than the select based version.

def santa(reindeer, elves):
    print "in santa"
    joinObject = stackless.join().addPattern([ch for _, ch, _ in reindeer]).\
                                  addPattern([ch for _, ch, _ in elves],3)

    reindeerPattern, elfPattern = joinObject.patterns

    while True:
        pattern = joinObject.join()
        if reindeerPattern.ready():
            reindeerPattern.join()
            pattern = reindeerPattern
        if pattern is reindeerPattern:
            harness(reindeerPattern)
            deliveryToys(reindeerPattern)
            unharness(reindeerPattern)
        elif pattern is elfPattern:
            consultWithSanta(elfPattern)

An interesting note. There are two cases that have to be checked. The first case is when both reindeer and elves are ready from the get-go and the select algorithm randomly picks elves and removes the reindeer pattern as a part of the select algorithm’s clean up. The select algorithm has no notion of priority. Consequently the Santa tasklet must check the reindeer pattern to see if indeed there are nine reindeer ready.

The second case is more interesting. Let’s pretend, eight of the reindeer have arrived. However three elves have arrived, completing the Elf pattern. As in the first case, the reindeer pattern is removed and stackless.select() schedules Santa. Again, the algorithm does not give Santa scheduling priority (perhaps it ought to. Easy enough to do with Stackless API). More importantly exiting select() does not mean the eight reindeer are thrown away. Rather the eight reindeer are still sleeping on their channels. Now assume the runnable scheduler queue looks like this:

(ELF[1], ELF[2], RUDOLPH, SANTA)

By the time Santa is the currently executing tasklet, all nine reindeer are ready. However the Santa tasklet will not know this unless it explicitly checks the reindeer pattern.

Got to love concurrent programming….

Two for the Price of One: The Dining Philosopher’s Problem

It is safe to say that implementing the Santa Claus Problem with join patterns was a success. The only change to the existing Stackless Python API was rendering a property, channel.balance irrelevant (internally I moved to double queues). After a “sneak peak” got a bad reception, I concentrated more on the API. I was torn between the more verbose but descriptive style used by Concurrent Visual Basic and the chaining style used by Twisted Deferred. The Twisted style has won for now.

In retrospect, the Santa Claus Problem was not the stress tester. The must-have features were barrier synchronisation and the ability to atomically move the data while still operating in the safety of the “scheduler” (for conceptual simplicity, I include the channel as a part of the scheduler).

The distinction of ball breaker went to the Dining Philosopher’s problem. I tried to write an implementation similar to the one posted in Scalable Join Patterns My initial solution failed. Why? Because I needed asynchrony. It didn’t help that I initially forgot to set the chopsticks! In lieu of an asynchronous send, I launched a tasklet that would block on the synchronous send. Meanwhile the philosopher tasklet in question could go its merry way.

The Dining Philosophers

This is the nucleus of a solution for the Dining Philosopher’s problem with join patterns:

def release(aChannel, name):
    def __release():
        aChannel.send(name)
    stackless.tasklet(__release)()

def thinker(name, thinking, left, right):
    print "Philosopher %s left=%s right=%s" % (name, left.label, right.label)
    utensils = stackless.join().addPattern([left, right])

    while True:
       print name, "thinking"
       print name, "ready to eat"    
       result = utensils.join()
       for p in result:
           print p.value, " relinquished chopstick ", p.channel.label

       print name, " finished eating"
       release(left, name)
       release(right, name)

Originally I had the philosophers wait an arbitrary amount of time before attempting to eat. However I removed the timing to better ensure that the robin robin scheduling produced the pathological case of all philosophers grabbing their left chopstick then right chopstick. Still the solution worked. I exampled the logging to see whether message stealing occurred. Nope. This initially threw me for a loop. I have to admit, I was skeptical about why join patterns work for the Dining Philosophers.

To the best of my knowledge, the following is happening. As per join patterns’ behaviour, the philosophers do not hold the chopsticks unless they can be acquired simultaneously. I suspect the join pattern machinery inside the scheduler takes on the role of butler. However the non-preemptive nature of the scheduler plays a role too. It allows the pattern matching in the active tasklet to check for availability of both chopsticks without producing a context switch. This greatly simplifies matters. Event if there was a context switch, the solution ought to still work since message stealing rules would come into play. Regardless the scheduler is acting like the butler of the Dijkstra solution. The important point is that join pattern’s declarative style allows one to get the butler for free.

Join Patterns, Concurrent ML, Transactional Events and Software Transactional Memory

I corresponded with the authors of “Scalable Join Patterns”: Claudio Russo of Microsoft and Aaron Turon from Northwest University. Claudio and Aaron were kind enough to recommend references and put up with my pestering about why join patterns work in terms of classic deadlock conditions. Two of the major take-aways from the “Scalable Join Patterns” paper is the use of atomic transactions and lock-free data structures. Fine grain locking is the great killer of performance and composability.

Amongst the recommendations were the John Reppy’s papers on Concurrent and Parallel ML and Kevin Donnelly and Matthew Fluet’s Transactional Events. I will admit that my reading of the papers leave much to be desired since I tend to skim over the formal definitions and fixate on parts that have a family resemblance to what I am doing.

Like Stackless Python, Concurrent ML uses synchronous channels. Concurrent ML’s goal is to create a model that separations the description of synchronization mechanisms from their operation. Concurrent ML sports a small number of operations (i.e., spawn) and primitives (i.e, channels, events). Events (receiveEvt, sendEvt) in Concurent ML are very much like sendCase/sendChanop, receiveCase/receiveChanop in my prototypes. Combinators such as guard, choose, wrap and sync (synchronises processes) are used to create more complex concurrency constructs.

The problem Concurrent ML is trying to address is that of composability. Composability involves the scalability of concurrent constructs. Issues surrounding composability appear in the Stackless section of PyPy literature.

Transactional events builds up from the Concurrent ML model. I won’t get into Transactional Events because I currently do not understand the nuances between it and Concurrent ML well enough to explain them coherently. Like Concurrent ML, transactional events aim to allow more flexible code modularisation (with an emphasis on protocol design). Like software transactional memory, atomic transactions are used in lieu of fine grain locking. And if I understand the literature and code examples, it easier to express thread interactions with Concurrent ML and transactional events than software transactional memory.

I will only truly know once I start writing examples with STM…..

Takeaways

I feel some important takeaways are:

  • There are languages with concurrency models similar enough to Stackless Python. Moreover these more mature systems have tackled problems that Stackless Pythonistas have already encountered and have come up with questionable solutions (i.e., lock convoys). There is a lot to learn. Why re-invent the wheel for want of reading a research paper?
  • There are numerous language implementations out there. It is important to play with them. This is a cheap way of learning what can and can’t work with Stackless Python (or Python)
  • stackless.py and PyPy ought to play a bigger role in prototyping new solutions for Stackless Python (if not, Python). As was shown in “Prototyping Go’s Select,” the difference between prototying a feature in Python and C was ~150 lines to ~1000 lines, not to mention the difference in time and skill.

Status of stackless.py with Join Patterns

There is still much work to be done on the stackless_V3.py module. For now, I will move my work into GitHub and give access to whomever wishes to look at it. When things are sufficiently stabilised, I will move a version into the Stackless Repository. In the immediate future, this is my TODO:

  1. Write tests for join patterns that share common channels.
  2. Remove of internal counting for M out of N joins (implementation detail)
  3. Automating tests.
  4. Implementation of channel preferences for join patterns.
  5. Addition of asynchronous buffers
  6. Refinements of internal channel operations (implementation detail)
  7. Support for greenlets and continulets

Conclusions

It may be questionable what I implemented is indeed join patterns. Perhaps message passing along with barrier synchronisation with atomicity is good enough to cost effectively solve a wide variety of concurrency problems? Nevertheless, I have been able to implement two notorious concurrency problems, the Santa Claus Problem being one of them. I was able to do this by modifying the existing stackless.py code base. stackless.py accompanied by Python provides a surprisingly rich toolbox for experimentation.

It is possible to write a faithful implementation of join patterns (I have ideas). However I don’t know if the gains would be worth the effort (why create machinery to create a synchronous channel when Stackless already has them?). If anything, I believe more bang for the buck can be gained from exploring concepts featured in Concurrent ML.

stackless.py

As Stackless features are better integrated with the JIT, this will open up new opportunities. I believe if performance is not an issue, then a designer stackless.py rather than C-extension to Stackless Python may be the answer. If one has a need for join patterns or some other esoteric concurrency feature, then a custom tailored stackless.py becomes entirely feasible. If one does not want these features, then use stock stackless.py (or Stackless Python). At the end of the day, Stackless Python is not so much an application but a family of related products and technologies that can be tailored for different purposes. Otherwise how does one explain the success of gEvent and other greenlet based solutions?

For myself, I can see join patterns and some of the transactional event features simplifying the construction of SOA (Service Oriented Architecture) based applications. It was the desire to prototyping WS-BPEL processors that got me involved with Stackless Python.

Final Thoughts

Once again, I found it a fulfilling experience to implement join patterns for Stackless Python. Whether join patterns get into Stackless Python is another story. I can’t describe the thrill of reading a paper and saying “yeah I think I can implement that!” and having a language that allows one to fearless diving in. Okay, I exaggerate: one takes a deep breath first….

Seasons Greetings,
Andrew

Prototyping Go’s Select and Beyond …

July 24, 2010

Hi Folks:

On Monday July 19th, I gave the talk “Prototyping Go’s Select in stackless.py for Stackless Python” as a part of EuroPython 2010. It was my pleasure to be invited to speak. Moreover it was great to see Christian Tismer receive an award for his contributions to the Python community.

This final version of the talk is sufficiently different from the original dry run. In large part this is due to Kevin Bulušek’s amazing efforts in implementing a Stackless Python version of select – this expanded the talk’s size. I also wanted to better emphasize the power of prototyping of Python in Python. Finally I wanted to better articulate the differences between Stackless Python and Go’s concurrency models. How could two languages sharing a common ancestor (Limbo) and a very similar API and object model be so different?

Regardless of if one is a newbie curious about Stacklss or a seasoned Stackless Python developer, I feel there is something for everyone in “Prototyping Select.” I also can’t stress enough the fun in preparing the material for this talk. As for the reliance on the Rob Pike’s Newsqueak paper and Plan9’s libthread.c. Well imitation is the sincerest form of flattery…..”

Here is link to the revised slides of Prototyping Go’s Select with stackless.py for Stackless Python and the original. Soon I will place examples and a new version of stackless.py and a Stackless Python patch in the Google repository. Since there are many things said during the talk that didn’t make it into the slides, I will try to write a paper within the next two weeks.

Beyond

As I stated in the talk, neither the stackless.py or the C code are ready for prime time. I am writing new tests and debugging. Hopefully the PyPy and Stackless Python communities will tell me the proper way to introduce the new code so others can learn, use and improve it.

What’s next? Well I attended the PyPy sprint. Took the first baby steps towards learning how to implement select as a language statement. This is what it looks like:

compound_stmt: if_stmt | while_stmt | for_stmt | try_stmt | with_stmt | funcdef | classdef | select_stmt
select_stmt: 'select' ':' select_suite

def test_select(self):
        input = """select:
                          case a:
                                  pass
                          case b:
                                  pass
                """
        tree = self.parse(input)

        import pdb;pdb.set_trace()

Thanks Antonio and Armin! I will also soon start experimenting with implementing join patterns and event pattern matching. So stay tuned!

Cheers,
Andrew

The Initial Version of Stackless.py with Select

June 9, 2010

Hi Folks:

Introduction:

Enclosed is a link to the initial copy of Stackless.py with select. This is different from the EventHandler class I presented at the dry run at Montreal Python 13 on April 26th. That version was completely clean roomed from Robert Pike’s Implementation of Newsqueak paper. Although I wrote the implementation, most of the credit has to be given to my friend Kevin Bulušeck for doing a clever Stackless Python mockup that I referenced. I also have to thank Russ Cox for pointing me to the Plan9 implementation at http://swtch.com/usr/local/plan9/src/libthread/channel.c and answering a few questions.

Usage

Select is now a tasklet method.


tasklet.select(list of chanops)
"""returns (channel, operation, value).

channel.receiveCase()
"""returns a receive chanop

channel.sendCase()
"""returns a send chanop

An example:

import stackless

def sender(ch, val):
    ch.send(val)

def receiver(ch):
    val = ch.receive()

def selector(a, b, c):

    print "selector: begin"
    count = 0
    while count < 3:
        ch, operation, value =  stackless.getcurrent().select([a.sendCase("A"), \
                                                       b.receiveCase(), \
                                                       c.sendCase("C")])
        if ch == a:
           print "sender A completed"
        elif ch == b:
           print "received ", value, "from receiver B"
        elif ch == c:
           print "sender C completed"
        else:
           print "should not get here"

        count += 1

    print "selector: end"

if __name__ == '__main__':
    a = stackless.channel("CHANNEL-A")
    b = stackless.channel("CHANNEL-B")
    c = stackless.channel("CHANNEL-C")

    stackless.tasklet(selector, "SELECTOR")(a, b, c)
    stackless.tasklet(receiver, "RECEIVER-A")(a)
    stackless.tasklet(sender,"SENDER-B")(b, "B")
    stackless.tasklet(receiver,"RECEIVER-C")(c)
    stackless.run()

Of course, a hash table can be used in lieu of if statements. Also unlike the Go/Newsqueak version, cases can be dynamically added and removed. I believe the only real limitation is that a channel cannot do both a sendCase and receiveCase at the same time.

Under the hood, the main changes are: 1) there is a new object, chanop (channel operation) 2)channel has two new methods: receiveCase() and sendCase() that return chanops. Chanop is meant to be manipulated by stackless and not the programmer. 3) channel now has a queue of operations rather than tasklets.

I would appreciate it if folks could download the code, play with it and make comments. Remember the code is still rather rough. It would be nice to see how easy/hard it is to implement select in other stackless.py based implementations 🙂 In the meanwhile, I will write tests and clean up the code. I also want to further experiment with implementing join objects a la JoCAML and Polyphonic C#.

Cheers,
Andrew

Prototyping Go’s Select for Stackless Python with Stackless.py

May 2, 2010

Dear Folks:

On April 26th, 2010, I did a dry run of the talk Prototyping Go’s Select for Stackless Python with Stackless.py. I thank everyone who attended my talk despite the hockey game being on at the same time. I also deeply appreciate the feedback! Hopefully this talk will be accepted for EuroPython 2010. The motivation for the talk was simple. Stackless Python and Google’s new language, Go, borrow their concurrency model from Bell Lab’s Limbo. However Stackless Python does not have the select statement. Select allows a coroutine to monitor multiple channels without resorting to polling. I find the idea of Stackless Python written Python to be a powerful idea. How hard would it be to implement select functionality in Stackless Python? How easy would it be to use PyPy’s stackless.py package to prototype the feature? Does stackless.py deviate so much from Stackless Python as not be a good estimator of required time and effort?

As Stackless Pythonistas pointed out, select like functionality could be implemented with additional tasklets. I understand this approach fits in with the tao of Stackless Python to customise through subclassing and using the API. Keep Stackless Python small. For the most part I agree. However in the case of select, how efficient would this be? It has also been argued that no one wants select. Well we all know the adage about hackers wanting to write software they will use. Based on my work with WS-BPEL and business process orchestration, I have implemented workarounds. So I see uses for select like abilities in the areas of complex event processing and workflow patterns. Perhaps these features could make Stackless Python attractive to a new audience?

We will soon be in a position to experiment and benchmark….

Incidentally I forgot to tell the audience that I tended to submit the talk to EuroPython 2010 and I wanted to iron out the bugs. My apologies. Based on feedback, I will restructure the talk. However I will post the original slides and code soon. Really. For better or worse, the video will be out there.

Prototyping is still very much a work in progress. I am still learning about many things about the Go language and Stackless Python internals. For instance, Go does expose more of the concurrency system than I thought, through its runtime and reflect packages. Although it doesn’t alter the big picture, this new information will be reflected in the updated talk and paper.

As of April 26th, I have implemented a few versions of a select like feature in stackless.py. For now select (called eventHandler) is a class rather than a language feature. These implementations have been very conservative. I will write more versions in the days and weeks to follow. Meanwhile, my friend Kevin Bulušek has bravely taken the plunge and is implementing the changes in the Stackless Python implementation to support a select. From working with Kevin, I was reminded of a line uttered by Ben Affleck’s character Morgan from the film Good Will Hunting comes to mind: “My boy’s wicked smart.”

To date, Prototying is the most esoteric of the three talks I have given. It is the most fun to research and do. A lot is happening in the background. Some the action won’t get into a thirty or forty-five minute talk. For instance, I think the Agile technique of two man programming is really effective. I am also excited about Christian’s work with integrating Stackless Python with Pscyo. yet another reason to avoid programming Python extensions in C. And my C is really rusty. It was a blast to take the algorithms out of Rob Pike’s paper The Implementation of Newsqueak and implement them in Python. Although stackless.py is a low priority concern for the PyPy team (and not maintained), I think there is much potential in using stackless.py. And each time I look at the PyPy blog, the team seems to be making significant gains.

Most important for the current project, Stephan Diehl and Carl Bolz pointed out an important trick-of-the-trade: use stackless.py with the greenlets package running on top of Standard Python. Previously I used the PyPy interpreter that was so slow. And forget compiling to pypy-c. It is all about high clockspeed (a book I am currently reading, recommended by a former professor). Thanks guys! Christian Tismer also gave helpful advice and was correct in saying that the C code base would have to altered to support select. I originally thought we could get away with no changes….. My first prototype gave me the wrong impression.

During the course of Prototyping, I attended a McGill Management school talk by John Seeley Brown called “Power of Pull: How Small Moves, Smartly Made, Can Set Big Things in Motion.” The aforementioned is the name of Brown’s new book in collaboration with Jon Hagel. I had heard of Jon Hagel before because I read “Out of the Box” that dealt with web services in business. This topic was right up my alley. A central theme of Brown’s talk was about how to create high performance organisations. Drawing from cases ranging from elite surfers, to Amazon, to WoW guiles, Brown listed some simple principles:

  • Deep collaboration.
  • Fail fast
  • Using frame-by-frame analysis to improve
  • Look at adjacencies.
  • Look for spikes of capabilities.

A lot of the principles applied to my current adventure. For instance, prototyping in with stackless.py allowed me to fail fast. . Believe me, I made many mistakes. The revision control system has the bones of many versions nipped in the bud. It helped that Kevin acted as a springboard and a sanity check. Occasionally Kevin would say, remember the KISS principle and all I could do is sheepishly grin (funny I just read a New York Times article called “It’s Complicated: Making Sense out of Complexity” that talked about American’s love affair with the complicated). As for adjacencies, I closely looked at Rob Pike “Newsqueak” paper and reviewed the various Google videos on Newsqueak and Go. And I have dabbled with Erlang. Perhaps the principle that didn’t dawn on me earlier was “look for spikes of capabilities.” I decided to be audacious and subscribe to the Go Language Nuts newsgroup and post questions. I was pleasantly surprised when ‘Commander’ Rob Pike himself, Russ Cox and Ian Taylor, kindly indulged me and answered my questions. Looking at the Plan 9 code was really really helpful although there are parts like the lock acquisition that I still haven’t wrapped my head around it. At the end of the day, Stackless Python and Go are *different* languages built for different purposes despite having a common ancestor. Nevertheless. I feel Newsqueak, Plan 9, and Go will show where the dragons may be for a C Stackless Python implementation. Perhaps once I have enough of a framework in place, I can find optimisations: one can hope to dream 🙂

Here is a copy of the original talk. Warts and all.

Cheers,
Andrew

A Survey of Stackless Python First Edition

October 14, 2009

Hi Folks:

On September 30th, 2009 I gave a talk entitled A Survey of Stackless Python to the Montreal Python Group. I thank the Montreal Python Group for having me. And it is always a pleasure to talk about Stackless Python. The talk was well received despite hiccups. I am sure I will cringe when I watch the video. As a teenager, I cringed watching my performance on a TV game show (even though my team won). I believe I fumbled the part on channel preferences execution and response times. As well as pronouncing Gregorio’s name. I kept on thinking ‘Grogono’ (A computer science professor at Concordia University). I am sure there are more gaffs.

I am tempted to say that the slides are better than the video.

A number of cool talks inspired “Survey”: Joe Gregorio’s (Lack of) Design Patterns in Python, David Beazley’s A Curious Course on Coroutines and Robert Pike’s Advanced Topics in Programming Languages: Concurrency and Message Passing in Newsqueak. Also at the time, I had design patterns on my mind as a part of reading Bruce Eckel’s work in progress Python 3 Idioms and Design Patterns. One of my favourite computer science courses was Software Design Patterns with Dwight Deugo.

What I liked about the three talks is the way they interconnect. Towards the end of his really informative talk, David Beazley raises some legitimate questions and comments: “Is it really worth it to write your own multi-tasker?” and “Threads are a well understood model.” With Gregorio, the answer is “well the language should incorporate the mult-tasker design pattern as a feature.” And Pike: “threads are low level” and “(channels as a part of) CSP, is a well understood model.”

I feel the cumulation of the ideas in these talks provide the framework for discussing Stackless Python and other concurrency frameworks in a more systematic fashion. Essentially Stackless Python provides a powerful built-in design pattern : The Active Object. And the relevantly easy to implement but lesser known Half-Sync Half-Async pattern gives the Stackless Programmer (and his or her choice of asynchronous networking package) a best of both worlds between synchronous and asynchronous concurrency models. This is a major theme of the talk.

I know I am bad on following up on blogs and talks. However in the weeks to come, I will be posting a new version of the talk. I want to clarify some facts – i.e., memory usage and a tasklet’s state machine (there is a little more to it). I want to provide a benchmarks between say Stackless Python and a generator based solution. I want to include a section on pickling. I want to tie all the networking techniques together through the use of a small web server (I would like to see what a Newsqueak web server looks like). To provide context, I will discuss about what got me into the wonderful world of Stackless Python (Service Oriented Architectures). Finally, I want to do some learning (then again, I am always learning something new with Stackless) in the form of seeing how easy (or difficult) it is to add new language features via the future of Stackless, PyPy.

Cheers,

Andrew

The Adventures in Stackless Python and Twisted Continue…

March 18, 2008

Hello Folks:

I recently gave two talks at PyCon 2008. The first was “Stackless 101” where I was a substitute teacher for Christian Tismer. If it weren’t for Christian Tismer and the wonderful work he has done, I wouldn’t have had any Python adventures worth telling. My own talk was entitled “Adventures in Stackless Python/Twisted Integration.” It was difficult to cover all the issues in a twenty five minute talk. Unfortunately there are problems with the original slides.

However this is the Web. There is no reason why I can’t use the feedback to correct mistakes and strengthen arguments. It would be a bigger mistake to leave the slides as is and let important things go unsaid.

In the days and weeks to follow, I will post a new edition of “Adventures in Stackless Python / Twisted Integration” I will also post coding examples.

That said, I cannot stress enough the fearless programming aspects of Python, Stackless Python and Twisted. I first learnt about the notion of “Fearless Programming” in the May 1987 IEEE Software article “Experimental Prototyping in Smalltalk.” I have long since forgotten Smalltalk but I still remember the sense of fearlessness. A fearlessness that I now experience with Python. Moreover prototyping WS-BPEL constructs with Stackless Python and Twisted allowed me to get a far better grasp of the issues than sitting on committee meetings!

I guess the best way to learn a specification is to implement it….

Yes there is some investment involved in learning Stackless Python and Twisted. However one will be amazed at the type of programming projects one can tackle with even a little Stackless Python and Twisted. One of the goals of this blog is to encourage the newbie to hang in there.

Cheers,
Andrew