This is a blog about Joose and my other mostly JavaScript-related projects. The official Joose homepage is located at Google Code.
Showing newest posts with label bespin. Show older posts
Showing newest posts with label bespin. Show older posts

Wednesday, August 26, 2009

Running the Bespin command line from the, well, the command line

I felt a little masochistic today which left two possibilities open:
  1. fix IE6 bugs in (pick any project)
  2. program in AppleScript
I chose #2.

The following script will execute it's command line arguments as commands inside bespin (if it is running in the current tab of Safari).
on run argv
tell application "Safari"

set docs to every document

set cmd to ""
repeat with c in argv
set cmd to cmd & c & " "
end repeat

set js to "(function () {if(typeof bespin != 'undefined') { return bespin.get('commandLine').executeCommand('" & cmd & "').output.replace(/]*>/g, '\\n').replace(/<[^>]*>/g, '') } else { return '' }})()"

set out to ""

repeat with doc in docs

set output to do JavaScript js in doc

set out to out & "
" & output

end repeat

out

end tell
end run


Wrap this in a little shell script:
#!/bin/bash
cp bespin.scpt bespin.work.scpt
osascript bespin.work.scpt "$@"

(Don't ask about line two (it is part of the masochism experience :))

... and now stuff like this works:
-bash-3.2$ ./bespin set | grep collab
collaborate = off

Wednesday, July 22, 2009

Google Chrome's Very Incomplete Web Worker Support

In bespin we use a facade that allows us running JavaScript code in Web Workers, Gears Workers and if those are not available in the main thread.

Google Chrome does not yet support Web Workers but since it has Google Gears built in it should use the appropriate fallback. Turns out it didn't with our original code.

I naively implemented object detection to check for the availability of the Worker-object like this:
if (typeof Worker == "undefined") {
For some reason the statement above is actually true in Chrome 2, even though as stated above support for the Worker API has not been implemented.

I then tried to instantiate the Worker object. All this does is to throw an exception with the message "Worker is not enabled". This looks like an unfinished implementation that was only partially removed or something in that direction.

This code handles the special case:
var noWorker = typeof Worker == "undefined" ? true : false;
if(!noWorker) {
try { new Worker("test.js") }
catch(e) {
e = e + ""
if(e.indexOf("Worker is not enabled") != -1) {
noWorker = true;
}
}
}
I'd be very interested why this fragment of the worker implementation was left in the code? Most likely it is a bug but it is questionable whether it will be fixed since the next version of Chrome will actually support the Worker API.

Thursday, May 28, 2009

Transparent Client-Server PubSub implementation details

For those of you interested in source code, here are the links to the implementation of my prototype of a transparent publish-subscribe system between client and server:

Server
  • three new URLs are "exposed":
  • /event/forward accepts events published on the client and publishes them on the server
  • /event/bind send the information about which events the server would like to subscribe to to the client
  • /event/listen is a comet connection which is used to transfer events published by the server to the client
  • lib/bespinserv/pubsub.js implements the actual pubsub mechanism in the server.
  • a java.util.concurrent.locks.ReentrantLock() is used to implement the lock for the comet implementation. This was taken and slightly modified from the ServerJS comet example.
  • Whenever a subscription is fired on the server, the event queue (and thus the associated client) is saved in a global variable (remember JS is single threaded) so that when the server does a publish we know where to send the event.
  • Multicast publishes are not supported.
Client
  • Upon "/event/bind" the client creates stub event subscriptions which forward the actual event to the server using /event/forward
  • The client connects to /event/listen. Once it returns the client publishes the returned event and starts listening again immeditately.
Please excuse the messed up indentation, I was using a new editor (not bespin :).

I'm personally not convinced that this architecture should be used because the beauty of REST goes away. On the other hand, I feel that the artificial border between client and server does not make sense for some applications and thus it is great if it goes away. Opinions?

Wednesday, May 27, 2009

One event loop to rule them all and in the darkness bind them

Based on my earlier work to bring transparent pub-sub semantics to web workers I implemented a pub-sub framework that does the same for server sided JavaScript.
The implementation is built on the excellent ServerJS platform and Kevin Dangoor's early prototype ServerJS bespin backend.

This means that you can now do something like this on the server side:
bespin.subscribe("test", function (event) {
bespin.publish("test:receive", event)
})
and then do
bespin.publish("test", { some: "data" })
on the client. Because the server listens for that event, the event defined above will now fire. In the code snippet above the server immediately fires "test:receive". If now the client is subscriped to that event, this subscription will fire.

Events are transferred to the server via simple HTTP requests. Also the client maintains a comet connection to the server which is used to transfer events which are published on the server to the client.

Based on my twitter stream the implementation of this protoype took under 2 hours. Considering it was my first ever ServerJS application, I can only applaud the people behind ServerJS to the awesomeness of their work. While the current state is still somewhat experimental, it is amazing how much was achieved in so little time.

PS: It is only bespin.subscribe above because I did: var bespin = require("./pubsub").pubsub;

Monday, May 11, 2009

One event loop to rule them all

While experimenting with web workers in bespin I made a small change that makes working with them a lot nicer. In bespin we make heavy use of custom events to loosely bind things together in the application. So you see a lot of
bespin.publish("something:happened", { ... })
and
bespin.subscribe("something:happened", function () { ... })
The change I made is that you can now do the same thing in workers and it will Do The Right Thing (tm). Meaning you can subscribe to events that might be triggered by the UI or anywhere else from within the worker and also publish events that might be observed by handlers which live in the "main page". Overall this makes working with workers much more seamless and first class from a bespin perspective because it means that as long as one is not doing any direct UI work (as opposed to sending events to UI components) one can do everything from the worker without building custom interfaces.

For me working on bespin is really an experiment on how to design an event driven client side web applications (or postmodern web application). One of the things that felt kinda awkward up until now was how to know the right point in time to initialize a particular component. You might want to wait until the settings have been loaded (asynchronously via Ajax) and another component has been initialized. All these events are, of course, signaled by custom events but the order might be totally random and they might have already happened when we start looking for them. The solution I came up with (whether it is a good one remains to be seen) is to have a function that checks a list of events against all events that have fired already and then waits for the rest of the events to fire and finally calls a callback when this is done:
    // ** {{{ assertEvents }}} **
//
// Given an array of topics, fires given callback as soon as all of the topics have
// fired at least once
assertEvents: function (topics, callback) {
var count = topics.length;
var done = function () {
if(count == 0) {
callback();
}
};
for(var i = 0; i < topics.length; ++i) {
var topic = topics[i];
if (this.eventLog[topic]) {
--count;
} else {
bespin.subscribe(topic, function () {
--count;
done()
})
}
done();
}
},

Wednesday, April 1, 2009

A structured outline view for bespin

After some major parse tree wrestling I finished an experimental structured outline view for JavaScript files in bespin. It's primary feature is that it can be modified at runtime to adapt to your favorite dialect of JavaScript.


While most programming languages have fix constructs like class, namespace or even event declarations, with JavaScript you have to build all these things from simple building blocks. On the other hand, real-world JS code usually follows quite strict rules to build these things. E.g. bespin itself uses dojo to declare "classes" (or constructors, whatever you want to call them) like this:
dojo.declare("bespin.parser.CodeInfo", null, {})
and this pattern to subscribe to and publish events:
bespin.subscribe("parser:error", function(error) {
bespin.publish("message", {
msg: 'Syntax error: ' + error.message + ' on line ' + error.row,
tag: 'autohide'
})
})
With my personal favorite object framework Joose you will find code like this:
Module("Bank", function (m) {
Class("Account", {})
})
While a programmer, who knows nothing about JavaScript will easily recognize the meaning of these statements, JavaScript has to actually execute them to derive the meaning.

My new outline generator works around this problem by being configurable through a kind of DSL which tells the parser more about the specific dialect of JavaScript that you are speaking. Here is the code to tell the parser about the specific pattern that bespin uses to publish and subscribe to events:

bespinEventPublish: {
declaration: "bespin.publish",
description: "Publish"
},
bespinEventSubscription: {
declaration: "bespin.subscribe",
description: "Subscribe to"
}

How it works:
At the heart of the outline generator is a visitor function that gets invoked for every node. It receives a stack of nodes that are lexically before and above it. It also receives a stack of indexes that tell us where a node lies within the child nodes of its parents. This allows finding predessors and successors of nodes above us. Long story short: Once we find a string like "subscribe", we can check our lexical predecessors for "bespin" and assume the next string node is the name of the event. Voila.

Other improvements:
I added an emulation for importScripts for Safari 4 workers that allows loading external source code from the worker. While this is specced, it is missing from Safari. Because Safari workers are also missing XMLHttpRequest I simply abuse the hosting page to do the requests and post the source back to the worker.
This allows removing the narcissus library from the worker bootstrap script and might soon enable us to remove packaging narcissus with bespin altogether.

And an improvement for people using my worker facade: You can now use console.log to write to the console from inside the worker.

Saturday, March 14, 2009

On-the-fly Syntax Checker and Code Outline View for Bespin

I continued to work on bespin and submitted my first complex patch for review. The patch builds on my work about moving JS objects into web workers and implements syntax checking as well as a simple code outline view for JavaScript. All these features are automatically disabled when you do not have access to web workers because syntax checking is too complex to be performed inside the UI thread.

The outline view is activated by typing the outline command:

The functions names are clickable and move the cursor to the line where the function is defined.

Syntax errors are displayed as short info on the bottom of the screen:

It might be possible to underline the relevant code, but I need to dive deeper into bespin to do this (plus get more info out of the JS parser).

While working on the code I made some discoveries about Gears and Web Workers:
  • Apparently you are not allowed to define a constant called Block inside a Gears worker
  • Safari 4 currently does not support any way to load source code into a worker (both importScripts and XMLHttpRequest are not implemented). For now I pasted all source into a bootstrap-script. Another possible solution would be to send a message to the main page asking it to make the http request and send back the result.
  • I removed the ugly hack to send source to the worker inside the hash part of the url and instead send the source to the worker via postMessage immediately after loading.
I tested the system in Safari 4, FF 3.1 beta and FF 3.0.x with Google Gears. The current patch only supports syntax checking for JavaScript. Other languages could be added in a similar fashion to language-dependend syntax highlighters, though.

Tuesday, March 10, 2009

bespin learns JavaScript

Today I hooked up bespin to Brendan Eich's JavaScript parser written in JavaScript.
Some evil recursive tree wrestling later, we can now find out the names, line and column numbers of all functions in a JS document. I also added some logic, to look up the tree for anonymous functions to see whether they were declared in an object literal, so we can use the key as the inferred name.

The parse tree could be used for all kinds of interesting things, like outlines and even IntelliSense(TM)(R). JS being a very dynamic language this, however, only goes so far. In particular systems like Joose or even dojo's simplistic object system do not go so well with static analysis.

As a next step, I'll move the parser to a web worker (I designed the API to be async, so this should be easy) and implement a simple outline view.

Monday, March 9, 2009

bespin extensibility / auto-indenting

bespin allows you to extend itself with JavaScript. Obviously this JavaScript is edited with bespin itself (I love StrangeLoops(TM)).
The easiest way to do this is to edit the config.js file within the standard BespinSettings project. Here is a sample that enables indenting after line breaks. When you hit enter the next line will have the same leading white space as the current line and when you hit enter after an opening { it will add an extra two spaces (Note that because this is my personal config.js I don't have to worry about configurable tab width :)
bespin.publish("bespin:editor:bindkey", {
key: "ENTER",
action: function (args) {
console.log("Pressed enter")
var editor = bespin.get("editor");
editor.ui.actions.newline(args);
var line = editor.model.getRowArray(args.modelPos.row).join("");
var match = line.match(/^(\s*)/)
var leadingWs = match[1];
var chunk = leadingWs;
var newBlock = line.match(/{\s*$/) ? true : false;

if(newBlock) chunk += " "

args.chunk = chunk;
editor.ui.actions.insertChunk(args)
}
})
Note1: More Info on the bespin custom events
Note2: For this to work you may need to enable the setting 'autoconfig': 'on' first
Note3: Seems like there currently is no easy way to say "In this case ignore me" for an event key event handler which would make it easy to do the right thing in case of an active selection.

Saturday, March 7, 2009

Offloading "arbitrary" JS Objects to Workers / Async drawing for bespin

While doing the experiment to marry Joose and bespin I got sucked directly into bespin development. I started digging into ways to offload part of bespin's painting work into web workers. Web workers are independent JavaScript processes which communicate with the primary page using message events. Because they run in different threads they don't block the UI of the web page and can thus safely do complex calculations without hampering it's responsiveness.

Offloading arbitrary objects into workers
I started with a prototype to offload the work bespin is doing to do syntax highlighting into workers. bespin currently cheats by only looking at the lines which are actually visible (many editors do this). This strategy is fast but might lead to errors when there is not enough information within the currently visible lines.
I'm a lazy person and I had some free time so I decided to build a more general solution which could be used to put all kinds of things into workers. Thus I build a framework that creates a facade for any given JS object that acts like the original object but delegates all work to another objects which actually has all the functions and state of the original object but lives inside a worker. Clear? :)


Turning the whole syntax highlighting framework of bespin into a system that runs in the background is now as easy as this:
this.syntaxModel = new bespin.worker.WorkerFacade(new bespin.syntax.Model());

Of course, there is a little more to it:
There are some important limitations for the objects which can be passed to workers:
  • The objects may not reference DOM nodes (because workers have no access to them)
  • The objects may not have circular references (this restriction could be lifted)
  • And most importantly: None of the functions may be closures. This might seem harsher than it is. It basically means that you need to maintain all state inside the object.
Also, all method calls on the facade will be async. I chose to do a jQuery-style fluid-interface for registering callbacks.

Aside: The Web Worker API
Creating a web worker is as easy as calling new Worker("myWorker.js"). The WorkerPool implementation is Google Gears also included an API to create Workers from a string of JavaScript source. Unfortunately this part did not make it into the official spec. My system needs this feature, so I tried a couple of work arounds. First I tried to put the source into a data URI and use that. I found out that this was even suggested when discussing the spec. Unfortunately at least Safari 4 does not seem to support data URIs for workers. (Should this be reported as a bug to the WebKit team?) The next solution I came up with was to use a small bootstrapping worker that loads the source from the hash-part of its URI. This works well but might have security implications.

Building the Facade
The actual facade object is basically a copy of the original object with all methods exchanged with methods to call into the background worker. The source is quite scary. If you, like me, like source code, please enjoy, otherwise no need to read it:
createFacade: function (obj) {

var facade = this;

for(var prop in obj) {
if(prop.charAt(0) != "_") { // supposedly we dont need "private" methods. Delete if assumption is wrong
(function () { // make a lexical scope
var val = obj[prop];
var method = prop
if(typeof val == "function") { // functions are replaced with code to call the worker
facade[prop] = function () {
var self = this;
var index = CALL_INDEX++ // each call gets a globally unique index
var paras = Array.prototype.slice.call(arguments);
if(this.__hasWorkers__) {
var data = {
callIndex: index,
method: method,
paras: paras
}
if(!USE_GEARS) data = dojo.toJson(data) // here we should really test whether our postMessage supports structured data. Safari 4 does not
// send the method to a worker
this.__getWorker__().postMessage(data)
} else {
// No worker implementation available. Use an async call using
// setTimeout instead
var self = this;
window.setTimeout(function () {
var retVal = self.__obj__[method].apply(self.__obj__, paras);
var callback = self.__callbacks__[index];
delete self.__callbacks__[index]
if(callback) {
callback(retVal)
}
}, 0)
}
// Return an object to create a "fluid-interface" style callback generator
// callback will be applied against context
// callback will be part of the mutex
// paras is an array of extra paras for the callback
return {
and: function (context, mutex, paras, callback) {
var func = callback
if(mutex instanceof bespin.worker.Mutex) {
mutex.start()
func = function () {
callback.apply(this, arguments)
mutex.stop()
}
}

self.__callbacks__[index] = function () {
paras = Array.prototype.slice.call(arguments).concat(paras)
func.apply(context, paras)
}
}
}
}
}
else {
// put instance vars here, too?
}
})()
}
}
},
Inside the source you see this:
this.__getWorker__()
By encapsulating access to the actual worker behind the facade it is possible to put multiple workers behind one object. I currently implemented a simple round robin scheduling, but this could of course be extended to use a more sophisticated mechanism (e.g. picking any currently idle worker) (Beware that multiple workers only work reliably for "stateless" objects).

Gears WorkerPool VS. Web Workers
The Gears WorkerPool API predates the Web Workers API. It is very likely that the Gears API will eventually change to more closely follow the W3C API. For now I implemented a simple facade for the Gears API to make it look like the Web Worker API.
This works quite well with some limitations:
  • DOM Message Events are currently not supported (Support could be added)
  • The current postMessage interface which is used to communicate with workers does not support structured data. My implementation uses Gear's native support for structured data and uses JSON for standard workers
Again here is the source for the facade if your care:
// If there is no Worker API (http://www.whatwg.org/specs/web-workers/current-work/) yet,
// try to build one using Google Gears API
if(typeof Worker == "undefined") {
BespinGearsInitializeGears() // this functions initializes Gears only if we need it
if(window.google && google.gears) {
USE_GEARS = true; // OK, gears is here

var wp = google.gears.factory.create('beta.workerpool');
var workers = {};
Worker = function (uri) { // The worker class
this.isGears = true;

// We can pass the source directly. So we decode the source ourselves.
// To make this more general purpose we could of course also load the
// actual JS file.
var source = uriDecodeSource(uri)

this.id = wp.createWorker(source)
workers[this.id] = this;
}

Worker.prototype = { // we can post messages to the worker
postMessage: function (data) {
wp.sendMessage(data, this.id)
}
}

// upon receiving a message we call our onmessage callback
// DOM-Message-Events are not supported
wp.onmessage = function (a, b, message) {
var worker = workers[message.sender];
var cb = worker.onmessage;
if(cb) {
cb.call(worker, {
data: message.body
})
}
}
}
}

Mutexes / Semaphores / etc.
Once you do real multi-process programming with JavaScript you need to start to worry about some of the issues that come with programming parallel systems (not all of them, because workers are more like processes so there are no issues related to pure multi-threading).
Bespin has a quite complicated paint() method which does the actual drawing of the editing area using a canvas-tag. This paint method calls the syntax highlighter for every line that is draws upon every paint. Because all those calls happen asynchronously, potentially in parallel and in undefined order, I needed a mechanism which basically says: Execute this code right after all those async calls finished.
My current solutions is to have an object which maintains a counter of method calls belonging to the same group. Starting an async method increments the counter, calling the method's callback decrements it. You can schedule yet more callbacks which execute once the counter reaches zero again (when all async method calls are finished). I called the class Mutex which is probably wrong but at least on topic.
Even more source for the interested:
//** {{{ bespin.worker.Mutex }}} **
//
// Object that maintains a counter of running workers/async processes.
// Calling after(callback) schedules a function to be called until all
// async processes are finished
//
// Is Mutex the correct term?
dojo.declare("bespin.worker.Mutex", null, {
constructor: function(name, options) {
this.name = name;
this.count = 0;
this.afterJobs = [];
this.options = options || {}
},
start: function () {
this.count = this.count + 1
},
stop: function () {
this.count = this.count - 1
if(this.count == 0) {
if(this.options.onlyLast) {
var last = this.afterJobs[this.afterJobs.length-1];
if(last) {
last()
}
} else {
for(var i = 0; i < this.afterJobs.length; ++i) {
var job = this.afterJobs[i];
job()
}
}
this.afterJobs = []
}
},
after: function (context, func) {
this.afterJobs.push(function () {
func.call(context)
})
}
})

Conclusion
While my current patch will not make it into bespin (because the current syntax highlighter is so fast that the overhead of the Worker is larger than the win) I still think that the approach above has a lot of potential (e.g. analyzing more lines than the currently visible ones). "Transparently" moving objects into workers while for the most part maintaining the original interface at least gives JavaScript a nice Erlang touch (as @psvensson put it).

Update:
I should clarify that it is no problem for functions to pass the worker-bounday which generate closures. Non of the functions which are already present inside the object which is sent to a worker may include a closure, though.