Search This Blog

Tuesday, April 21, 2015

Between Poverty and Universality lies Structure

Lisp is worth learning for the profound enlightenment experience you will have when you finally get it; that experience will make you a better programmer for the rest of your days, even if you never actually use Lisp itself           —  Eric Raymond

In ancient times people set each other puzzles such as:

       Can God make a stone so heavy that he can't lift it?

These puzzles-of-omnipotence can be rephrased in theory-of-computation lingo:

       Can God compute the uncomputable?
       If he can, how is it uncomputable?
       If he cant, how is he God?

So what are those limits of/by structure?  Unsurprisingly related to God-el's theorem:
God-el's Theorem says that for any record player, there are records which it cannot play because they will cause it to self-destruct
Gödel-Escher-Bach

And like record players what about programming languages whose abstractions can be arranged to break the language?

Structure is good because it reduces breakage; its bad because it imprisons us into precooked forms.

Following I explore the space between poverty and universality; a space which for want of a better word I will simply call structure, the most elusive being the structure of syntax.

On the python list Chris Angelico asked me:
I definitely don't see how a non-text source code format would improve on it [breaking into the Python implementation]. Feel like elaborating?
This was in the context of a thread about the benefits of programmer customizable syntax:
[In the context of the question: Why is python space/indent sensitive? A classic troll-bait in python circles when 'BartC' said:]

Actually I would quite like to know why languages don't have switchable syntax anyway to allow for people's personal preferences.
To which at least one perspicuous answer was that You want Lisp.
 
I thought I'd elaborate further on this
But first what do we mean by poverty first-classness and universality?

Poverty

just means 'not having...' in this case technology — alternate words would be Ludditeism, technical illiteracy, blub programmer etc.

When people stop being poor they start travelling…

First Class

Although this post is more about first-class syntax let me briefly talk about the more usual first-class semantics using python¹ as example.

Of course anyone who has studied a bit of CS knows about things like syntax trees (ASTs), REPL, debugging etc.

However python goes one step further than the classic languages like C by providing ASTs, interactive introspection (help/dir/type etc) and live inspection under the same hood with little difficulty.

Ned Batchelder has a nice presentation on this and calls it Python-aware Python

The limit of first-classness is…

Universality 

This is less metaphor, more literal, but less understood.

It is not too much exaggeration to say universality is about being God except that God too must function according to his own laws.

Following I explore the spectrum from poverty to universality with some examples, though the focus is more on syntax for this post

HTML

Html is not 'not-text' – you can edit it in a text editor alright. However if you edit it in an html-editor – mozilla-composer, seamonkey, or any of the web versions – you get (at least) two views – text vs structured.

In the text (ironically also called html) view it behaves like a text-editor – mostly a half-assed one.

In the structured view it renders the html structure – after a fashion.

So a table would show as <table> <tr><td>1<td>2…</table> etc in text view, and like this

12
34

ie as a rectangular array of cell in structural view.
This of course prevents trivial counting/off-by-one errors but more importantly gives a neat overview.

After the gross-layout is taken care of you can switch to text mode and add fancy html attributes if needed.

So html is structured-text and plain-text depending on how you squint.

In analogous manner having ready access to the AST of a program would be neat and impinges on many things, for instance…

Editing

When I used to do a lot of lisp, my editor was set up so that
  • cursor arrows behaved like normal cursor arrows
  • keypad arrows moved up/down/along S-exps ie they were tree movements
IOW cursor for unstructured-text; keypad for the structured-view.

A feeble attempt in the C world is gccxml

Wordperfect's Show Codes

Most people who produce documents today assume that the way to do it is Word.
A few may know of something arcane and frightening called Latex.

Years ago there used to be a system called WordPerfect, which occupied the middle ground between Latex and Word.  By default the user added formatting in a way similar to Word, but then he could choose to show codes (ie press F3) and see the formatting-codes in a split-window, something like the way Latex would look

Conversely in Latex a user can produce any effect… if he is wizard enough to know all the arcane incantations.
The holy grail is a single system in which one can slide smoothly between user-friendly – and dumbed down – paradise and ultimate but impossibly arcane wizardry.  For that

Syntax needs to be First Class

WordPerfect's nice but not first class ie one can see the formatting codes but not type them in.

My friend Emanuel Berg from the emacs list gave this short and sweet demo of

First-classness in emacs

As he himself cautioned this is not really about firstclass syntax, but the first-classness is still worth exploring.

Emacs is  an editor – supposedly.
So one can edit for example C files.
Now programmers find it pleasant (and also error-catching) to show different language elements in different colors, or faces as emacs calls them.
So there is a face for keywords, for builtins, for comments and so on.
But now different people like different color-sets – so we set them in elisp

Would it not be neat if we could see the face (color) that we set?
IOW we would like to program emacs in emacs just like we use emacs to program C. Thats first-classness²

Emanuel's screenshot shows this: And if someone wants to try it in emacs, here's the file; load it as
$ emacs -l captain-mr-king.el captain-mr-king.el

Syntax Transcendence

One thing that lispers hear from non-lispers is: Lisp… Gawd! Horrible syntax…  Which causes no end of irritation to the lisper because he retorts: Syntax? What syntax? Define any syntax you like. Lisp has NO Syntax!

I would say that it is only the Lisp community that gets

The fundamental koan of computer science

         You cannot do programming without syntax
         Syntax is irrelevant to programming

In fact Eric Raymond's famous quote I started with is at least in part about getting this element of transcending syntax.

As a simple example consider the problem of adding local-bindings to a 'normal' language like python – usually called let.
ie we would like to write something like

let x = a+b
    y = a-b
in x/y

without having x,y added to the current scope

This can be simulated by: the expression:

   (lambda x,y : x/y)(a+b, a-b)

The simulation is neat; the readability is terrible.

In most languages adding such a let would be a many-year long process of arguing with the guardians-of-the-language.

In Lisp you just write a macro³ that transforms the let into the lambda – the same grade of triviality as the vanilla programmer writing a function.

All this would become clear if framed rightly…

Theory of Computation

The theory of computation is usually taken to start with Turing's 1936 paper.
I think the Turing paper is really an elaboration on Gödel's 1931 paper on the incompleteness theorem.
Most presentations of these, focus too much on the results and not enough on the methodology or better

The Technology of the Theory

Gödel used the only data-structure he had – numbers.
And he showed how to make numbers – really humongous ones – that could encode theorems about numbers.  This is often considered very profound, very difficult, very

Mysterious

But is it?

Imagine a programmer manipulating a data-structure that sits between memory locations 1000 and 2000.  Most of us brought up on von Neumann machines would do this as 'memory-manipulations'.  However one can also see the memory 1000-2000 as a 1000-byte large number and the manipulations can all be looked upon as arithmetic on that number.
In short...

Gödel's theorem is less of difficult math, more of messy programming

Turing corrected Gödel's faux pas by replacing arcane arithmetic manipulations on humongous numbers by trivial symbol-shuffling on strings.
And so a number that is both a number and a theorem about numbers (for Gödel) becomes for Turing, a machine whose tape can contain another Turing machine — a Universal Turing Machine (UTM).

A programmer would immediately recognize that Turing's approach is way cleaner and easier than Gödel's [see Hehner].

But even here the object Turing machine sitting as data on the tape of a UTM needs to be parsed and unparsed.

What would it mean to delete this completely and only do the essence of theory of computation without the irrelevant parsing/unparsing fluff?

What we need is a data structure that can encode everything — a universal data structure, but it should also be convenient and as natural as feasible. Possible??

Enter Lisp


In Computability and Complexity from a Programming Perspective Neil Jones shows that the use of Lisp streamlines computability and complexity theory almost to the same degree that Turing cleaned up Gödel's theorems.

The cost is this — treat S-Exprs as fundamental, universal just as Turing treats strings, assembly programmers treat memory, and most vanilla programmers treat text.

To the lisper this cost is nothing⁴.
To the non-lisper its way too high.
Neither understands that its the cost of illiteracy

TL;DR

If you Lisp, you dont need this summary (or this blog post).
If you dont, read Eric Raymond's quote above and start Lisping instead of only clumthily lithping in other languages!


¹ Actually python comes closer to supporting first-classness in syntax than say C++, eg by decoupling the syntax x+y from its internal rep x.__add__(y) and making the latter available to the programmer. Builtin is is an exception and is for me one of python's bugbears – but that's more because of an infelicitous name than the fact that it has no first-class rendering
² Well some would say meta-circularity as against first-classness. Scheme supports the distinction better than any other language and therefore also the confusion!! See applying si on sicp
³ To which the C programmer says: C also has macros. To which the Lisper mutters under his breath: Blub programmer!
⁴ Ok close to nothing. Enumeration of ℕ is trivial, strings is easy, S-Exprs is a bit non-trivial – Sect. 5.7 of Neil Jones shows how to do it

No comments:

Post a Comment