Languager

Thursday, March 26, 2015

CS History 0

Are real numbers real?

Wait!! What does this have to do with programming? Or even computer science??

Sounds like angels-on-the-head-of-a-pin philosophy No??
NO! CS came into existence because of this question!

Unicode: Universal or Whimsical?

Unicode Classification

In my last post, I wrote about two sides to unicode — a universal side and a babel side. Some readers while agreeing with this classification were jarred by a passing reference to ‘gibberish’ in unicode⁵.
unicode-universal-or-whimsical.html
Since I learnt some things from those comments, this post expands that classification into these¹.

Babel
Universal
Legacy
Unavoidable mess
Political mess
Whimsical

Universal Unicode

What is the 'uni-' in unicode? According to the official records it comes from Unique Uniform and Universal.

Unicode starts out with the realization that ASCII is ridiculously restrictive, or the world is larger than the two sides of the Atlantic¹. This gives rise to all the blocks from Arabic to Zhuang.

However the greatest promise of unicode lies not in catering to this tower of babel but rather in those areas that are more universal. Yeah I know technically this distinction between universal and international will not stand up to scrutiny.

Unicode and the Universe

If you're trilingual you speak three languages, if you're bilingual you speak two languages, if you're monolingual you're American.

Mark Harris on the python list

Well if one reads that thread above, one would find that people were rather uptight with Mark Harris for that statement. And yet they have the same insular attitude towards ASCII-in-programming that Mark describes in Americans towards English (or more correctly Americanese); to wit they consider that programming with ASCII (alone) is natural, easy, convenient, obvious, universal, inevitable etc.

Is it mere coincidence that the 'A' of ASCII is short for American?

Not so long ago the world lay from a few kilometers east of The Garden of Eden to a few hundreds kilometers west. And then it stretched to a spherical globe of 40,000 km circumference. At that time the gods used to light lamps at night called 'stars'.

And then things changed a wee little bit, the stars and our world – suddenly grown quite small – became more 'similar' and the wider world stretches now to a few billion light-years across.

In many respects the story of ASCII to Unicode is similar. Pragmatically both represent a 0 → ∞ jump, in the sense that it was natural to use the whole of the (printable) part of ASCII. [Many of us even used to know the code-points of ASCII quite well!] With unicode, not only is any one person knowing all the 1,114,112 characters unrealistic, even knowing what all blocks exist is infeasible.

At base this is

The problem of meaning

The smaller world is naturally more meaningful than the larger one. Just as one can have a more warm fuzzy feeling about Momma than woman-kind, one can at least imagine a God who selects a chosen people and is solicitous and possessive about them as long as the world is comprehensible on my scale. When it becomes too large that life itself looks like a freak-accident, such beliefs are harder to maintain.

As example, consider Amerigo Vespucci

We saw more wild animals—such as wild hogs, kids, deer, hares, and rabbits—than could ever have entered the ark of Noah; but we saw no domestic animals whatever… I fancied myself near the terrestrial paradise…

Vespucci was an adventurer, not a religious man. By contrast today even a committed religious person would not ask whether a specific animal of the mundane world is found in the scripture of his choice. And I dare say Vespucci talks of paradise with a literalness that is not possible for a modern.

In effect our world has become so large it is difficult to give it meaning.

Likewise, even considering only extant languages…

Unicode is too large

People want to stick to ASCII because of the unending, terrifying swathes of undecipherable characters. An argument I often hear is

Given that I have only ten fingers and a hundred or so keys in front me, how am I to invoke a specific symbol from the hundred thousand or so that are available in Unicode?

Well… Dunno what to say… If I can go from 100 characters to 200 I am twice as rich. Why worry about the million I have no use for?

But it is really much worse

Unicode has plain gibberish

You dont play with Mahjong characters? How crude!
You dont know about cuneiform? How illiterate!
You dont compose poetry with Egyptian hieroglyphs? How rude!
Shavian has not reformed you? How backward!

In short, to make effective use of unicode, it may be worthwhile to distinguish the international blocks (also called the tower of babel) from the universal parts of unicode, viz. math.

That is,

Unicode is like the universe

in the sense that in the pre-unicode era, the universe was so small that parochialism was unavoidable. Today it is so big, meaninglessness is inevitable.

In the medieval ASCII world one could choose between being one of:

1 Dummy: To sell one's computer and work (and soul?) to a proprietary format and word-processing software
2 Wizard: To master something intricate and complicated such as latex (or mathml, lilypond, troff…)
3 Programmer: Everything that is worth expressing can be expressed in ASCII.

IOW…

God made ASCII. All the rest is the work of man.

And so we had before us a delicious à la carte offering:

idiocy of ignorance
slavery to savantery
prison of penury

Now while we are not completely free from these 'blessings' yet, we are better off than before, thanks to Unicode

To see why 1 and 2 need not be the case any more, see some suggestions made in the context of python. Now while the suggestions are not quite serious and are unlikely to be taken seriously, as we go from established/old languages towards the bleeding edge they become more realistic. Here's Julia and Agda.

As for not having to choose between 2 and 3, heres something I recently asked on the (la)tex list:

Here is the wikipedia page on ε-δ definition of limit where we see the well-known definition

Editing it produces this excerpt [note this is input text]
(\forall \varepsilon > 0)(\exists \ \delta > 0) (\forall x \in D)
(0 < |x − c | < \delta \ \Rightarrow \ |f(x) - L| < \varepsilon)

Now compare it with the following – also input text:

(∀ ε > 0) (∃ δ > 0) (∀ x ∈ D) (0 < |x − c| < δ ⇒ |f(x) - L| < ε)

[Note particularly the real minus between x and c and the ASCII hyphen minus between f(x) and L]

In this age of unicode when we have xetex/luatex why do we use the first when the second is so much closer to the desired result?
Hopefully most people would agree the latter is more readable than the former.
The questions that remain are

Typing it in.
Is it close to luatex/xetex?

For 2. I'd welcome help/suggestions ;-)

For 1., Ive just recently discovered pointless-xcompose which goes a good way towards solving this at least on linux¹

And I suggest we distinguish these

Levels of Input Methods

Cut paste a character after searching with google
Select a character from a local app like gucharmap (emacs: C-x 8 Ret)
Use an editor abbrev(iation)
Use an editor input method eg emacs' tex input-method will convert \forall into ∀ etc
Use the compose-key (Windows users may try this – dunno…)
Switch keyboard layouts in software with something like ibus
Use a special purpose hardware keyboard

As we go from 1 to 7 the expertise and efficiency increases but also the expense of setup, hardware etc. and most important, learning. The cost of assuming that only the extreme choices – 1 and 6 – are available and not all the other interim possibilities, is the binary choice between meaninglessness and parochialism.

IOW placing the slider effectively along this spectrum represents an efficient…

Huffman coding

applied to keystrokes and mouse gestures (in analogy to bits)

For a while now Ive used 1 and 3.

Combining 3 and 4 thanks to pointless-xcompose is, I expect, going to be more convenient and effective, especially when it is tailored to the subset of characters one needs frequently.

The one thing not clear is how to set up the compose key. Complete noob myself but on a recent linux¹ this may work:

$ setxkbmap -option compose:menu

to make the menu key behave like compose. Replace the 'menu' by 'rwin' or 'ralt' to get the same behavior out of the right-windows or right-alt keys.

Acknowledgements

Thomas Reuben for writing pointless-xcompose
David de la Harpe Golden for introducing me to xkb (setxkbmap)

¹ Thomas Reuben, author of pointless-xcompose, points out to me that saying linux is inappropriate where X-windows would be more correct. He is right.
Left the linux there as more people are likely to know they are using linux than that they are using X-windows ☺

Friday, September 26, 2014

Pugofer

In the early 90s I used gofer to teach FP in the introductory programming class at the university of Pune. At first I used Miranda/Scheme, then gofer. I was also impressed with Dijkstra's philosophy of making function application explicit with a dot ('.') and decided to incorporate this into gofer. This changed gofer was called pugofer.

The philosophy of these changes is here. Summary of changes is:

Universities starting with functional programming

Here's a list of some universities that are using functional languages to teach programming. As I find more data, it will be added. So please let me know (with links!!) what Ive missed – lists are particularly welcome, but individual universities is also welcome. Also other languages that have some claim to being functional.

Haskell: Haskell – official list (list)
At quora (also scheme and ML dialects)
ML: Carnegie Mellon

ACM FDP – Invited Talk

I was an invited speaker at the ACM faculty development program (FDP) organized jointly by ACM and VIT Pune on 9th July 2014.
The stuff of my talk — and good deal of other stuff that I did not manage to cover for lack of time :D — is put up at github.

To view, you will need

Freeplane to read the mindmaps

Unicode in Haskell Source

After writing Unicoded Python, I discovered that Haskell can do some of this already. No its not even half way there but I am still mighty pleased!

Unicode and the Unix Assumption

Once upon a time, file was a rich, profound, daunting and wondrously messy concept. It involved ideas like

record orientation
blocking factor
partitioned data sets

and other wonders of computer (rocket) science.

Then there came along 2 upstarts, playing around in their spare time with a machine that their Lab had junked. They were having a lot of fun…

They decided that for them File was just List of Bytes.
type File = [Byte]
Oh the fun of it!

Unicode in Python

1 Introduction

Python has been making long strides in embracing unicode. With python 3 we are at a stage where python programs can support unicode well however python program-source is still completely drawn from the ASCII subset of unicode.
Well… Actually with python 3 (not 2) this is already possible

def solvequadratic(a,b,c):
    Δ = b*b - 4*a*c
    α = (-b + sqrt(Δ))/(2*a)
    β = (-b - sqrt(Δ))/(2*a)
    return (α, β)

>>> solvequadratic(1,-5,6)
(3.0, 2.0)
>>>

Now to move ahead!

Haskell: From unicode friendly to unicode embracing

Doesn't λ x ⦁ x : α → α look better and communicate more clearly than \ a -> a :: a -> a ?

What are the problems with the second (current Haskell) form?

The a in the value world is the same as the a in the type world -- a minor nuisance and avoidable -- one can use different names
λ looks like \
The purely syntactic -> that separates a lambda-variable and its body is the same token that denotes a deep semantic concept -- the function space constructor

APL was one of the oldest programming language and is still one of the most visually striking. It did not succeed because of various reasons, most notable of which is that it was at its heyday too long before unicode.

While APL is the first in using mathematical notation in programming, Squiggol, Bananas and Agda are more recent precedents in this direction.

In short, its time for programming languages to move from unicode-friendly to unicode-embracing

Some stray thoughts incorporating these ideas into Haskell.

Computer Science: Technology or Philosophy?

A computer is like a violin. You can imagine a novice trying first a phonograph and then a violin. The latter, he says, sounds terrible. That is the argument we have heard from our humanists and most of our computer scientists. Computer programs are good, they say, for particular purposes, but they aren't flexible. Neither is a violin, or a typewriter, until you learn how to use it.

Marvin Minsky – Programming clarifies poorly-understood and sloppily-formulated Ideas

Computer science is not a science and it has little to do with computers. Its a revolution in the way we think and in the way we express what we think. The essence of this change is procedural epistemology — the study of the structure of knowledge from an imperative point of view, as opposed to the declarative point of view taken by math.
Mathematics provides a framework for dealing precisely with notions of «what is»
Computation provides a framework for dealing precisely with notions of «how to»

Abelson and Sussman — Preface to Structure and Interpretation of Computer Programs

Computer Science is no more about computers than astronomy is about telescopes, biology is about microscopes or chemistry is about beakers and test tubes.
There is an essential unity of mathematics and computer science.

Michael Fellows — usually attributed to Dijkstra

The above three quotes are interesting as much in their agreement – the irrelevance of computers to computer-science – as in the difference of emphasis: Minsky sees CS from the intelligence/learning pov, Fellows/Dijkstra as math, Abelson/Sussman as something contrasting to math…

So what actually is CS about??

Following is an article I wrote for a newspaper in 1995 on the wider-than-mere-technology significance of CS — reposting here for historical interest.

The Poorest Computer Users are Programmers

In the old days programmers programmed computers. Period.

Nowadays when everything is a computer, and the traditional computer is about a decade and half behind the curve, describing a programmer as someone who programs computers is narrow and inaccurate. Instead we should think of programmers as working at effecting and improving the human-X interface, where X may be 'computer'. But it could also be IT, or technology or the network and through that last, interaction with other humans.

Now the classic 'nerdy' programmer was (by stereotype) always poor at 'soft' questions like that: Interaction? Synergy?! What's all that manager-PR talk to do with programming?

And so today…

Programmers are inept as users of computers

Some examples:

Apply-ing SI on SICP

Abelson and Sussman wrote a legendary book: SICP. The book has a famous wizard cover:

Unfortunately the cover misses some key content of the book. What is it?
If we remove the other wizardly stuff, three main artifacts stand out on that cover: eval and apply on the crystal ball and a magical λ. Lets put these into a table

apply	eval
lambda

The fourth empty square seems to stand out, doesn't it? Lets dig a little into this square.

Functional Programming invades the Mainstream

Kewl-kids in love with their favorite language will often bring up how wonderful is some non-trivial app written in their language.

Kewl, Kewt, Ardent… And the producer of yawns…

So sometimes it is good to invert the perspective and ask about cross-fertilization: What ideas/features of these fashionable languages are becoming compelling enough to enter the mainstream?

This post is about how the boring mainstream is giving in – feature-by-feature – to Functional Programming

Almost every modern language supports garbage collection. Origin Lisp
From that followed the fact that any value not just scalars can be first-class.
As widely disparate systems as Python, R, Groovy, VBA, Mathematica share a common idea – using the interpreter interactively as an exploratory tool. Started with Lisp's REPL.

Search This Blog

Thursday, March 26, 2015

Are real numbers real?

Monday, March 2, 2015

Unicode Classification

Thursday, February 26, 2015

Tuesday, January 6, 2015

The problem of meaning

Unicode is too large

Unicode has plain gibberish

Unicode is like the universe

Levels of Input Methods

Huffman coding

Acknowledgements

Friday, September 26, 2014

Tuesday, August 12, 2014

Wednesday, July 9, 2014

Tuesday, May 13, 2014

Tuesday, April 29, 2014

Saturday, April 19, 2014

1 Introduction

Tuesday, September 17, 2013

Tuesday, September 10, 2013

Monday, September 9, 2013

Programmers are inept as users of computers

Tuesday, August 27, 2013

Sunday, June 23, 2013