Search This Blog

Thursday, March 26, 2015

CS History 0

Are real numbers real?

Wait!! What does this have to do with programming? Or even computer science??

Sounds like angels-on-the-head-of-a-pin philosophy No??
NO!  CS came into existence because of this question!

Monday, March 2, 2015

Unicode: Universal or Whimsical?

Unicode Classification

In my last post, I wrote about two sides to unicode — a universal side and a babel side. Some readers while agreeing with this classification were jarred by a passing reference to ‘gibberish’ in unicode⁵.
unicode-universal-or-whimsical.html
Since I learnt some things from those comments, this post expands that classification into these¹.
  1. Babel
  2. Universal
  3. Legacy
  4. Unavoidable mess
  5. Political mess
  6. Whimsical

Thursday, February 26, 2015

Universal Unicode

What is the 'uni-' in unicode? According to the official records it comes from Unique Uniform and Universal.

Unicode starts out with the realization that ASCII is ridiculously restrictive, or the world is larger than the two sides of the Atlantic¹. This gives rise to all the blocks from Arabic to Zhuang.

However the greatest promise of unicode lies not in catering to this tower of babel but rather in those areas that are more universal. Yeah I know technically this distinction between universal and international will not stand up to scrutiny.

Tuesday, January 6, 2015

Unicode and the Universe

If you're trilingual you speak three languages, if you're bilingual you speak two languages, if you're monolingual you're American.

Mark Harris on the python list
Well if one reads that thread above, one would find that people were rather uptight with Mark Harris for that statement. And yet they have the same insular attitude towards ASCII-in-programming that Mark describes in Americans towards English (or more correctly Americanese); to wit they consider that programming with ASCII (alone) is natural, easy, convenient, obvious, universal, inevitable etc.

Is it mere coincidence that the 'A' of ASCII is short for American?

Not so long ago the world lay from a few kilometers east of The Garden of Eden to a few hundreds kilometers west.  And then it stretched to a spherical globe of 40,000 km circumference.  At that time the gods used to light lamps at night called 'stars'.

And then things changed a wee little bit, the stars and our world – suddenly grown quite small – became more 'similar' and the wider world stretches now to a few billion light-years across.

In many respects the story of ASCII to Unicode is similar. Pragmatically both represent a 0 → ∞ jump, in the sense that it was natural to use the whole of the (printable) part of ASCII.  [Many of us even used to know the code-points of ASCII quite well!] With unicode, not only is any one person knowing all the 1,114,112 characters unrealistic, even knowing what all blocks exist is infeasible.

At base this is

The problem of meaning

The smaller world is naturally more meaningful than the larger one.  Just as one can have a more warm fuzzy feeling about Momma than woman-kind, one can at least imagine a God who selects a chosen people and is solicitous and possessive about them as long as the world is comprehensible on my scale. When it becomes too large that life itself looks like a freak-accident, such beliefs are harder to maintain.

As example, consider Amerigo Vespucci 
We saw more wild animals—such as wild hogs, kids, deer, hares, and rabbits—than could ever have entered the ark of Noah; but we saw no domestic animals whatever… I fancied myself near the terrestrial paradise…
Vespucci was an adventurer, not a religious man.  By contrast today even a committed religious person would not ask whether a specific animal of the mundane world is found in the scripture of his choice. And I dare say Vespucci talks of paradise with a literalness that is not possible for a modern.

In effect our world has become so large it is difficult to give it meaning.

Likewise, even considering only extant languages…

Unicode is too large

People want to stick to ASCII because of the unending, terrifying swathes of undecipherable characters.  An argument I often hear is
Given that I have only ten fingers and a hundred or so keys in front me, how am I to invoke a specific symbol from the hundred thousand or so that are available in Unicode?
Well… Dunno what to say… If I can go from 100 characters to 200 I am twice as rich. Why worry about the million I have no use for?

But it is really much worse

Unicode has plain gibberish

You dont play with Mahjong characters? How crude!
You dont know about cuneiform? How illiterate!
You dont compose poetry with Egyptian hieroglyphs? How rude!
Shavian has not reformed you? How backward!
In short, to make effective use of unicode, it may be worthwhile to distinguish the international blocks (also called the tower of babel) from the universal parts of unicode, viz. math.

That is,

Unicode is like the universe

in the sense that in the pre-unicode era, the universe was so small that parochialism was unavoidable. Today it is so big, meaninglessness is inevitable.

In the medieval ASCII world one could choose between being one of:
1 Dummy
To sell one's computer and work (and soul?) to a proprietary format and word-processing software
2 Wizard
To master something intricate and complicated such as latex (or mathml, lilypond, troff…)
3 Programmer
Everything that is worth expressing can be expressed in ASCII.

IOW…

God made ASCII. All the rest is the work of man.
And so we had before us a delicious à la carte offering:
  1. idiocy of ignorance
  2. slavery to savantery
  3. prison of penury
Now while we are not completely free from these 'blessings' yet, we are better off than before, thanks to Unicode

To see why 1 and 2 need not be the case any more, see some suggestions made in the context of python.  Now while the suggestions are not quite serious and are unlikely to be taken seriously, as we go from established/old languages towards the bleeding edge they become more realistic.  Here's Julia and Agda.

As for not having to choose between 2 and 3, heres something I recently asked on the (la)tex list:

Here is the wikipedia page on ε-δ definition of limit where we see the well-known definition


Editing it produces this excerpt [note this is input text]
(\forall \varepsilon > 0)(\exists \ \delta > 0) (\forall x \in D)
(0 < |x − c | < \delta \ \Rightarrow \ |f(x) - L| < \varepsilon)


Now compare it with the following – also input text:

(∀ ε > 0) (∃ δ > 0) (∀ x ∈ D) (0 < |x − c| < δ  ⇒  |f(x) - L| < ε)

[Note particularly the real minus between x and c and the ASCII hyphen minus between f(x) and L]

In this age of unicode when we have xetex/luatex why do we use the first when the second is so much closer to the desired result?
Hopefully most people would agree the latter is more readable than the former.
The questions that remain are
  1. Typing it in.
  2. Is it close to luatex/xetex? 
For 2. I'd welcome help/suggestions ;-)

For 1., Ive just recently discovered pointless-xcompose which goes a good way towards solving this at least on linux¹

And I suggest we distinguish these

Levels of Input Methods

  1. Cut paste a character after searching with google
  2. Select a character from a local app like gucharmap (emacs: C-x 8 Ret)
  3. Use an editor abbrev(iation)
  4. Use an editor input method eg emacs' tex input-method will convert \forall into ∀ etc
  5. Use the compose-key (Windows users may try this – dunno…) 
  6. Switch keyboard layouts in software with something like ibus
  7. Use a special purpose hardware keyboard
As we go from 1 to 7 the expertise and efficiency increases but also the expense of setup, hardware etc. and most important, learning. The cost of assuming that only the extreme choices – 1 and 6 – are available and not all the other interim possibilities, is the binary choice between meaninglessness and parochialism.

IOW placing the slider effectively along this spectrum represents an efficient…

Huffman coding

applied to keystrokes and mouse gestures (in analogy to bits)

For a while now Ive used 1 and 3.

Combining 3 and 4 thanks to pointless-xcompose is, I expect, going to be more convenient and effective, especially when it is tailored to the subset of characters one needs frequently.

The one thing not clear is how to set up the compose key. Complete noob myself but on a recent linux¹ this may work:

$ setxkbmap -option compose:menu

to make the menu key behave like compose.  Replace the 'menu' by 'rwin' or 'ralt' to get the same behavior out of the right-windows or right-alt keys.

Acknowledgements

  1. Thomas Reuben for writing pointless-xcompose
  2. David de la Harpe Golden for introducing me to xkb (setxkbmap)


¹ Thomas Reuben, author of pointless-xcompose, points out to me that saying linux is inappropriate where X-windows would be more correct. He is right.
Left the linux there as more people are likely to know they are using linux than that they are using X-windows  ☺

Friday, September 26, 2014

Pugofer

In the early 90s  I used gofer to teach FP in the introductory programming class at the university of Pune.  At first I used Miranda/Scheme, then gofer. I was also impressed with Dijkstra's philosophy of making function application explicit with a dot ('.') and decided to incorporate this into gofer.  This changed gofer was called pugofer.

The philosophy of these changes is here. Summary of changes is:

Tuesday, August 12, 2014

Universities starting with functional programming

Here's a list of some universities that are using functional languages to teach programming. As I find more data, it will be added. So please let me know (with links!!) what Ive missed – lists are particularly welcome, but individual universities is also welcome.  Also other languages that have some claim to being functional.
Haskell
Haskell – official list (list)
At quora (also scheme and ML dialects) 
ML
Carnegie Mellon 

Wednesday, July 9, 2014

ACM FDP – Invited Talk

I was an invited speaker at the ACM faculty development program (FDP) organized jointly by ACM and VIT Pune on 9th July 2014.
The stuff of my talk — and good deal of other stuff that I did not manage to cover for lack of time :D — is put up at github.

To view, you will need

Tuesday, May 13, 2014

Unicode in Haskell Source

After writing Unicoded Python, I discovered that Haskell can do some of this already.  No its not even half way there but I am still mighty pleased!

Tuesday, April 29, 2014

Unicode and the Unix Assumption

Once upon a time, file was a rich, profound, daunting and wondrously messy concept. It involved ideas like
  • record orientation
  • blocking factor
  • partitioned data sets
and other wonders of computer (rocket) science.

Then there came along 2 upstarts, playing around in their spare time with a machine that their Lab had junked. They were having a lot of fun…

They decided that for them File was just List of Bytes.
type File = [Byte]
Oh the fun of it!

Saturday, April 19, 2014

Unicode in Python

1 Introduction

Python has been making long strides in embracing unicode. With python 3 we are at a stage where python programs can support unicode well however python program-source is still completely drawn from the ASCII subset of unicode.
Well… Actually with python 3 (not 2) this is already possible
def solvequadratic(a,b,c):
    Δ = b*b - 4*a*c
    α = (-b + sqrt(Δ))/(2*a)
    β = (-b - sqrt(Δ))/(2*a)
    return (α, β)

>>> solvequadratic(1,-5,6)
(3.0, 2.0)
>>>
Now to move ahead!

Tuesday, September 17, 2013

Haskell: From unicode friendly to unicode embracing

Doesn't λ x ⦁ x  :  α → α look better and communicate more clearly than \ a -> a :: a -> a  ?

What are the problems with the second (current Haskell) form?
  1. The a in the value world is the same as the a in the type world -- a minor nuisance and avoidable -- one can use different names
  2. λ looks like \
  3. The purely syntactic -> that separates a lambda-variable and its body is the same token that denotes a deep semantic concept -- the function space constructor
APL was one of the oldest programming language and is still one of the most visually striking.  It did not succeed because of various reasons, most notable of which is that it was at its heyday too long before unicode.

While APL is the first in using mathematical notation in programming, Squiggol, Bananas and Agda are more recent precedents in this direction.

In short, its time for programming languages to move from unicode-friendly to unicode-embracing

Some stray thoughts incorporating these ideas into Haskell.

Tuesday, September 10, 2013

Computer Science: Technology or Philosophy?

A computer is like a violin. You can imagine a novice trying first a phonograph and then a violin. The latter, he says, sounds terrible. That is the argument we have heard from our humanists and most of our computer scientists. Computer programs are good, they say, for particular purposes, but they aren't flexible. Neither is a violin, or a typewriter, until you learn how to use it.
Marvin Minsky – Programming clarifies poorly-understood and sloppily-formulated Ideas

Computer science is not a science and it has little to do with computers. Its a revolution in the way we think and in the way we express what we think. The essence of this change is procedural epistemology — the study of the structure of knowledge from an imperative point of view, as opposed to the declarative point of view taken by math.
Mathematics provides a framework for dealing precisely with notions of «what is»
Computation provides a framework for dealing precisely with notions of «how to»

Abelson and Sussman — Preface to Structure and Interpretation of Computer Programs

Computer Science is no more about computers than astronomy is about telescopes, biology is about microscopes or chemistry is about beakers and test tubes.
There is an essential unity of mathematics and computer science.

Michael Fellows — usually attributed to Dijkstra


The above three quotes are interesting as much in their agreement – the irrelevance of computers to computer-science – as in the difference of emphasis: Minsky sees CS from the intelligence/learning pov, Fellows/Dijkstra as math, Abelson/Sussman as something contrasting to math…

So what actually is CS about??

Following is an article I wrote for a newspaper in 1995 on the wider-than-mere-technology significance of CS — reposting here for historical interest.

Monday, September 9, 2013

The Poorest Computer Users are Programmers

In the old days programmers programmed computers. Period.

Nowadays when everything is a computer, and the traditional computer is about a decade and half behind the curve, describing a programmer as someone who programs computers is narrow and inaccurate. Instead we should think of programmers as working at effecting and improving the human-X interface, where X may be 'computer'. But it could also be IT, or technology or the network and through that last, interaction with other humans.

Now the classic 'nerdy' programmer was (by stereotype) always poor at 'soft' questions like that:  Interaction? Synergy?! What's all that manager-PR talk to do with programming?

And so today…

Programmers are inept as users of computers

Some examples:

Tuesday, August 27, 2013

Apply-ing SI on SICP

Abelson and Sussman wrote a legendary book: SICP. The book has a famous wizard cover:

Unfortunately the cover misses some key content of the book.  What is it?
If we remove the other wizardly stuff, three main artifacts stand out on that cover:  eval and apply on the crystal ball and a magical λ.  Lets put these into a table

apply eval
lambda

The fourth empty square seems to stand out, doesn't it?  Lets dig a little into this square.

Sunday, June 23, 2013

Functional Programming invades the Mainstream

Kewl-kids in love with their favorite language will often bring up how wonderful is some non-trivial app written in their language.

Kewl, Kewt, Ardent… And the producer of yawns…

So sometimes it is good to invert the perspective and ask about cross-fertilization:  What ideas/features of these fashionable languages are becoming compelling enough to enter the mainstream?

This post is about how the boring mainstream is giving in – feature-by-feature – to Functional Programming
  • Almost every modern language supports garbage collection. Origin Lisp
  • From that followed the fact that any value not just scalars can be first-class.
  • As widely disparate systems as Python, R, Groovy, VBA, Mathematica share a common idea – using the interpreter interactively as an exploratory tool. Started with Lisp's REPL.