How long should program lines be?
But wait! Is this question even meaningful without specifying which programming language?
There was a discussion on the python list regarding the maximum length for code lines. It started with the fact that Python's style guide – Pep 8 – recommends that code lines should never exceed 79 characters.
We need to deal with old 80 column devices argument is analogous to the: Dont move to unicode because of 'old ASCII devices' argument. Pragmatics demands carrying people along but also moving along.1
Then there is the other…
The proverbial terseness of functional programs comes at the cost of longer lines. Functional programs are not shorter; rather they are squatter!
Thinking further about it I realized its a more subtle issue than just long vs short; its more a question of multidimensionality. To understand this we need to see that much of the practice of software engineering consists in discovering and leveraging3
Why is all this a necessary evil? Because in the long run, its a complexity cutter: instead of thinking of a problem all at once, I can think of a 'summary' (interface) and the details (implementation) separately.
Other similar complexity cutters could be thought of:
At the time of Descartes, it may not have been quite clear that such a superimposition was worth the headache. Today it has become a far more wide-reaching benchmark than just math or science – orthogonality has become synonymous with being analytical in all sorts of fields, with 'Cartesian' being almost synonymous with the principle:
If the n tokens of a program are arranged in sequence, Ive to read a sequence that is n-long. If they are arranged in a square, the square has side √n, if in a cube then its side reduces to ∛n and so on.
So if the energy-output of program reading is our comprehension, and the work-done in that reading is the energy input, then the minimum energy would be wasted if the program were as block-structured8 as possible.
Can this be applied in toto? IOW can we try to envisage going away from the dictum:
to (something like)
Are our natural visceral reactions to this out of limitations that are intrinsic or out of mere limitations of habit? For example our books can be thousands of pages in length, but each line only a few words long.
Because we dont know how to physically imagine a book that is say one-page long and thousands of characters wide, we transpose that same limitation onto our windowing software, with vertical scrolling being better supported than horizontal. And then that s/w limitation becomes a h/w limitation: Why does 'scroll-mouse' exclusively mean vertical scroll-mouse?
Well in all fairness Pep 8 also says, Don't follow these rules slavishly.4 Putting that aside and assuming that the 79 column rule is a 'strong suggestion,' here are some points for reconsidering it.
A calculus result is that the rectangle with the maximum area (for a given fixed perimeter) is the square. IOW if we have a certain length of string to enclose an area, the maximum will be enclosed if we make the area square5. Perimeter to area maximization implies more square is better, more long is worse. And my conjecture is that area correlates with comprehension; perimeter with physical therefore cognitive effort.
In fact, if we take the naive sense that a square 'packs-in' more than a line; a cube more than a square and so on, we see one of the hallmarks of the more expressive programming paradigms: higher-dimensional code:
At the least most so-called high-level languages are 'structured' as compared to assembly which is linear. Furthermore most modern languages have notions of separate compilation for source files, packages etc which are really just other dimensions of structuring.
Here it is cut-pasted verbatim from the standard prelude. Note particularly the lineup formatting. [Best read in gist in raw-mode]
And here is my preferred version – longest lines about 115 chars. Naturally its a few lines shorter but really thats not the point. The point is that the natural chunking that happens when we read a line is more optimal with this increased length. See the lexFracExp and lexExp at end.
So, the imperative, OO and assembly programmer specifies the series of actions, possibly with some data packaged into abstractions. The functional programmer makes a table:
scene pattern1 = result1
scene pattern2 = result2
etc
In fact maybe even a '3-D' table:
In short: functional programs are short! Or if you will, plump and sweet, others are thin and skinny with machine-code being the limiting 1-dimensional case – a sequence of bytes. To go the other direction ie from the 1-dimensional nature of assembly involves a reconsideration of hypertext.
Yeah I need better examples and more pictures
Also yeah fat is not fashionable, with women killing themselves with anorexia trying to get thin. Well we should then imagine ourselves in Leonardo's Italy wherein pretty = plump
Footnotes 1. IIRC the first linux I used ran with 4 MB ram. Should a modern distro take pains to make sure it will install with 4 MB ram?
2. Paul Graham uses unusually narrow columns and says that its easier to read – cant find that justification right now.  And there are some general directives. However I know of no link between column width for prose and for programs
3. Funny word that – 'Leverage' not sure whether to laugh at it or be annoyed
4. But most importantly: know when to be inconsistent – sometimes the style guide just doesn't apply. When in doubt, use your best judgment. Look at other examples and decide what looks best. And don't hesitate to ask!
5. Actually circle is even better than square but lets leave that for now!
6. Bird and Wadler used the term 'concrete type' for Haskell's types – a telling term that is somewhat forgotten nowadays.
7. See the pseudocode examples here and here. And imperative programming explained here
8. Where block-structured is to be understood more hi-dimensionally than in traditional block-structured languages like C.
But wait! Is this question even meaningful without specifying which programming language?
There was a discussion on the python list regarding the maximum length for code lines. It started with the fact that Python's style guide – Pep 8 – recommends that code lines should never exceed 79 characters.
The History
The 80 column restriction comes from a time when computer monitors were universally 80x24. Are we suggesting that the two 80s are only coincidentally related? And if not why stick to a long past legacy? When modern monitors are- bigger
- wider form factor
- higher resolution
We need to deal with old 80 column devices argument is analogous to the: Dont move to unicode because of 'old ASCII devices' argument. Pragmatics demands carrying people along but also moving along.1
Then there is the other…
Human-Cognitive argument
This usually runs along the lines:Moore's law applies to electronics and hardware, not the wetware in our heads and in this case our eyes. When the former has grown exponentially the latter is the same.Yes, there seems to be some research that the best cognition is at narrow columns2; this research is for reading text not code. Its naive to believe text-cognition carries over unchanged to programming languages, given how different Haskell/Apl are from C++/Assembly. IOW Haskell, Java and assembly are almost certain to have differing optimal line-lengths. And the question of what it should be for python – where these reflections started – in all probability devolves onto the question of how we view python: OO? Functional? Scripting? Something-else? Those who are in the 80-col-camp are probably those who are in the 'executable-pseudocode' camp – which sounds really nice until we realize that 'pseudocode' and 'imperative paradigm' are defined mutually recursively with no base case7!
(Non) Co-relation with Ordinary Text
The reason why rules and guidelines for text cannot carry over to programs is that programs are read very differently from normal text – ie the to-n-fro scanning needed to read a program ranges from 3 times (if you are some kind of genius) to way over 10 (if you are like all of us).The proverbial terseness of functional programs comes at the cost of longer lines. Functional programs are not shorter; rather they are squatter!
Thinking further about it I realized its a more subtle issue than just long vs short; its more a question of multidimensionality. To understand this we need to see that much of the practice of software engineering consists in discovering and leveraging3
Complexity Cutters
to deal with the intrinsic complexity of the programming problem.The first and probably still most far-reaching such is the subroutine. If we remember when we first learnt it, we had to remember some arbitrary syntax, some arcane rules of parameter passing and types and so on and so forth.Why is all this a necessary evil? Because in the long run, its a complexity cutter: instead of thinking of a problem all at once, I can think of a 'summary' (interface) and the details (implementation) separately.
Other similar complexity cutters could be thought of:
- files and separate compilation
- modules and their linkage
Cartesian Geometry
Descartes' brilliant insight was that by superimposing a rectangular grid onto the space, geometry would be vastly simplified. Simplified because:- By numbering the corners in a certain way, geometry can be reduced to algebra
- By making the grid rectangular, moving along one dimension, does not disturb the others.
At the time of Descartes, it may not have been quite clear that such a superimposition was worth the headache. Today it has become a far more wide-reaching benchmark than just math or science – orthogonality has become synonymous with being analytical in all sorts of fields, with 'Cartesian' being almost synonymous with the principle:
High(er)-dimensionality is a complexity-cutter
Coming back the the question of the optimal line-length in programs we find that Descartes applies surprisingly literally.If the n tokens of a program are arranged in sequence, Ive to read a sequence that is n-long. If they are arranged in a square, the square has side √n, if in a cube then its side reduces to ∛n and so on.
So if the energy-output of program reading is our comprehension, and the work-done in that reading is the energy input, then the minimum energy would be wasted if the program were as block-structured8 as possible.
Can this be applied in toto? IOW can we try to envisage going away from the dictum:
Programs can be arbitrarily long but only 80 characters wide
to (something like)
Program comprehensibility increases
as the program maintains a square/cubical form
??
??
Are our natural visceral reactions to this out of limitations that are intrinsic or out of mere limitations of habit? For example our books can be thousands of pages in length, but each line only a few words long.
Because we dont know how to physically imagine a book that is say one-page long and thousands of characters wide, we transpose that same limitation onto our windowing software, with vertical scrolling being better supported than horizontal. And then that s/w limitation becomes a h/w limitation: Why does 'scroll-mouse' exclusively mean vertical scroll-mouse?
No I am not suggesting some new windowing hardware/software.  Just that we take a second look at restrictions like python's line length restriction.
A calculus result is that the rectangle with the maximum area (for a given fixed perimeter) is the square. IOW if we have a certain length of string to enclose an area, the maximum will be enclosed if we make the area square5. Perimeter to area maximization implies more square is better, more long is worse. And my conjecture is that area correlates with comprehension; perimeter with physical therefore cognitive effort.
In fact, if we take the naive sense that a square 'packs-in' more than a line; a cube more than a square and so on, we see one of the hallmarks of the more expressive programming paradigms: higher-dimensional code:
At the least most so-called high-level languages are 'structured' as compared to assembly which is linear. Furthermore most modern languages have notions of separate compilation for source files, packages etc which are really just other dimensions of structuring.
Instructions or Scenarios?
And that probably shows why for a functional programmer, other paradigms like imperative, OO and assembly are almost in the same bracket: all these ultimately write sequences of instructions whereas the FPer thinks not in terms of instructions but of scenario-tablesSmall example
From Don Sannela's intro to programming lectures:filter :: (a -> Bool) -> [a] -> [a]
filter p []                   = []
filter p (x:xs)  | p x        = x : filter p xs
                 | otherwise  = filter p xs 
good to see it in the original (pgs 20-25) and note how much more important the triple-lining-up is:- the filter
- the '|'
- the =
A Larger Example
is the lexer for Haskell in Haskell – lex – from the haskell standard prelude
Here it is cut-pasted verbatim from the standard prelude. Note particularly the lineup formatting. [Best read in gist in raw-mode]
And here is my preferred version – longest lines about 115 chars. Naturally its a few lines shorter but really thats not the point. The point is that the natural chunking that happens when we read a line is more optimal with this increased length. See the lexFracExp and lexExp at end.
So, the imperative, OO and assembly programmer specifies the series of actions, possibly with some data packaged into abstractions. The functional programmer makes a table:
scene pattern1 = result1
scene pattern2 = result2
etc
In fact maybe even a '3-D' table:
scene pattern1   |guard11   = resulting scenario11
                 |guard12   = resulting scenario12
scene pattern2   |guard21   = resulting scenario21
                 |guard22   = resulting scenario22
In short: functional programs are short! Or if you will, plump and sweet, others are thin and skinny with machine-code being the limiting 1-dimensional case – a sequence of bytes. To go the other direction ie from the 1-dimensional nature of assembly involves a reconsideration of hypertext.
Syntax? Semantics? Or Structure?
This whole argument is seemingly about syntax. In actual fact its a question of effectively mapping syntax to semantics. Pattern-matching is as close as we can get today to visual programming because patterns are almost pictures of the data.6Whole Brained Programming
FPs tabular nature of patterns+guards appeals to right-brain's need for pictures. FP's rewriting semantics is more comforting to the left brain than the value-in-a-box model of imperative programing because arguments about program-correctness can be much more syntactic. FP addresses both brains. Restricting line-length is not a mere syntax question. Its a question of constricting the pictorial right-brain to subscribe to the linear, logical left. Allowing a programmer to use a long line (when necessary) allows him the freedom to sketch a picture rather than only using sequenced instructions. Allowing a bunch of long lines (as in classic pattern-matching + guards code) is an encouragement to paint a picture rather than just spewing techno-verbiage.Yeah I need better examples and more pictures
Also yeah fat is not fashionable, with women killing themselves with anorexia trying to get thin. Well we should then imagine ourselves in Leonardo's Italy wherein pretty = plump
Footnotes 1. IIRC the first linux I used ran with 4 MB ram. Should a modern distro take pains to make sure it will install with 4 MB ram?
3. Funny word that – 'Leverage' not sure whether to laugh at it or be annoyed
4. But most importantly: know when to be inconsistent – sometimes the style guide just doesn't apply. When in doubt, use your best judgment. Look at other examples and decide what looks best. And don't hesitate to ask!
5. Actually circle is even better than square but lets leave that for now!
6. Bird and Wadler used the term 'concrete type' for Haskell's types – a telling term that is somewhat forgotten nowadays.
7. See the pseudocode examples here and here. And imperative programming explained here
8. Where block-structured is to be understood more hi-dimensionally than in traditional block-structured languages like C.
 
It is kind of ironic that the blog format wraps your wide code examples at 65 chars.
ReplyDeleteHa Ha
DeleteWell that goes to underscore a couple of points:
1. The fixed 80 char width that was inviolable decades ago breaks today on both sides: it may be too low or too high!
2. I guess I dont know how to use blogger very well :-)
3. Are you viewing on a narrow device?
Just a suggestion, use a monospace font for the code blocks! Some things aren't aligning the way you say they are in the article because of the font.
ReplyDeleteOk Mike. The filter code has been monospaced. [The rest was moved to gist earlier. Hope thats it!!]
DeleteOn the larger scale however I wonder if this goes against my point, viz that functional programming is at its most beautiful when its most visual.
Monospaced font is not so pleasant or easy to read.