The End of Literate Programming

The end of the affair

At our final meeting for 2024, we completed reading Knuth’s Literate Programming. In the final section, Knuth considers “Retrospects and Prospects” for literate programming. In this section, Knuth is explicit about who literate programming is for: computer scientists and systems programmers, rather than hobbyists. This context justifies many of Knuth’s arguments throughout the essay, about the kinds of literacy assumed by the WEB system. But it also widens the main gap in his philosophy—the gap between the programmer and the reader. He anticipates the programs will become works of literature, which implies a wide readership, but he restricts WEB to a small subset of people, resulting in a restricted writership. In this way Knuth sharpens the literary aspect of his enterprise, for indeed literature too is written by the few for the many to consume.

With that, our first major reading for the group came to an end.

After the end

We spent some time looking at the upshot of Knuth’s new programming system. On his website, he has published many literate programs, in addition to publishing three books written in either WEB or CWEB, which contain the programs for TeX, Metafont and the MMIX virtual machine.

Literate programming has inspired many programming systems, but not in the manner Knuth proposed. He saw WEB as a system for highly skilled programmers to write complex software systems. But WEB (and its descendent CWEB) have not found much use in this domain. Instead, programming language designers have designed ever more capable documentation-generation systems, which allow software to be composed in a more conventional format, but with excellent computer-generated documentation. Python includes pydoc as part of its standard library, for example, while Rust ships with rustdoc. Such tools allow a developer to include documentation in their code, and generate attractive websites for their software. They do not support the creation of elegant books of the kind that Knuth prefers, and in particular, do not free the programmer from the sytax of their chosen programming language.

The Knuthian ideal of literate programming has caught on in a different community: data science. Statisticians, digital humanists, data analysts, lab scientists and others frequently use tools such as EMACS org-mode, RMarkdown and Juypter Notebooks to write their software. This form of literate programming is nonetheless distinct from Knuth’s. Knuth foresaw systems programmers building complex reusable systems using literate tools. Data scientists tend to write more ephemeral, simple programs, which analyse a particular dataset or form the basis for a particular article or report. When a data scientist does take the time to develop a more complex and reusable piece of software, they are more likely to do so in the form of a simple R, Python or Julia package. While it is possible to write such software in a more literate style using tools such as nbdev, this is not a common practice.

It is a pity that Knuth’s vision of programming-as-literature has not gone mainsteam. Source code is the primary medium of communication for millions of people who work every day as programmers. They write software that affects all of us, and if this software were readable by the general public, then the systems that govern our lives could in principle be more open and democratic. Even an experienced programmer can find it difficult to find a reading path through a complex program. If essential pieces of software such as MediaWiki (i.e. Wikipedia), TensorFlow or Bluesky were written and published in a Knuthian style with a linear narrative, then more people might be inclined to read and debate the code.

What next?

We will reconvene in February 2025, as the summer recess draws to a close here in Australia. Stay tuned for our next reading, which will involve the source code of a Large Language Model. Tell then, Happy Holidays!

Faulkner's Typewriter

In this session, we read sections L and M of “Literate Programming,” which consider the economics of WEB and its relationship to prior work.

Economics of a text editor

Where his earlier writing was produced by hand, the result of a careful and meticulous craftsmanship, in the 1950s Faulkner developed a messier, faster method that relied solely on a typewriter. (DiLeonardi 2024, p. 177)

How does a writing tool affect the writer? Knuth argues that WEB code is faster to write, because WEB imposes a salutary discipline on the programmer at the time of composition. It may take a little longer to write Version 0, but the code will be so much better that it will only be a short step to Version 1.

One member of the group drew a comparison with William Faulkner, whose practice changed significantly when he adopted the typewriter in the late 1950s. As DiLeonardi (2024) explains, Faulkner’s typescripts and manuscripts are quite different documents. His handwritten manuscripts are meticulously crafted. Faulkner claimed that he hadn’t changed a word of As I Lay Dying (1930) in the writing, but the flawlessness of the manuscript suggests a close process of real-time editing. His typescripts are different. He complained that it was “too easy” to put words on the page. They are full of typos, and the novels he wrote on a typewriter, such as The Town (1957) are looser and less crystalline that his handwritten books.

Knuth’s adoption of WEB had, in his own estimation, the opposite effect to Faulkner’s adoption of the typewriter. He claims that code he once wrote carelessly he now writes carefully. When he writes a WEB program, he is in “expository mode.” He imagines that he is explaining the code to someone else, and this mode of thought makes for better programming.

Knuth is trying to advocate WEB as a practical programming tool, and couches his arguments in terms an IBM executive might understand: will this tool reduce labour costs? But in our group we observed a different side to the argument. In his discussion of the “expository mode,” Knuth once again undermines his idea about WEB programming as a “stream of consciousness.” The expositor does not simply let their consciousness stream. They shape that stream for the benefit of a student or a reader. Although he doesn’t clearly theorise their role, Knuth is aware of the importance of the reader, and of the rhetorical situation of the programmer.

He does want readers. The section ends with a delightful expression of Knuth’s burning desire to promulgate both his code and his coding tool.

The rhetorical function of acknowledgements

Section M is essentially and “acknowledgements” section, or an “awards acceptance speech,” as two wits of the group put it. What it is doing in the article?

Partly the acknowledgement of prior work justifies his own undertaking. He makes the radical claim that “it is worthwhile to consider every program as a work of literature,” and wants to establish precedents.

Every program? We discussed this with some astonishment. In our own fields—literature and art history—there is habitually some distinction between what is literary or artistic and what is not. A bureaucratic form is certainly writing, but doubtfully literature. Painting a door is certainly painting, but probably not art. Should every program be see as a piece of literature?

There are two problems with Knuth’s claim, one obvious, the other more subtle.

The obvious problem is that many programs are boring, technical or uninteresting. There are programs that resemble shopping lists and bureaucratic forms, as well as programs that resemble haikus or essays. Perhaps Knuth really is advocating a Dadaist aesthetic, and wants to claim that even the humblest little utility program can be dignified as literature, if only it is written and documented properly in WEB.

The more subtle problem is, again, readership. Another class of things we usually exclude from “art” or “literature” are creative works that are intended for coterie audiences. Patents, blueprints, scientific articles—these things are usually excluded from the art-world or the literary sphere. Art is for the “public.” Literature is for the “general reader.” Can literate programs be so? Perhaps Knuth is advocating an esoteric literature. It may appeal to a small audience, but it has the same aesthetic values and careful creative purpose as literature intended for the public.

It seems more likely that every is hyperbole, or simply not a claim that Knuth carefully considered before he made it. But it is tempting to view Knuth as a kind of William Blake of programming, simultaneously furious in his esotericism, unorthodox in his standards of beauty, and utterly committed to his crazy craft as a form of expression that reaches out widely into society and lengthily into the future.

Next week, we conclude our reading of “Literate Programming.” And move on to something new …

References

DiLeonardi, Sean. “Mediation, Stream of Consciousness, and the Faulknerian Voice: As I Lay Dying to The Town.” Twentieth-Century Literature 70, no. 2 (June 1, 2024): 173–98. doi:10.1215/0041462X-11205357.

Psychological correctness

In this meeting, we read the crucial section of “Literate Programming,” J, in which Knuth lays out his theory of ‘programs as webs’. This section justifies WEB in the most general possible terms: all programs are already webs, Knuth argues. All WEB does is allow the programmer to clearly express the structure that is already there.

Knuth uses the ‘web’ metaphor in an interesting way. Those of us who are used to the Internet and the World Wide Web may think of a ‘web’ as an inherently dynamic or unstable system. We browse or surf the web. We hop from page to page, from app to app, using hyperlinks. We summon up fragments of the web using search engines, or allow recommendation algorithms to summon up fragments for us as we scroll the feeds of our favourite platforms.

This is not what Knuth means by web. For Knuth, a web is a static, well-ordered structure, like the delicately woven web of a spider, or a narrative tapestry whose threads are chronological. A web is for reading from start to finish. A web has a finite set of components, which have been joined carefully by the weaver of the web. Of course, you may use an index to jump to particular joins on the web. You may use cross references to travel along particular strands. But the web itself is single and entire, with a beginning, middle and end.

Structures and structures

A hierarchical structure is present, but the most important thing about a program is its structural relationships. (p. 107)

Knuth argues WEB accomodates both ‘top-down’ and ‘bottom-up’ programming, or rather, it transcends these two approaches. The WEB programmer can start with a top-level description of a program, or they can start by defining subroutines, or they can mix both freely. This freedom to decide between top and bottom at will frees the programmer from the ‘hierarchy’ of the program. Of course, in the end, ‘[a] hierarchical structure is present’: a program must be a single object the computer can execute, comprising smaller parts that lie within it. But the WEB approach allows the programmer to reveal the ‘structural relationships’ of the program: the logical and intellectual links between different parts of the program.

For example, perhaps some global variables are manipulated by subroutine X, and others are manipulated by subroutine Y. From a hierarchical perspective, each global variable and each subroutine is a separate part of the program, on the same level, while the code inside each subroutine is at the next level down, nested within the subroutine. Using WEB, however, the programmer can explicitly reveal the relationships between the variables and the subroutines, for example by declaring the variables next to the subroutines that matter to them, or by building up the subroutines in parts that a clearly related to other global aspects of the program.

There is an interesting slippage in Knuth’s argument. There is the ‘hierarchical structure’ on the one hand, and the ‘structural relationships’ on the other. Both of these are structur(e|al). What makes them different? How are they related?

Knuth implies that there is no single description of a program that is the right one. Programs have many parts, which combine to form the entire program. These parts have many possible relationships: the orderly hierarchy of their execution by the machine is only one set of relationships. The human reader of a program may observe many other sets of relationships in the program that matter to them.

We could think of this in practical terms. A human might use a profiler, observing how and when different parts of the program are called in practice. They might use a flowchart tool to visualise the control flow. They might write out mathematical theorems that characterise the invariants of parts of the program. They might observe the way that the program models the problem domain, the user, the machine itself. There are (possibly) infinitely many ‘structures’ in a program. Knuth’s aim with WEB is to let the programmer structure their program in whatever way will maximise human comprehension of the code.

Psychological correctness vs. [personal] style

Knuth’s theory of coding style is simulataneously aesthetic, cognitive and functional. Code written in WEB should be aesthetically pleasing, according to literary criteria; it should be easily comprehended (or congnised); and it should function correctly.

These three aims don’t always go together, according to Knuth. He gives an example on page 108. Imagine a programmer is writing a function that does a simple data update, but it needs to check the user input for errors. If the programming language obliges the programmer to put the error-checking code first, then they may feel the urge to shrink the error-checking. The error-checking code is tangential to the function: what really matters is the code at the end, which actually performs the data update for which the function is being written. If there are dozens of lines of error-checking code, which make up virtually the whole function, the programmer may find the function aesthetically repulsive. It would be like designing a pencil with a grip so enormous and contorted that you can no longer clearly see the barrel and tip of the pencil itself. In this case, aesthetics pulls against both cognition and functionality. To make the function seem less ugly, the programmer will try to write the error-checking code as concisely as possible, which may mean it is terse and difficult to understand. They will also be tempting to omit error checks, potentially impairing the functionality of the code.

Knuth demonstrates how WEB resolves the contradiction between aesthetics, cognition and functionality. By giving the programmer complete control over the presentation of the code, and the ability to add labels or commentaries to any part of it freely, WEB allows the programmer to achieve any functionality they like without compromising on either aesthetics or cognition.

There is a tight link, and nonetheless a tension between aesthetics and cognition in Knuth’s theory. Knuth argues that the best way to present a program is in the “psychologically correct” order. But he also argues that programmers can and should develop a personal “style” of programming. If there is a “correct” way to present the program, how is there room for individual “style”?

Knuth’s theory of “psychological correctness” is highly individualised. He argues that a program should represent the programmer’s “stream of consciousness” (p. 107)—that is, the program should be written in the order that the programmer conceived of it. He insists throughout the essay that in his own experience, he only ever envisages a program in one order. There is an order in which the program occurred to him, and this is the order in which it must be written. He argues that when he reads another programmer’s code, he can understand their stream of consciousness easily: the program he presents on pages 98-102 of the article is actually no Knuth’s own stream of consciousness, but Edsger Dijkstra’s.

There is a commonsense aspect to this. If the programmer builds up the program logically, then they can communicate this logical process to the reader, who will hopefully find it easer to comprehend what is going on. Knuth does occasionally modify his theory, admitting that the programmer should not simply regurgitate their actual “stream of consciousness,” but shape the program text with the reader in mind.

But Knuth nonetheless presents the idea of “psychological correctness” in such a stark way that its implications are thriling and extreme. Is it true that every program Knuth writes appears to him in exactly one way? Is this a universal experience of programming? We felt in the group that perhaps Knuth is not accounting for his own extreme level of skill and learning—most programmers probably fumble around, and need to experiment, much more than this most famous computer programmer needs to when he writes software. Is it true that we can understand one another’s thought-processes so easily? Many in our group found the presentation of the primes program on pages 98-102 extremely difficult to follow. The program makes many assumptions about the prior knowledge and discursive competence of its readers. Does Knuth believe that there is a single programming literacy that all programmers share, such that the reader of any program can be assumed to be the same kind of person with the same kind of consciousness?

There is something deeply Kantian about Knuth’s views. He seems to believe in a universal rationality, which extends to the task of aesthetic judgment, and which links cognition to the feeling of beauty. As an unreconstructed Romantic, I find this point of view to be very attractive, even if our experience in this very group demonstrates (for the millionth time) that rationality is more contingent and culturally determined that Kantians may like to admit.

We recommence next time partway through section K, on page 108.

Simplicity and Neglect

There have been some interruptions to our group—and the blog—but we are back today. We read slowly through sections H and I of Knuth’s “Literate Programming,” which contained some suggestive clues to Knuth’s aesthetic philosophy.

Simplicity: for whom?

Simplicity is Knuth’s justification for one-parameter macros:

Again, I did this in the interests of simplicity, because I noticed that most applications of multiple parameters could in fact be reduced to the one-parameter case. (p. 104)

We discussed this argument for some time, because there is a fascinating lacuna in Knuth’s argument: simplicity for whom? Knuth could mean, simplicitiy for the implementor of WEB (namely, for Donald Ervin Knuth), or he could mean, simplicity for the user of WEB. Knuth’s example doesn’t clarify the situation. He presents this example of a two-parameter macro:

mac(#1, #2) == m[#1*r+#2]

and shows how it could be rewritten using only one-parameter macros like so:

mac_tail(#) == #]
mac(#) == m[#*r+mac_tail

At first blush, the rewritten version looks considerably more complicated. It takes more lines of code. It reverses the order of the [] symbols, requiring the reader to put them back in the correct order by mentally substituting mac_tail for #]. Most of us in the group recoiled at first from Knuth’s example. Why not allow two-parameter macros, and the apparently more elegant initial example?

On close inspection, some points in favour of Knuth’s decision emerged. One of the key simplifications of the rewritten version is that the parameters (#) no longer need to be numbered. This is presumably much easier for TANGLE to handle, and therefore easier for Knuth to implement. It also eliminates certain possible errors, such as inconsistent labelling, or too many parameters being passed to the macro, or too few.

The main criterion of ‘simplicity’ for Knuth seems to be: paucity of primitive elements and means of combination. The one-parameter macro comprises one macro label (e.g. mac_tail) and one parameter (#). There is a simple substitution: replace every # on the right-hand side with the passed value of #. The two-parameter macro introduces an additional primitive element: the number of the parameter (e.g. the 2 in #2). It also introduces an additional substitution rule: first match the parameters on the left and right hand sides using their numbers, then proceed with the substitution.

Programmers often want to ‘reason’ about their programs. This is probably the activity Knuth hopes to support by allowing only macros of one parameter. It is easier for me to see what a macro will do, because it is so constrained in what it can do. I can construct a more complex macro by combining several macros together. While this may seem complex at first, the advantage is that each individual piece is extremely simple, and I can understand it perfectly with little effort.

In this way, there is a happy marriage between the impelementor and the user, both of whom benefit from the same kind of simplicity. How often do programmers and computer scientists aim for this kind of happy marriage? Is the user’s simplicitiy alway reducible to the implementor’s?

Of course in Knuth’s case, the users and implementors were often the same people. As he explains in section I, on “Portability,” installing WEB was no easy matter in the 1980s. This importability1 extended to programs written in WEB, such as TeX. In the days when software was distributed as source code, the user often had to modify the code in order to get the software working. Ease of implementation and ease of use are hard to distinguish in such a context.

We spoke for sometime about the culture of programming in the 1980s. All this fuss about macros seems otiose to digital humanists raised on dynamically typed intepreted languages like R, Python and JavaScript. There is simply no need for all this mucking about! There are two tracks through Knuth’s arguments that are difficult to disentangle:

  1. Knuth’s attempt to overcome the particular limitations of PASCAL, which requires the programmer to write their program in a certain order using certain sections.
  2. Knuth’s attempt to devise a new form of writing, of general application, which will make software more enjoyable to read and write.

In the context of (1), the discussion of macros is necessary. As Knuth himself demonstrates, some programming tasks are basically just torture in PASCAL without WEB’s macros. In the context of (2), however, the discussion seems to wander down the garden path. Is this detail necessary for me to understand the acts of WEAVING or TANGLING code?

Neglect: bootstrapping the authorial persona

The WEB system caters to system-dependent changes in a simple but surprisingly effective way that I neglected to mention when I listed its other features. (p. 106)

The word ‘neglected’ evoked a range of responses in the group. There is an obvious fictionality to it. Knuth could of course edit his essay, and introduce this feature earlier on. His essay is obvioulsy artful, and to ascribe the ordering of its contents to ‘neglect’ is misleading—on the literal level.

But of course Knuth and his readers are well aware of the fictionality of essayistic neglect. The word indicates two aspects of Knuth’s writing:

  1. His cultivation of Socratic humility.
  2. His belief in the linearity of text.

Knuth’s humility is an expression of mastery. He can afford to ‘neglect’ a topic, because he knows when is the right time to introduce it. His writing is replete with such self-effacement. It is easy to imagine his smiling presence in the classroom, as he gently introduces students to theorems that he ‘neglected’ to tell them earlier on. I personally like this authorial persona—as I like its obvious antecedent, the Socrates of Plato’s dialogues. But perhaps his ingratiating humility and pedantic specificity are not to all reader’s tastes.

Knuth’s belief in linearity expresses his deeper ideas about literacy. For Knuth, the model reader is one who reads a text from the start to the end. The whole WEB system is designed to make this possible for software generally. And indeed, how nice it would be if all programs did have a single reading path, so that a new programmer could take a guided tour of the software before they start hacking on it. Knuth is essentially a teacher, and he views the writer’s task as pedagogical. Take the reader through the content in the order that it makes sense to human cognition. Avoid complexity and digressions. ‘Neglect’ what is not necessary to explain until the opportune moment to introduce it.

In this way, Knuth’s humility and linearity converge in a common ideal. Both his humility and linearity serve to make the text transparent and open. There is nothing hidden from the reader, at least not intentionally.

There may also be a sly, and thoroghly computational, humour behind Knuth’s avuncular language. As one member of the group observed, Knuth himself knows that he hasn’t really ‘neglected’ the topic of change files. Of course he had to neglect it! The contents of the essay have to be in some order! It’s not possible to write the entire essay in the first sentence! This ‘neglect’ represents the bootstrapping of the authorial persona. The author writes himself into existence. Knuth didn’t ‘neglect’ the topic of change files until he introduced the topic of change files on page 106. Like TANGLE.WEB, Knuth’s authorial persona is self-hosted.

We resume next time at section J. Programs as Webs, where Knuth’s implicit poststructuralism becomes explicit.

  1. I wrote this word as a nominalisation of ‘importable’, by analogy with ‘portability’. Then had second thoughts. The OED confirms that ‘importability’ is not attested in the sense of ‘the quality of lacking portability’. Treat this as a catachresis, solecism or evidence of usage as you please. 

The loop is broken

The inner loop

In this session we concluded the central section of the central section of “Literate Programming”. In sections 22-26 of the “woven output,” Knuth presents the “inner loop” of his program to print the first 1000 primes. The inner loop checks each candidate number $j$ to see if it is prime. It is the kernel of the program, the part that consumes the most computation time, and which performs the function closest to the program’s ultimate goal. And it does it all without performing a single multiplication or division…

We noticed again some common ticks in Knuth’s rhetoric. Once again we encounter “the remaining task” (which we had already met in section 11). Once again the program is “quite simple” and “straightforward” (as have been most parts of the program). Knuth’s program text unrolls like a function being optimised. It relentlessly converges on its solution, following a chain of logic whose links are joined each to each in an intuitive way. By this point in the program, the whole group were baffled by the mathematics. For this group of humanists, the refrain of straightforwardness had become rather humerous.

For my part, I find Knuth’s authorial persona amusing, his program elegant, and his presentation of it poetic—to the extent I was able to understand any of them. But others in the group found his authorial persona “judgey,” and his presentation of the program intolerably taxing on the powers of memory. What do all these auxiliary variables mean again? What exactly is the structure of that table? How are those different numbers combined? Who exactly was Eratosthenes, and what does his “sieve” have to do with any of this?

Way back in section 3, Knuth informed us that the program should be “reliable, well motivated, and reasonably fast.” We were all amused by the program’s final motivation:

Let’s suppose taht division is very slow or nonexistent on our machine. We want to detect nonprime odd numbers, which are odd multiples of the set of primes $\set{p_2, …, p_{ord}}$.

We have discussed at many points the exemplary nature of the program. Knuth does not intend to provide either a useful program listing (e.g. a prime number solver for use in production), or an exercise in programming style (e.g. an example of structured programming). He intends to provide a well-documented program using WEB, and one gets the impression at many points that the program has been made needlessly complicated simply to justify the need for WEB’s features. In this case, the computer’s lack of division requires explanation, which requires documentation, which requires—WEB.

A maze of strands

We took a moment, upon completing the program itself, to reflect on it as a literary work. Most in the group agreed that the WEB program has a remarkably “tangled” structure—to misuse Knuth’s own metaphor. You could say that the program is deeply intratextual. It constantly refers to itself. No part of the program can very easily be detached, and viewed independently on its own. To understand any part, it is necessary to recall the whole structure of variables and constants that govern its behaviour. The text of the explanation typically refers to code that has not been presented yet. To understand the “motivation” for each programming decision, you need to already understand the program, or at least programs like it. The text is topsy-turvey, round-about, splayed-across and liketty-split. I personally enjoyed the ride, but well-oiled labyrinths are not for everyone!

Upon reading section 27, the index, some of the group were dismayed. (Not really.) If only we had kept the index open the entire time, it might have been possible to keep track of all the quantities and arrays that comprise the program code! But as one member of the group observed, Knuth’s preference for mathematical symbolism makes the index unreadable on it’s own. What do $c$ and $cc$ mean? You already need an intimate knowledge of the program to understand that they mean “current column” and “columns per page.”

Another in the group observed the difficult role that time plays in “the plot of the code.” The program text has its own narrative time, marked out sequentially by the section numbers. It also refers to the serial time of the computation, and to the “parallel time” of certain “auxilary variables” which evolve in lockstep with the main variables of the program. On top of this, the reader is constantly aware of the way the program loops back on itself, providing a structure for WEB to reorder all the code as the PASCAL compiler demands. And then there is the overall evolution of the algorithm as a process when the program is run. Knuth’s program has the complex temporality of a Virginia Woolf novel, and a Woolfian quality of internal reflection. Perhaps for someone whose daily life is number theory, the program would also have those Woolfian qualities of sacredness and care that we were, for the most part, unable to draw from the text.

We also discuss the personality of the text. Knuth argues from the start that “[p]rogramming is a very personal activity,” and suggests that the programmer’s personality should shine through the code. If you can see the person behind the code, you can understanding the reasoning behind the code, and therefore can understand the code itself. One member of the group suggested that this contradicts the common ideal of “egoless programming,” which is important for engineers working on complex projects in large teams over many years. How “egotistical” is Knuth’s idea of “personality”? Knuth himself is a master-craftsman. Perhaps for him there is no link between personality and ego, for the master-craftsman loses herself in the craft, and can bear any criticism of her work so long as it is in the cause of programming elegance. But perhaps ego and personality cannot so easily be separated.

We recommence next week in the fourth paragraph of D. How the example was specified.