Right from the beginning, it was about power. He was a young mathematician at Harvard, helping the economist Leontief work on the input-output matrices for which he later won the Nobel Prize. The young mathematician needed a notation. Albertans are used to looking out for themselves. He devised his own. Later, still at Harvard, he taught with Fred Brooks the world’s first computer-science course. They wrote their own textbooks: Automatic Data Processing, the title of their course, and a book on the notation, A Programming Language.
This was computing the way mathematicians thought about it. Not engineers. The notation wasn’t about poking bytes into registers and incrementing counters. It was about flipping and mashing big matrices, fitting curves. A Programming Language was a small book, and he worked hard to shrink it, cutting everything that could be left out.
When he applied for tenure, Harvard said he hadn’t published enough.
He left the university to become an IBM Fellow. There his notation was used for the first-ever formal description of a computer operating system. And there he met implementors, who wanted to make his notation an actual programming language. While IBM’s first multi-user operating system wobbled slowly to its feet, Iverson Notation jumped into life and started providing personal computing to growing numbers of users – two decades before the personal computer appeared.
The young mathematician was Dr Kenneth E. Iverson. His team at IBM, and later at I.P. Sharp Associates in Toronto, made APL (as his notation became known) one of the most exciting languages of the 60s and 70s, a forerunner of functional programming.
It was all about power the second time around too. A protogé of Iverson, a former implementor at I.P. Sharp, was at Morgan Stanley in the 1980s. The investment bank’s computers struggled to digest the swelling streams of quotes and trades from financial exchanges. Arthur Whitney devised an APL subset stripped for speed and optimized for time series. Characteristically laconic (another Albertan) he called it A. Kdb+ and q are its descendants.
Iverson was a mathematician, not an engineer. The Iversonian languages (APL, J, A, k, q) embody a mathematician’s view of computation. Here is how Iverson spent his time.
- Writing and talking about and teaching the notation: 100%.
- Implementing: 0%.
With Whitney it’s the other way round. That has worked. His colleagues are steeped in the Iversonian languages. Whitney has not needed to promote, teach or champion. And the unequalled performance of his implementations has had customers supplying developers smart enough to catch on and follow along. Documentation? Alongside his famously terse notes on the language, senior colleagues began documenting q and kdb+ in a wiki.
Big Data has flooded out of financial technology and is spreading rapidly through commerce and technology. It presents Kx with a challenge. The wiki would not serve a new and larger generation of programmers needing to learn the language. We needed a new documentation site.
Working past the wiki
Phase One of the project was to edit and organise the wiki content. We chose MkDocs, a static-site generator, as a platform. That meant the source files could be Markdown files in plain view on GitHub. It was no longer a wiki, but users continue to contribute.
The move to MkDocs supported – required – tables of content. That alone was a huge improvement. The wiki (frozen since January 2017) was well organised for reading q code. You could quickly find on it all the possible semantics for, say, the dollar sign. But the only way to know what tools q has for, say, handling strings was to read the entire wiki.
Loads of overloads
Much of the terseness of q comes from heavily overloaded symbols. Their semantics are fixed largely by the number of arguments to which they are applied. Reading q code requires a clear grasp of the syntax. Yet the documentation site had no account of it.
Delving into that with the Kx Tech team revealed a further obstacle. The terms used for describing the language were somewhat… fluid. Terms such as verb and ambivalent meant different things to different people. Repeatedly, conversations with implementers led to “This is hard to talk about…”
Raiding the inarticulate
SWEENEY: Well here again that don’t apply
But I’ve gotta use words when I talk to you.
But here’s what I was going to say.
— T.S. Eliot, “Fragment of an Agon”
The language had inherited terms from its ancestor languages. Not all were helpful for talking about q.
APL functions can have zero, one, or two arguments. Decades before Haskell adopted the term monad, APL functions were niladic, monadic, or dyadic.
Iverson followed Heaviside’s usage in calling higher-order functions operators, but by now usage has established operator for primitives such as + and *. In his second language, J, Iverson redubbed them adverbs, using English grammar – noun, verb, adverb – as a metaphor. That worked for a generation of Canadians well schooled in English grammar, but conveys nothing to many programmers now.
The terms in which q was described were making the language seem more difficult to learn than it really is. In Phase Two we set out to fix that.
An early principle was that common terms will bear their usual meaning. So + and * are operators. Functions take arguments. Functions that take one, two, or three arguments are respectively unary, binary, and ternary. So operators such as + are binary functions with infix syntax. Keywords like like
and and
are functions; some have infix syntax. Although lambda originally meant an anonymous function, we relaxed that to mean any function, named or not, defined with function notation.
Shorn of the noun/verb/adverb metaphor, adverb had become particularly unhelpful. Candidate replacement terms were debated last year at Iverson College. A plenary meeting of the Kx Tech Team this year settled it: iterator. Iterators come in two kinds, now distinguished as maps and accumulators.
Certain usages reflect special aspects of q. Iterators are most commonly applied to functions. But they can also be applied to lists and dictionaries. We needed a term for an object – function, list or dictionary – that takes an argument or an index. We use value.
A function can be applied to a certain number of arguments. That number is sometimes known as its arity. A list or dictionary can be indexed by a certain number of, er, indexes. That number is sometimes known as its dimension. We take from J the term rank to apply to both concepts, arity and dimension.
The two terms value and rank enable short, clear definitions of the iterators that work for functions, lists, and dictionaries. Result!
Now we can talk
With the vocabulary settled it was at last possible to finish the Reference. The revised Reference appeared in March 2019 as V2 of the documentation site. Every operator and keyword has its syntax and possible arguments described, and with short, predictable URLs, such as https://code.kx.com/q/ref/aj.
The White Papers have been migrated to HTML, revised to be consistent with the vocabulary, and linked to the Reference.
The revised Reference should be a resource for educational material and help train a new generation of q programmers. It is better supported and more up to date and reliable than the wiki ever was.
Search
Dislodging experienced q programmers from familiar bookmarks on the wiki makes Search more important than ever. The new documentation site has better semantic markup than the wiki, and Google Search (GS) delivers better results for it. However, GS has been slow to forget the wiki pages it listed for so many years and this summer still served links that are now dead. We’ve worked to get these links out of Google Search results.
We’ve also been developing the documentation site’s own search tool, customized for q. For example, a search for the dollar symbol lists the table of its overloads, followed by the operator Reference pages for Tok, Cond, Cast, and so on.
The future
With the Reference finished, it’s also been possible – at last – to port the Q Idioms. This list of handy expressions has become the Q Phrasebook, the subject of a forthcoming blog post.
Expect to see more articles in the Knowledge Base that lean on the Reference.
If you need help linking internal documentation to code.kx.com, please ask the Librarian.
And we look forward to welcoming more programmers to the ranks of kdb+ users, confident that they have a documentation site they can rely on.
Stephen Taylor is the Kx Librarian; the views expressed here are his alone.