Why do Programming Languages Succeed?
December 20, 2014
There have been thousands of programming languages of various scopes created since the advent of computers. Of these, two kinds are worthy of special attention: those that are very popular and those that are very original. Perhaps not remarkably, both groups are quite small. A few languages probably belong in both categories.
Measuring originality, on the other hand, is complicated by the fact that novel ideas are often only recognized as significant in hindsight. For example, while Simula predates Smalltalk, which in turn predates Self, it's debatable which of these languages best embodies the unique constellation of ideas surrounding object-oriented programming. Unsurprisingly, as the low hanging fruit has mostly been picked, many of the most novel languages are also relatively old.
Obviously granting some subjectivity on originality and guesswork due to lack of historical statistics on popularity, below are my lists. Those in both categories are starred. The original languages are tagged with a description of what makes them special. Some of the original languages share the title when there isn't a clear exemplar. Also, I don't mean to suggest that un-starred unique languages haven't achieved some degree of popularity. Often they have loyal followings, are taught in advanced institutions, or have been used for something big, but generally they aren't used widely for day to day development. The cut off for being popular enough is similarly squish. If I've left off your favorite, I apologise.
- BASIC / Visual Basic / VBscript / ASP
- C *
- Fortran *
- Hypercard *
- Matlab / R
- Objective C
- Shell / Bash
- SQL *
- Algol - Structured Programming
- APL - Everything is a Tensor
- Fortran * - High Level Languages
- Forth - RPN, Compile Time Execution, Everything a Word
- Haskell - Pattern Matching, Monads
- HyperCard * - Code bound to widgets
- Inform7 - Strong Literate Programming
- Lisp - Data as Code, Functional Programming
- PL/I / C * - Pointers and Pointer Arithmetic
- Prolog - Logic Programming
- Simula / Smalltalk / Self - Object Oriented Programming
- SQL* - Relational Databases
Popular languages, share some common themes surrounding why they became popular. Broadly these themes are: Easy to Embedded, Easy to Implement, Scoped for Teaching, Powerful or Unique Libraries, Powerful Built-In Types / Operators, Endorsement of a Powerful Entity / Gateway to a Platform, Intertwined with the Success of UNIX.
Easy to Embedded
Ease of embedding seems to help spread languages by creating a bunch of niche contexts in which learning the embedded language becomes the only way to use a tool. Lua and Tcl as well as Python in its early days found their way into Game Engines, Graphics tools, Web Servers, and Window Managers as a configuration / extension languages. BASIC or a subset of it, is a regular feature in Word Processor and Spreadsheet expression and scripting languages.
Easy to Implement
Ease of implementation played a significant role in language popularity, particularly during the early microcomputer era. A small memory footprint with implementations often small enough to be printable in a magazine allowed BASIC to spread to most early systems. Similarly Pascal's small grammar allowed compact implementations for very early systems, as well as highly performant implementations like Turbo Pascal.
Scoped for Teaching
Being scoped for teaching helps spread languages by shaping the preferences of newly minted programmers. Teaching material for BASIC and Logo were pervasive during the early 1980s push to teach kids to code. Pascal took over once unstructured programming had fallen out of fashion. Java and later Python made fast inroads in education by providing syntactically small but contemporary object-oriented languages that include key CS teaching features like built-in concurrency. Each has a relatively definitive dialect or sub-dialect teachable in the span of a single course. Arguably, standardization plays a central role, as very popular languages like C++ often find use in education only when coupled with an extensive cross-course library. Also, it's notable that teachability often popularizes only a subset of a language. Logo's capabilities beyond turtle graphics were rarely taught, and only small portion of Python and Java libraries are commonly explored in an education setting.
Powerful or Unique Libraries
- Java - Large standard library
- Python - Large standard library
- Ruby - Rails
- Tcl/Tk - Tk!
A useful library can make a language. The brief popularity of Tcl was almost entirely fueled by the Tk graphics library. Rails made Ruby. Python and Java benefit from huge standard libraries as well as a cornucopia of freely available additional libraries. Libraries that are cross-platform seem to play an oversized role, as people will often tolerate a new way of programming if it gives them reach. Being the only game in town seems to matter. Tcl's popularity rapidly deflated once Tk and equivalents became available for other languages. Many relatively unpopular languages now have a range of libraries that once would have been considered large, but which come up short in light of the number available for currently popular languages.
Powerful Built-in Types / Operators
- BASIC - Strings
- COBOL - Structures / Records
- Fortran - Complex Numbers + Arrays
- Hypercard + Visual BASIC - GUI built-in
- Java - Concurrency
- Logo - Turtle Graphics
- Matlab / R - Matrix Operations
- Objective C + C++ - Objects
- Perl - Regular Expressions
- PHP - CGI out of the box
- Shell / Bash - Pipes + Redirection
- SQL - Relational Algebra
Being able to express a particular domain succinctly seems to matter. Regular expressions were the killer feature for Perl, so much so that Perl style regexs pervade other languages and tools. Decent string operations helped early BASIC. Standardize graphics primitives made Logo. Fortran was made by arrays and complex numbers. Standard memory and disk structures were why people wanted to use COBOL. Shell scripts gave uses access to the power of pipes and redirection. HTML inline with code in PHP helped web designers step incrementally from static pages to dynamic ones. Hypercard let non-programmers have a hand at UI design. The success of relational databases and SQL was intertwined. Concurrency out of box helped Java.
In some cases, popular languages spread the features of the unique languages outside their narrow communities:
- Hypercard &rarrow; Visual BASIC --- GUI built-in
- APL &rarrow; Matlab/R -- Matrices / Tensors
- Simula / Smalltalk / Self &rarrow; Objective C + C++ --- Objects
Building things into the core syntax of a language seem to matter. This is likely both because it allows for syntactic sugar (reducing noise), and because it means there is a definitive way to express something. Reducing an idea to the point it has a good notation is usually an indication the idea is ready for dissemination. Perl's regexes, Matlab's matrices, Python's list comprehensions, BASIC's strings, Fortran's complex numbers, COBOL's structures all have in common that they represent the point when an idea found a notation good enough that you'd write it on a blackboard.
Languages that don't build in these constructs often imitate them with an almost sad deference to their original source. Much of the evolution of C++ can be seen as a quest to allow Fortran style complex numbers, BASIC style strings, COBOL style serialization, and Matlab style matrices to be implemented as libraries. A host of languages embed a clone of Perl's regex syntax, but stubbornly embed it in a string, often at higher runtime cost, and usually with vexing escaping quirks.
Endorsement of Powerful Entity / Only Way to Program a Platform
- ActionScript - Flash
- Ada - Department of Defense
- BASIC / Visual Basic / VBscript / ASP - Microsoft
- C++ - Microsoft
- C# - Microsoft
- Fortran - IBM
- Objective C - Apple
Some languages benefit from a powerful sponsor. These languages are, or were at some point, the only way to program for a popular platform or audience, or the only "recommended" way. Though their users are captive, the languages aren't always terrible. Often languages in this category improve over time, if they started out bad, to prevent alternate contenders from stealing the reins. The effect of the sponsor is often amplified when a sizable user community develops around a language.
Intertwined with the Success of UNIX
- Shell / Bash
The success of UNIX is fairly unique in computing. Prior to UNIX, operating systems failed to succeed in definitively standardizing I/O, process control, and interprocess communication. Often language specific standards for I/O trumped OS ones. After UNIX, other operating systems frequently have mimicked its facilities, sometimes fixing their rough edges, but more often watering them down or making them more complicated. C is a fairly ideal language to implement the core parts of a UNIX system. Shell scripts combined with a collection of micro-applications form a domain specific process control language that's just as much a part of UNIX as its standard syscalls. Thus C and shell scripts are intertwined with the success of UNIX and the UNIX model of an OS, making their success mutual.
What make the "original" languages original, is that they each introduced a new programming paradigm.
Conventional wisdom has a played an oversized role in which of these language's ideas have spread. A sizeable portion of developers hold the belief that we've progressed from low level, to high level (Fortran), to structured (Algol), to object oriented (Simula / Smalltalk / Self) programming. Many have also seen the wisdom of mixing in functional programming (Lisp) constructs when possible. Interestingly, with the exception of Fortran, none of the languages responsible for introducing the most widely accepted paradigms actually benefiting by being first. Fortran's novelty in this regard is likely the result of its strong backing from IBM, coupled with it arguably evolving so radically in its early days, that it became its own successor.
All of the starred languages (both popular and original) intriguingly appear in the "Powerful Built-in Types / Operators" category above. This suggests that devising the "right" notation for something can be so compelling that it makes your language. These language, Fortran, C, Hypercard, and SQL also could be said to have benefitted from being part of a larger movement Fortran (scientific computing), C (UNIX), Hypercard (GUIs), SQL (standardization of relational databases).
By contrast the languages that introduced a successful paradigm, but failed to capitalize on it, seem to suffer from a variety of weaknesses. APL's succinct notation, while appealing to strong mathematicians, was hampered by its steep learning curve in the hands of weak ones. The burden of a quirky character set, probably also kept it out of enough hands to hold on to its killer feature. Matlab succeed in the same domain with a more friendly learning curve for novices. Algol while the Ur-structured programming language, was notoriously difficult to implement, leaving open the door for languages like Pascal and C. Lisp's functional programming was conflated with the orthogonal idea of code as data, thus hampering its dissemination with a tedious syntax. Simula, Smalltalk, and Self suffered from being invented at moments in time when the runtime cost of dynamic dispatch was prohibitive on mainframes and microcomputers respectively.
Some paradigms exemplified by the special languages are arguably unrealized or aspirational. Lisp's notion of code as data seems profound, but is hard to apply in practice. Also, Lisp programs are often no more strictly functional than a typical Python or Perl program. Prolog's logic programming is really an attempt to apply a theorem solver to general purpose programming, but actual Prolog programs are written with implicit assumptions about evaluation, Lisp in disguise. Haskell's pattern matching and monads end up being used mostly as a vehicle to embed imperative programs. Inform7's natural language use is compelling in a subset of the adventure game domain, but falls short in COBOL like ways outside it. Forth's compile time execution, which is also being experimented with in C++ via template meta-programming, has generally failed to be made convenient enough for everyday use by non-experts. Forth's potential for extreme refactoring (particularly into one line definitions) is in practice used by a limited subset of very skilled users. The majority of Forth programs are not that different from C programs written with heavy use of statics / globals and limited use of types, locals, and parameter passing. In each case, the paradigm may yet lead somewhere, but arguably we haven't yet found the right notation or figured out how to actually program in the paradigm.
This implies an interesting test if you think you've got a cool new programming language paradigm:
Flagxor's Rockstar Test
A programming language paradigm is real only if code written by an average user of the language looks pretty close to code written by experts.
What didn't matter
It's also interesting to examine what didn't matter.
The ability to embed domain specific languages in Lisp and Forth seems to have been overwhelmed by the fragmentation resulting from everyone doing it differently. Forth and Lisp have multiple libraries for implementing things like regexs, numeric data, or object oriented behavior. While Forth and Lisp can often borrow use notational styles from other languages with high fidelity, subtle incompatibilities and slow or inconclusive standardization undermine this flexibility in practice.
How to succeed in programming languages...
So what would it take today to popularize a language? You'll get a temporary boost if you make it easy to embed in other tools or come up with some unique or elegant libraries for it. Making it easy to implement might give you the benefit of competition, but as long as your implementation is portable and of good quality, it might not matter in this day in age. You could pitch it as a learning language or get a big corporate sponsor, but be careful not to make it too novel or you'll weed yourself out. Alternatively if you want your language to actually be special, your best bet looks like to come up with really good notation to express some problem domain, and then bake it in. In all cases, be sure to have a standard library up to todays standards, and be sure to support structured, object oriented, and a dash of functional programming.
Plan A: (Ruby / Groovy / Dart / Java / Swift / Python / C++...)
Make your language a rationalization of existing practice, knock off all the other cool kids. Beg borrow and steal libraries and lock in corporate or scholastic backers.
Plan B: (Road less travelled...)
Come up with the perfect syntax for some ill served domain. This is pretty much as hard as devising an "original" language. Next, bake the syntax into a language that keeps all the imperative programming features the world loves, with a syntax as close as possible to the mainstream. Then, do most of the other stuff from A.
Plan A has the disheartening drawback that you're not really inventing anything so much as making a mash-up. Plan B suffers from all the same hardships, with the added problem of coming up with a notation for something new. You also, of course, have the option of inventing a weird new trick language and letting someone else bring your secret sauce to the masses.
I confess I lack the even temperament and soul crushing conformity required for Plan A. While Plan B sound daunting, I actually find it somewhat hopeful. It usefully clarifies what you need to do get there. Aside from possibly getting the backing of a Fortune 500 company or nation state and all the work of actually implementing your language and the mountain of libraries you need to get the ball rolling, you're really just in the business of inventing an algebra for something that's not currently covered well. How hard could that be?