Choosing a Programming Language :: Jon Gjengset

 2 years ago
source link: https://thesquareplanet.com/blog/choosing-a-programming-language/
Choosing a Programming Language :

Choosing a Programming Language (16 min. read)

Posted on Jun 6, 2016

One of the first de­ci­sions one has to make when learn­ing to pro­gram is which pro­gram­ming lan­guage to learn. In some cases, the choice is made for you, dic­tated ei­ther by the lan­guage used in a class, or by a frame­work you need to use, but often you will have at least a few op­tions.

Pick­ing your first (and sec­ond) lan­guage can be daunt­ing, as there are so many can­di­dates out there. It’s not al­ways clear how they dif­fer, and much less which are “bet­ter” for what you are want­ing to do. This post will try to give a high-level overview of the choices that are avail­able, and what dif­fer­en­ti­ates them, to aid you in mak­ing an in­formed de­ci­sion.

Note that this post is not a re­view of any of these lan­guages, nor does it aim to find the “best” pro­gram­ming lan­guage out there. I will de­scribe dif­fer­ent types of lan­guages, and give ex­am­ples for each one, but the choice of cat­e­gory, and lan­guage within that cat­e­gory, is yours. My hope is that this post will pro­vide you with enough con­text and ter­mi­nol­ogy to let you con­tinue a more in­formed and re­stricted search on your own.

Choos­ing a lan­guage

Pro­gram­ming lan­guages gen­er­ally be­long to a num­ber of dif­fer­ent cat­e­gories. These are often re­ferred to as “par­a­digms” in the lit­er­a­ture. As we’ll see shortly, the lines be­tween dif­fer­ent par­a­digms are often blurred, es­pe­cially when it comes to more mod­ern lan­guages. Many lan­guages are so sim­i­lar that they dif­fer mostly in syn­tax, ecosys­tem, and con­ven­tions, not in fea­tures. Nev­er­the­less, they can be use­ful for lim­it­ing the num­ber of lan­guages to look at in more de­tail.

Through all of this, it is im­por­tant to keep in mind that these lan­guages can all do the same things — they all let you print to the screen, do math, read files, con­nect to the in­ter­net, etc. They dif­fer in how they en­able you to do that, and in many cases, how con­ve­nient it is to do those things. Some lan­guages are more suit­able for doing sta­tis­ti­cal com­pu­ta­tion, some are bet­ter for build­ing web­sites, and oth­ers are bet­ter for in­ter­act­ing with hard­ware. This is part of the rea­son why it’s im­por­tant to ask your­self what kind of thing you want to do first, and only then start look­ing for a lan­guage.

Level of ab­strac­tion

Let’s start with one of the most dis­crim­i­na­tory fea­tures of pro­gram­ming lan­guages: the level of ab­strac­tion from the hard­ware. Lan­guages that hide more of the inner work­ings of the com­puter from the pro­gram­mer are usu­ally re­ferred to as “high-level”, whereas those that ex­pose these low-level de­tails are called “low-level”. This dis­tinc­tion is more of a scale than it is a cat­e­go­riza­tion — plenty of lan­guages can be con­sid­ered “mid-level” in terms of ab­strac­tion.

In gen­eral, ab­strac­tions come at a price, and the higher level the lan­guages, the lower the per­for­mance. That said, even high-level lan­guages have per­for­mance that is suit­able for most ap­pli­ca­tions. You should find a lan­guage in the “high­est” cat­e­gory whose per­for­mance your ap­pli­ca­tion can tol­er­ate.

In­creased ab­strac­tion also hides more of the inner work­ings of your pro­gram from you — there’s more “magic” going on be­hind the scenes. This is often what you want, as it can save you a lot of typ­ing, but it may cause frus­tra­tion if things break or un­der­per­form, and you have to fig­ure out why. How much magic is right for you is a mat­ter of per­sonal taste, and you may have to play around with mul­ti­ple lan­guages be­fore you find what’s right for you.

With­out fur­ther ado, let’s look at a few dif­fer­ent “tiers” on this scale, and ex­am­ples of what lan­guages fit in each tier.

  • Lit­tle to no ab­strac­tion. At the bot­tom of the scale, we have a set of lan­guages that are com­monly re­ferred to as as­sem­bly code. As­sem­bly is an al­most di­rect map­ping of the ma­chine in­struc­tions un­der­stood by your CPU, and thus there exist many dif­fer­ent vari­ants of as­sem­bly, each one map­ping to a dif­fer­ent type of CPU. Every state­ment in your code is a sin­gle ma­chine op­er­a­tion, such as “store this value to this piece of mem­ory” or “add the val­ues from these two pieces of mem­ory”.

    This kind of code is usu­ally only used for code that needs to in­ter­act with de­vices on a very low-level (e.g. the op­er­at­ing sys­tem ker­nel), or that needs to squeeze every last bit of per­for­mance out of the hard­ware at hand (e.g. video de­cod­ing). It is un­likely that you will want to choose an as­sem­bly lan­guages as your first lan­guage.

  • Ma­chine-in­de­pen­dence. A bit fur­ther up, we find C (and no­tably not its cousin C++). These lan­guages are still fairly low-level; their code maps very closely to the op­er­a­tions the hard­ware can per­form, and they re­quire you to man­age your own mem­ory (“I’d like 2 MB of mem­ory, please”; “I’m done with it now, thanks”). How­ever, they also pro­vide ab­strac­tions such as func­tions (pieces of code that can be reused), loops (code that is ex­e­cuted sev­eral times), and types (an­no­ta­tions on mem­ory say­ing that it con­tains, say, a num­ber in­stead of a let­ter).

    Lan­guages in this cat­e­gory are usu­ally com­piled, mean­ing that the code you write must be passed through a “com­piler” — a pro­gram that trans­lates your pro­gram to code the hard­ware can un­der­stand — be­fore you can run it. This al­lows C to be ma­chine-in­de­pen­dent. The code you write can be writ­ten once, and then com­piled to dif­fer­ent types of hard­ware.

    C-like lan­guages are pop­u­lar be­cause they gen­er­ally per­form well (be­cause the code maps so closely to the hard­ware), and be­cause they are con­cep­tu­ally quite sim­ple (there’s very lit­tle magic — the code you write is what’s run). The down­side to these lan­guages is that you often need to write a lot of code, pre­cisely be­cause you have to spell out every step of the pro­gram. You should con­sider these lan­guages if per­for­mance if your pri­mary con­cern.

  • Mid-level lan­guages. In this next cat­e­gory, we find the lan­guages that still tar­get high per­for­mance, but that aim to also pro­vide some ad­di­tional ab­strac­tion from the low-level work­ings of the hard­ware. These ab­strac­tions can take a va­ri­ety of forms, but com­mon ex­am­ples are clo­sures (roughly speak­ing, a func­tion that is con­structed when the pro­gram is run), semi-au­to­matic mem­ory man­age­ment, and gener­ics (func­tions that can op­er­ate on data of dif­fer­ent types; for ex­am­ple, a dic­tio­nary that can use ei­ther strings or num­bers are key­words). These often allow you to ex­press your code in a more con­cise way, and of­floads some of the te­dious step-by-step enu­mer­a­tion to the com­piler. Ex­am­ples of well-known lan­guages in this cat­e­gory are C++, Swift, and Rust.

  • Lan­guages with run­times. Lan­guages in this tier also in­clude a run­time — when your pro­gram is run­ning, some other code that’s part of the lan­guage is run­ning along­side it, per­form­ing fea­tures such as garbage col­lec­tion (au­to­mat­i­cally fig­ur­ing out when mem­ory is no longer needed) and green­thread­ing (more ef­fi­ciently run­ning more con­cur­rent com­pu­ta­tions than there are proces­sors on the ma­chine). These fea­tures usu­ally come with some per­for­mance penalty (though you prob­a­bly won’t no­tice un­less you’re writ­ing very per­for­mance-sen­si­tive ap­pli­ca­tions), but can make pro­gram­ming both sim­pler and safer.

    The run­time does, in many cases, make it harder to write pro­grams that in­ter­act with other lan­guages. For ex­am­ple, high-per­for­mance im­ple­men­ta­tions of com­plex math­e­mat­i­cal op­er­a­tions such as large ma­trix mul­ti­pli­ca­tions are often im­ple­mented in FOR­TRAN or C, and it can be dif­fi­cult to take ad­van­tage of those kinds of li­braries when you are in a lan­guage with a run­time. Pop­u­lar lan­guages in this cat­e­gory are Java, Scala, Go, and C# (along with all of .NET).

  • In­ter­preted lan­guages. Pro­grams writ­ten using lan­guages in this tier are gen­er­ally much slower than those in the cat­e­gories above, but are often much eas­ier to write. The biggest ad­van­tage of code writ­ten in in­ter­preted lan­guages is that it can be par­tially ex­e­cuted. You can write a piece of code, run it, write some more code, and then run that new code as if it fol­lowed the code that you ran pre­vi­ously. This is use­ful for quickly con­struct­ing one-off com­pu­ta­tions piece-by-piece, as well as for fig­ur­ing out where some­thing goes wrong in your pro­gram (you can in­spect the state of your pro­gram as it’s run­ning).

    In­ter­preted lan­guages are also often much more le­nient about what your code can do; they often allow mon­key-patch­ing (chang­ing the be­hav­ior of a run­ning pro­gram), code eval­u­a­tion (ex­e­cut­ing code that you read from a file or over the net­work dur­ing as part of run­ning your pro­gram), and type con­ver­sion (a string con­tain­ing a num­ber can sim­ply be used as a num­ber di­rectly). How­ever, this le­nience in­tro­duces new classes of bugs that can be hard to find and fix, since what code ac­tu­ally ended up run­ning is not im­me­di­ately ob­vi­ous. For this rea­son, com­plex, long-run­ning soft­ware is usu­ally writ­ten in a com­piled lan­guage, whereas in­ter­preted lan­guages are used for writ­ing man­age­ment tools, data an­a­lyt­ics, one-off scripts, and web­sites. Web­site de­vel­op­ment is an ex­am­ple where the abil­ity to rapidly it­er­ate on the code is par­tic­u­larly im­por­tant, which the run-as-you-go ap­proach of in­ter­preted lan­guages fits nicely.

    There are a lot of in­ter­preted lan­guages out there. The most pop­u­lar ones are Python, PHP, JavaScript, Perl, Ruby, and Lua.

  • Spe­cial­ized lan­guages. These are lan­guages that have been built to cater for par­tic­u­lar use-cases. It is often dif­fi­cult to say what level of ab­strac­tion they pro­vide, be­cause they are usu­ally very high-level for the tar­get use, but pro­vide very low-level prim­i­tives if you want to do some­thing non-stan­dard. In gen­eral, you will only want to use these lan­guages if you are try­ing to do ex­actly what they are built for. Com­mon ex­am­ples here are R (for sta­tis­ti­cal com­pu­ta­tions), MAT­LAB (for math-heavy com­pu­ta­tions), SQL (for data­base query­ing), Coq (for writ­ing proofs about pro­grams), and Pro­log (for logic-based in­fer­ence). We will not be talk­ing a lot about spe­cial­ized lan­guages, since you gen­er­ally know if you should be using them.

  • High-level lan­guages. These lan­guages often de­part sig­nif­i­cantly from the com­pu­ta­tional model used by the lan­guages we have dis­cussed thus far. They make lit­tle or no ef­fort to con­form to the way the hard­ware ex­e­cutes code (one small com­pu­ta­tion or mem­ory op­er­a­tion at the time), and in­stead let you focus on the high-level prop­er­ties of your al­go­rithm. In many ways, these lan­guages are more like ex­e­cutable math for­mu­las than they are ma­chine code. As a re­sult, pro­grams writ­ten in these lan­guages often have fewer bugs, and are more likely to work cor­rectly if the com­piler ac­cepts the code.

    Lan­guages at this level of ab­strac­tion can be, and have been, used to build “tra­di­tional” pro­grams. How­ever, where they re­ally shine is when they are used to parse and rea­son about the be­hav­ior of other pro­grams, or for­mally ver­ify prop­er­ties and in­vari­ants of the code it­self. For ex­am­ple, in these lan­guages it is often pos­si­ble to prove that the pro­gram will never fail in a par­tic­u­lar way, or that a per­for­mance op­ti­miza­tion al­ways re­turns the same an­swer as the slower, naïve im­ple­men­ta­tion.

    You’ll want to use these lan­guages if per­for­mance is not of crit­i­cal per­for­mance to you, or if you want strong guar­an­tees about the cor­rect­ness of your code. Be aware that they can be some­what te­dious to get started with, since it can be hard to con­vince the com­piler that your code is in fact cor­rect. Well-known lan­guages in this tier are Haskell, F#, Coq, LISP, and OCaml.


Find­ing the right level of ab­strac­tion usu­ally takes you a long way to­wards pick­ing a lan­guage. This is par­tic­u­larly true as many of the lan­guages within each tier are fairly sim­i­lar in terms of fea­tures, and mostly vary in syn­tax. Nonethe­less, it can be use­ful to have a sec­ond scale on which to eval­u­ate dif­fer­ent lan­guages within a tier. I have often found it use­ful to com­pare lan­guages in terms of their “strict­ness”. Stricter lan­guages are harder to write code for, as they re­quire you to con­vince the com­piler that your code ad­heres to some no­tion of “cor­rect”, but once your code com­piles, you can be more cer­tain that it does the right thing. Con­versely, less strict lan­guages place fewer re­stric­tions on your code, but your pro­grams are more likely to break when you run them.

So, in order from less to more strict:

  • Do what­ever you want. These lan­guages let you get away with pretty much any­thing. Want to add the let­ter S to the value true? Sure, go ahead! Want to make + ig­nore its ar­gu­ments and al­ways re­turn “One ring to rule them all” in­stead? That’s fine. This flex­i­bil­ity al­lows you to do re­ally neat things, like mod­i­fy­ing and eval­u­at­ing your pro­gram as it’s run­ning. It also means your code will do some­thing the first time you run it. If you know what you’re doing, or if you’re doing some­thing sim­ple where retry­ing if it’s wrong isn’t too costly, this is great. If this run-crash-fix-re-run loop sounds frus­trat­ing though, you may want to look for a dif­fer­ent kind of lan­guage.

    Lan­guages in this cat­e­gory are JavaScript, PHP, Perl, Ruby, and ar­guably LISP. There are also lan­guages that are slightly more strict, and will check that you haven’t done some­thing com­pletely crazy, but that still be­long to this gen­eral cat­e­gory. Python and Type­Script are ex­am­ples of such lan­guages. These lan­guages are some­times re­ferred to as dy­nam­i­cally typed.

  • Try to make sense. These lan­guages re­quire that your code be­haves ra­tio­nally. If + is a func­tion that takes two num­ber and re­turns a num­ber, you can’t just go ahead and re­turn true in­stead. This gets rid of a lot of bugs re­lated to in­cor­rect types at run­time, but also means that it’s harder to, for ex­am­ple, con­vert user input from a string to a num­ber. These lan­guages still have short­cuts you can take to cir­cum­vent many of the checks (see interface{} in Go, void* in C, and Object in Java), but in gen­eral force you to write sen­si­ble code. Ex­am­ples of these lan­guages are Go, C, C++, and Java. These are often re­ferred to as sta­t­i­cally and strongly typed. There is a dif­fer­ence be­tween sta­t­i­cally typed and strongly typed, which you can read up on else­where, but they both be­long in this tier. The higher tiers re­quire both.

  • No cheat­ing. Now we’re get­ting into the land of “no shenani­gans”. Not only do you have to con­vince the com­piler that your pro­gram doesn’t do some­thing silly like mix­ing num­bers and strings, you also have to en­sure that it doesn’t do any­thing dan­ger­ous. This can be that your pro­gram isn’t al­lowed to mod­ify im­mutable data, have data that is con­cur­rently mod­i­fied, or to read data after it has been wiped. There are many dif­fer­ent ap­proaches to this, such as dis­al­low­ing mu­ta­ble data al­to­gether (Haskell), or check­ing these prop­er­ties at com­pile time (Rust). Other lan­guages in this cat­e­gory are Scala, Swift, and F#.

Does it mat­ter where I start?

If you just want to learn how to pro­gram, with few or no use-cases in mind, it can be hard to pick any of these cat­e­gories. If you’re in this po­si­tion, you should ask your­self what you’re more in­ter­ested in learn­ing. Start­ing with a lower-level lan­guage will force you to learn more about how the com­puter’s CPU and mem­ory works, which pro­vides a solid foun­da­tion on which to build if you want to make things go fast, if you want to build some­thing like an Op­er­at­ing Sys­tem, a de­vice dri­ver, or a re­source-in­ten­sive game. On the other hand, start­ing with a higher-level lan­guage will let you dive right into al­go­rithms and data struc­tures, with­out wor­ry­ing too much about the low-level de­tails. This might be what you want if your back­ground is more math-y, or if you have some prob­lem you’d like to solve quickly (e.g., data an­a­lyt­ics, small con­ve­nience tools, or a web­site).

Switch­ing lan­guages

The de­scrip­tions I’ve given above will hope­fully help you make a good de­ci­sion about what lan­guage to dive into. In­evitably, how­ever, you will find that the lan­guage you picked doesn’t work very well for some par­tic­u­lar task, or that there’s some­thing that irks you about it. When this hap­pens, don’t be afraid to try out an­other lan­guage! In many cases, you will find that the new lan­guage dif­fers from your cur­rent one mostly in terms of syn­tax, es­pe­cially if you are switch­ing be­tween lan­guages of a sim­i­lar level of strict­ness and ab­strac­tion.

Switch­ing to lan­guages that are “far­ther apart” is harder, as there are new con­cepts you need to learn. Luck­ily, much of your ex­ist­ing ex­pe­ri­ence will trans­late eas­ily — no­tions like vari­ables, strings, func­tions, mod­ules are found in al­most all lan­guages. Fur­ther­more, the more lan­guages you know, the eas­ier it is to learn new ones. This is one of the rea­son ex­pe­ri­enced de­vel­op­ers often claim that they know dozens of lan­guages; each ad­di­tional lan­guage be­comes eas­ier to learn. In many cases, learn­ing a new lan­guage can even change the way you pro­gram in the lan­guages you al­ready know! Pick­ing up an­other lan­guage, or sam­pling a bunch of them, is a nat­ural part of de­vel­op­ing your­self as a pro­gram­mer — don’t be afraid to try!

Good luck, and don’t be too stressed about the de­ci­sion. Es­pe­cially in the be­gin­ning, every­thing you learn will come to good use, even if you later change your mind.

