8

Choosing a Programming Language :: Jon Gjengset

 2 years ago
source link: https://thesquareplanet.com/blog/choosing-a-programming-language/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
neoserver,ios ssh client

Choosing a Programming Language :

Choosing a Programming Language (16 min. read)

Posted on Jun 6, 2016 — shared on Hacker News Twitter Reddit

One of the first de­ci­sions one has to make when learn­ing to pro­gram is which pro­gram­ming lan­guage to learn. In some cases, the choice is made for you, dic­tated ei­ther by the lan­guage used in a class, or by a frame­work you need to use, but often you will have at least a few op­tions.

Pick­ing your first (and sec­ond) lan­guage can be daunt­ing, as there are so many can­di­dates out there. It’s not al­ways clear how they dif­fer, and much less which are “bet­ter” for what you are want­ing to do. This post will try to give a high-level overview of the choices that are avail­able, and what dif­fer­en­ti­ates them, to aid you in mak­ing an in­formed de­ci­sion.

Note that this post is not a re­view of any of these lan­guages, nor does it aim to find the “best” pro­gram­ming lan­guage out there. I will de­scribe dif­fer­ent types of lan­guages, and give ex­am­ples for each one, but the choice of cat­e­gory, and lan­guage within that cat­e­gory, is yours. My hope is that this post will pro­vide you with enough con­text and ter­mi­nol­ogy to let you con­tinue a more in­formed and re­stricted search on your own.

Choos­ing a lan­guage

Pro­gram­ming lan­guages gen­er­ally be­long to a num­ber of dif­fer­ent cat­e­gories. These are often re­ferred to as “par­a­digms” in the lit­er­a­ture. As we’ll see shortly, the lines be­tween dif­fer­ent par­a­digms are often blurred, es­pe­cially when it comes to more mod­ern lan­guages. Many lan­guages are so sim­i­lar that they dif­fer mostly in syn­tax, ecosys­tem, and con­ven­tions, not in fea­tures. Nev­er­the­less, they can be use­ful for lim­it­ing the num­ber of lan­guages to look at in more de­tail.

Through all of this, it is im­por­tant to keep in mind that these lan­guages can all do the same things — they all let you print to the screen, do math, read files, con­nect to the in­ter­net, etc. They dif­fer in how they en­able you to do that, and in many cases, how con­ve­nient it is to do those things. Some lan­guages are more suit­able for doing sta­tis­ti­cal com­pu­ta­tion, some are bet­ter for build­ing web­sites, and oth­ers are bet­ter for in­ter­act­ing with hard­ware. This is part of the rea­son why it’s im­por­tant to ask your­self what kind of thing you want to do first, and only then start look­ing for a lan­guage.

Level of ab­strac­tion

Let’s start with one of the most dis­crim­i­na­tory fea­tures of pro­gram­ming lan­guages: the level of ab­strac­tion from the hard­ware. Lan­guages that hide more of the inner work­ings of the com­puter from the pro­gram­mer are usu­ally re­ferred to as “high-level”, whereas those that ex­pose these low-level de­tails are called “low-level”. This dis­tinc­tion is more of a scale than it is a cat­e­go­riza­tion — plenty of lan­guages can be con­sid­ered “mid-level” in terms of ab­strac­tion.

In gen­eral, ab­strac­tions come at a price, and the higher level the lan­guages, the lower the per­for­mance. That said, even high-level lan­guages have per­for­mance that is suit­able for most ap­pli­ca­tions. You should find a lan­guage in the “high­est” cat­e­gory whose per­for­mance your ap­pli­ca­tion can tol­er­ate.

In­creased ab­strac­tion also hides more of the inner work­ings of your pro­gram from you — there’s more “magic” going on be­hind the scenes. This is often what you want, as it can save you a lot of typ­ing, but it may cause frus­tra­tion if things break or un­der­per­form, and you have to fig­ure out why. How much magic is right for you is a mat­ter of per­sonal taste, and you may have to play around with mul­ti­ple lan­guages be­fore you find what’s right for you.

With­out fur­ther ado, let’s look at a few dif­fer­ent “tiers” on this scale, and ex­am­ples of what lan­guages fit in each tier.

  • Lit­tle to no ab­strac­tion. At the bot­tom of the scale, we have a set of lan­guages that are com­monly re­ferred to as as­sem­bly code. As­sem­bly is an al­most di­rect map­ping of the ma­chine in­struc­tions un­der­stood by your CPU, and thus there exist many dif­fer­ent vari­ants of as­sem­bly, each one map­ping to a dif­fer­ent type of CPU. Every state­ment in your code is a sin­gle ma­chine op­er­a­tion, such as “store this value to this piece of mem­ory” or “add the val­ues from these two pieces of mem­ory”.

    This kind of code is usu­ally only used for code that needs to in­ter­act with de­vices on a very low-level (e.g. the op­er­at­ing sys­tem ker­nel), or that needs to squeeze every last bit of per­for­mance out of the hard­ware at hand (e.g. video de­cod­ing). It is un­likely that you will want to choose an as­sem­bly lan­guages as your first lan­guage.

  • Ma­chine-in­de­pen­dence. A bit fur­ther up, we find C (and no­tably not its cousin C++). These lan­guages are still fairly low-level; their code maps very closely to the op­er­a­tions the hard­ware can per­form, and they re­quire you to man­age your own mem­ory (“I’d like 2 MB of mem­ory, please”; “I’m done with it now, thanks”). How­ever, they also pro­vide ab­strac­tions such as func­tions (pieces of code that can be reused), loops (code that is ex­e­cuted sev­eral times), and types (an­no­ta­tions on mem­ory say­ing that it con­tains, say, a num­ber in­stead of a let­ter).

    Lan­guages in this cat­e­gory are usu­ally com­piled, mean­ing that the code you write must be passed through a “com­piler” — a pro­gram that trans­lates your pro­gram to code the hard­ware can un­der­stand — be­fore you can run it. This al­lows C to be ma­chine-in­de­pen­dent. The code you write can be writ­ten once, and then com­piled to dif­fer­ent types of hard­ware.

    C-like lan­guages are pop­u­lar be­cause they gen­er­ally per­form well (be­cause the code maps so closely to the hard­ware), and be­cause they are con­cep­tu­ally quite sim­ple (there’s very lit­tle magic — the code you write is what’s run). The down­side to these lan­guages is that you often need to write a lot of code, pre­cisely be­cause you have to spell out every step of the pro­gram. You should con­sider these lan­guages if per­for­mance if your pri­mary con­cern.

  • Mid-level lan­guages. In this next cat­e­gory, we find the lan­guages that still tar­get high per­for­mance, but that aim to also pro­vide some ad­di­tional ab­strac­tion from the low-level work­ings of the hard­ware. These ab­strac­tions can take a va­ri­ety of forms, but com­mon ex­am­ples are clo­sures (roughly speak­ing, a func­tion that is con­structed when the pro­gram is run), semi-au­to­matic mem­ory man­age­ment, and gener­ics (func­tions that can op­er­ate on data of dif­fer­ent types; for ex­am­ple, a dic­tio­nary that can use ei­ther strings or num­bers are key­words). These often allow you to ex­press your code in a more con­cise way, and of­floads some of the te­dious step-by-step enu­mer­a­tion to the com­piler. Ex­am­ples of well-known lan­guages in this cat­e­gory are C++, Swift, and Rust.

  • Lan­guages with run­times. Lan­guages in this tier also in­clude a run­time — when your pro­gram is run­ning, some other code that’s part of the lan­guage is run­ning along­side it, per­form­ing fea­tures such as garbage col­lec­tion (au­to­mat­i­cally fig­ur­ing out when mem­ory is no longer needed) and green­thread­ing (more ef­fi­ciently run­ning more con­cur­rent com­pu­ta­tions than there are proces­sors on the ma­chine). These fea­tures usu­ally come with some per­for­mance penalty (though you prob­a­bly won’t no­tice un­less you’re writ­ing very per­for­mance-sen­si­tive ap­pli­ca­tions), but can make pro­gram­ming both sim­pler and safer.

    The run­time does, in many cases, make it harder to write pro­grams that in­ter­act with other lan­guages. For ex­am­ple, high-per­for­mance im­ple­men­ta­tions of com­plex math­e­mat­i­cal op­er­a­tions such as large ma­trix mul­ti­pli­ca­tions are often im­ple­mented in FOR­TRAN or C, and it can be dif­fi­cult to take ad­van­tage of those kinds of li­braries when you are in a lan­guage with a run­time. Pop­u­lar lan­guages in this cat­e­gory are Java, Scala, Go, and C# (along with all of .NET).

  • In­ter­preted lan­guages. Pro­grams writ­ten using lan­guages in this tier are gen­er­ally much slower than those in the cat­e­gories above, but are often much eas­ier to write. The biggest ad­van­tage of code writ­ten in in­ter­preted lan­guages is that it can be par­tially ex­e­cuted. You can write a piece of code, run it, write some more code, and then run that new code as if it fol­lowed the code that you ran pre­vi­ously. This is use­ful for quickly con­struct­ing one-off com­pu­ta­tions piece-by-piece, as well as for fig­ur­ing out where some­thing goes wrong in your pro­gram (you can in­spect the state of your pro­gram as it’s run­ning).

    In­ter­preted lan­guages are also often much more le­nient about what your code can do; they often allow mon­key-patch­ing (chang­ing the be­hav­ior of a run­ning pro­gram), code eval­u­a­tion (ex­e­cut­ing code that you read from a file or over the net­work dur­ing as part of run­ning your pro­gram), and type con­ver­sion (a string con­tain­ing a num­ber can sim­ply be used as a num­ber di­rectly). How­ever, this le­nience in­tro­duces new classes of bugs that can be hard to find and fix, since what code ac­tu­ally ended up run­ning is not im­me­di­ately ob­vi­ous. For this rea­son, com­plex, long-run­ning soft­ware is usu­ally writ­ten in a com­piled lan­guage, whereas in­ter­preted lan­guages are used for writ­ing man­age­ment tools, data an­a­lyt­ics, one-off scripts, and web­sites. Web­site de­vel­op­ment is an ex­am­ple where the abil­ity to rapidly it­er­ate on the code is par­tic­u­larly im­por­tant, which the run-as-you-go ap­proach of in­ter­preted lan­guages fits nicely.

    There are a lot of in­ter­preted lan­guages out there. The most pop­u­lar ones are Python, PHP, JavaScript, Perl, Ruby, and Lua.

  • Spe­cial­ized lan­guages. These are lan­guages that have been built to cater for par­tic­u­lar use-cases. It is often dif­fi­cult to say what level of ab­strac­tion they pro­vide, be­cause they are usu­ally very high-level for the tar­get use, but pro­vide very low-level prim­i­tives if you want to do some­thing non-stan­dard. In gen­eral, you will only want to use these lan­guages if you are try­ing to do ex­actly what they are built for. Com­mon ex­am­ples here are R (for sta­tis­ti­cal com­pu­ta­tions), MAT­LAB (for math-heavy com­pu­ta­tions), SQL (for data­base query­ing), Coq (for writ­ing proofs about pro­grams), and Pro­log (for logic-based in­fer­ence). We will not be talk­ing a lot about spe­cial­ized lan­guages, since you gen­er­ally know if you should be using them.

  • High-level lan­guages. These lan­guages often de­part sig­nif­i­cantly from the com­pu­ta­tional model used by the lan­guages we have dis­cussed thus far. They make lit­tle or no ef­fort to con­form to the way the hard­ware ex­e­cutes code (one small com­pu­ta­tion or mem­ory op­er­a­tion at the time), and in­stead let you focus on the high-level prop­er­ties of your al­go­rithm. In many ways, these lan­guages are more like ex­e­cutable math for­mu­las than they are ma­chine code. As a re­sult, pro­grams writ­ten in these lan­guages often have fewer bugs, and are more likely to work cor­rectly if the com­piler ac­cepts the code.

    Lan­guages at this level of ab­strac­tion can be, and have been, used to build “tra­di­tional” pro­grams. How­ever, where they re­ally shine is when they are used to parse and rea­son about the be­hav­ior of other pro­grams, or for­mally ver­ify prop­er­ties and in­vari­ants of the code it­self. For ex­am­ple, in these lan­guages it is often pos­si­ble to prove that the pro­gram will never fail in a par­tic­u­lar way, or that a per­for­mance op­ti­miza­tion al­ways re­turns the same an­swer as the slower, naïve im­ple­men­ta­tion.

    You’ll want to use these lan­guages if per­for­mance is not of crit­i­cal per­for­mance to you, or if you want strong guar­an­tees about the cor­rect­ness of your code. Be aware that they can be some­what te­dious to get started with, since it can be hard to con­vince the com­piler that your code is in fact cor­rect. Well-known lan­guages in this tier are Haskell, F#, Coq, LISP, and OCaml.

Strict­ness

Find­ing the right level of ab­strac­tion usu­ally takes you a long way to­wards pick­ing a lan­guage. This is par­tic­u­larly true as many of the lan­guages within each tier are fairly sim­i­lar in terms of fea­tures, and mostly vary in syn­tax. Nonethe­less, it can be use­ful to have a sec­ond scale on which to eval­u­ate dif­fer­ent lan­guages within a tier. I have often found it use­ful to com­pare lan­guages in terms of their “strict­ness”. Stricter lan­guages are harder to write code for, as they re­quire you to con­vince the com­piler that your code ad­heres to some no­tion of “cor­rect”, but once your code com­piles, you can be more cer­tain that it does the right thing. Con­versely, less strict lan­guages place fewer re­stric­tions on your code, but your pro­grams are more likely to break when you run them.

So, in order from less to more strict:

  • Do what­ever you want. These lan­guages let you get away with pretty much any­thing. Want to add the let­ter S to the value true? Sure, go ahead! Want to make + ig­nore its ar­gu­ments and al­ways re­turn “One ring to rule them all” in­stead? That’s fine. This flex­i­bil­ity al­lows you to do re­ally neat things, like mod­i­fy­ing and eval­u­at­ing your pro­gram as it’s run­ning. It also means your code will do some­thing the first time you run it. If you know what you’re doing, or if you’re doing some­thing sim­ple where retry­ing if it’s wrong isn’t too costly, this is great. If this run-crash-fix-re-run loop sounds frus­trat­ing though, you may want to look for a dif­fer­ent kind of lan­guage.

    Lan­guages in this cat­e­gory are JavaScript, PHP, Perl, Ruby, and ar­guably LISP. There are also lan­guages that are slightly more strict, and will check that you haven’t done some­thing com­pletely crazy, but that still be­long to this gen­eral cat­e­gory. Python and Type­Script are ex­am­ples of such lan­guages. These lan­guages are some­times re­ferred to as dy­nam­i­cally typed.

  • Try to make sense. These lan­guages re­quire that your code be­haves ra­tio­nally. If + is a func­tion that takes two num­ber and re­turns a num­ber, you can’t just go ahead and re­turn true in­stead. This gets rid of a lot of bugs re­lated to in­cor­rect types at run­time, but also means that it’s harder to, for ex­am­ple, con­vert user input from a string to a num­ber. These lan­guages still have short­cuts you can take to cir­cum­vent many of the checks (see interface{} in Go, void* in C, and Object in Java), but in gen­eral force you to write sen­si­ble code. Ex­am­ples of these lan­guages are Go, C, C++, and Java. These are often re­ferred to as sta­t­i­cally and strongly typed. There is a dif­fer­ence be­tween sta­t­i­cally typed and strongly typed, which you can read up on else­where, but they both be­long in this tier. The higher tiers re­quire both.

  • No cheat­ing. Now we’re get­ting into the land of “no shenani­gans”. Not only do you have to con­vince the com­piler that your pro­gram doesn’t do some­thing silly like mix­ing num­bers and strings, you also have to en­sure that it doesn’t do any­thing dan­ger­ous. This can be that your pro­gram isn’t al­lowed to mod­ify im­mutable data, have data that is con­cur­rently mod­i­fied, or to read data after it has been wiped. There are many dif­fer­ent ap­proaches to this, such as dis­al­low­ing mu­ta­ble data al­to­gether (Haskell), or check­ing these prop­er­ties at com­pile time (Rust). Other lan­guages in this cat­e­gory are Scala, Swift, and F#.

Does it mat­ter where I start?

If you just want to learn how to pro­gram, with few or no use-cases in mind, it can be hard to pick any of these cat­e­gories. If you’re in this po­si­tion, you should ask your­self what you’re more in­ter­ested in learn­ing. Start­ing with a lower-level lan­guage will force you to learn more about how the com­puter’s CPU and mem­ory works, which pro­vides a solid foun­da­tion on which to build if you want to make things go fast, if you want to build some­thing like an Op­er­at­ing Sys­tem, a de­vice dri­ver, or a re­source-in­ten­sive game. On the other hand, start­ing with a higher-level lan­guage will let you dive right into al­go­rithms and data struc­tures, with­out wor­ry­ing too much about the low-level de­tails. This might be what you want if your back­ground is more math-y, or if you have some prob­lem you’d like to solve quickly (e.g., data an­a­lyt­ics, small con­ve­nience tools, or a web­site).

Switch­ing lan­guages

The de­scrip­tions I’ve given above will hope­fully help you make a good de­ci­sion about what lan­guage to dive into. In­evitably, how­ever, you will find that the lan­guage you picked doesn’t work very well for some par­tic­u­lar task, or that there’s some­thing that irks you about it. When this hap­pens, don’t be afraid to try out an­other lan­guage! In many cases, you will find that the new lan­guage dif­fers from your cur­rent one mostly in terms of syn­tax, es­pe­cially if you are switch­ing be­tween lan­guages of a sim­i­lar level of strict­ness and ab­strac­tion.

Switch­ing to lan­guages that are “far­ther apart” is harder, as there are new con­cepts you need to learn. Luck­ily, much of your ex­ist­ing ex­pe­ri­ence will trans­late eas­ily — no­tions like vari­ables, strings, func­tions, mod­ules are found in al­most all lan­guages. Fur­ther­more, the more lan­guages you know, the eas­ier it is to learn new ones. This is one of the rea­son ex­pe­ri­enced de­vel­op­ers often claim that they know dozens of lan­guages; each ad­di­tional lan­guage be­comes eas­ier to learn. In many cases, learn­ing a new lan­guage can even change the way you pro­gram in the lan­guages you al­ready know! Pick­ing up an­other lan­guage, or sam­pling a bunch of them, is a nat­ural part of de­vel­op­ing your­self as a pro­gram­mer — don’t be afraid to try!

Good luck, and don’t be too stressed about the de­ci­sion. Es­pe­cially in the be­gin­ning, every­thing you learn will come to good use, even if you later change your mind.


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK