Demystifying Programming

We talk quite a bit about code here at Tech Connect and it’s not unusual to see snippets of it pasted into a post. But most of us, indeed most librarians, aren’t professional programmers or full-time developers; we had to learn like everyone else. Depending on your background, some parts of coding will be easy to pick up while others won’t make sense for years. Here’s an attempt to explain the fundamental building blocks of programming languages.

The Languages

There are a number of popular programming languages: C, C#, C++, Java, JavaScript, Objective C, Perl, PHP, Python, and Ruby. There are numerous others, but this semi-arbitrary selection cover the ones most commonly in use. It’s important to know that each programming language requires its own software to run. You can write Python code into a text file on a machine that doesn’t have the Python interpreter installed, but you can’t execute it and see the results.

A lot of learners stress over which language to learn first unnecessarily. Once you’ve picked up one language, you’ll understand all of the foundational pieces listed below. Then you’ll be able to transition quickly to another language by understanding a few syntax changes: Oh, in JavaScript I write function myFunction(x) to define a function, while in Python I write def myFunction(x). Programming languages differ in other ways too, but knowing the basics of one provides a huge head start on learning the basics of any other.

Finally, it’s worth briefly distinguishing compiled versus interpreted languages. Code written in a compiled language, such as all the capital C languages and Java, must first be passed to a compiler program which then spits out an executable—think a file ending if .exe if you’re on Windows—that will run the code. Interpreted languages, like Perl, PHP, Python, and Ruby, are quicker to program in because you just pass your code along to an interpreter program which immediately executes it. There’s one fewer step: for a compiled language you need to write code, generate an executable, and then run the executable while interpreted languages sort of skip that middle step.

Compiled languages tend to run faster (i.e. perform more actions or computations in a given amount of time) than interpreted ones, while interpreted ones tend to be easier to learn and more lenient towards the programmer. Again, it doesn’t matter too much which you start out with.

Variables

Variables are just like variables in algebra; they’re names which stand in for some value. In algebra, you might write:

x = 10 + 3

which is also valid code in many programming languages. Later on, if you used the value of x, it would be 13.

The biggest difference between variables in math and in programming is that programming variables can be all sort of things, not just numbers. They can be strings of text, for instance. Below, we combine two pieces of text which were stored in variables:

name = 'cat'
mood = ' is laughing'
both = name + mood

In the above code, both would have a value of ‘cat is laughing’. Note that text strings have to be wrapped in quotes—often either double or single quotes is acceptable—in order to distinguish them from the rest of the code. We also see above that variables can be the product of other variables.

Comments

Comments are pieces of text inside a program which are not interpreted as code. Why would you want to do that? Well, comments are very useful for documenting what’s going on in your code. Even if your code is never going to be seen by anyone else, writing comments helps understand what’s going on if you return to a project after not thinking about it for a while.

// This is a comment in JavaScript; code is below.
number = 5;
// And a second comment!

As seen above, comments typically work by having some special character(s) at the beginning of the line which tells the programming language that the rest of the line can be ignored. Common characters that indicate a line is a comment are # (Python, Ruby), // (C languages, Java, JavaScript, PHP), and /* (CSS, multi-line blocks of comments in many other languages).

Functions

As with variables, functions are akin to those in math: they take an input, perform some calculations with it, and return an output. In math, we might see:

f(x) = (x * 3)/4

f(8) = 6

Here, the first line is a function definition. It defines how many parameters can be passed to the function and what it will do with them. The second line is more akin to a function execution. It shows that the function returns the value 6 when passed the parameter 8. This is really, really close to programming already. Here’s the math above written in Python:

def f(x):
  return (x * 3)/4

f(8)
# which returns the number 6

Programming functions differ from mathematical ones in much the same way variables do: they’re not limited to accepting and producing numbers. They can take all sorts of data—including text—process it, and then return another sort of data. For instance, virtually all programming languages allow you to find the length of a text string using a function. This function takes text input and outputs a number. The combinations are endless! Here’s how that looks in Python:

len('how long?')
# returns the number 9

Python abbreviates the word “length” to simply “len” here, and we pass the text “how long?” to the function instead of a number.

Combining variables and functions, we might store the result of running a function in a variable, e.g. y = f(8) would store the value 6 in the variable y if f(x) is the same as above. This may seem silly—why don’t you just write y = 6 if that’s what you want!—but functions help by abstracting out blocks of code so you can reuse them over and over again.

Consider a program you’re writing to manage the e-resource URLs in your catalog, which are stored in MARC field 856 subfield U. You might have a variable named num_URLs (variable names can’t have spaces, thus the underscore) which represents the number of 856 $u subfields a record has. But as you work on records, that value is going to change; rather than manually calculate it each time and set num_URLs = 3 or num_URLs = 2 you can write a function to do this for you. Each time you pass the function a bibliographic record, it will return the number of 856 $u fields, substantially reducing how much repetitive code you have to write.

Conditionals

Many readers are probably familiar with IFTTT, the “IF This Then That” web service which can glue together various accounts, for instance “If I post a new photo to Instagram, then save it to my Dropbox backup folder.” These sorts of logical connections are essential to programming, because often whether or not you perform a particular action varies depending on some other condition.

Consider a program which counts the number of books by Virginia Woolf in your catalog. You want to count a book only if the author is Virginia Woolf. You can use Ruby code like this:

if author == 'Virginia Woolf'
  total = total + 1
end

There are three parts here: first we specify a condition, then there’s some code which runs only if the condition is true, and then we end the condition. Without some kind of indication that the block of code inside the condition has ended, the entire rest of our program would only run depending on if the variable author was set to the right string of text.

The == is definitely weird to see for the first time. Why two equals? Many programming languages use a variety of double-character comparisons because the single equals already has a meaning: single equals assigns a value to a variable (see the second line of the example above) while double-equals compares two values. There are other common comparisons:

  • != often means “is not equal to”
  • > and < are the typical greater or lesser than
  • >= and <= often mean “greater/lesser than or equal to”

Those can look weird at first, and indeed one of the more common mistakes (made by professionals and newbies alike!) is accidentally putting a single equals instead of a double.[1] While we’re on the topic of strange double-character equals signs, it’s worth pointing out that += and -= are also commonly seen in programming languages. These pairs of symbols respectively add or subtract a given number from a variable, so they do assign a value but they alter it slightly. For instance, above I could have written total += 1 which is identical in outcome as total = total + 1.

Lastly, conditional statements can be far more sophisticated than a mere “if this do that.” You can write code that says “if blah do this, but if bleh do that, and if neither do something else.” Here’s a Ruby script that would count books by Virginia Woolf, books by Ralph Ellison, and books by someone other than those two.

total_vw = 0
total_re = 0
total_others = 0
if author == 'Virginia Woolf'
  total_vw += 1
elsif author == 'Ralph Ellison'
  total_re += 1
else
  total_others += 1
end

Here, we set all three of our totals to zero first, then check to see what the current value of author is, adding one to the appropriate total using a three-part conditional statement. The elsif is short for “else if” and that condition is only tested if the first if wasn’t true. If neither of the first two conditions is true, our else section serves as a kind of fallback.

Arrays

An array is simply a list of variables, in fact the Python language has an array-like data type named “list.” They’re commonly denoted with square brackets, e.g. in Python a list looks like

stuff = [ "dog", "cat", "tree"]

Later, if I want to retrieve a single piece of the array, I just access it using its index wrapped in square brackets, starting from the number zero. Extending the Python example above:

stuff[0]
# returns "dog"
stuff[2]
# returns "tree"

Many programming languages also support associative arrays, in which the index values are strings instead of numbers. For instance, here’s an associative array in PHP:

$stuff = array(
  "awesome" => "sauce",
  "moderate" => "spice",
  "mediocre" => "condiment",
);
echo $stuff["mediocre"];
// prints out "condiment"

Arrays are useful for storing large groups of like items: instead of having three variables, which requires more typing and remembering names, we have just have one array containing everything. While our three natural numbers aren’t a lot to keep track of, imagine a program which deals with all the records in a library catalog, or all the search results returned from a query: having an array to store that large list of items suddenly becomes essential.

Loops

Loops repeat an action a set number of times or until a condition is met. Arrays are commonly combined with loops, since loops make it easy to repeat the same operation on each item in an array. Here’s a concise example in Python which prints every entry in the “names” array to the screen:

names = ['Joebob', 'Suebob', 'Bobob']
for name in names:
  print name

Without arrays and loops, we’d have to write:

name1 = 'Joebob'
name2 = 'Suebob'
name3 = 'Bobob'
print name1
print name2
print name3

You see how useful arrays are? As we’ve seen with both functions and arrays, programming languages like to expose tools that help you repeat lots of operations without typing too much text.

There are a few types of loops, including “for” loops and “while” loops loops. Our “for” loop earlier went through a whole array, printing each item out, but a “while” loop only keeps repeating while some condition is true. Here is a bit of PHP that prints out the first four natural numbers:

$counter = 1;
while ( $counter < 5 ) {
  echo $counter;
  $counter = $counter + 1;
}

Each time we go through the loop, the counter is increased by one. When it hits five, the loop stops. But be careful! If we left off the $counter = $counter + 1 line then the loop would never finish because the while condition would never be false. Infinite loops are another potential bug in a program.

Objects & Object-Oriented Programming

Object-oriented programming (oft-abbreviated OOP) is probably the toughest item in this post to explain, which is why I’d rather people see it in action by trying out Codecademy than read about it. Unfortunately, it’s not until the end of the JavaScript track that you really get to work with OOP, but it gives you a good sense of what it looks like in practice.

In general, objects are simply a means of organizing code. You can group related variables and functions under an object. You make an object inherit properties from another one, if it needs to use all the same variables and functions but also add some of its own.

For example, let’s say we have a program that deals with a series of people, each of which have a few properties like their name and age but also the ability to say hi. We can create a people class which is kind of like a template; it helps us stamp out new copies of objects without rewriting the same code over and over. Here’s an example in JavaScript:

function Person(name, age) {
  this.name = name;
  this.age = age;
  this.sayHi = function() {
    console.log("Hi, I'm " + name + ".");
  };
}

Joebob = new Person('Joebob', 39);
Suebob = new Person('Suebob', 40);
Bobob = new Person('Bobob', 3);
Bobob.sayHi();
// prints "Hi, I'm Bobob."
Suebob.sayHi();
// prints "Hi, I'm Suebob."

Our Person function is essentially a class here; it allows us to quickly create three people who are all objects with the same structure, yet they have unique values for their name and age.[2] The code is a bit complicated and JavaScript isn’t a great example, but basically think of this: if we wanted to do this without objects, we’d end up repeating the content of the Person block of code three times over.

The efficiency gained with objects is similar to how functions save us from writing lots of redundant code; identifying common structures and grouping them together under an object makes our code more concise and easier to maintain as we add new features. For instance, if we wanted to add a myAgeIs function that prints out the person’s age, we could just add it to the Person class and then all our people objects would be able to use it.

Modules & Libraries

Lest you worry that every little detail in your programs must be written from scratch, I should mention that all popular programming languages have mechanisms which allow you to reuse others’ code. Practically, this means that most projects start out by identifying a few fundamental building blocks which already exist. For instance, parsing MARC data is a non-trivial task which takes some serious knowledge both of the data structure and the programming language you’re using. Luckily, we don’t need to write a MARC parsing program on our own, because several exist already:

The Code4Lib wiki has an even more extensive list of options.

In general, it’s best to reuse as much prior work as possible rather than spend time working on problems that have already been solved. Complicated tasks like writing a full-fledged web application take a lot of time and expertise, but code libraries already exist for this. Particularly when you’re learning, it can be rewarding to use a major, well-developed project at first to get a sense of what’s possible with programming.

Attention to Detail

The biggest hangup for new programmers often isn’t conceptual: variables, functions, and these other constructs are all rather intuitive, especially once you’ve tried them a few times. Instead, many newcomers find out that programming languages are very literal and unyielding. They can’t read your mind and are happy to simply give up and spit out errors if they can’t understand what you’re trying to do.

For instance, earlier I mentioned that text variables are usually wrapped in quotes. What happens if I forget an end quote? Depending on the language, the program may either just tell you there’s an error or it might badly misinterpret your code, treating everything from your open quote down to the next instance of a quote mark as one big chunk of variable text. Similarly, accidentally misusing double equals or single equals or any of the other arcane combinations of mathematical symbols can have disastrous results.

Once you’ve worked with code a little, you’ll start to pick up tools that ease a lot of minor issues. Most code editors use syntax highlighting to distinguish different constructs  which helps to aid in error recognition. This very post uses a syntax highlighter for WordPress to color keywords like “function” and distinguish variable names. Other tools can “lint” your code for mistakes or code which, while technically valid, can easily lead to trouble. The text editor I commonly use does wonderful little things like provide closing quotes and parens, highlight lines which don’t pass linting tests, and enable me to test-run selected snippets of code.

There’s lots more…

Code isn’t magic; coders aren’t wizards. Yes, there’s a lot to programming and one can devote a lifetime to its study and practice. There are also thousands of resources available for learning, from MOOCs to books to workshops for beginners. With just a few building blocks like the ones described in this post, you can write useful code which helps you in your work.

Footnotes

[1]^ True story: while writing the very next example, I made this mistake.

[2]^ Functions which create objects are called constructor functions, which is another bit of jargon you probably don’t need to know if you’re just getting started.


6 Comments on “Demystifying Programming”

  1. I really enjoyed this post as a beginning/hobbyist coder. Given my low level, can you tell me why in the first example, to get the output, it’s not both = name + mood?

    • Eric Phetteplace says:

      ah-ha! It is supposed to be name + mood, you caught a bug. I’ve fixed the post.
      In that situation, name + cat would probably lead to an error, because “cat” hasn’t been defined as a variable yet.

  2. […] We talk quite a bit about code here at Tech Connect and it’s not unusual to see snippets of it pasted into a post. But most of us, indeed most librarians, aren’t professional programmers or full-time developers; we had to learn like everyone else. Depending on your background, some parts of coding will be easy to pick up while others won’t make sense for years. Here’s an attempt to explain the fundamental building blocks of programming languages. -Eric Phetteplace  […]

  3. […] later in the year on how the course goes, and what my students think about the subject. Meanwhile, here is a nice introduction to what is involved in […]

  4. […] We talk quite a bit about code here at Tech Connect and it's not unusual to see snippets of it pasted into a post. But most of us, indeed most librarians, aren't professional programmers or full-ti…  […]