Streamlining Code (Without Fucking Things Up)

When people talk about streamlining code, usually they are talking about writing code that runs as fast as possible under heavy loads. Google’s search functionality, for example, handles millions of requests daily, so it’s to Google’s advantage to write the leanest, fastest code possible. Even a small bit of code, if written badly enough, could measurably slow down a production that large.

Of course most of the code we write doesn’t handle data at the volume Google does. But even if you’re just looping through a large file, it is useful to streamline your your code. Streamlining also tends to follow best practices, and makes your code easier to read, so doing so builds your skills as a programmer.

Take a simple task like opening a file and reading through its lines, one by one. Let’s say we want to streamline this. We might first look at something like this:

Seems pretty succinct. But without much effort, we can reduce this five line snippet to four, like so:

All we did here was collapsed lines 1 and 2. Moving the string stored in the file variable into the open() function. It’s arguable a little easier to read, and now the machine running it won’t have to store an extra variable in memory, only to retrieve it one line later. Also, the code is arguably just a little easier to read read. We could take things a step further, like this:

Now we’ve collapsed lines 1 and 2 again. That open() function returns a file object, which now, instead of being stored in a variable, is fed directly into the with statement. Okay, so our code is still legible, and we’ve eliminated another unnecessary variable. Can we take it one step further? Well, yeah. This is what we could do:

This works, and produces the same output as the previous snippets. But is it better? No. We’ve lost something here. The with statement has been taken out, and we’re jumping straight to the for loop. The with statement makes sure that the file object created by the open() function gets handled correctly if something goes wrong. It’s essentially like putting the code in a try/finally statement, where finally calls close() on the file object, no matter what happens. It’s the same as writing this:

To put it another way, we reached peak efficiency with out third iteration of streamlines:

It’s at this point that we’d made the code as small and resource light as possible, without losing any of it’s original functionality.

Now, this is not to say that this snippet is necessarily the ideal for any situation. Maybe you want to leave in some of those variables for a reason. Say you’re processing several files with the same snippet, and want to pass in the file names to the same file variable each time. Or maybe you just personally find the use of some extra variables makes your code more legible. That’s fine, and I wouldn’t argue with you at all. The point is to decide what works best for you, while taking into account the load the machine running the code has to deal with, and as a result, how quickly it can complete its job. If you are passing it a file that is only a thousand lines long, maybe reducing the code by a line or two won’t matter. If you’re passing in a thousand files that are a thousand lines long, however, it might be worth your while!

Taking Risks in Code to Build Your Confidence

If you’re a programmer, you probably fall into the bad habit, at least sometimes, of comparing yourself unfavorably to other programmers. That guy or gal who sits across the office from you and seems to be able to tackle large and complex problems with ease. “Will I ever be as smart or as skilled as them?” you wonder. “What separates them from me?” While experience and talent no doubt play a role in programming, one thing that programmers often overlook is the power of confidence gained from risk taking. Since computers can only do exactly what we tell them to, and it’s easy to make small mistakes with large consequences, changing even a small part of your code can often be an anxiety inducing experience. While the best code is loosely coupled and appropriately abstract, all code is susceptible to being broken by change. So we’re conditioned to make as small a change as possible, sometimes manually altering the input we feed into a program instead of altering existing code, in order to mitigate the risk. The result is that we miss out on the opportunity to build confidence as programmers.

Take the following example: Your boss gives you a spreadsheet with 1,100 rows of data, each one representing a car that the company has recently purchased. Your boss wants you to enter each car into the database. Obviously, this is a job for a small script. No problem. You look at the first five rows and see this (we’re keeping things simple for the sake of example):

 MakeModelYear
1ToyotaCarolla2001
2MitsubishiLancer2010
3LexusES 3002014
4VolvoXC902018
5KiaOptima2016

Seems easy. The script needs to import the file, and then read it line by line. For each line it needs to create a record in the database, with the make, model, and year of the car. So this is your solution:

This script reads the file, cars.csv, into a while loop, and line by line, it calls the function “insertIntoDB()” and passes in an array representing the row it is currently looking at. You run the script on the test database and leave to grab a cup of coffee, confident that you have tackled the job quickly and painlessly. When you return to the office, however, your boss comes to you and says, “Actually, we need to make sure that only cars made on or after the year 2005 get entered into the database.” Okay, this is a little more complex, but an if statement should do the trick. Just wrap it around the call to insertIntoDB, so only those cars made on or after 2005 get through. Something like this:

Is it really going to be this easy? Your programming instincts are telling you no. You take a closer look at the spreadsheet, and just as you suspected, there are some complications further down the line:

 MakeModelYear
6HondaAccord15
7VolkswagenTiguan08
8TeslaModel S2018
9FordFocus2000
10NissanFrontier"2014"

Lines one through five were fine, but lines six and seven have a year format of XX, and line ten has a year with quotes around it, like this: “XXXX”. Your new code will not work in any of these instances. You scan through the next hundred records and see that every fifth record or so uses the XX format for the year, and every tenth record uses quotation marks.

Well, now we have a decision to make, we can solve the problem in a non-programmatic, less risky, and less confidence building way, or we can do it the programmatic, riskier, more confidence building way. Let’s take a look at both.

The Less Risky, Initially Easier Way

Here are the first ten lines of the spreadsheet, all together this time:

 MakeModelYear
1ToyotaCarolla2001
2MitsubishiLancer2010
3LexusES 3002014
4VolvoXC902018
5KiaOptima2016
6HondaAccord15
7VolkswagenTiguan08
8TeslaModel S2018
9FordFocus2000
10NissanFrontier"2014"

We’ve already identified the problem, we have differently formatted years. 1,100 rows isn’t nothing, but we could go through the spreadsheet ourselves (as in, opening up a spreadsheet app and tabbing through each cell one by one) and manually change the oddly formatted years to be four digit numbers with no quotes around them. Sure, it would take a little bit of time, but you wanted to take a break from thinking too hard today, anyways. Just put your headphones on and mindlessly plug through it. It will be like taking a mini break from your job, while still appearing productive, you tell yourself. Sounds tempting, and as an added bonus, you will be able to see to it yourself that all of the years are formatted properly, reducing the risk that the output in the database will not be as expected. But let me argue here that it is the very thing that makes this appealing, the idea of kicking back and wasting an hour doing something repetitive but easy, that makes it a bad idea.

Ask yourself, “What do computers do well that humans don’t?” Not a lot. In addition to being able to make calculations and logic based comparisons (just another kind of calculation), us humans can write symphonies, books on philosophy, and yes, even tech blogs (cue applause). But there are some things we’re not as good at. While a math wiz can bang out a hundred calculations in a minute, that’s nowhere near as fast as even an older model computers can. Plus a computer can do these calculations for weeks on end without rest, while humans get tired after the first couple of hours. Think of everyday things  computers do for us, like compressing video, sending it halfway around the world, and decompressing it on our friend’s computer screen so fast that it appears we are speaking to them in real time. At its heart this too is just a computer doing a large amount of simple calculations very, very fast.

I know this might be pretty obvious to a lot of people reading this, but the point I want to drive home is that to be a good programmer, you have to always keep this in mind, and let computers do this one thing they know how to do very well. In this particular instance, that means asking it to loop through a list of 1,100 records and format the dates itself. You could do the formatting yourself, and there are obvious advantages to it, but you are missing the opportunity to build skills that will let you tackle the next data set that is say, 10,000 records long, much too big for you to complete efficiently. And more importantly, the risk you take by doing things the hard way will build your confidence as a programmer, which is a much greater gain than saving yourself a little time and stress in the shortrun.

The Riskier, Initially Harder Way

So we’ve decided to do things the slightly riskier way, and we’re back to our original problem: we’ve got three different year formats and we need to compare each one to an integer, 2005, and filter out the ones that are lower in number (or earlier in year, whatever you prefer). Here’s a snippet of our data set again:

 MakeModelYear
1ToyotaCarolla2001
2MitsubishiLancer2010
3LexusES 3002014
4VolvoXC902018
5KiaOptima2016
6HondaAccord15
7VolkswagenTiguan08
8TeslaModel S2018
9FordFocus2000
10NissanFrontier"2014"

The question before us is, what are the (potentially code breaking) features we need to add to our script? Well, we need a function that does essentially what we were thinking of doing by hand a minute ago. It needs to trim the potential quotation marks off the year, and, since some year values will be two digits, it needs to grab only the last two digits of the year for the comparison. A few lines of code should get us set up:

rtrim() and ltrim() remove the quotation marks. Then substr() gives us just the last two digits of the year. Finally, a ternary operator returns true if the last two digits are higher than 05, for 2005 and lower than 18, for 2018 (it’s unlikely that any cars on the spreadsheet were made between 1905 and 1918, so this should be fine).

Note that while in this example, adding an extra function is not likely to break your code in any way that is difficult to repair, when you have an existing script that is say, 5,000 lines long, adding a new function and calling it from within the code could mess things up in unexpected ways. But it’s still worth doing!

So what did we do here that a computer couldn’t do just as easily? We asked ourselves a philosophical question. “What is a year, in this context?” We decided, “A year is at minimum a two digit number, with no quotation marks.” But what if there are some values in the spreadsheet like this:

$05

Or this:

(2015)

After a little thought, we decide that we want to strip off all non-numerical characters from our string, not just quotation marks. Here is a function that will do that:

Now we’ve accounted for not just the unusual formats we’ve come across already, but also a large set of potential formats that we’ve anticipated might exist, based on what we’ve seen so far. This sort of intuitive problem solving is what we are still much better at doing than computers, so it’s beneficial that we’ve taken on this task ourselves, and asked the computer to do the repetitive stuff. Humans for the win! The preg_match() function just compares a string, in this case $year, to a regular expression,  “/\d{2}$/” and puts the results into an array called $matches. For more on preg_match, see this, and take a look here to learn more about regular expressions.

Now, there could be some nonsense year values in the spreadsheet that our function hasn’t accounted for: “2022956,” or “unicorn,” for example. But who knows what these are supposed to mean in the first place? The above function will take the number “2022956” and return “56,” which is as good a guess as any, and it will return a blank string for “unicorn,” which is also as good a guess as any, because there’s no such thing as “unicorn” year (at least as far as I know). What we have written will probably work for 95% of our records, and possibly for 100% of them. So that’s a pretty good spot to be in. Here is our final code:

What’s The Point, Even?

So you might reasonably ask, “What was the point, really, of doing things the harder way? 1,100 records isn’t that many. In the time it took to conceptualize what a year is, and write a function that covered as many potential formats for it as possible, we could have just changed the values in the spreadsheet ourselves, and with less expended brain power.”

The answer is, while I understand the urge to do it this way, that won’t ever build your confidence as a programmer. Anyone can go through a spreadsheet and manually change numbers. By doing things programatically, and enhancing your code as you go, instead of mitigating risk by avoiding making changes to existing code, you are taking on some manageable risk and building your confidence as a programmer, and confidence is a valuable thing to have. Today you’ve tweaked some code that affects 1,100 records. Tomorrow you will be comfortable adjusting code that accepts 2,000 records, and after that 5,000 and 10,000. You will be better at reading large blocks of source code, and seeing what the effects are of changing a specific part of the code to make it more robust. And if you mess up, you will be better in the future at finding where the problem occurred in the code, and how to fix it.

The more you are willing to risk making mistakes, the better you’ll become at programming, and the stronger and more error free your code will be. That knowledge and confidence pays off in the end, because you’re really not as different from that person across the office from you as you might think.

Why use Lambdas in Python?

There is no doubt that lambdas in Python are fun to write. They’re like regular expressions lite, they give us a certain endorphin rush of accomplishment when we complete one. If you need a refresher, here is a simple example of a function being rewritten into a lambda:

 

Doesn’t look that special, right? I mean, all we’re doing is substituting a function declaration called add2 for a function literal, passed to the variable c. We haven’t made the code lighter by any significant degree, nor have we made it more readable. So the question is, even though they’re cool, why use lambdas in the first place?

The answer comes when we want to call a function only once, especially inside another function. Take a function that takes another function as a parameter, the map function:

 

This can be written much easier with a lambda:

 

I mean, that’s simpler, right? We only need to use the function once, so why not pass a lambda to the map function, as opposed to writing it on one line and referencing it in another? By writing a one line lambda function, we are:

  • making our code leaner and faster
  • avoiding clogging up the namespace with functions we only use once
  • making our code more readable

The importance of this last point can’t be overstated, even for small scripts. Imagine we didn’t use a lambda for this map function, that we declared a function instead and passed it into map:

Then we came back later, and not thinking about it (an easy thing to do when you’re returning to code you previously wrote) added some lines of code in-between the function declaration and the map call:

Suddenly we’re left with a function call to add2 on line 302, and we have no idea what it does. We have to scan all the way up to line 2 to find out.

So the short answer to the question, “Why even use lambdas?” is “its just fun.” The longer answer is “it cuts down on line real estate, speeds up execution time, and makes our code more readable and less prone to bugs in the future.”

Leave a comment below letting me know what you think, and check out my Twitter feed here.

Absolute Relative Positioning

Absolute relative positioning is one of my favorite CSS tricks, and by far my favorite trick involving the CSS position style rule. The concept is very simple. Take two divs, a rectangle and a circle:

See the Pen wmLqzv by Float Nine (@floatnine) on CodePen.

There is some CSS at the bottom of the window (hidden) that determines the size and shape of the two divs. Ignore that. What we are focusing on here is the positioning style rules, shown in .circle and .rectangle.

Take another look:

See the Pen wmLqzv by Float Nine (@floatnine) on CodePen.

In the HTML (seen if you click the “HTML” button) you can see that the circle is clearly inside the rectangle. But because of the circle’s absolute position, with a bottom and right value of 0px, it is stuck to the bottom of the screen. Now, let’s add one style rule to the .rectangle class:

See the Pen geNxwJ by Float Nine (@floatnine) on CodePen.

Pretty cool, huh? By adding the position of relative to the rectangle, we’ve completely changed the design layout of the page. Now the circle sits in the bottom-right of the rectangle, not the bottom-right of the page itself. Think of it like this: the circle is now positioned absolute bottom-right, relative to the rectangle.

And it’s really that simple. You can use this trick to position almost any element relative to its parent container.

Leave a comment below letting me know what you think, and check out my Twitter feed here.

Align Elements Vertically with Vertical Align

Align Elements Vertically: Simple, Right?

In CSS, developers spend a lot of time lining up elements next towith each other. Lining up elements horizontally is not much of a problem. To align elements vertically, however, can be much trickier. Vertical align seems like the property that should do the trick. But many CSS developers can’t answer the question: When is it appropriate to use this rule? One of the most common sources of confusion comes with overestimating what vertical-align can and can’t do.

Getting a Few Things Out of the Way

So let’s get this out of the way right now: vertical-align does not align elements vertically unless those elements are inline. I’ll say that again: vertical-align does not align elements vertically unless those elements are inline. Block elements simply don’t cooperate with it!

Vertical-align is actually a very limited tool, but an important one.

That’s another way of saying that vertical-align is actually a very limited tool, but an important one. But knowing how to use it (and when not to) will make you a better developer.

How to Align Elements Vertically

Let’s take a look at an example to get a concrete understanding of what vertical-align does right. Here we have a simple Codepen with an image and a line of text next to each other on a page:

See the Pen wmbGqm by Float Nine (@floatnine) on CodePen.

Notice how the text vertically aligns with the bottom of the image? That’s vertical-align’s default value: baseline. Let’s change that:

See the Pen mxYeQo by Float Nine (@floatnine) on CodePen.

That’s it. We changed vertical-align from baseline to middle, and now we see it align elements vertically. In this case, the text vertically aligns with the middle of the picture. Vertical-align isn’t any more complex than that. It’s just a matter of knowing its limitations.

Convert Inline Styles to an External Style Sheet

Unless you’re writing HTML Email, there’s really no reason to add inline CSS to your code. Follow the link below, and paste your HTML with inline styles into the form. After clicking convert, the inline styles will be separated from the HTML in a separate form field:

Click Here for An Inline Styles to External Style Sheet Converter