The writings of Peter Stuifzand

Archive for July 2008

The ternary operator can be found in many programming languages. In some popular ones it is written as bool ? expr1 : expr2, which will evaluate to expr1 if bool is true or expr2 if bool is false.

The nice thing about the ternary operator is that one can write an otherwise big if block in one line. For example:

$var = null;
if ($x > 10) {
    $var = 102;
}
else {
    $var = 1044;
}

becomes

$var = $x > 10 ? 102 : 1044;

I hope it's obvious what the pros are of this approach.

But then I read an article which tries to explain the usage of the ternary operator in Java. At first I liked the simple design of the page. But the code examples are probably some of the worst that I have seen.

boolean giveTicket;
giveTicket = speed > speedLimit ? true : false;
if (giveTicket)
    pullEmOver();   // nab the offender!

The biggest problem with this example is that second and third part of the expression are true and false. The first part of the expression itself will already evaluate to that, so that's redundant. This example can be written without the ternary operator.

if (speed > speedLimit) {
    pullEmOver();
}

This is cleaner and contains less words. To have the same obvious code, you could create a method for that comparison.

public boolean giveTicket(int speed) {
        return speed > speedLimit;
}

...

if (giveTicket(speed)) {
    pullEmOver();
}

It won't get better than this. The nice thing about this is that you can change the code in giveTicket if there are changes to the ideas about what is illegal.

The last few days I saw a few things related to Batman. This is, of course, because the new Batman film came out a few days ago (in the US, in the Netherlands we still have to wait a few days).

A review of Dark Knight by the guys of The Totally Rad Show.

A review of some Batman comics by the guys of iFanboy.

An amazing animated movie that is about stories happening between the two movies.

Monday my girlfriend and I went to the NEMO science center in Amsterdam. It is a place where you can discover technical and scientific things. From DNA to love and electricity to water.

A view of the NEMO science center

NEMO is a five story building that looks like a huge boat. At the moment you have to go through a back entrance. There is a lot of construction going on at the moment.

On the first floor there is a chain reaction in the spirit of the Pythagoras Switch Japanese TV show. You can watch this exhibit a few times a day.

There is a lot of difference between the techniques that are shown. On the one side you have low tech: pull yourself up while sitting on a chair, to show to effect of pulleys. With one pulley it's heavy to pull yourself up, but with three or even five it's much simpler. On the other side there is high tech, a face recognition Pong, where you can move the paddle by using different emotions (sad, happy, angry, neutral, surprised).

On one of the floors there is a huge ball factory. A part of this machine is automated and part of it is manual labor. It is a game you can play. You need to put balls in order in one place. A touch screen shows that order. In another place you have to use a bar code scanner to check the contents of crates that pass by. This is done with a checksum that you can calculate by adding and multiplying. The machine is cool to look at. It is huge and mostly automatic with cables, switches and controlling hardware showing.

We have done chemical experiments in the chemical lab on the fourth floor. There are twelve experiments to choose from. We only did two experiments, because there were not enough lab coats for the group that came after us.

On the sun roof of the building, you can sit and recharge for a bit after all the things that you can see. It is named after the actor who does the voice of Ernie on the Dutch Sesame Street. He also presented the National Science Quiz every year (which is probably why the roof is named after him).

The whole place is explained in Dutch and English. It is a nice place to spend a day. I can recommend it to everyone.

This is an article that I wrote about two years ago. I like to share it with you because I think that it contains a few nice rules for writing better software.

When creating websites, there is a time when you want to put some dynamically generated output to the web browser. With a CGI script this is done by printing to stdout. When you don't take some simple precautions it will be easy for crackers to take over your website.

The simple rule is: encode all output. This means that if data is going from your program to another place, you should encode it. The way in which the data will be encoded is dependent on where it's going. I will show this with three examples.

Rule #1: Encode all dynamic output

HTML

On a webpage with HTML it's possible for an evil person to change parts of the website. Sometimes this isn't a very big problem, like when it's only possible to have the HTML on his own page. But when it's possible to get user input on another users page, then there is a possibility for cross site scripting (XSS), which is something you don't want.

Next I will show a simple php program with a problem.

<html>
<body>
<form action="badscript.php" method="post" >
Email: <input type="text" name="email" />
<input type="submit" value="Mail me!" />
</form>
</body></html>

This will look like the following webpage.

Email: <input type="text" name="email"/>
<input type="submit" value="Mail me!"/>

OK, so this is really simple. The next part of this needs a badly written php script.

<?php 
   echo "The text you wrote is: " . $_POST["email"] . "</br>";
   echo 'Try again: <form action="badscript.php" method="post" >
Email: <input type="text" name="email" value="' . $_POST['email'] . '"/>
<input type="submit" value="Mail me!" />
</form>';
?>

So now let's try the script. This script let's you write some text, and then text will get printed on the webpage when you click the button. Now you can try some things. Nice examples are: normal text, some text with a bit html, like bold tags, or things with quotes.

To save you from some emberrasment, it's better to encode the output of the text in the script. The parts of the text that can destroy your page are the characters that are interpreted differently in HTML than in plain text.

The characters that you should encode are:

characterencoded
&&amp;
<&lt;
>&gt;
"&quot;
'&apos;

The table above shows the order in which the characters should be encoded. The &amp; is first because is part of the other entities. The order of the other characters don't matter that much.

<?php 
   echo "The text you wrote is: " . htmlentities($_POST["email"]) . "</br>";
   echo 'Try again: <form action="goodscript.php" method="post" >
Email: <input type="text" name="email" value="' . htmlentities($_POST['email']) . '"/>
<input type="submit" value="Mail me!" />
</form>';
?>

The driver form:

Email: <input type="text" name="email"/>
<input type="submit" value="Mail me!"/>

When you program php for a living, it's good to know that magic quotes is a hack, and that it will break your page. The nice thing however is, you don't need magic quotes to survive injection attacks. By knowing what, when and how to encode all problems with injection attacks can be thwarted.

MySQL

MySQL is output just like HTML is, but instead of going to the webbrowser it's going to a MySQL server. With MySQL the rules are even simpler than with HTML.

Rule #2: Every ' that should be inserted into the database, should be replaced with \'.

Note that this is a specialization of rule #1. Only the rules for encoding MySQL are different than the rules for encoding HTML. It's all about the special characters. The ' is a special character in MySQL, in other databases there could be other special characters.

The easiest way to get rid of special characters in SQL queries is by using prepared statements. A prepared statement is a sql query that is 'prepared' before it is used. In a prepared statement you can use the question mark to specify a place where a variable will be inserted.

In PHP prepared statements can be found in the mysqli module, by means of the mysqli_prepare function. In perl the DBI module is all you need and in Java you should take a look at the PreparedStatement class. All classes, modules and functions work in a similar way.

  1. Prepare a mysql query; use ? where variables should go.
  2. Use the statement many times

As you can see it's really simple. In perl prepared statements work with the prepare method.

use DBI;

my $conn = DBI->connect(...);
# Find all people in berlin.
my $stmt = $conn->prepare(<<"SQL");
SELECT name, city
FROM people
WHERE city = ?
SQL

$stmt->execute('Berlin');
while (my $row = $stmt->fetchrow_arrayref) {
    # do something with $row->[0] and $row->[1]
}

By using prepare and execute, perl will make sure that the variable 'Berlin' wil get quoted. Another nice side effect is the resulting code will run faster and $stmt can be used multiple times with other arguments. This is especially useful for INSERT queries, where the query stays the same, but where the arguments change. The prepared statement is faster because the query doesn't have to parsed everytime the statement is executed.

Input

This section probably shouldn't be in this article, but as long as there isn't a better place, it will stay here.

Rule #3: Keep the input as close to the original as possible

This means that you shouldn't encode the input from user in any way if it isn't necessary. For example: a piece of text typed by a user, shouldn't be htmlencoded when it's inserted into the database. If you do encode the text at that time, it's a lot harder in the future to use the text in another medium, like plain text or pdf, because they expect another format.

Rule #4: Always check user input.

Don't check the input for possible encoding problems like quotes or angle brackets. But do check for the empty string (no value), or digits (when expecting a number). The simplest way to check these kind of things is by using regular expressions.

The number check in perl:

my $number = $cgi->param('id');
if ($number !~ /^\s*(\d+)\s*$/) {
    $error = "id should be a number";
    return;
}
$number = $1;

This check finds out if there is a number. It can contain optional whitespace at the front and at the end. Also notice the caret (^) and the dollar-sign ($), which make sure it's only a number. Otherwise it could match a number in a string like "ad3df" (3), which would probably be the wrong thing.

Tonight I implemented comments on my weblog. I have this blog already for four years and now at last you can post a comment.

The biggest problem with implementing comments is that my blog is static. Otherwise it wouldn't be that hard to implement the comments.

At the moment you shouldn't pay much attention to the esthetics.

View archived entries