Home > empirical > Implementing the between operation

Implementing the between operation

July 30, 2009

What code do developers write to check whether a value lies between two bounds (i.e., a between operation)?  I would write (where MIN and MAX might be symbolic names or numeric literals):

   if ( x >= MIN && x <= MAX )

that is I would check the lowest value first. Performing the test in this order just seems the natural thing to do, perhaps because I live in a culture that writes left to write and a written sequence of increasing numbers usually has the lowest number on the left.

I am currently measuring various forms of if-statement conditional expressions that occur in visible source as part of some research on if/switch usage by developers and the between operation falls within the set of expressions of interest. I was not expecting to see any usage of the form:

   if ( x <= MAX && x >= MIN )

that is with the maximum value appearing first. The first program measured threw up seven instances of this usage, all with the minimum value being negative and in five cases the maximum value being zero. Perhaps left to right ordering still applied, but to the absolute value of the bounds.

Measurements of the second and subsequent programs threw up instances that did not follow any of the patterns I had dreamt up. Of the 326 between operations appearing in the measured source 24 had what I consider to be the unnatural order. Presumably the developers using this form of between consider it to be natural, so what is their line of thinking? Are they thinking in terms of the semantics behind the numbers (in about a third of cases symbolic constants appear in the source rather than literals) and this semantics has an implied left to right order? Perhaps the authors come from a culture where the maximum value often appears on the left.

Suggestions welcome.

  1. Seven
    July 30, 2009 07:22 | #1

    My personal preference is to test the minimum first, but with the x on the right so it is really *between* the min and max:

    if ( MIN <= x && x <= MAX)

    I find that with the X in the middle and the same operator used twice, it’s far easier to recognize the “between” operation.

    I hope one day there will be an IDE that allows you to format a short piece of code to your liking, and then automatically applies your preferred style to all the code it shows…

  2. July 30, 2009 09:14 | #2

    Interesting idea Seven and at least two three other people use it (or at least there are three instances in the Netscape sources and six in OpenMotif and six in Linux, none in gcc and Linux; around 3%4.5% of “between” operator usage).

    Having a constant appear on the left of a variable is not a pattern I am comfortable with, probably because I don’t get much practice using it.

    IDEs do have a long way to go in their support for personal preferences. Identifier naming conventions is another usage that cries out for automatic user adoption.

  3. August 5, 2009 16:31 | #3

    While I would instinctively write it this way:
    if ( MIN <= x && x <= MAX)

    I do find myself when I’m programming trying to minimize the amount of work the computer goes through. So, for example, if I came across this if statement I wouldn’t be surprised at all to find myself writing it the other way around. Simply because I would expect, in that situation, that the MAX condition would fail more often than the MIN condition, thus in my head and perhaps not in real life, I would be saving the CPU some cycles.

    All of that assumes though that during compilation the compiler sets it up so that the computer actually evaluates the conditions from left to right like my head does.

  4. Martin
    August 14, 2009 20:11 | #4

    @Derek-Jones
    Having the constant on the left can prevent errors such as:
    if ( MIN = x )

    when you meant to use a comparison operator. Therefore I usually write the constants at the left hand side.

  5. Alan Stokes
    September 7, 2009 08:19 | #5

    I would also probably write MIN <= x && x <= MAX, probably because it’s closer to the mathematical formulation.

    BCPL allows/allowed you to write what you really meant – there
    MIN <= x <= MAX
    would do the right thing.

    I’d also consider writing a generic is_between function.

  1. December 23rd, 2009 at 02:33 | #1
Comments are closed.