(6) Testing

Due: Monday, January 26, 11:59 PM Eastern

Test-Driven Development

If you don’t know what a function is supposed to do, you can’t write it.

That may seem like an obvious statement. I can tell you, though, that I’ve seen many students try to write functions without fully understanding the expected behavior. They end up wasting a lot of time, and I don’t want that to happen to you!

Test-driven development helps us work out the expected behavior of a function, and gives us a nice way to test the function once we’ve written it!

Writing a Test Function

Let’s consider a small example. Suppose you need to write a function to return the price of attending an event for a given number of people. The price is $10 per person, but if there are more than 5 people in the group, there is a group discount and the price is $8 per person. First, you figure out the function stub and RME:

def calculate_price(num_people: int) -> int:
    """
    Requires: num_people is non-negative
    Modifies: Nothing
    Effects: Returns the total price for num_people attending the event.
             The price is $10 per person for 5 or fewer people, and $8
             per person for more than 5 people.
    """
    return -1

The return -1 is just a temporary placeholder so that the code can run. Now, before writing the function body itself, let’s start writing the test function.

def test_calculate_price() -> None:
    """Tests the calculate_price function."""
    print("Testing calculate_price...")
    print(calculate_price(2))

Note some good things we have already:

The name test_calculate_price isn’t required, but it’s a good convention to follow.
It’s nice to have a message printed out at the start (Testing calculate_price...) so you know which function is being tested. Eventually, you’ll have a lot of test functions, and if something goes wrong, you’ll want to know which function caused the problem.
We have to print the result of calling calculate_price(2) so we can see what it returns. If we just had calculate_price(2) on a line by itself, nothing would be printed and so we wouldn’t know how calculate_price behaved.

Label Your Output

We know that once we implement calculate_price, we would expect test_calculate_price as written above to print Testing calculate_price... and then 20. It might seem easy enough to just remember this, but note that eventually we’ll write lots of tests for lots of functions, so it will be helpful to label our output. Here’s an example:

def test_calculate_price() -> None:
    """Tests the calculate_price function."""
    print("Testing calculate_price...")
    print(f"Input 2, Expected 20, Actual {calculate_price(2)}")

The label should include the input, expected output, and actual output of the function. Otherwise there are no definitive rules for how you print this out. You could write print(f"The price for 2 people is: {calculate_price(2)} and is expected to be 20.") or something else. The important thing is that when you see the output, you can easily tell what input produced what output, and what it was supposed to be.

Test Every Case

Suppose we then wrote calculate_price, and test_calculate_price indicates that the function worked for the input of 2 people. Great! But we haven’t tested the function thoroughly yet. It worked for 2 people, but might not work for other amounts. So we think some more about what the function is supposed to do. We remember the “Effects” clause of the RME: the price is $10 per person for 5 or fewer people, and $8 per person for more than 5 people. So we should certainly have a test case for more than 5 people. Let’s add that:

def test_calculate_price() -> None:
    """Tests the calculate_price function."""
    print("Testing calculate_price...")
    print(f"Input 2, Expected 20, Actual {calculate_price(2)}")
    print(f"Input 6, Expected 48, Actual {calculate_price(6)}")

Test Boundary Cases

This is looking good! But we shouldn’t stop here. Notice that the pricing changes at 5 people. This is called a boundary case because the behavior of the function changes at that point. Boundary cases are very important to test, because they are often where mistakes happen. For example, someone might use a < operator when they should use a <= operator, or vice versa. Let’s remind ourselves of the Effects clause again: the price is $10 per person for 5 or fewer people… So 5 people should have a price of $50. So let’s add the boundary case of 5 people:

def test_calculate_price() -> None:
    """Tests the calculate_price function."""
    print("Testing calculate_price...")
    print(f"Input 2, Expected 20, Actual {calculate_price(2)}")
    print(f"Input 6, Expected 48, Actual {calculate_price(6)}")
    print(f"Input 5, Expected 50, Actual {calculate_price(5)}")

Are there any other boundary cases? We should probably test 0 people as well. That’s the lowest allowed value for num_people, according to the Requires clause. So:

def test_calculate_price() -> None:
    """Tests the calculate_price function."""
    print("Testing calculate_price...")
    print(f"Input 2, Expected 20, Actual {calculate_price(2)}")
    print(f"Input 6, Expected 48, Actual {calculate_price(6)}")
    print(f"Input 5, Expected 50, Actual {calculate_price(5)}")
    print(f"Input 0, Expected  0, Actual {calculate_price(0)}")

Notice that I added an additional space between Expected and 0 in the last line. This is just to help line up the output a little better when we print it out. Things like this are not required in testing, but can be nice.

Don’t Test Disallowed Inputs

Let’s consider the Requires clause again: num_people is non-negative. So you might think we should test negative inputs as well. Generally we don’t, though. The Requires clause indicates what inputs the function is expected to handle correctly. Inputs that violate the Requires clause are disallowed inputs; the function is not required to handle them in any particular way. Don’t test disallowed inputs in EECS 183.

def test_calculate_price() -> None:
    """Tests the calculate_price function."""
    print("Testing calculate_price...")
    print(f"Input 2, Expected 20, Actual {calculate_price(2)}")
    print(f"Input 6, Expected 48, Actual {calculate_price(6)}")
    print(f"Input 5, Expected 50, Actual {calculate_price(5)}")
    print(f"Input 0, Expected  0, Actual {calculate_price(0)}")
    print(f"Input -1, Expected -10, Actual {calculate_price(-1)}")  # Don't do this!

The function is not required to handle -1 correctly, because -1 violates the Requires clause. How could we say what the correct output is for that input? Someone might say -10, as we’ve written here, but someone else might say that the result should be 0, or an error should occur. The Requires clause disallows this input, so the function isn’t required to handle it, so it’s not “fair” for us to test it.

Tests Prepare Us to Write the Function

Notice how, by thinking about test cases, we’ve reached a more firm understanding of what calculate_price is supposed to do. Maybe calculate_price is simple enough that you don’t feel like this was necessary, but for more complex functions, this process is very helpful. By writing test cases, we force ourselves to think about what the function should do in various situations, thus clarifying the requirements.

Furthermore, by writing the test function first, we have a ready-made way to check whether our implementation of calculate_price is correct. Once we’ve written calculate_price, we can simply run test_calculate_price to see if everything works as expected.

Don’t Just Blindly Add Tests

Sometimes you might be tempted to add tons of tests, just in the hopes of catching another bug. For example:

def test_calculate_price() -> None:
    """Tests the calculate_price function."""
    print("Testing calculate_price...")
    print(f"Input 2, Expected 20, Actual {calculate_price(2)}")
    print(f"Input 6, Expected 48, Actual {calculate_price(6)}")
    print(f"Input 5, Expected 50, Actual {calculate_price(5)}")
    print(f"Input 0, Expected  0, Actual {calculate_price(0)}")
# primer-spec-highlight-start
    print(f"Input 3, Expected 30, Actual {calculate_price(3)}")
    print(f"Input 4, Expected 40, Actual {calculate_price(4)}")
    print(f"Input 7, Expected 56, Actual {calculate_price(7)}")
    print(f"Input 25, Expected 200, Actual {calculate_price(25)}")
# primer-spec-highlight-end

The highlighted tests above aren’t really useful. They don’t add anything to the test function. Each test case should be purposeful. Think about what the function is supposed to do, and what inputs are important to test. Boundary cases are especially important. But don’t just add random tests without thinking about them.

How the Autograder Checks Your Tests

In some of your course work (e.g., project 2), we require you to write test functions for your code. This will be very helpful for you for the reasons described above. We also have the autograder set up to check your test cases. How do we do that? We take your test code and run it with two or more “secret” versions of the functions being tested. One of the secret versions is a perfect, completely correct implementation. Other secret versions have intentional mistakes in the implementation. The autograder makes sure that your test cases can tell the difference.

For example, suppose someone’s test_calculate_price function does not have enough test cases. Maybe it’s like this:

def test_calculate_price() -> None:
    """Tests the calculate_price function."""
    print("Testing calculate_price...")
    print(f"Input 2, Expected 20, Actual {calculate_price(2)}")
    print(f"Input 6, Expected 48, Actual {calculate_price(6)}")

The autograder checks this test function against several versions of calculate_price. One version is correct:

def calculate_price(num_people: int) -> int:
    """
    Correct version
    """
    if num_people <= 5:
        return num_people * 10
    else:
        return num_people * 8

Other versions may be incorrect in various ways. For example:

def calculate_price(num_people: int) -> int:
    """
    Incorrect version: Mistake in the boundary condition
    """
# primer-spec-highlight-start
    if num_people < 5:  # WRONG!
# primer-spec-highlight-end
        return num_people * 10
    else:
        return num_people * 8

So the autograder runs your test_calculate_price function with the correct version and gets this output:

Testing calculate_price...
Input 2, Expected 20, Actual 20
Input 6, Expected 48, Actual 48

Then, the autograder runs your test_calculate_price function with the incorrect version and gets this output:

Testing calculate_price...
Input 2, Expected 20, Actual 20
Input 6, Expected 48, Actual 48

As we can see, the test_calculate_price function produces the same output for both the correct and incorrect versions. This means that the test cases are insufficient.

In contrast, this test function is sufficient:

def test_calculate_price() -> None:
    """Tests the calculate_price function."""
    print("Testing calculate_price...")
    print(f"Input 2, Expected 20, Actual {calculate_price(2)}")
    print(f"Input 6, Expected 48, Actual {calculate_price(6)}")
    print(f"Input 5, Expected 50, Actual {calculate_price(5)}")
    print(f"Input 0, Expected  0, Actual {calculate_price(0)}")

On the correct version, that test function outputs:

Testing calculate_price...
Input 2, Expected 20, Actual 20
Input 6, Expected 48, Actual 48
Input 5, Expected 50, Actual 50
Input 0, Expected  0, Actual 0

On the incorrect version, it outputs:

Testing calculate_price...
Input 2, Expected 20, Actual 20
Input 6, Expected 48, Actual 48
# primer-spec-highlight-start
Input 5, Expected 50, Actual 40
# primer-spec-highlight-end
Input 0, Expected  0, Actual 0

This test function correctly distinguishes between the two versions of calculate_price, and thus catches the boundary condition bug.

Copyright and Academic Integrity

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

All materials provided for this course, including but not limited to labs, projects, notes, and starter code, are the copyrighted intellectual property of the author(s) listed in the copyright notice above. While these materials are licensed for public non-commercial use, this license does not grant you permission to post or republish your solutions to these assignments.

It is strictly prohibited to post, share, or otherwise distribute solution code (in part or in full) in any manner or on any platform, public or private, where it may be accessed by anyone other than the course staff. This includes, but is not limited to:

Public-facing websites (like a personal blog or public GitHub repo).
Solution-sharing websites (like Chegg or Course Hero).
Private collections, archives, or repositories (such as student group “test banks,” club wikis, or shared Google Drives).
Group messaging platforms (like Discord or Slack).

To do so is a violation of the university’s academic integrity policy and will be treated as such.

Asking questions by posting small code snippets to our private course discussion forum is not a violation of this policy.