Programming Standards for Ma128a

Part of the intention of this course is to teach some programming techniques and provide programming experience. Since we also have a lot of other topics to teach, we can't do as complete a job as courses like E77 or CS61A/B/C devoted to programming, but we still hope to teach you the difference between good code and bad code, and a bit about testing.

Fortunately, our CS61B colleagues have written some detailed material on coding, testing and debugging styles. Though it is oriented around Java and C, most of it applies to good code written in any language, including Matlab and Fortran. I recommend reading General Guidelines for Programming Projects (in postscript) (or the pdf version) for details. But here are some general guidelines.

Programming Style

Write your code with the reader in mind. In other words, pretend that you (or someone else) will want to look at it again in 6 months and want to use it and possibly modify it. So you should make it as easy to read and understand as posssible. So document it well. Here are standards to do that:

Break your code into functions that perform one or a closely related set of operations.

Use (long) names for your functions and variables that make it clear what they do. For example, "iterations" not "i", and "bisect()" not "b()".

A function should be short enough to be visible all at once in the editor window. If it is too long, break it up.

Each function should begin with comments (that will appear if you type "help func" in Matlab) that explain how it is used. These comments should include a short description of each input variable, each output variable, the algorithm, what assumptions are made about the input to work correctly, and what global variables (avoid them) are used, if any.

Indent your programs so it is easy to identify loops, nested if statements, comments, etc. Use the same style consistently, so it is easy to read.

Testing

Test your code thoroughly. Though testing can only show the presence of bugs, not their absence, it is a major way to gain confidence. Depending on the language you use, it may have a built-in debugger that lets you "single-step" through the program, stop at a certain line in the code, examine and change values of variables, and continue executing, etc. Use such a debugger if it is available, otherwise use print statements in the form "if (debug>0), `print some data', end", where debug is a flag you can set. For large programs with many parts, you should both test the parts to see if they behave as expected, and test the whole thing ("black-box testing"). The trouble with just doing black-box testing is that you may not even execute all the lines of code you have written.

Picking test cases is hard work. You have to get into a nasty mood, pretend that the program is your adversary, and try to think of things that will break it. In the real world, testing and coding is sometimes done by different people or teams to make this adversarial approach work better. The idea is to think of inputs that are in some sense as hard as possible, right at the boundary of the legal set of inputs that the program claims it will work on.

In numerical programs, it is often challenging to say exactly what this boundary is. For example, it is often possible to break a code by putting in data all near the overflow threshold or underflow threshold, so that the intermediate computations quickly over/underflow. Or the user may want the zeros of a polynomial that you can't evaluate with any accuracy anywhere in the search interval. A really robust code might try to detect and deal with all such possible nasty cases, but such code is hard (and expensive) to write. As one example, a state-of-the-art library for solving linear algebra problems called LAPACK (for Linear Algebra PACKage) includes all its test code as part of the publically available software, and there is more test code than actual library code. So while we will study the limits of numerical algorithms in this class, we will not expect you to write really "bullet-proof" code that deals with all possible nasty inputs. We will expect you to make and describe a good-faith effort to try a lot of examples that you believe to be hard examples. Your ability to think up hard examples is a measure of your understanding of the algorithm.

What you should turn in

For your programming assignments, you should turn in the following items. Each one will count 1/3 of the grade.

A program listing, well-written and documented according to the guidelines above.

A separate description of the mathematical derivation of non-obvious parts of your algorithm, and a description of the complexity of your algorithm. The complexity analysis will depend on how hard the algorithm is to analyze, and your judgement as to the most important costs. So for bisection, the answer would be the number of function evaluations, as a function of the initial input interval width and the tolerance, plus a statement that the other operations in the algorithm are proportional to the number of function evaluations.

A list of test cases, what you expect the answer to be, what it really was (an error bound), whether the error bound meets your theoretical expectations, and some description of why you chose the test cases you did. The test cases given in the assignment are meant to be a minimal set; you should try more to explore the limits of the algorithm, and explain what you see. Ideally, you should turn in pictures and graphs, not tables of numbers.

Doing your own work

In the Course Outline we said that you could work in groups, but each person had to turn in his or her own work. For the last assignment, some people turned in joint assignments, which is not what we intended. So to make it perfectly clear, you are not to turn in joint assignments, and in the future only 1/n-th credit will be given to each member of an n-person group turning in one assignment. In your assignment, you need to identify the other group members, explain briefly how you split the work. The purpose of this policy is to encourage you to learn from one another, but to insist that each person learn all the aspects of the assignments, including programming, documenting, analyzing and testing.