Floating Point Number Representation
How to Submit Your Lab Writeup
Review the general guidelines for submitting work, especially Laboratory Exercises. You must include an Academic Honesty Statement on the top page of your lab report, and this statement must be signed by all collaborators before submission. Please number your responses and put them in numerical order so that graders don't have to search through pages trying to figure out what sections answer each question.
Hand in only the paper copy and printouts of programs created or modified. You do not need to submit electronic copies to the grader email account. Make sure that the names of both lab partners are clearly listed and be sure to label the problem number for each question. Hand in answers to the following question numbers: 1, 4, 5, 7, and 8. Some questions will require printouts of code you have written or modified.
Lab submissions handed in after 2:45 on the due date will suffer a 10% late penalty!!
Goals
This lab provides experience viewing the representation of floatingpoint real numbers on PC/Linux machines, and explores an application for which numerical roundoff error has visible consequences.
Binary Representation of FloatingPoint Numbers:
The first part of this lab asks you to review the bitlevel storage of floating point numbers on PC/Linux computers.
 Write the real numbers ± 1.0, ± 2.0, ± 3.0, ± 6.0, and ± 9.0 using the IEEE Standard for 32bit Floating Point Numbers.

Copy the program
datarep.c
to your account and compile it. Then enter real numbers, and conduct experiments to determine: which bit is the sign bit,
 which bits are used for the mantissa,
 which bits are used for the exponent, and
 what bias or excess is used in the storage of exponents.
Write down the pattern that you observe.

When the decimal number 0.1 (one tenth) is converted to binary, the
resulting floatingpoint number is the repeating sequence
0.1001100110011 (just as the decimal
representation of one third is the repeating sequence
0.333).
Use
datarep.c
to determine the floatingpoint number that is actually stored for the decimal number 0.1 (one tenth). Write down how this differs from the actual binary number. 
Use your knowledge of the storage of real numbers to determine what real number comes "immediately after" 3.0 and 10.0 on this system. That is, look at the mantissa to determine what change would yield the smallest number above 3.0 and 10.0. Record that number.
Hint: When running the
datarep
program, toggle (i.e. switch/slip the value from 0 to 1 or from 1 to 0) an appropriate bit and look at what results.
Floatingpoint Numbers and Loops
Inaccuracies in representing floatingpoint numbers with a limited number of digits of accuracy have an impact on how programs are written and how they run. This section of the lab explores some of these consequences.

Copy
floatloop.c
to your account. The idea of this program is to work through a loop, starting at 0.0, incrementing by 0.1 each time through the loop, and continuing while the number is not 1.0.
Read through this program. Write out what should be printed
(including the expected value of
sum
to be printed each time). 
Run this program, and describe what happens.
Note: You can stop any running program by typing ctrl+c.  Review the first part of the output printed to determine why
the program ran the way it did.
Hint: After compiling the program, you might use the following line to run the program and look at the first several lines of output:
./floatloop  head n 20

Change the loop condition to
(val <= end)
. Again, explain what happens. What is the last value printed within the loop? What sum is actually computed?

Read through this program. Write out what should be printed
(including the expected value of

Change
floatloop.c
so that the variables are declared asdouble
rather thanfloat
, and repeat Steps 4ad. What happens this time?
 Why?

The program
floatloop.c
illustrates that loops may or may not repeat the number of times expected, when the variables within the loop condition are floatingpoint numbers. One common way to resolve this problem is to change the loop control variables to an integer. For example, forfloatloop.c
, we could use anint
variablei
to control the loop. Effectively,i
has the value of 10 times the value we intend forval
. The main loop might beint i; for (i = 0; i <=10; i++) { val = i / 10.0; ... )
Here, the
int i
is always computed exactly, so the loop always runs exactly the desired number of times, and the value ofval
is recomputed from the exact numberi
each time so inaccuracies in the storage of 0.1 do not compound.Rewrite
floatloop.c
to replace thewhile (val < end)
loop with afor
construction using an integer as the loop control variable. Then run the program to confirm it produces the desired output. Include your modifed program with your lab write up.
More Floatingpoint Numbers and Loops
Your experience with floatloop.c
illustrates that some
issues arising with float
numbers may be resolved
with double
numbers. The extra digits of accuracy
sometimes can make a substantial difference. This section explores
this observation further.

Program
doubleloop.c
is similar tofloatloop.c
, except that its variables aredouble
, the range of numbers for the loop is 1000 to 1001, and the condition isval <= end
as in Step 4c. Copy this program to your account.
Read through this program. Write what should be printed
(including the expected value of
sum
to be printed each time).  Run this program and describe what happens.

What, if anything, happens if the variables are changed to
float
type?

Read through this program. Write what should be printed
(including the expected value of
Computing Area Under y = x^{2}:
[The following is an edited version of Section 5.5 from Introduction to Computing and Computer Science with Pascal by Henry M. Walker, Little, Brown, and Company, 1986 and is used with permission of the copyright holder.]
Suppose we are given a function y = f(x), and we want to find the
area under the graph between x = a and x = b.
(The following figure illustrates the area under the curve between x =
1 and x = 3 when f(x) = x^{2}.)
Using calculus, the exact size of this area is 8 2/3 or 8.666.
Discussion
In what follows, we will not try to compute the desired area exactly. Rather, we will consider a fairly simple approach, called the trapezoidal rule, which can give good approximations to the area. In this approach, we break down a large area into small pieces and approximate each of the small pieces by a trapezoid (as shown below).
From geometry, we we can compute the area of a trapezoid:
Then we can approximate the entire area under the curve by adding up the areas of the trapezoids.
More precisely, we first divide the interval [a, b] into n equal pieces a=x_{0}, x_{1}, x_{2}, . . ., x_{n}=b. Then we use the pieces to divide the overall areas into trapezoids. After we compute the area of each trapezoids, we add up these small areas. The final formula is
Approximate Area = h[f(x_{0})/2 + f(x_{1}) + f(x_{2}) + . . . + f(x_{n1}) + f(x_{n})/2)]
where h = (b  a) / n and x_{j} = a + jh for j = 0, 1, 2, . ., n. This is the formula trapezoidal rule. (The interested reader should consult books in calculus or numerical methods for the details of this and other methods.)
To make this formula more concrete, we apply it to f(x) = x^{2} between x = 1 and x = 3 (as shown in an earlier figure), and we divide the interval ]1, 3] into five pieces. This gives: n = 5; a = 1; b = 3. The overall interval [1, 3] has length 2; we divide it into five subintervals of length h = 2/5 = 0.4. The x values are x_{0} = 1, x_{1} = 1.4, x_{2} = 1.8, x_{3} = 2.2, x_{4} = 2.6, x_{5} = 3. The trapezoidal rule gives:
Approx. Area  = h[f(x_{0})/2 + f(x_{1}) + f(x_{2})+ f(x_{3})+ f(x_{4})+ f(x_{5})/2)] 
= 0.4[f(1)/2 + f(1.4) + f(1.8) + f(2.2) + f(2.6) + f(3)/2]  
= 0.4]1^{2}/2 + (1.4)^{2} + (1.8)^{2} + (2.2)^{2} + (2.6)^{2} + 3^{2}/2]  
= 8.72 
Theoretical Accuracy of the Trapezoidal Rule
While it is hard to predict the accuracy of approximations with the trapezoidal rule, we can make several useful observations.
 The trapezoidal rule relies upon the actual area under the graph being close to the area under the trapezoid.
 If the graph of the function is a straight line, then the trapezoids should give exact results. Otherwise the trapezoidal rule cannot be expected to be exactly correct.
 If we divide the interval [a, b] into a large number of pieces, we can expect each trapezoid to be close to the actual area under the graph.
 As n gets bigger, the approximation of area using the trapezoidal rule should get better.
Practical Implications of Floating Point Error
Since floatingpoint numbers are not stored exactly, work with any individual floating point number may involve a small amount of error. If these numbers are combined in many arithmetic operations, such small numerical errors sometimes can come together to significantly affect results.
Programming
This part of the lab asks you to run and expand
program traprule.c
that computes area using the
trapezoidal rule. You then will experiment with this program to investigate
the effect of numerical errors.

Copy
traprule.c
to your account, and then compile and run it.
Review the program and describe how it works. For example, how the
table is produced? Why does the function
area_l_to_r
use the variablei
? Why does the computation forxvalue
give appropriate values for x values in the trapezoidal rule?  As noted above, the correct value of this area is 8 2/3 or 8.666 as determined with calculus. Write down how the computed approximations compare to this exact value as the number of intervals increases.

Review the program and describe how it works. For example, how the
table is produced? Why does the function
The function area_l_to_r
adds terms in the Trapezoidal Rule
from first to last. For the function given, the terms get steadily
larger as the function is increasing from left to right. A natural
question arises regarding what might happen if the terms were added in
the opposite order.

Modify the program to include another
function
area_r_to_l
, which adds the terms in the Trapezoidal Rule from last to first (i.e., from the nth term toward the initial term). Then, in the main loop, add another column to the table, for "Computing from R to L". Run the revised program, showing the results of both lefttoright and righttoleft computations.
 Compare the results of the lefttoright and righttoleft computations. What patterns do you observe? What, if any, differences do you identify? Briefly explain what you see.
 If this lab is to be turned in, include your program for this step as well as your explanations and other work.