's is found to be true, the value of the cond is undefined. The word predicate is used for procedures that return true or false, as well as for expressions that evaluate to true or false. The absolute-value procedure abs makes use of the primitive predicates >, <, and =.18 These take two numbers as arguments and test whether the first number is, respectively, greater than, less than, or equal to the second number, returning true or false accordingly. Another way to write the absolute-value procedure is (define (abs x) (cond ((< x 0) (- x)) (else x))) which could be expressed in English as “If x is less than zero return - x; otherwise return x.” Else is a special symbol that can be used in place of the

in the final clause of a cond. This causes the cond to return as its value the value
of the corresponding

here.
Here is yet another way to write the absolute-value procedure:
(define (abs x)
(if (< x 0)
(- x)
x))
This uses the special form if, a restricted type of conditional that can be used
when there are precisely two cases in the case analysis. The general form of an
if expression is
(if

is found to be true, the expressions *painter segment-list)
(lambda (frame)
(for-each
(lambda (segment)
(draw-line
((frame-coord-map frame) (start-segment segment))
((frame-coord-map frame) (end-segment segment))))
segment-list)))
The segments are given using coordinates with respect to the unit square. For
each segment in the list, the painter transforms the segment endpoints with the
frame coordinate map and draws a line between the transformed points.
Representing painters as procedures erects a powerful abstraction barrier in the
picture language. We can create and intermix all sorts of primitive painters,
based on a variety of graphics capabilities. The details of their implementation
do not matter. Any procedure can serve as a painter, provided that it takes a
frame as argument and draws something scaled to fit the frame.28
Exercise 2.48. A directed line segment in the plane can be represented as a pair
of vectors – the vector running from the origin to the start-point of the segment,
and the vector running from the origin to the end-point of the segment. Use your
vector representation from exercise 2.46 to define a representation for segments
with a constructor make-segment and selectors start-segment and endsegment.
Exercise 2.49. Use segments->painter to define the following primitive
painters:
a. The painter that draws the outline of the designated frame.
b. The painter that draws an “X” by connecting opposite corners of the frame.
c. The painter that draws a diamond shape by connecting the midpoints of the
sides of the frame.
d. The wave painter.
Transforming and combining painters
An operation on painters (such as flip-vert or beside) works by creating a
painter that invokes the original painters with respect to frames derived from the
argument frame. Thus, for example, flip-vert doesn't have to know how a
painter works in order to flip it – it just has to know how to turn a frame upside
down: The flipped painter just uses the original painter, but in the inverted
frame.
Painter operations are based on the procedure transform-painter, which takes
as arguments a painter and information on how to transform a frame and
produces a new painter. The transformed painter, when called on a frame,
transforms the frame and calls the original painter on the transformed frame. The
arguments to transform-painter are points (represented as vectors) that specify
the corners of the new frame: When mapped into the frame, the first point
specifies the new frame's origin and the other two specify the ends of its edge
vectors. Thus, arguments within the unit square specify a frame contained within
the original frame.
(define (transform-painter painter origin corner1 corner2)
(lambda (frame)
(let ((m (frame-coord-map frame)))
(let ((new-origin (m origin)))
(painter
(make-frame new-origin
(sub-vect (m corner1) new-origin)
(sub-vect (m corner2) new-origin)))))))
Here's how to flip painter images vertically:
(define (flip-vert painter)
(transform-painter painter
(make-vect 0.0 1.0)
; new origin
(make-vect 1.0 1.0)
; new end of edge1
(make-vect 0.0 0.0))) ; new end of edge2
Using transform-painter, we can easily define new transformations. For
example, we can define a painter that shrinks its image to the upper-right quarter
of the frame it is given:
(define (shrink-to-upper-right painter)
(transform-painter painter
(make-vect 0.5 0.5)
(make-vect 1.0 0.5)
(make-vect 0.5 1.0)))
Other transformations rotate images counterclockwise by 90 degrees29
(define (rotate90 painter)
(transform-painter painter
(make-vect 1.0 0.0)
(make-vect 1.0 1.0)
(make-vect 0.0 0.0)))
or squash images towards the center of the frame:30
(define (squash-inwards painter)
(transform-painter painter
(make-vect 0.0 0.0)
(make-vect 0.65 0.35)
(make-vect 0.35 0.65)))
Frame transformation is also the key to defining means of combining two or
more painters. The beside procedure, for example, takes two painters,
transforms them to paint in the left and right halves of an argument frame
respectively, and produces a new, compound painter. When the compound
painter is given a frame, it calls the first transformed painter to paint in the left
half of the frame and calls the second transformed painter to paint in the right
half of the frame:
(define (beside painter1 painter2)
(let ((split-point (make-vect 0.5 0.0)))
(let ((paint-left
(transform-painter painter1
(make-vect 0.0
split-point
(make-vect 0.0
(paint-right
(transform-painter painter2
split-point
(make-vect 1.0
(make-vect 0.5
(lambda (frame)
(paint-left frame)
(paint-right frame)))))
0.0)
1.0)))
0.0)
1.0))))
Observe how the painter data abstraction, and in particular the representation of
painters as procedures, makes beside easy to implement. The beside procedure
need not know anything about the details of the component painters other than
that each painter will draw something in its designated frame.
Exercise 2.50. Define the transformation flip-horiz, which flips painters
horizontally, and transformations that rotate painters counterclockwise by 180
degrees and 270 degrees.
Exercise 2.51. Define the below operation for painters. Below takes two painters
as arguments. The resulting painter, given a frame, draws with the first painter in
the bottom of the frame and with the second painter in the top. Define below in
two different ways – first by writing a procedure that is analogous to the beside
procedure given above, and again in terms of beside and suitable rotation
operations (from exercise 2.50).
Levels of language for robust design
The picture language exercises some of the critical ideas we've introduced about
abstraction with procedures and data. The fundamental data abstractions,
painters, are implemented using procedural representations, which enables the
language to handle different basic drawing capabilities in a uniform way. The
means of combination satisfy the closure property, which permits us to easily
build up complex designs. Finally, all the tools for abstracting procedures are
build up complex designs. Finally, all the tools for abstracting procedures are
available to us for abstracting means of combination for painters.
We have also obtained a glimpse of another crucial idea about languages and
program design. This is the approach of stratified design, the notion that a
complex system should be structured as a sequence of levels that are described
using a sequence of languages. Each level is constructed by combining parts that
are regarded as primitive at that level, and the parts constructed at each level are
used as primitives at the next level. The language used at each level of a
stratified design has primitives, means of combination, and means of abstraction
appropriate to that level of detail.
Stratified design pervades the engineering of complex systems. For example, in
computer engineering, resistors and transistors are combined (and described
using a language of analog circuits) to produce parts such as and-gates and orgates, which form the primitives of a language for digital-circuit design.31 These
parts are combined to build processors, bus structures, and memory systems,
which are in turn combined to form computers, using languages appropriate to
computer architecture. Computers are combined to form distributed systems,
using languages appropriate for describing network interconnections, and so on.
As a tiny example of stratification, our picture language uses primitive elements
(primitive painters) that are created using a language that specifies points and
lines to provide the lists of line segments for segments->painter, or the shading
details for a painter like rogers. The bulk of our description of the picture
language focused on combining these primitives, using geometric combiners
such as beside and below. We also worked at a higher level, regarding beside
and below as primitives to be manipulated in a language whose operations, such
as square-of-four, capture common patterns of combining geometric
combiners.
Stratified design helps make programs robust, that is, it makes it likely that small
changes in a specification will require correspondingly small changes in the
program. For instance, suppose we wanted to change the image based on wave
shown in figure 2.9. We could work at the lowest level to change the detailed
appearance of the wave element; we could work at the middle level to change the
way corner-split replicates the wave; we could work at the highest level to
change how square-limit arranges the four copies of the corner. In general,
each level of a stratified design provides a different vocabulary for expressing
the characteristics of the system, and a different kind of ability to change it.
Exercise 2.52. Make changes to the square limit of wave shown in figure 2.9 by
working at each of the levels described above. In particular:
a. Add some segments to the primitive wave painter of exercise 2.49 (to add a
smile, for example).
b. Change the pattern constructed by corner-split (for example, by using only
one copy of the up-split and right-split images instead of two).
c. Modify the version of square-limit that uses square-of-four so as to
assemble the corners in a different pattern. (For example, you might make the
big Mr. Rogers look outward from each corner of the square.)
6
The use of the word “closure” here comes from abstract algebra, where a set of elements is said to be
closed under an operation if applying the operation to elements in the set produces an element that is again
an element of the set. The Lisp community also (unfortunately) uses the word “closure” to describe a totally
unrelated concept: A closure is an implementation technique for representing procedures with free
variables. We do not use the word “closure” in this second sense in this book.
7
The notion that a means of combination should satisfy closure is a straightforward idea. Unfortunately, the
data combiners provided in many popular programming languages do not satisfy closure, or make closure
cumbersome to exploit. In Fortran or Basic, one typically combines data elements by assembling them into
arrays – but one cannot form arrays whose elements are themselves arrays. Pascal and C admit structures
whose elements are structures. However, this requires that the programmer manipulate pointers explicitly,
and adhere to the restriction that each field of a structure can contain only elements of a prespecified form.
Unlike Lisp with its pairs, these languages have no built-in general-purpose glue that makes it easy to
manipulate compound data in a uniform way. This limitation lies behind Alan Perlis's comment in his
foreword to this book: “In Pascal the plethora of declarable data structures induces a specialization within
functions that inhibits and penalizes casual cooperation. It is better to have 100 functions operate on one
data structure than to have 10 functions operate on 10 data structures.”
8
In this book, we use list to mean a chain of pairs terminated by the end-of-list marker. In contrast, the term
list structure refers to any data structure made out of pairs, not just to lists.
9
Since nested applications of car and cdr are cumbersome to write, Lisp dialects provide abbreviations for
them – for instance,
The names of all such procedures start with c and end with r. Each a between them stands for a car
operation and each d for a cdr operation, to be applied in the same order in which they appear in the name.
The names car and cdr persist because simple combinations like cadr are pronounceable.
10
It's remarkable how much energy in the standardization of Lisp dialects has been dissipated in arguments
that are literally over nothing: Should nil be an ordinary name? Should the value of nil be a symbol?
Should it be a list? Should it be a pair? In Scheme, nil is an ordinary name, which we use in this section as
a variable whose value is the end-of-list marker (just as true is an ordinary variable that has a true value).
Other dialects of Lisp, including Common Lisp, treat nil as a special symbol. The authors of this book,
who have endured too many language standardization brawls, would like to avoid the entire issue. Once we
have introduced quotation in section 2.3, we will denote the empty list as '() and dispense with the variable
nil entirely.
11
To define f and g using lambda we would write
(define f (lambda (x y . z) ))
(define g (lambda w ))
12
Scheme standardly provides a map procedure that is more general than the one described here. This more
general map takes a procedure of n arguments, together with n lists, and applies the procedure to all the first
elements of the lists, all the second elements of the lists, and so on, returning a list of the results. For
example:
(map + (list 1 2 3) (list 40 50 60) (list 700 800 900))
(741 852 963)
(map (lambda (x y) (+ x (* 2 y)))
(list 1 2 3)
(list 4 5 6))
(9 12 15)
13
The order of the first two clauses in the cond matters, since the empty list satisfies null? and also is not a
pair.
14
This is, in fact, precisely the fringe procedure from exercise 2.28. Here we've renamed it to emphasize
that it is part of a family of general sequence-manipulation procedures.
15
Richard Waters (1979) developed a program that automatically analyzes traditional Fortran programs,
viewing them in terms of maps, filters, and accumulations. He found that fully 90 percent of the code in the
Fortran Scientific Subroutine Package fits neatly into this paradigm. One of the reasons for the success of
Lisp as a programming language is that lists provide a standard medium for expressing ordered collections
so that they can be manipulated using higher-order operations. The programming language APL owes much
of its power and appeal to a similar choice. In APL all data are represented as arrays, and there is a
universal and convenient set of generic operators for all sorts of array operations.
16
According to Knuth (1981), this rule was formulated by W. G. Horner early in the nineteenth century, but
the method was actually used by Newton over a hundred years earlier. Horner's rule evaluates the
polynomial using fewer additions and multiplications than does the straightforward method of first
computing anxn, then adding an-1xn-1, and so on. In fact, it is possible to prove that any algorithm for
evaluating arbitrary polynomials must use at least as many additions and multiplications as does Horner's
rule, and thus Horner's rule is an optimal algorithm for polynomial evaluation. This was proved (for the
number of additions) by A. M. Ostrowski in a 1954 paper that essentially founded the modern study of
optimal algorithms. The analogous statement for multiplications was proved by V. Y. Pan in 1966. The
book by Borodin and Munro (1975) provides an overview of these and other results about optimal
algorithms.
17
This definition uses the extended version of map described in footnote 12.
18
This approach to nested mappings was shown to us by David Turner, whose languages KRC and Miranda
provide elegant formalisms for dealing with these constructs. The examples in this section (see also
exercise 2.42) are adapted from Turner 1981. In section 3.5.3, we'll see how this approach generalizes to
infinite sequences.
19
We're representing a pair here as a list of two elements rather than as a Lisp pair. Thus, the “pair” (i,j) is
represented as (list i j), not (cons i j).
20
The set S - x is the set of all elements of S, excluding x.
21
Semicolons in Scheme code are used to introduce comments. Everything from the semicolon to the end of
the line is ignored by the interpreter. In this book we don't use many comments; we try to make our
programs self-documenting by using descriptive names.
22
The picture language is based on the language Peter Henderson created to construct images like M.C.
Escher's “Square Limit” woodcut (see Henderson 1982). The woodcut incorporates a repeated scaled
pattern, similar to the arrangements drawn using the square-limit procedure in this section.
23
William Barton Rogers (1804-1882) was the founder and first president of MIT. A geologist and talented
teacher, he taught at William and Mary College and at the University of Virginia. In 1859 he moved to
Boston, where he had more time for research, worked on a plan for establishing a “polytechnic institute,”
and served as Massachusetts's first State Inspector of Gas Meters.
When MIT was established in 1861, Rogers was elected its first president. Rogers espoused an ideal of
“useful learning” that was different from the university education of the time, with its overemphasis on the
classics, which, as he wrote, “stand in the way of the broader, higher and more practical instruction and
discipline of the natural and social sciences.” This education was likewise to be different from narrow tradeschool education. In Rogers's words:
The world-enforced distinction between the practical and the scientific worker is utterly futile, and the
whole experience of modern times has demonstrated its utter worthlessness.
Rogers served as president of MIT until 1870, when he resigned due to ill health. In 1878 the second
president of MIT, John Runkle, resigned under the pressure of a financial crisis brought on by the Panic of
1873 and strain of fighting off attempts by Harvard to take over MIT. Rogers returned to hold the office of
president until 1881.
Rogers collapsed and died while addressing MIT's graduating class at the commencement exercises of
1882. Runkle quoted Rogers's last words in a memorial address delivered that same year:
“As I stand here today and see what the Institute is, ... I call to mind the beginnings of science. I
remember one hundred and fifty years ago Stephen Hales published a pamphlet on the subject of
illuminating gas, in which he stated that his researches had demonstrated that 128 grains of bituminous
coal – ”
“Bituminous coal,” these were his last words on earth. Here he bent forward, as if consulting some
notes on the table before him, then slowly regaining an erect position, threw up his hands, and was
translated from the scene of his earthly labors and triumphs to “the tomorrow of death,” where the
mysteries of life are solved, and the disembodied spirit finds unending satisfaction in contemplating
the new and still unfathomable mysteries of the infinite future.
In the words of Francis A. Walker (MIT's third president):
All his life he had borne himself most faithfully and heroically, and he died as so good a knight would
surely have wished, in harness, at his post, and in the very part and act of public duty.
24
Equivalently, we could write
(define flipped-pairs
(square-of-four identity flip-vert identity flip-vert))
25
Rotate180 rotates a painter by 180 degrees (see exercise 2.50). Instead of rotate180
(compose flip-vert flip-horiz), using the compose procedure from exercise 1.42.
we could say
26
Frame-coord-map uses the vector operations described in exercise 2.46 below, which we assume have
been implemented using some representation for vectors. Because of data abstraction, it doesn't matter what
this vector representation is, so long as the vector operations behave correctly.
27
Segments->painter uses the representation for line segments described in exercise 2.48 below. It also
uses the for-each procedure described in exercise 2.23.
28
For example, the rogers painter of figure 2.11 was constructed from a gray-level image. For each point in
a given frame, the rogers painter determines the point in the image that is mapped to it under the frame
coordinate map, and shades it accordingly. By allowing different types of painters, we are capitalizing on
the abstract data idea discussed in section 2.1.3, where we argued that a rational-number representation
could be anything at all that satisfies an appropriate condition. Here we're using the fact that a painter can
be implemented in any way at all, so long as it draws something in the designated frame. Section 2.1.3 also
showed how pairs could be implemented as procedures. Painters are our second example of a procedural
representation for data.
29
Rotate90 is a pure rotation only for square frames, because it also stretches and shrinks the image to fit
into the rotated frame.
30
The diamond-shaped images in figures 2.10 and 2.11 were created with squash-inwards applied to wave
and rogers.
31
Section 3.3.4 describes one such language.
2.3 Symbolic Data
All the compound data objects we have used so far were constructed ultimately
from numbers. In this section we extend the representational capability of our
language by introducing the ability to work with arbitrary symbols as data.
2.3.1 Quotation
If we can form compound data using symbols, we can have lists such as
(a b c d)
(23 45 17)
((Norah 12) (Molly 9) (Anna 7) (Lauren 6) (Charlotte 4))
Lists containing symbols can look just like the expressions of our language:
(* (+ 23 45) (+ x 9))
(define (fact n) (if (= n 1) 1 (* n (fact (- n 1)))))
In order to manipulate symbols we need a new element in our language: the
ability to quote a data object. Suppose we want to construct the list (a b). We
can't accomplish this with (list a b), because this expression constructs a list
of the values of a and b rather than the symbols themselves. This issue is well
known in the context of natural languages, where words and sentences may be
regarded either as semantic entities or as character strings (syntactic entities).
The common practice in natural languages is to use quotation marks to indicate
that a word or a sentence is to be treated literally as a string of characters. For
instance, the first letter of “John” is clearly “J.” If we tell somebody “say your
name aloud,” we expect to hear that person's name. However, if we tell
somebody “say `your name' aloud,” we expect to hear the words “your name.”
Note that we are forced to nest quotation marks to describe what somebody else
might say.32
We can follow this same practice to identify lists and symbols that are to be
treated as data objects rather than as expressions to be evaluated. However, our
format for quoting differs from that of natural languages in that we place a
quotation mark (traditionally, the single quote symbol ') only at the beginning of
the object to be quoted. We can get away with this in Scheme syntax because we
rely on blanks and parentheses to delimit objects. Thus, the meaning of the
single quote character is to quote the next object.33
Now we can distinguish between symbols and their values:
(define a 1)
(define b 2)
(list a b)
(1 2)
(list 'a 'b)
(a b)
(list 'a b)
(a 2)
Quotation also allows us to type in compound objects, using the conventional
printed representation for lists:34
(car '(a b c))
a
(cdr '(a b c))
(b c)
In keeping with this, we can obtain the empty list by evaluating '(), and thus
dispense with the variable nil.
One additional primitive used in manipulating symbols is eq?, which takes two
symbols as arguments and tests whether they are the same.35 Using eq?, we can
implement a useful procedure called memq. This takes two arguments, a symbol
and a list. If the symbol is not contained in the list (i.e., is not eq? to any item in
the list), then memq returns false. Otherwise, it returns the sublist of the list
beginning with the first occurrence of the symbol:
(define (memq item x)
(cond ((null? x) false)
((eq? item (car x)) x)
(else (memq item (cdr x)))))
For example, the value of
(memq 'apple '(pear banana prune))
is false, whereas the value of
(memq 'apple '(x (apple sauce) y apple pear))
is (apple
pear).
Exercise 2.53. What would the interpreter print in response to evaluating each
of the following expressions?
(list 'a 'b 'c)
(list (list 'george))
(cdr '((x1 x2) (y1 y2)))
(cadr '((x1 x2) (y1 y2)))
(pair? (car '(a short list)))
(memq 'red '((red shoes) (blue socks)))
(memq 'red '(red shoes blue socks))
Exercise 2.54. Two lists are said to be equal? if they contain equal elements
arranged in the same order. For example,
(equal? '(this is a list) '(this is a list))
is true, but
(equal? '(this is a list) '(this (is a) list))
is false. To be more precise, we can define equal? recursively in terms of the
basic eq? equality of symbols by saying that a and b are equal? if they are both
symbols and the symbols are eq?, or if they are both lists such that (car a) is
equal? to (car b) and (cdr a) is equal? to (cdr b). Using this idea,
implement equal? as a procedure.36
Exercise 2.55. Eva Lu Ator types to the interpreter the expression
(car ”abracadabra)
To her surprise, the interpreter prints back quote. Explain.
2.3.2 Example: Symbolic Differentiation
As an illustration of symbol manipulation and a further illustration of data
abstraction, consider the design of a procedure that performs symbolic
differentiation of algebraic expressions. We would like the procedure to take as
arguments an algebraic expression and a variable and to return the derivative of
the expression with respect to the variable. For example, if the arguments to the
procedure are ax2 + bx + c and x, the procedure should return 2ax + b. Symbolic
differentiation is of special historical significance in Lisp. It was one of the
motivating examples behind the development of a computer language for symbol
manipulation. Furthermore, it marked the beginning of the line of research that
led to the development of powerful systems for symbolic mathematical work,
which are currently being used by a growing number of applied mathematicians
and physicists.
In developing the symbolic-differentiation program, we will follow the same
strategy of data abstraction that we followed in developing the rational-number
system of section 2.1.1. That is, we will first define a differentiation algorithm
that operates on abstract objects such as “sums,” “products,” and “variables”
without worrying about how these are to be represented. Only afterward will we
address the representation problem.
The differentiation program with abstract data
In order to keep things simple, we will consider a very simple symbolicdifferentiation program that handles expressions that are built up using only the
operations of addition and multiplication with two arguments. Differentiation of
any such expression can be carried out by applying the following reduction
rules:
Observe that the latter two rules are recursive in nature. That is, to obtain the
Observe that the latter two rules are recursive in nature. That is, to obtain the
derivative of a sum we first find the derivatives of the terms and add them. Each
of the terms may in turn be an expression that needs to be decomposed.
Decomposing into smaller and smaller pieces will eventually produce pieces that
are either constants or variables, whose derivatives will be either 0 or 1.
To embody these rules in a procedure we indulge in a little wishful thinking, as
we did in designing the rational-number implementation. If we had a means for
representing algebraic expressions, we should be able to tell whether an
expression is a sum, a product, a constant, or a variable. We should be able to
extract the parts of an expression. For a sum, for example we want to be able to
extract the addend (first term) and the augend (second term). We should also be
able to construct expressions from parts. Let us assume that we already have
procedures to implement the following selectors, constructors, and predicates:
(variable? e)
(same-variable? v1
Is e a variable?
v2) Are v1 and v2 the same variable?
Is e a sum?
(sum? e)
(addend e)
(augend e)
(make-sum a1 a2)
Addend of the sum e.
Augend of the sum e.
Construct the sum of a1 and a2.
Is e a product?
(product? e)
Multiplier of the product e.
(multiplicand e)
Multiplicand of the product e.
(make-product m1 m2) Construct the product of m1 and m2.
Using these, and the primitive predicate number?, which identifies numbers, we
can express the differentiation rules as the following procedure:
(multiplier e)
(define (deriv exp var)
(cond ((number? exp) 0)
((variable? exp)
(if (same-variable? exp var) 1 0))
((sum? exp)
(make-sum (deriv (addend exp) var)
(deriv (augend exp) var)))
((product? exp)
(make-sum
(make-product (multiplier exp)
(deriv (multiplicand exp) var))
(make-product (deriv (multiplier exp) var)
(multiplicand exp))))
(else
(error "unknown expression type -- DERIV" exp))))
This deriv procedure incorporates the complete differentiation algorithm. Since
it is expressed in terms of abstract data, it will work no matter how we choose to
represent algebraic expressions, as long as we design a proper set of selectors
and constructors. This is the issue we must address next.
Representing algebraic expressions
We can imagine many ways to use list structure to represent algebraic
expressions. For example, we could use lists of symbols that mirror the usual
algebraic notation, representing ax + b as the list (a * x + b). However, one
especially straightforward choice is to use the same parenthesized prefix notation
that Lisp uses for combinations; that is, to represent ax + b as (+ (* a x) b).
Then our data representation for the differentiation problem is as follows:
The variables are symbols. They are identified by the primitive predicate
symbol?:
(define (variable? x) (symbol? x))
Two variables are the same if the symbols representing them are eq?:
(define (same-variable? v1 v2)
(and (variable? v1) (variable? v2) (eq? v1 v2)))
Sums and products are constructed as lists:
(define (make-sum a1 a2) (list '+ a1 a2))
(define (make-product m1 m2) (list '* m1 m2))
A sum is a list whose first element is the symbol +:
(define (sum? x)
(and (pair? x) (eq? (car x) '+)))
The addend is the second item of the sum list:
(define (addend s) (cadr s))
The augend is the third item of the sum list:
(define (augend s) (caddr s))
A product is a list whose first element is the symbol *:
(define (product? x)
(and (pair? x) (eq? (car x) '*)))
The multiplier is the second item of the product list:
(define (multiplier p) (cadr p))
The multiplicand is the third item of the product list:
(define (multiplicand p) (caddr p))
Thus, we need only combine these with the algorithm as embodied by deriv in
order to have a working symbolic-differentiation program. Let us look at some
examples of its behavior:
(deriv '(+ x 3) 'x)
(+ 1 0)
(deriv '(* x y) 'x)
(+ (* x 0) (* 1 y))
(deriv '(* (* x y) (+ x 3)) 'x)
(+ (* (* x y) (+ 1 0))
(* (+ (* x 0) (* 1 y))
(+ x 3)))
The program produces answers that are correct; however, they are unsimplified.
It is true that
but we would like the program to know that x · 0 = 0, 1 · y = y, and 0 + y = y.
The answer for the second example should have been simply y. As the third
example shows, this becomes a serious issue when the expressions are complex.
Our difficulty is much like the one we encountered with the rational-number
implementation: we haven't reduced answers to simplest form. To accomplish
the rational-number reduction, we needed to change only the constructors and
the selectors of the implementation. We can adopt a similar strategy here. We
won't change deriv at all. Instead, we will change make-sum so that if both
summands are numbers, make-sum will add them and return their sum. Also, if
one of the summands is 0, then make-sum will return the other summand.
(define (make-sum a1 a2)
(cond ((=number? a1 0) a2)
((=number? a2 0) a1)
((and (number? a1) (number? a2)) (+ a1 a2))
(else (list '+ a1 a2))))
This uses the procedure =number?, which checks whether an expression is equal
to a given number:
(define (=number? exp num)
(and (number? exp) (= exp num)))
Similarly, we will change make-product to build in the rules that 0 times
anything is 0 and 1 times anything is the thing itself:
(define (make-product m1 m2)
(cond ((or (=number? m1 0) (=number? m2 0)) 0)
((=number? m1 1) m2)
((=number? m2 1) m1)
((and (number? m1) (number? m2)) (* m1 m2))
(else (list '* m1 m2))))
Here is how this version works on our three examples:
(deriv '(+
1
(deriv '(*
y
(deriv '(*
(+ (* x y)
x 3) 'x)
x y) 'x)
(* x y) (+ x 3)) 'x)
(* y (+ x 3)))
Although this is quite an improvement, the third example shows that there is still
a long way to go before we get a program that puts expressions into a form that
we might agree is “simplest.” The problem of algebraic simplification is
complex because, among other reasons, a form that may be simplest for one
purpose may not be for another.
purpose may not be for another.
Exercise 2.56. Show how to extend the basic differentiator to handle more
kinds of expressions. For instance, implement the differentiation rule
by adding a new clause to the deriv program and defining appropriate
procedures exponentiation?, base, exponent, and make-exponentiation. (You
may use the symbol ** to denote exponentiation.) Build in the rules that
anything raised to the power 0 is 1 and anything raised to the power 1 is the
thing itself.
Exercise 2.57. Extend the differentiation program to handle sums and products
of arbitrary numbers of (two or more) terms. Then the last example above could
be expressed as
(deriv '(* x y (+ x 3)) 'x)
Try to do this by changing only the representation for sums and products,
without changing the deriv procedure at all. For example, the addend of a sum
would be the first term, and the augend would be the sum of the rest of the terms.
Exercise 2.58. Suppose we want to modify the differentiation program so that it
works with ordinary mathematical notation, in which + and * are infix rather
than prefix operators. Since the differentiation program is defined in terms of
abstract data, we can modify it to work with different representations of
expressions solely by changing the predicates, selectors, and constructors that
define the representation of the algebraic expressions on which the differentiator
is to operate.
a. Show how to do this in order to differentiate algebraic expressions presented
in infix form, such as (x + (3 * (x + (y + 2)))). To simplify the task,
assume that + and * always take two arguments and that expressions are fully
parenthesized.
b. The problem becomes substantially harder if we allow standard algebraic
notation, such as (x + 3 * (x + y + 2)), which drops unnecessary parentheses
and assumes that multiplication is done before addition. Can you design
appropriate predicates, selectors, and constructors for this notation such that our
derivative program still works?
2.3.3 Example: Representing Sets
In the previous examples we built representations for two kinds of compound
data objects: rational numbers and algebraic expressions. In one of these
examples we had the choice of simplifying (reducing) the expressions at either
construction time or selection time, but other than that the choice of a
representation for these structures in terms of lists was straightforward. When we
turn to the representation of sets, the choice of a representation is not so obvious.
Indeed, there are a number of possible representations, and they differ
significantly from one another in several ways.
Informally, a set is simply a collection of distinct objects. To give a more precise
definition we can employ the method of data abstraction. That is, we define “set”
by specifying the operations that are to be used on sets. These are union-set,
intersection-set, element-of-set?, and adjoin-set. Element-of-set? is a
predicate that determines whether a given element is a member of a set. Adjoinset takes an object and a set as arguments and returns a set that contains the
elements of the original set and also the adjoined element. Union-set computes
the union of two sets, which is the set containing each element that appears in
either argument. Intersection-set computes the intersection of two sets, which
is the set containing only elements that appear in both arguments. From the
viewpoint of data abstraction, we are free to design any representation that
implements these operations in a way consistent with the interpretations given
above.37
Sets as unordered lists
One way to represent a set is as a list of its elements in which no element
appears more than once. The empty set is represented by the empty list. In this
representation, element-of-set? is similar to the procedure memq of
section 2.3.1. It uses equal? instead of eq? so that the set elements need not be
symbols:
(define (element-of-set? x set)
(cond ((null? set) false)
((equal? x (car set)) true)
(else (element-of-set? x (cdr set)))))
Using this, we can write adjoin-set. If the object to be adjoined is already in
the set, we just return the set. Otherwise, we use cons to add the object to the list
that represents the set:
(define (adjoin-set x set)
(if (element-of-set? x set)
set
(cons x set)))
For intersection-set we can use a recursive strategy. If we know how to form
the intersection of set2 and the cdr of set1, we only need to decide whether to
include the car of set1 in this. But this depends on whether (car set1) is also
in set2. Here is the resulting procedure:
(define (intersection-set set1 set2)
(cond ((or (null? set1) (null? set2)) '())
((element-of-set? (car set1) set2)
(cons (car set1)
(intersection-set (cdr set1) set2)))
(else (intersection-set (cdr set1) set2))))
In designing a representation, one of the issues we should be concerned with is
efficiency. Consider the number of steps required by our set operations. Since
they all use element-of-set?, the speed of this operation has a major impact on
the efficiency of the set implementation as a whole. Now, in order to check
whether an object is a member of a set, element-of-set? may have to scan the
entire set. (In the worst case, the object turns out not to be in the set.) Hence, if
the set has n elements, element-of-set? might take up to n steps. Thus, the
number of steps required grows as θ(n). The number of steps required by
adjoin-set, which uses this operation, also grows as θ(n). For intersectionset, which does an element-of-set? check for each element of set1, the
number of steps required grows as the product of the sizes of the sets involved,
or θ(n2) for two sets of size n. The same will be true of union-set.
Exercise 2.59. Implement the union-set operation for the unordered-list
representation of sets.
Exercise 2.60. We specified that a set would be represented as a list with no
duplicates. Now suppose we allow duplicates. For instance, the set {1,2,3} could
be represented as the list (2 3 2 1 3 2 2). Design procedures element-ofset?, adjoin-set, union-set, and intersection-set that operate on this
representation. How does the efficiency of each compare with the corresponding
procedure for the non-duplicate representation? Are there applications for which
you would use this representation in preference to the non-duplicate one?
Sets as ordered lists
One way to speed up our set operations is to change the representation so that
the set elements are listed in increasing order. To do this, we need some way to
compare two objects so that we can say which is bigger. For example, we could
compare symbols lexicographically, or we could agree on some method for
assigning a unique number to an object and then compare the elements by
comparing the corresponding numbers. To keep our discussion simple, we will
consider only the case where the set elements are numbers, so that we can
compare elements using > and <. We will represent a set of numbers by listing its
elements in increasing order. Whereas our first representation above allowed us
to represent the set {1,3,6,10} by listing the elements in any order, our new
representation allows only the list (1 3 6 10).
One advantage of ordering shows up in element-of-set?: In checking for the
presence of an item, we no longer have to scan the entire set. If we reach a set
element that is larger than the item we are looking for, then we know that the
item is not in the set:
(define (element-of-set? x set)
(cond ((null? set) false)
((= x (car set)) true)
((< x (car set)) false)
(else (element-of-set? x (cdr set)))))
How many steps does this save? In the worst case, the item we are looking for
may be the largest one in the set, so the number of steps is the same as for the
unordered representation. On the other hand, if we search for items of many
different sizes we can expect that sometimes we will be able to stop searching at
a point near the beginning of the list and that other times we will still need to
examine most of the list. On the average we should expect to have to examine
about half of the items in the set. Thus, the average number of steps required will
be about n/2. This is still θ(n) growth, but it does save us, on the average, a
factor of 2 in number of steps over the previous implementation.
We obtain a more impressive speedup with intersection-set. In the unordered
representation this operation required θ(n2) steps, because we performed a
complete scan of set2 for each element of set1. But with the ordered
representation, we can use a more clever method. Begin by comparing the initial
elements, x1 and x2, of the two sets. If x1 equals x2, then that gives an element
of the intersection, and the rest of the intersection is the intersection of the cdrs
of the two sets. Suppose, however, that x1 is less than x2. Since x2 is the
smallest element in set2, we can immediately conclude that x1 cannot appear
anywhere in set2 and hence is not in the intersection. Hence, the intersection is
equal to the intersection of set2 with the cdr of set1. Similarly, if x2 is less than
x1, then the intersection is given by the intersection of set1 with the cdr of
set2. Here is the procedure:
(define (intersection-set set1 set2)
(if (or (null? set1) (null? set2))
'()
(let ((x1 (car set1)) (x2 (car set2)))
(cond ((= x1 x2)
(cons x1
(intersection-set (cdr set1)
(cdr set2))))
((< x1 x2)
(intersection-set (cdr set1) set2))
((< x2 x1)
(intersection-set set1 (cdr set2)))))))
To estimate the number of steps required by this process, observe that at each
step we reduce the intersection problem to computing intersections of smaller
sets – removing the first element from set1 or set2 or both. Thus, the number of
steps required is at most the sum of the sizes of set1 and set2, rather than the
product of the sizes as with the unordered representation. This is θ(n) growth
rather than θ(n2) – a considerable speedup, even for sets of moderate size.
Exercise 2.61. Give an implementation of adjoin-set using the ordered
representation. By analogy with element-of-set? show how to take advantage
of the ordering to produce a procedure that requires on the average about half as
many steps as with the unordered representation.
Exercise 2.62. Give a θ(n) implementation of union-set for sets represented as
ordered lists.
Sets as binary trees
We can do better than the ordered-list representation by arranging the set
elements in the form of a tree. Each node of the tree holds one element of the set,
called the “entry” at that node, and a link to each of two other (possibly empty)
nodes. The “left” link points to elements smaller than the one at the node, and
the “right” link to elements greater than the one at the node. Figure 2.16 shows
some trees that represent the set {1,3,5,7,9,11}. The same set may be represented
by a tree in a number of different ways. The only thing we require for a valid
representation is that all elements in the left subtree be smaller than the node
entry and that all elements in the right subtree be larger.
Figure 2.16: Various binary trees that represent the set { 1,3,5,7,9,11 }.
The advantage of the tree representation is this: Suppose we want to check
whether a number x is contained in a set. We begin by comparing x with the
entry in the top node. If x is less than this, we know that we need only search the
left subtree; if x is greater, we need only search the right subtree. Now, if the tree
is “balanced,” each of these subtrees will be about half the size of the original.
Thus, in one step we have reduced the problem of searching a tree of size n to
searching a tree of size n/2. Since the size of the tree is halved at each step, we
should expect that the number of steps needed to search a tree of size n grows as
θ(log n).38 For large sets, this will be a significant speedup over the previous
representations.
We can represent trees by using lists. Each node will be a list of three items: the
entry at the node, the left subtree, and the right subtree. A left or a right subtree
of the empty list will indicate that there is no subtree connected there. We can
describe this representation by the following procedures:39
(define (entry tree) (car tree))
(define (left-branch tree) (cadr tree))
(define (right-branch tree) (caddr tree))
(define (make-tree entry left right)
(list entry left right))
Now we can write the element-of-set? procedure using the strategy described
above:
(define (element-of-set? x set)
(cond ((null? set) false)
((= x (entry set)) true)
((< x (entry set))
(element-of-set? x (left-branch set)))
((> x (entry set))
(element-of-set? x (right-branch set)))))
Adjoining an item to a set is implemented similarly and also requires θ(log n)
steps. To adjoin an item x, we compare x with the node entry to determine
whether x should be added to the right or to the left branch, and having adjoined
x to the appropriate branch we piece this newly constructed branch together with
the original entry and the other branch. If x is equal to the entry, we just return
the node. If we are asked to adjoin x to an empty tree, we generate a tree that has
x as the entry and empty right and left branches. Here is the procedure:
(define (adjoin-set x set)
(cond ((null? set) (make-tree x '() '()))
((= x (entry set)) set)
((< x (entry set))
(make-tree (entry set)
(adjoin-set x (left-branch set))
(right-branch set)))
((> x (entry set))
(make-tree (entry set)
(left-branch set)
(adjoin-set x (right-branch set))))))
The above claim that searching the tree can be performed in a logarithmic
number of steps rests on the assumption that the tree is “balanced,” i.e., that the
left and the right subtree of every tree have approximately the same number of
elements, so that each subtree contains about half the elements of its parent. But
how can we be certain that the trees we construct will be balanced? Even if we
start with a balanced tree, adding elements with adjoin-set may produce an
unbalanced result. Since the position of a newly adjoined element depends on
how the element compares with the items already in the set, we can expect that if
we add elements “randomly” the tree will tend to be balanced on the average.
But this is not a guarantee. For example, if we start with an empty set and adjoin
the numbers 1 through 7 in sequence we end up with the highly unbalanced tree
shown in figure 2.17. In this tree all the left subtrees are empty, so it has no
advantage over a simple ordered list. One way to solve this problem is to define
an operation that transforms an arbitrary tree into a balanced tree with the same
elements. Then we can perform this transformation after every few adjoin-set
operations to keep our set in balance. There are also other ways to solve this
problem, most of which involve designing new data structures for which
searching and insertion both can be done in θ(log n) steps.40
Figure 2.17: Unbalanced tree produced by adjoining 1 through 7 in sequence.
Exercise 2.63. Each of the following two procedures converts a binary tree to a
list.
(define (tree->list-1 tree)
(if (null? tree)
'()
(append (tree->list-1 (left-branch tree))
(cons (entry tree)
(tree->list-1 (right-branch tree))))))
(define (tree->list-2 tree)
(define (copy-to-list tree result-list)
(if (null? tree)
result-list
(copy-to-list (left-branch tree)
(cons (entry tree)
(copy-to-list (right-branch tree)
result-list)))))
(copy-to-list tree '()))
a. Do the two procedures produce the same result for every tree? If not, how do
the results differ? What lists do the two procedures produce for the trees in
figure 2.16?
b. Do the two procedures have the same order of growth in the number of steps
required to convert a balanced tree with n elements to a list? If not, which one
grows more slowly?
Exercise 2.64. The following procedure list->tree converts an ordered list to
a balanced binary tree. The helper procedure partial-tree takes as arguments
an integer n and list of at least n elements and constructs a balanced tree
containing the first n elements of the list. The result returned by partial-tree is
a pair (formed with cons) whose car is the constructed tree and whose cdr is the
list of elements not included in the tree.
(define (list->tree elements)
(car (partial-tree elements (length elements))))
(define (partial-tree elts n)
(if (= n 0)
(cons '() elts)
(let ((left-size (quotient (- n 1) 2)))
(let ((left-result (partial-tree elts left-size)))
(let ((left-tree (car left-result))
(non-left-elts (cdr left-result))
(right-size (- n (+ left-size 1))))
(let ((this-entry (car non-left-elts))
(right-result (partial-tree (cdr non-left-elts)
right-size)))
(let ((right-tree (car right-result))
(remaining-elts (cdr right-result)))
(cons (make-tree this-entry left-tree right-tree)
remaining-elts))))))))
a. Write a short paragraph explaining as clearly as you can how partial-tree
works. Draw the tree produced by list->tree for the list (1 3 5 7 9 11).
b. What is the order of growth in the number of steps required by list->tree to
convert a list of n elements?
Exercise 2.65. Use the results of exercises 2.63 and 2.64 to give θ(n)
implementations of union-set and intersection-set for sets implemented as
(balanced) binary trees.41
Sets and information retrieval
We have examined options for using lists to represent sets and have seen how
the choice of representation for a data object can have a large impact on the
performance of the programs that use the data. Another reason for concentrating
on sets is that the techniques discussed here appear again and again in
applications involving information retrieval.
Consider a data base containing a large number of individual records, such as the
personnel files for a company or the transactions in an accounting system. A
typical data-management system spends a large amount of time accessing or
modifying the data in the records and therefore requires an efficient method for
accessing records. This is done by identifying a part of each record to serve as an
identifying key. A key can be anything that uniquely identifies the record. For a
personnel file, it might be an employee's ID number. For an accounting system,
it might be a transaction number. Whatever the key is, when we define the
record as a data structure we should include a key selector procedure that
retrieves the key associated with a given record.
Now we represent the data base as a set of records. To locate the record with a
given key we use a procedure lookup, which takes as arguments a key and a data
base and which returns the record that has that key, or false if there is no such
record. Lookup is implemented in almost the same way as element-of-set?. For
example, if the set of records is implemented as an unordered list, we could use
(define (lookup given-key set-of-records)
(cond ((null? set-of-records) false)
((equal? given-key (key (car set-of-records)))
(car set-of-records))
(else (lookup given-key (cdr set-of-records)))))
Of course, there are better ways to represent large sets than as unordered lists.
Information-retrieval systems in which records have to be “randomly accessed”
are typically implemented by a tree-based method, such as the binary-tree
representation discussed previously. In designing such a system the methodology
of data abstraction can be a great help. The designer can create an initial
implementation using a simple, straightforward representation such as unordered
lists. This will be unsuitable for the eventual system, but it can be useful in
providing a “quick and dirty” data base with which to test the rest of the system.
Later on, the data representation can be modified to be more sophisticated. If the
data base is accessed in terms of abstract selectors and constructors, this change
data base is accessed in terms of abstract selectors and constructors, this change
in representation will not require any changes to the rest of the system.
Exercise 2.66. Implement the lookup procedure for the case where the set of
records is structured as a binary tree, ordered by the numerical values of the
keys.
2.3.4 Example: Huffman Encoding Trees
This section provides practice in the use of list structure and data abstraction to
manipulate sets and trees. The application is to methods for representing data as
sequences of ones and zeros (bits). For example, the ASCII standard code used
to represent text in computers encodes each character as a sequence of seven
bits. Using seven bits allows us to distinguish 27, or 128, possible different
characters. In general, if we want to distinguish n different symbols, we will
need to use log2n bits per symbol. If all our messages are made up of the eight
symbols A, B, C, D, E, F, G, and H, we can choose a code with three bits per
character, for example
A 000 C 010 E 100 G 110
B 001 D 011 F 101 H 111
With this code, the message
BACADAEAFABBAAAGAH
is encoded as the string of 54 bits
001000010000011000100000101000001001000000000110000111
Codes such as ASCII and the A-through-H code above are known as fixed-length
codes, because they represent each symbol in the message with the same number
of bits. It is sometimes advantageous to use variable-length codes, in which
different symbols may be represented by different numbers of bits. For example,
Morse code does not use the same number of dots and dashes for each letter of
the alphabet. In particular, E, the most frequent letter, is represented by a single
dot. In general, if our messages are such that some symbols appear very
frequently and some very rarely, we can encode data more efficiently (i.e., using
fewer bits per message) if we assign shorter codes to the frequent symbols.
Consider the following alternative code for the letters A through H:
A 0 C 1010 E 1100 G 1110
B 100 D 1011 F 1101 H 1111
With this code, the same message as above is encoded as the string
100010100101101100011010100100000111001111
This string contains 42 bits, so it saves more than 20% in space in comparison
with the fixed-length code shown above.
One of the difficulties of using a variable-length code is knowing when you have
reached the end of a symbol in reading a sequence of zeros and ones. Morse
code solves this problem by using a special separator code (in this case, a pause)
after the sequence of dots and dashes for each letter. Another solution is to
design the code in such a way that no complete code for any symbol is the
beginning (or prefix) of the code for another symbol. Such a code is called a
prefix code. In the example above, A is encoded by 0 and B is encoded by 100,
so no other symbol can have a code that begins with 0 or with 100.
In general, we can attain significant savings if we use variable-length prefix
codes that take advantage of the relative frequencies of the symbols in the
messages to be encoded. One particular scheme for doing this is called the
Huffman encoding method, after its discoverer, David Huffman. A Huffman
code can be represented as a binary tree whose leaves are the symbols that are
encoded. At each non-leaf node of the tree there is a set containing all the
symbols in the leaves that lie below the node. In addition, each symbol at a leaf
is assigned a weight (which is its relative frequency), and each non-leaf node
contains a weight that is the sum of all the weights of the leaves lying below it.
The weights are not used in the encoding or the decoding process. We will see
below how they are used to help construct the tree.
Figure 2.18: A Huffman encoding tree.
Figure 2.18 shows the Huffman tree for the A-through-H code given above. The
weights at the leaves indicate that the tree was designed for messages in which A
appears with relative frequency 8, B with relative frequency 3, and the other
letters each with relative frequency 1.
Given a Huffman tree, we can find the encoding of any symbol by starting at the
root and moving down until we reach the leaf that holds the symbol. Each time
we move down a left branch we add a 0 to the code, and each time we move
down a right branch we add a 1. (We decide which branch to follow by testing to
see which branch either is the leaf node for the symbol or contains the symbol in
its set.) For example, starting from the root of the tree in figure 2.18, we arrive at
the leaf for D by following a right branch, then a left branch, then a right branch,
then a right branch; hence, the code for D is 1011.
To decode a bit sequence using a Huffman tree, we begin at the root and use the
successive zeros and ones of the bit sequence to determine whether to move
down the left or the right branch. Each time we come to a leaf, we have
generated a new symbol in the message, at which point we start over from the
root of the tree to find the next symbol. For example, suppose we are given the
tree above and the sequence 10001010. Starting at the root, we move down the
right branch, (since the first bit of the string is 1), then down the left branch
(since the second bit is 0), then down the left branch (since the third bit is
also 0). This brings us to the leaf for B, so the first symbol of the decoded
message is B. Now we start again at the root, and we make a left move because
the next bit in the string is 0. This brings us to the leaf for A. Then we start again
at the root with the rest of the string 1010, so we move right, left, right, left and
reach C. Thus, the entire message is BAC.
Generating Huffman trees
Given an “alphabet” of symbols and their relative frequencies, how do we
construct the “best” code? (In other words, which tree will encode messages
with the fewest bits?) Huffman gave an algorithm for doing this and showed that
the resulting code is indeed the best variable-length code for messages where the
relative frequency of the symbols matches the frequencies with which the code
was constructed. We will not prove this optimality of Huffman codes here, but
we will show how Huffman trees are constructed.42
The algorithm for generating a Huffman tree is very simple. The idea is to
arrange the tree so that the symbols with the lowest frequency appear farthest
away from the root. Begin with the set of leaf nodes, containing symbols and
their frequencies, as determined by the initial data from which the code is to be
constructed. Now find two leaves with the lowest weights and merge them to
produce a node that has these two nodes as its left and right branches. The
weight of the new node is the sum of the two weights. Remove the two leaves
from the original set and replace them by this new node. Now continue this
process. At each step, merge two nodes with the smallest weights, removing
them from the set and replacing them with a node that has these two as its left
and right branches. The process stops when there is only one node left, which is
the root of the entire tree. Here is how the Huffman tree of figure 2.18 was
generated:
Initial leaves {(A 8) (B 3) (C 1) (D 1) (E 1) (F 1) (G 1) (H 1)}
{(A 8) (B 3) ({C D} 2) (E 1) (F 1) (G 1) (H 1)}
Merge
{(A 8) (B 3) ({C D} 2) ({E F} 2) (G 1) (H 1)}
Merge
{(A 8) (B 3) ({C D} 2) ({E F} 2) ({G H} 2)}
Merge
{(A 8) (B 3) ({C D} 2) ({E F G H} 4)}
Merge
{(A 8) ({B C D} 5) ({E F G H} 4)}
Merge
{(A 8) ({B C D E F G H} 9)}
Merge
{({A B C D E F G H} 17)}
Final merge
The algorithm does not always specify a unique tree, because there may not be
unique smallest-weight nodes at each step. Also, the choice of the order in which
the two nodes are merged (i.e., which will be the right branch and which will be
the left branch) is arbitrary.
Representing Huffman trees
In the exercises below we will work with a system that uses Huffman trees to
encode and decode messages and generates Huffman trees according to the
algorithm outlined above. We will begin by discussing how trees are
represented.
Leaves of the tree are represented by a list consisting of the symbol leaf, the
symbol at the leaf, and the weight:
(define (make-leaf symbol weight)
(list 'leaf symbol weight))
(define (leaf? object)
(eq? (car object) 'leaf))
(define (symbol-leaf x) (cadr x))
(define (weight-leaf x) (caddr x))
A general tree will be a list of a left branch, a right branch, a set of symbols, and
a weight. The set of symbols will be simply a list of the symbols, rather than
some more sophisticated set representation. When we make a tree by merging
two nodes, we obtain the weight of the tree as the sum of the weights of the
nodes, and the set of symbols as the union of the sets of symbols for the nodes.
Since our symbol sets are represented as lists, we can form the union by using
the append procedure we defined in section 2.2.1:
(define (make-code-tree left right)
(list left
right
(append (symbols left) (symbols right))
(+ (weight left) (weight right))))
If we make a tree in this way, we have the following selectors:
(define (left-branch tree) (car tree))
(define (right-branch tree) (cadr tree))
(define (symbols tree)
(if (leaf? tree)
(list (symbol-leaf tree))
(caddr tree)))
(define (weight tree)
(if (leaf? tree)
(weight-leaf tree)
(cadddr tree)))
The procedures symbols and weight must do something slightly different
depending on whether they are called with a leaf or a general tree. These are
simple examples of generic procedures (procedures that can handle more than
one kind of data), which we will have much more to say about in sections 2.4
and 2.5.
The decoding procedure
The following procedure implements the decoding algorithm. It takes as
arguments a list of zeros and ones, together with a Huffman tree.
(define (decode bits tree)
(define (decode-1 bits current-branch)
(if (null? bits)
'()
(let ((next-branch
(choose-branch (car bits) current-branch)))
(if (leaf? next-branch)
(cons (symbol-leaf next-branch)
(decode-1 (cdr bits) tree))
(decode-1 (cdr bits) next-branch)))))
(decode-1 bits tree))
(define (choose-branch bit branch)
(cond ((= bit 0) (left-branch branch))
((= bit 1) (right-branch branch))
(else (error "bad bit -- CHOOSE-BRANCH" bit))))
The procedure decode-1 takes two arguments: the list of remaining bits and the
current position in the tree. It keeps moving “down” the tree, choosing a left or a
right branch according to whether the next bit in the list is a zero or a one. (This
is done with the procedure choose-branch.) When it reaches a leaf, it returns the
symbol at that leaf as the next symbol in the message by consing it onto the
result of decoding the rest of the message, starting at the root of the tree. Note
the error check in the final clause of choose-branch, which complains if the
procedure finds something other than a zero or a one in the input data.
Sets of weighted elements
In our representation of trees, each non-leaf node contains a set of symbols,
which we have represented as a simple list. However, the tree-generating
algorithm discussed above requires that we also work with sets of leaves and
trees, successively merging the two smallest items. Since we will be required to
repeatedly find the smallest item in a set, it is convenient to use an ordered
representation for this kind of set.
We will represent a set of leaves and trees as a list of elements, arranged in
increasing order of weight. The following adjoin-set procedure for
constructing sets is similar to the one described in exercise 2.61; however, items
are compared by their weights, and the element being added to the set is never
already in it.
(define (adjoin-set x set)
(cond ((null? set) (list x))
((< (weight x) (weight (car set))) (cons x set))
(else (cons (car set)
(adjoin-set x (cdr set))))))
The following procedure takes a list of symbol-frequency pairs such as ((A 4)
(B 2) (C 1) (D 1)) and constructs an initial ordered set of leaves, ready to be
merged according to the Huffman algorithm:
(define (make-leaf-set pairs)
(if (null? pairs)
'()
(let ((pair (car pairs)))
(adjoin-set (make-leaf (car pair)
; symbol
(cadr pair)) ; frequency
(make-leaf-set (cdr pairs))))))
Exercise 2.67. Define an encoding tree and a sample message:
(define sample-tree
(make-code-tree (make-leaf 'A 4)
(make-code-tree
(make-leaf 'B 2)
(make-code-tree (make-leaf 'D 1)
(make-leaf 'C 1)))))
(define sample-message '(0 1 1 0 0 1 0 1 0 1 1 1 0))
Use the decode procedure to decode the message, and give the result.
Exercise 2.68. The encode procedure takes as arguments a message and a tree
and produces the list of bits that gives the encoded message.
(define (encode message tree)
(if (null? message)
'()
(append (encode-symbol (car message) tree)
(encode (cdr message) tree))))
is a procedure, which you must write, that returns the list of bits
that encodes a given symbol according to a given tree. You should design
encode-symbol so that it signals an error if the symbol is not in the tree at all.
Test your procedure by encoding the result you obtained in exercise 2.67 with
the sample tree and seeing whether it is the same as the original sample message.
Encode-symbol
Exercise 2.69. The following procedure takes as its argument a list of symbolfrequency pairs (where no symbol appears in more than one pair) and generates
a Huffman encoding tree according to the Huffman algorithm.
(define (generate-huffman-tree pairs)
(successive-merge (make-leaf-set pairs)))
is the procedure given above that transforms the list of pairs into
an ordered set of leaves. Successive-merge is the procedure you must write,
using make-code-tree to successively merge the smallest-weight elements of
the set until there is only one element left, which is the desired Huffman tree.
(This procedure is slightly tricky, but not really complicated. If you find yourself
Make-leaf-set
designing a complex procedure, then you are almost certainly doing something
wrong. You can take significant advantage of the fact that we are using an
ordered set representation.)
Exercise 2.70. The following eight-symbol alphabet with associated relative
frequencies was designed to efficiently encode the lyrics of 1950s rock songs.
(Note that the “symbols” of an “alphabet” need not be individual letters.)
A
2 NA 16
BOOM 1 SHA 3
GET 2 YIP 9
JOB 2 WAH 1
Use generate-huffman-tree (exercise 2.69) to generate a corresponding
Huffman tree, and use encode (exercise 2.68) to encode the following message:
Get a job
Sha na na na na na na na na
Get a job
Sha na na na na na na na na
Wah yip yip yip yip yip yip yip yip yip
Sha boom
How many bits are required for the encoding? What is the smallest number of
bits that would be needed to encode this song if we used a fixed-length code for
the eight-symbol alphabet?
Exercise 2.71. Suppose we have a Huffman tree for an alphabet of n symbols,
and that the relative frequencies of the symbols are 1, 2, 4, ..., 2n-1. Sketch the
tree for n=5; for n=10. In such a tree (for general n) how may bits are required to
encode the most frequent symbol? the least frequent symbol?
Exercise 2.72. Consider the encoding procedure that you designed in
exercise 2.68. What is the order of growth in the number of steps needed to
encode a symbol? Be sure to include the number of steps needed to search the
symbol list at each node encountered. To answer this question in general is
difficult. Consider the special case where the relative frequencies of the n
symbols are as described in exercise 2.71, and give the order of growth (as a
function of n) of the number of steps needed to encode the most frequent and
least frequent symbols in the alphabet.
32
Allowing quotation in a language wreaks havoc with the ability to reason about the language in simple
terms, because it destroys the notion that equals can be substituted for equals. For example, three is one plus
two, but the word “three” is not the phrase “one plus two.” Quotation is powerful because it gives us a way
to build expressions that manipulate other expressions (as we will see when we write an interpreter in
chapter 4). But allowing statements in a language that talk about other statements in that language makes it
very difficult to maintain any coherent principle of what “equals can be substituted for equals” should
mean. For example, if we know that the evening star is the morning star, then from the statement “the
evening star is Venus” we can deduce “the morning star is Venus.” However, given that “John knows that
the evening star is Venus” we cannot infer that “John knows that the morning star is Venus.”
33
The single quote is different from the double quote we have been using to enclose character strings to be
printed. Whereas the single quote can be used to denote lists or symbols, the double quote is used only with
character strings. In this book, the only use for character strings is as items to be printed.
34
Strictly, our use of the quotation mark violates the general rule that all compound expressions in our
language should be delimited by parentheses and look like lists. We can recover this consistency by
introducing a special form quote, which serves the same purpose as the quotation mark. Thus, we would
type (quote a) instead of 'a, and we would type (quote (a b c)) instead of '(a b c). This is precisely
how the interpreter works. The quotation mark is just a single-character abbreviation for wrapping the next
complete expression with quote to form (quote ). This is important because it maintains the
principle that any expression seen by the interpreter can be manipulated as a data object. For instance, we
could construct the expression (car '(a b c)), which is the same as (car (quote (a b c))), by
evaluating (list 'car (list 'quote '(a b c))).
35
We can consider two symbols to be “the same” if they consist of the same characters in the same order.
Such a definition skirts a deep issue that we are not yet ready to address: the meaning of “sameness” in a
programming language. We will return to this in chapter 3 (section 3.1.3).
36
In practice, programmers use equal? to compare lists that contain numbers as well as symbols. Numbers
are not considered to be symbols. The question of whether two numerically equal numbers (as tested by =)
are also eq? is highly implementation-dependent. A better definition of equal? (such as the one that comes
as a primitive in Scheme) would also stipulate that if a and b are both numbers, then a and b are equal? if
they are numerically equal.
37
If we want to be more formal, we can specify “consistent with the interpretations given above” to mean
that the operations satisfy a collection of rules such as these:
For any set S and any object x, (element-of-set? x (adjoin-set x
“Adjoining an object to a set produces a set that contains the object”).
S))
is true (informally:
For any sets S and T and any object x, (element-of-set? x (union-set S T)) is equal to (or
(element-of-set? x S) (element-of-set? x T)) (informally: “The elements of (union S T) are
the elements that are in S or in T”).
For any object x, (element-of-set?
empty set”).
x '())
is false (informally: “No object is an element of the
38
Halving the size of the problem at each step is the distinguishing characteristic of logarithmic growth, as
we saw with the fast-exponentiation algorithm of section 1.2.4 and the half-interval search method of
section 1.3.3.
39
We are representing sets in terms of trees, and trees in terms of lists – in effect, a data abstraction built
upon a data abstraction. We can regard the procedures entry, left-branch, right-branch, and make-tree
as a way of isolating the abstraction of a “binary tree” from the particular way we might wish to represent
such a tree in terms of list structure.
40
Examples of such structures include B-trees and red-black trees. There is a large literature on data
structures devoted to this problem. See Cormen, Leiserson, and Rivest 1990.
41
Exercises 2.63-2.65 are due to Paul Hilfinger.
42
See Hamming 1980 for a discussion of the mathematical properties of Huffman codes.
2.4 Multiple Representations for Abstract Data
We have introduced data abstraction, a methodology for structuring systems in
such a way that much of a program can be specified independent of the choices
involved in implementing the data objects that the program manipulates. For
example, we saw in section 2.1.1 how to separate the task of designing a
program that uses rational numbers from the task of implementing rational
numbers in terms of the computer language's primitive mechanisms for
constructing compound data. The key idea was to erect an abstraction barrier –
in this case, the selectors and constructors for rational numbers (make-rat,
numer, denom) – that isolates the way rational numbers are used from their
underlying representation in terms of list structure. A similar abstraction barrier
isolates the details of the procedures that perform rational arithmetic (add-rat,
sub-rat, mul-rat, and div-rat) from the “higher-level” procedures that use
rational numbers. The resulting program has the structure shown in figure 2.1.
These data-abstraction barriers are powerful tools for controlling complexity. By
isolating the underlying representations of data objects, we can divide the task of
designing a large program into smaller tasks that can be performed separately.
But this kind of data abstraction is not yet powerful enough, because it may not
always make sense to speak of “the underlying representation” for a data object.
For one thing, there might be more than one useful representation for a data
object, and we might like to design systems that can deal with multiple
representations. To take a simple example, complex numbers may be represented
in two almost equivalent ways: in rectangular form (real and imaginary parts)
and in polar form (magnitude and angle). Sometimes rectangular form is more
appropriate and sometimes polar form is more appropriate. Indeed, it is perfectly
plausible to imagine a system in which complex numbers are represented in both
ways, and in which the procedures for manipulating complex numbers work with
either representation.
More importantly, programming systems are often designed by many people
working over extended periods of time, subject to requirements that change over
time. In such an environment, it is simply not possible for everyone to agree in
advance on choices of data representation. So in addition to the data-abstraction
barriers that isolate representation from use, we need abstraction barriers that
isolate different design choices from each other and permit different choices to
coexist in a single program. Furthermore, since large programs are often created
by combining pre-existing modules that were designed in isolation, we need
conventions that permit programmers to incorporate modules into larger systems
additively, that is, without having to redesign or reimplement these modules.
In this section, we will learn how to cope with data that may be represented in
different ways by different parts of a program. This requires constructing generic
procedures – procedures that can operate on data that may be represented in
more than one way. Our main technique for building generic procedures will be
to work in terms of data objects that have type tags, that is, data objects that
include explicit information about how they are to be processed. We will also
discuss data-directed programming, a powerful and convenient implementation
strategy for additively assembling systems with generic operations.
We begin with the simple complex-number example. We will see how type tags
and data-directed style enable us to design separate rectangular and polar
representations for complex numbers while maintaining the notion of an abstract
“complex-number” data object. We will accomplish this by defining arithmetic
procedures for complex numbers (add-complex, sub-complex, mul-complex,
and div-complex) in terms of generic selectors that access parts of a complex
number independent of how the number is represented. The resulting complexnumber system, as shown in figure 2.19, contains two different kinds of
abstraction barriers. The “horizontal” abstraction barriers play the same role as
the ones in figure 2.1. They isolate “higher-level” operations from “lower-level”
representations. In addition, there is a “vertical” barrier that gives us the ability
to separately design and install alternative representations.
Figure 2.19: Data-abstraction barriers in the complex-number system.
In section 2.5 we will show how to use type tags and data-directed style to
develop a generic arithmetic package. This provides procedures (add, mul, and
so on) that can be used to manipulate all sorts of “numbers” and can be easily
extended when a new kind of number is needed. In section 2.5.3, we'll show how
to use generic arithmetic in a system that performs symbolic algebra.
2.4.1 Representations for Complex Numbers
We will develop a system that performs arithmetic operations on complex
numbers as a simple but unrealistic example of a program that uses generic
operations. We begin by discussing two plausible representations for complex
numbers as ordered pairs: rectangular form (real part and imaginary part) and
polar form (magnitude and angle).43 Section 2.4.2 will show how both
representations can be made to coexist in a single system through the use of type
tags and generic operations.
Like rational numbers, complex numbers are naturally represented as ordered
pairs. The set of complex numbers can be thought of as a two-dimensional space
with two orthogonal axes, the “real” axis and the “imaginary” axis. (See
figure 2.20.) From this point of view, the complex number z = x + iy (where i2 =
- 1) can be thought of as the point in the plane whose real coordinate is x and
whose imaginary coordinate is y. Addition of complex numbers reduces in this
representation to addition of coordinates:
When multiplying complex numbers, it is more natural to think in terms of
representing a complex number in polar form, as a magnitude and an angle (r
and A in figure 2.20). The product of two complex numbers is the vector
obtained by stretching one complex number by the length of the other and then
rotating it through the angle of the other:
Thus, there are two different representations for complex numbers, which are
appropriate for different operations. Yet, from the viewpoint of someone writing
a program that uses complex numbers, the principle of data abstraction suggests
that all the operations for manipulating complex numbers should be available
regardless of which representation is used by the computer. For example, it is
often useful to be able to find the magnitude of a complex number that is
specified by rectangular coordinates. Similarly, it is often useful to be able to
specified by rectangular coordinates. Similarly, it is often useful to be able to
determine the real part of a complex number that is specified by polar
coordinates.
Figure 2.20: Complex numbers as points in the plane.
To design such a system, we can follow the same data-abstraction strategy we
followed in designing the rational-number package in section 2.1.1. Assume that
the operations on complex numbers are implemented in terms of four selectors:
real-part, imag-part, magnitude, and angle. Also assume that we have two
procedures for constructing complex numbers: make-from-real-imag returns a
complex number with specified real and imaginary parts, and make-from-magang returns a complex number with specified magnitude and angle. These
procedures have the property that, for any complex number z, both
(make-from-real-imag (real-part z) (imag-part z))
and
(make-from-mag-ang (magnitude z) (angle z))
produce complex numbers that are equal to z.
Using these constructors and selectors, we can implement arithmetic on complex
numbers using the “abstract data” specified by the constructors and selectors,
just as we did for rational numbers in section 2.1.1. As shown in the formulas
above, we can add and subtract complex numbers in terms of real and imaginary
parts while multiplying and dividing complex numbers in terms of magnitudes
and angles:
(define (add-complex z1 z2)
(make-from-real-imag (+ (real-part z1) (real-part z2))
(+ (imag-part z1) (imag-part z2))))
(define (sub-complex z1 z2)
(make-from-real-imag (- (real-part z1) (real-part z2))
(- (imag-part z1) (imag-part z2))))
(define (mul-complex z1 z2)
(make-from-mag-ang (* (magnitude z1) (magnitude z2))
(+ (angle z1) (angle z2))))
(define (div-complex z1 z2)
(make-from-mag-ang (/ (magnitude z1) (magnitude z2))
(- (angle z1) (angle z2))))
To complete the complex-number package, we must choose a representation and
we must implement the constructors and selectors in terms of primitive numbers
and primitive list structure. There are two obvious ways to do this: We can
represent a complex number in “rectangular form” as a pair (real part, imaginary
part) or in “polar form” as a pair (magnitude, angle). Which shall we choose?
In order to make the different choices concrete, imagine that there are two
programmers, Ben Bitdiddle and Alyssa P. Hacker, who are independently
designing representations for the complex-number system. Ben chooses to
represent complex numbers in rectangular form. With this choice, selecting the
real and imaginary parts of a complex number is straightforward, as is
constructing a complex number with given real and imaginary parts. To find the
magnitude and the angle, or to construct a complex number with a given
magnitude and angle, he uses the trigonometric relations
which relate the real and imaginary parts (x, y) to the magnitude and the angle (r,
A).44 Ben's representation is therefore given by the following selectors and
constructors:
(define
(define
(define
(sqrt
(define
(atan
(define
(define
(cons
(real-part z) (car z))
(imag-part z) (cdr z))
(magnitude z)
(+ (square (real-part z)) (square (imag-part z)))))
(angle z)
(imag-part z) (real-part z)))
(make-from-real-imag x y) (cons x y))
(make-from-mag-ang r a)
(* r (cos a)) (* r (sin a))))
Alyssa, in contrast, chooses to represent complex numbers in polar form. For
her, selecting the magnitude and angle is straightforward, but she has to use the
trigonometric relations to obtain the real and imaginary parts. Alyssa's
representation is:
(define (real-part z)
(* (magnitude z) (cos (angle z))))
(define (imag-part z)
(* (magnitude z) (sin (angle z))))
(define (magnitude z) (car z))
(define (angle z) (cdr z))
(define (make-from-real-imag x y)
(cons (sqrt (+ (square x) (square y)))
(atan y x)))
(define (make-from-mag-ang r a) (cons r a))
The discipline of data abstraction ensures that the same implementation of addcomplex, sub-complex, mul-complex, and div-complex will work with either
Ben's representation or Alyssa's representation.
2.4.2 Tagged data
One way to view data abstraction is as an application of the “principle of least
commitment.” In implementing the complex-number system in section 2.4.1, we
can use either Ben's rectangular representation or Alyssa's polar representation.
The abstraction barrier formed by the selectors and constructors permits us to
defer to the last possible moment the choice of a concrete representation for our
data objects and thus retain maximum flexibility in our system design.
The principle of least commitment can be carried to even further extremes. If we
desire, we can maintain the ambiguity of representation even after we have
designed the selectors and constructors, and elect to use both Ben's
representation and Alyssa's representation. If both representations are included
in a single system, however, we will need some way to distinguish data in polar
form from data in rectangular form. Otherwise, if we were asked, for instance, to
find the magnitude of the pair (3,4), we wouldn't know whether to answer 5
(interpreting the number in rectangular form) or 3 (interpreting the number in
polar form). A straightforward way to accomplish this distinction is to include a
type tag – the symbol rectangular or polar – as part of each complex number.
Then when we need to manipulate a complex number we can use the tag to
decide which selector to apply.
In order to manipulate tagged data, we will assume that we have procedures
type-tag and contents that extract from a data object the tag and the actual
contents (the polar or rectangular coordinates, in the case of a complex number).
We will also postulate a procedure attach-tag that takes a tag and contents and
produces a tagged data object. A straightforward way to implement this is to use
ordinary list structure:
(define (attach-tag type-tag contents)
(cons type-tag contents))
(define (type-tag datum)
(if (pair? datum)
(car datum)
(error "Bad tagged datum -- TYPE-TAG" datum)))
(define (contents datum)
(if (pair? datum)
(cdr datum)
(error "Bad tagged datum -- CONTENTS" datum)))
Using these procedures, we can define predicates rectangular? and polar?,
which recognize polar and rectangular numbers, respectively:
(define (rectangular? z)
(eq? (type-tag z) 'rectangular))
(define (polar? z)
(eq? (type-tag z) 'polar))
With type tags, Ben and Alyssa can now modify their code so that their two
different representations can coexist in the same system. Whenever Ben
constructs a complex number, he tags it as rectangular. Whenever Alyssa
constructs a complex number, she tags it as polar. In addition, Ben and Alyssa
must make sure that the names of their procedures do not conflict. One way to
do this is for Ben to append the suffix rectangular to the name of each of his
representation procedures and for Alyssa to append polar to the names of hers.
Here is Ben's revised rectangular representation from section 2.4.1:
(define
(define
(define
(sqrt
(real-part-rectangular z) (car z))
(imag-part-rectangular z) (cdr z))
(magnitude-rectangular z)
(+ (square (real-part-rectangular z))
(square (imag-part-rectangular z)))))
(define (angle-rectangular z)
(atan (imag-part-rectangular z)
(real-part-rectangular z)))
(define (make-from-real-imag-rectangular x y)
(attach-tag 'rectangular (cons x y)))
(define (make-from-mag-ang-rectangular r a)
(attach-tag 'rectangular
(cons (* r (cos a)) (* r (sin a)))))
and here is Alyssa's revised polar representation:
(define (real-part-polar z)
(* (magnitude-polar z) (cos (angle-polar z))))
(define (imag-part-polar z)
(* (magnitude-polar z) (sin (angle-polar z))))
(define (magnitude-polar z) (car z))
(define (angle-polar z) (cdr z))
(define (make-from-real-imag-polar x y)
(attach-tag 'polar
(cons (sqrt (+ (square x) (square y)))
(atan y x))))
(define (make-from-mag-ang-polar r a)
(attach-tag 'polar (cons r a)))
Each generic selector is implemented as a procedure that checks the tag of its
argument and calls the appropriate procedure for handling data of that type. For
example, to obtain the real part of a complex number, real-part examines the
tag to determine whether to use Ben's real-part-rectangular or Alyssa's
real-part-polar. In either case, we use contents to extract the bare, untagged
datum and send this to the rectangular or polar procedure as required:
(define (real-part z)
(cond ((rectangular? z)
(real-part-rectangular (contents z)))
((polar? z)
(real-part-polar (contents z)))
(else (error "Unknown type -- REAL-PART" z))))
(define (imag-part z)
(cond ((rectangular? z)
(imag-part-rectangular (contents z)))
((polar? z)
(imag-part-polar (contents z)))
(else (error "Unknown type -- IMAG-PART" z))))
(define (magnitude z)
(cond ((rectangular? z)
(magnitude-rectangular (contents z)))
((polar? z)
(magnitude-polar (contents z)))
(else (error "Unknown type -- MAGNITUDE" z))))
(define (angle z)
(cond ((rectangular? z)
(angle-rectangular (contents z)))
((polar? z)
(angle-polar (contents z)))
(else (error "Unknown type -- ANGLE" z))))
To implement the complex-number arithmetic operations, we can use the same
procedures add-complex, sub-complex, mul-complex, and div-complex from
section 2.4.1, because the selectors they call are generic, and so will work with
either representation. For example, the procedure add-complex is still
(define (add-complex z1 z2)
(make-from-real-imag (+ (real-part z1) (real-part z2))
(+ (imag-part z1) (imag-part z2))))
Finally, we must choose whether to construct complex numbers using Ben's
representation or Alyssa's representation. One reasonable choice is to construct
rectangular numbers whenever we have real and imaginary parts and to construct
polar numbers whenever we have magnitudes and angles:
(define (make-from-real-imag x y)
(make-from-real-imag-rectangular x y))
(define (make-from-mag-ang r a)
(make-from-mag-ang-polar r a))
Figure 2.21: Structure of the generic complex-arithmetic system.
The resulting complex-number system has the structure shown in figure 2.21.
The system has been decomposed into three relatively independent parts: the
complex-number-arithmetic operations, Alyssa's polar implementation, and
Ben's rectangular implementation. The polar and rectangular implementations
could have been written by Ben and Alyssa working separately, and both of
these can be used as underlying representations by a third programmer
implementing the complex-arithmetic procedures in terms of the abstract
constructor/selector interface.
Since each data object is tagged with its type, the selectors operate on the data in
a generic manner. That is, each selector is defined to have a behavior that
depends upon the particular type of data it is applied to. Notice the general
mechanism for interfacing the separate representations: Within a given
representation implementation (say, Alyssa's polar package) a complex number
is an untyped pair (magnitude, angle). When a generic selector operates on a
number of polar type, it strips off the tag and passes the contents on to Alyssa's
code. Conversely, when Alyssa constructs a number for general use, she tags it
with a type so that it can be appropriately recognized by the higher-level
procedures. This discipline of stripping off and attaching tags as data objects are
passed from level to level can be an important organizational strategy, as we
shall see in section 2.5.
2.4.3 Data-Directed Programming and Additivity
The general strategy of checking the type of a datum and calling an appropriate
procedure is called dispatching on type. This is a powerful strategy for obtaining
modularity in system design. On the other hand, implementing the dispatch as in
section 2.4.2 has two significant weaknesses. One weakness is that the generic
interface procedures (real-part, imag-part, magnitude, and angle) must know
about all the different representations. For instance, suppose we wanted to
incorporate a new representation for complex numbers into our complex-number
system. We would need to identify this new representation with a type, and then
add a clause to each of the generic interface procedures to check for the new
type and apply the appropriate selector for that representation.
Another weakness of the technique is that even though the individual
representations can be designed separately, we must guarantee that no two
procedures in the entire system have the same name. This is why Ben and Alyssa
had to change the names of their original procedures from section 2.4.1.
The issue underlying both of these weaknesses is that the technique for
implementing generic interfaces is not additive. The person implementing the
generic selector procedures must modify those procedures each time a new
representation is installed, and the people interfacing the individual
representations must modify their code to avoid name conflicts. In each of these
cases, the changes that must be made to the code are straightforward, but they
must be made nonetheless, and this is a source of inconvenience and error. This
is not much of a problem for the complex-number system as it stands, but
suppose there were not two but hundreds of different representations for
complex numbers. And suppose that there were many generic selectors to be
maintained in the abstract-data interface. Suppose, in fact, that no one
programmer knew all the interface procedures or all the representations. The
problem is real and must be addressed in such programs as large-scale data-basemanagement systems.
What we need is a means for modularizing the system design even further. This
is provided by the programming technique known as data-directed
programming. To understand how data-directed programming works, begin with
the observation that whenever we deal with a set of generic operations that are
common to a set of different types we are, in effect, dealing with a twodimensional table that contains the possible operations on one axis and the
possible types on the other axis. The entries in the table are the procedures that
implement each operation for each type of argument presented. In the complexnumber system developed in the previous section, the correspondence between
operation name, data type, and actual procedure was spread out among the
various conditional clauses in the generic interface procedures. But the same
information could have been organized in a table, as shown in figure 2.22.
Data-directed programming is the technique of designing programs to work with
such a table directly. Previously, we implemented the mechanism that interfaces
the complex-arithmetic code with the two representation packages as a set of
procedures that each perform an explicit dispatch on type. Here we will
implement the interface as a single procedure that looks up the combination of
the operation name and argument type in the table to find the correct procedure
to apply, and then applies it to the contents of the argument. If we do this, then to
add a new representation package to the system we need not change any existing
procedures; we need only add new entries to the table.
Figure 2.22: Table of operations for the complex-number system.
To implement this plan, assume that we have two procedures, put and get, for
manipulating the operation-and-type table:
(put *