Comprehending List Comprehensions
Comprehensions are a compact way to write down a List, or a Set, or some other form of collection, that contains related elements. This page walks through examples of using list comprehensions. UseAGenerator is a more complete treatment of comprehensions in general.
Preliminaries
The list of integers from 1 to 3 is written as follows:
The range 1:3 is known as a generator. (*) indicates a comment until the end of the line.
Here are more compact ways to generate other interesting sequences of integers:
We can also use a filter (also known as a guard or a predicate) to select a subset of the values generated. The following two expressions generate the same lists, the squares of the first 5 even integers.
The boolean expression after the comma, 2 DIVIDES x, is the filter: it selects those values of x that make it true. Notice that x is in scope for the filter.
Generators can be nested, a bit like nested loops. We can also nest filters. Now lets see how generators and filters are used in a more complex example: Pythagorean Triples.
Pythagorean Triples: Our Primary Example
A Pythagorean triple is three integers a, b, and c representing the sides of a right triangle with hypotenuse c. This means that a2 + b2 = c2, which we'll refer to as the Pythagorean equality. We will require that all three sides are less than some specified maximum value, and for uniqueness we'll also require that a < b < c. Also, triples should be primitive — not a multiple of another triple.
This exercise is based on the trips.fss demo program. Much more information on Pythagorean Triples can be found in this Wikipedia article.
First we define two helper functions, coprime and even, and a trip object.
We are looking to generate lists like this one, from Wikipedia:
Here's a direct enumerate-and-filter specification, parametrized by m, the maximum side length.
The concise expression: a<-1:m, b<-1:m, c<-1:m is like a triply nested loop. In this case, the order of the generators is irrelevant (and the compiler might leverage this), but generally the order matters as we'll see. We also specified three filters: an ordering of a, b, and c, the Pythagorean equality, and coprime (a,b), which ensures that the triple is primitive.
We can fold the first filter, a < b < c, into the lower and upper bounds of the generators.
This sort of transformation is known as filter promotion. Rather than generate and then filter, we can avoid generating some values in the first place. We can imagine that that a compiler, a run time system, or in Fortress' case, a clever library, could accomplish simple filter promotion. Now c generation must come after b generation which in turn must come after a generation as a is used in the bounds of b generation which in turn is used in the bounds of c generation --- the generators can no longer be freely interchanged --- c generation is "more inner" than b generation which in turn is "more inner" than a generation.
Shouldn't this sort of transformation be the programmer's responsibility? Perhaps, but sometime in mathematics we specify sets this way, and it can be very concise and easy to understand.
What about the next filter, the Pythagorean equality? Once a and b are selected, this filter tells us there is at most one one possible value to consider for c, we have a candidate if c is an integer. First rewrite the filter c = SQRT(a^2 + b^2). If that produces an integer we have a candidate; otherwise, we're not interested. We can define the perfectSqrt function to return the exact integer square-root (for perfect squares) or nothing at all.
Why not just truncate the real square-root and check that? That would make a less interesting example, and it's not quite right for large numbers.
The just function generates a single element, and Nothing lets us return nothing at all. narrow is used to go from ZZ64 to ZZ32, and will not be needed after the typechecker is online (real Soon Now!). The typechecker will also obviate the need to write Nothing[\ZZ32\] instead of simply Nothing.
There's also a bit of housekeeping, as c must satisfy the generator bounds, ie, b+1 <= c <= m. The lower bound comes for free, but we have to check the upper bound explicitly with a filter: c <= m.
Can we promote the filter c <= m? It can be absorbed into b generation. Since c2 = a2 + b2 and c <= m, a2 + b2 <= m2, giving us b <= SQRT(m2 - a2). This only produces values at most m-1.
It would be pretty impressive if a library or compiler could automatically do the filter promotion we've done by hand:
- Fold a < b < c into b and c generation. This is the easiest one.
- Calculate c rather than generate and filter. This involves quadratic algebraic manipulation.
- Fold c < m into b generation (upper bound): again, algebraic manipulation.
It's tricky, but all of the preceding and the following could conceivably be produced automatically by a clever enough system, perhaps an MS project, or an undergrad capstone project for a very clever undergrad.
Just for fun, let's pull common divisibility by two out of the coprime filter.
Then fold that into b generation using striding.
These last two steps could conceivably be done automagically, but that would take a really clever researcher.
Is there more fun to be had? It turns out that exactly one of a and b are even, but proving that is reasonably tricky. However, given that fact, we can write the stronger guard (even a) XOR (even b), or, we could fold that clever fact into simplified striding in b generation.
And, it also turns out that the smallest leg in a triple is 3.
Performing these last three transformations automatically would also be impressive research.
1 & 2. Turn common divisibility by 2 into b striding (1: complex striding, and 2: simple striding)
3. Start a with 3. Good luck with this one. :-)
Checking Results
Suppose we wanted to compare the result to Wikipedia100. Since the order of generated triples is not necessarily the same order as in Wikipedia100, we have to be careful. We could sort the triples at the end, or we could simply dump the results into sets. Sets are generated by using { and } instead of the list enclosers <| and |> . There is currently an implementation restriction in Fortress: if we want to put elements into a set, they have to have an ordering (counter intuitive, no?). So, first we have to define an order on trip.
We define the comparison operator, CMP, by lexicographically comparing a then b then c in turn. LEXICO is shorthand for: if the thing on the left didn't determine the order, check the thing on the right. Now we can write:
When we put the list Wikipedia100 in the position of a generator, it generates it's elements. How cool is that? assert compares it's first two arguments, and if they are not equal, it prints the third argument.

