Additional Interpreter Environment Issues
Current Environments
Each component receives its "own" self-contained environment, initialized by copying bindings through imported APIs from other components. This requires iterated copying until no more new bindings are added.
Overloaded functions and functional methods make this a little delicate.
In this implementation (which we think mirrors the semantics) an overloaded function named "f" is a set of functions, either defined or imported with the name "f".
Overloaded functions are detected as they are added to the environment; if a top level definition occurs for an "f" that already has a top-level definition, then the two definitions are combined into a single overloaded function.
This works well with the copy-values-in environment plan; when traits and objects are copied in, they can be scanned for functional methods, when functions are copied, in they can be added to the overload.
The disadvantage of this way of structuring environments is that each components' environment can become very large.
Proposed changes
The proposed other plan for treating environments is that bindings are NOT copied in. Instead, any imports are referenced through an additional level of indirection using the name of the API from which they are imported, then the name of the API member that is referenced; the APIs, at "link" time, will copy bindings from their implementing (exporting) components.
However, this causes problems with overloading. If we intend to retain the current behavior, it requires iterated passes over imported APIs that will be the footprint-equivalent of our current copy-in scheme.
... names are pre-resolved (by the disambiguator) to note which API they are defined in. It seems like we need a special "overloaded function" definition that can be inserted at the top level by the disambiguator.
A "new" idea is to associate a "canonical name" with each function declaration, in addition to its existing "name". A canonical name contains all the information necessary to distinguish two overloaded functions. This might, or might not, include things like where clauses. It definitely includes parameters and static parameters, keyword arguments, and at least the presence/absence of a default value. It is probably not necessary to preserve distinctions that could never occur in a valid overloading set.
When combined with the API name in which a function is defined, this gives an exact name for a member of an overloading, which can be used (in references, in ASTs) to "talk about" overloaded functions. (Function references will include a list of the functions that might be actual referents at run time).
A detail: what about functional methods?
canonical naming rules
An unrelated problem in the creation of file names has clarified things somewhat. There are several important rules to keep in mind:
- The canonical names should not be defined-mangled. As much as possible, they should contain the Unicode for what they are (for example, Oxford brackets should appear as ⟦ (27E6) and ⟧ (27E7), not [\ and \]).
- It is important to agree on a string name. These canonical names are going to be the names of fields in bytecode-encoded data structures that result from compilation. That means strings. The strings should look as much like the names that are seen in the (idealized) source text as possible.
- In some cases, we need to improvise, and we need to be careful of ambiguity in the presence of preceding package names. Functional methods come to mind -- the obvious way to encode these is
Traitname⟦TP1,TP2⟧.Methodname(a1:t1,a2:t2,self)
That is the functional method Methodname from of generic trait Traitname. Note that this distinguishes "generic functional methods" from "functional generic methods" in an obvious way.
