> module Modules(computeInsOuts,inscope) where
>
> import Relations
> import NamesEntities
> import ModSysAST
> import Maybe (isNothing)
> import Sets

The semantics of imports and exports
====================================
\label{Sem}

Now we are ready to present a method for computing the in-scope
and export relations of the modules in a program.  The process
proceeds in two stages.  In this section we compute the relations, ignoring
any errors that might occur (e.g. ambiguous or undefined exports).
In Section~\ref{Err}, we check that each computed relation satisfies a number
of additional requirements and produce appropriate error messages if
any problems are detected.  This approach simplifies the specification
as we can concentrate on a single issue at a time.  It also results in
better error messages being reported to the users of our prototype system,
because the module system analysis can continue, even in the presence
of ambiguities. In Section~\ref{SemProg} we glue everything together.

The computation phase, described in the remainder of this section,
is itself split in three parts: first we describe 
how to export/import a single entity, next we present how to combine the 
meaning of entity specifications to obtain the meaning of complete import
and export specifications. Finally, we describe how 
to handle mutually recursive modules.


Importing or exporting an entity
--------------------------------

We already noted the similarity in the abstract syntax for exporting
and importing an entity---they both made use of the #EntSpec# data structure.
It is therefore not surprising that those two cases are handled in
essentially the same manner.  Our goal is to determine which
name-entity pairs in an in-scope or export relation satisfy
a certain #EntSpec# specification.  The function #mEntSpec# formalizes 
this process.

> mEntSpec :: (Ord j,ToSimple j) => 
>   Bool ->           -- is it a hiding import?
>   Rel j Entity ->   -- the original relation
>   EntSpec j         -- the specification
>     -> Rel j Entity -- the subset satisfying the specification
>
> mEntSpec isHiding rel (Ent x subspec) =
>   unionRels [mSpec, mSub]
>   where
>   mSpec   = restrictRng consider (restrictDom (== x) rel)
>   allSubs = owns `unionMapSet` rng mSpec
>   subs    = restrictRng (`elementOf` allSubs) rel
>   mSub    = 
>     case subspec of
>       Nothing        -> emptyRel
>       Just AllSubs   -> subs
>       Just (Subs xs) -> 
>         restrictDom ((`elem` xs) . toSimple) subs
>
>   consider
>     | isHiding && isNothing subspec = const True
>     | otherwise                     = not . isCon

\begin{ex} Before we describe #mEntSpec# in detail,
we present an example of how it is going to be used.  Consider the module:

|  module M (f, M.List(..), Show(showList)) where
|  import A(g)
|  ...

The function #mEntSpec# is applied to each of the three entries 
in the export list, to determine the subset of the in-scope relation each of 
them matches.  Similarly, it is applied to the entry #g# of the import
list, to determine which of #A#'s exports come in scope.
\end{ex}

The #EntSpec# structure consists of two components:
the ``main'' specification, and possibly a subordinate specification.
The meaning of the entire specification is the union of its components.
The subset of #rel# which matches the ``main'' specification is #mSpec#.
It is computed by restricting the domain of the relation to contain only
names matching the specification.  It turns out we might need to restrict
the range of the relation as well, the reasons for this are discussed shortly.
Typically (but not always!) #mSpec# will be a relation only relating a single
name to an entity, representing the fact that the specification is unambiguous.

Continuing with the example above, #Show# is the ``main'' specification,
and #mSpec# would be:
\begin{align*}
Show              &\mapsto \ent{Show}{Prelude}    
\end{align*}

The meaning of the subordinate specification depends on the meaning
of the ``main'' specification.  The set #allSubs# contains
all subordinate entities of all possible interpretations in #mSpec#.
To compute the subordinates that are in-scope/exported, we restrict #rel#
so that names may only refer to entities in #allSubs# --- the result is #subs#.
In our example #allSubs# would contain the entities of the methods in the
 #Show# class:  
$$
\{ \ent{show}{Prelude}, \ent{showsPrec}{Prelude}, \ent{showList}{Prelude}\}
$$
Now suppose that \ent{showsPrec}{Prelude} is not in scope, perhaps because
the programmer explicitly hid it, when importing the #Prelude#.  
Then #subs# would be:
\begin{align*}
show              &\mapsto \ent{show}{Prelude}     \\
Prelude.show      &\mapsto \ent{show}{Prelude}     \\
showList          &\mapsto \ent{showList}{Prelude} \\
Prelude.showList  &\mapsto \ent{showList}{Prelude}
\end{align*}

Finally, we need to compute what part of #subs# matches the subordinate
specification.  If there was no specification, the result is the empty
relation, if the specification was of the form #AllSubs# we return #subs#,
as by construction it contains all subordinates which are in-scope or exported.
Finally, if a programmer specified an explicit list of subordinates, we need
to restrict #subs# so that it only contains names in the explicitly
provided list.  Since subordinate specifications contain only simple
names (i.e. of type #Name#), and in-scope relations contain possibly
qualified names (i.e. of type #QName#) we first strip any qualifiers
using the method #toSimple#.  This has the effect that a 
subordinate will be exported if it is available in scope with either 
qualified _or_ unqualified name \cite[Section 5.2]{Haskell98}.
In the ongoing example, we had a subordinate name list, containing one name:
 #``showList''#.  So the restricted #subs# relation will be:
\begin{align*}
showList          &\mapsto \ent{showList}{Prelude} \\
Prelude.showList  &\mapsto \ent{showList}{Prelude}
\end{align*}

Next we describe a quirk in the Haskell module system, which
gives rise to the #isHiding# parameter and the auxiliary predicate
 #consider#.  The entities in Haskell may be grouped in two 
non-interchangeable groups: classes and type constructors on the one side,
all other entities on the other.
In the body of a module, it is always possible
to determine which type of entity a name refers to.
A name may refer to two different entities without the risk
of an ambiguity.  A problem occurs with import and export lists, 
as they may not provide enough context.

\begin{ex} It is quite common to use the same name for both a type
and a value constructor, as types and values do not mix:

|  module A (Env) where
|  newtype Env a = Env [(String,a)]

There is no context in the export list to indicate if the programmer intended
to export the type constructor #Env#, the value constructor #Env#, 
or perhaps both. 
\end{ex}

To avoid the potential ambiguity illustrated in the example, Haskell 98
uses the following strategy to decide what is to be imported/exported.

Because only classes and type constructors may ``own'' other
entities, the presence of a subordinate name list indicates that
the programmer is referring to the type or class in scope.
The situation is more complicated if the name does not have a subordinate
list, as then there are two different policies, depending on where
the name occurs.  

The first policy is that names starting with a capital
letter always refer to types or classes.  It is used for names occurring
in the export list of a module, and in non-hiding imports.  So in particular,
in the above example, only the type constructor #Env# will be exported,
and not the value constructor.  This means that in order to export/import
a value constructor, a programmer has to include it in the subordinate
list of the relevant type constructor.  A consequence of this is that
it is not possible to export just a value constructor without
its type.  In practice this is not a very big problem.

The second policy is used for ``hiding'' imports, and it says that 
capital names may refer to _both_ types/classes and value constructors.
One reason for this difference is that it is sometimes useful to _hide_
just a value constructor without hiding its type.  If the first
policy was used for hiding imports this would not be possible.  We note
that if a name refers to both a type and a value constructor, both of
them are hidden.

A (non Haskell 98) alternative to having two separate policies is to 
have a single more flexible policy that applies in both cases.  
Later in the paper (Section \ref{conclusions}) we will describe a simple
modification to the first policy to achieve this.

Here is an example illustrating what the policies do.

\begin{ex}
| import A (Env)          -- import only the type 
| import A hiding (Env)   -- hide the type and the value
| import A hiding (Env()) -- hide only the type 
\end{ex}

To implement this rule we defined the auxiliary predicate #consider#.  The 
parameter #isHiding# tells us if we are in a hiding import (the special case).
If we are, then we consider all entities as valid interpretations 
for the name in the ``main'' specification.   However if we are using
 #mEntSpec# in an export list or a normal import, then we do not consider value
constructors.

\vspace{6mm}
Export relations
----------------

Now that we know how to handle a single entry in the export list,
we are ready to compute the export relation of a module.   This
is the task of the function #exports#.

> exports :: Module -> Rel QName Entity -> Rel Name Entity
> exports mod inscp = 
>   case modExpList mod of
>     Nothing -> modDefines mod
>     Just es -> getQualified `mapDom` unionRels exps
>       where
>       exps = mExpListEntry inscp `map` es

The parameter #inscp# models the in-scope relation of the module.
It is necessary, as the interface of a module is essentially a subset
of #inscp#.  If the programmer omitted the export specification of
a module, we just export the locally defined entities by using the #modDefines#
field of the module under consideration.  Alternatively if an explicit
export list was present,  we take
the union of the meanings of all listed entries.
To obtain a proper export relation we remove all qualifiers.
Note that after removing the qualifiers (or even before that)
we might end up with a relation containing ambiguous names.
This is not valid in Haskell, but we will delay detection of such invalid 
solutions until a later pass.

We have already seen that there are two forms of specification
that may appear in an export list.  If we see an entity specification,
we just use #mEntSpec# to determine the subset of #inscp# which matches it.
The other possibility is that we see an export of the form
 #module M#.  In this case, the Haskell 98 report 
\cite[Section 5.2, item 5]{Haskell98} states that the result is the subset 
of #inscp# containing precisely those entities, 
which may be named with _both_ some simple name #x# _and_ 
a qualified name #M.x# 

> mExpListEntry :: 
>   Rel QName Entity -> ExpListEntry -> Rel QName Entity
> mExpListEntry inscp (EntExp it) = mEntSpec False inscp it
> mExpListEntry inscp (ModuleExp m) = 
>   (qual m `mapDom` unqs) `intersectRel` qs
>   where
>   (qs,unqs) = partitionDom isQual inscp

 
In-scope relations
------------------

In this section we specify how to compute the in-scope relation of a module.
This is done by the function #inscope#:  

> inscope :: Module -> (ModName -> Rel Name Entity) 
>                         -> Rel QName Entity
> inscope m expsOf = unionRels [imports, locals]
>   where
>   defEnts = modDefines m
>   locals  = unionRels 
>               [mkUnqual `mapDom` defEnts,
>                mkQual (modName m) `mapDom` defEnts]
>
>   imports = 
>     unionRels $ map (mImp expsOf) (modImports m)

An entity is in scope if it is either locally defined, 
or if it is imported from another module. It is therefore necessary to 
know what the exports of other modules are.  The parameter #expsOf# 
is a function mapping module names to their exports.

Every locally defined entity may be referred to with at least two names:
the simple name used in its definition, and a qualified name, obtained
by prefixing with the name of the module.  For example if a module #A# 
defines an entity with simple name #f#, a programmer may refer to it as 
either #f# or #A.f# \cite[Section 5.5.1]{Haskell98}.  The part
of the in-scope relation containing local definitions is #locals#.

Import declarations are cumulative \cite[Section 5.3]{Haskell98}
and so the imported entities 
(#imports#) are the union of the entities imported by each declaration.

\begin{ex}
The declarations:
| import A hiding (f)
| import A (f) 
import everything from module #A#, as the first one imports everything
but #f#, while the second one imports just #f#.
\end{ex}

The function #mImp# is used to compute what entities come in scope through
a single import declaration.  
\newpage

> mImp :: (ModName -> Rel Name Entity) -> Import ->
>          Rel QName Entity 
> mImp expsOf imp
>   | impQualified imp  = qs
>   | otherwise         = unionRels [unqs, qs]
>   where
>   qs      = mkQual (impAs imp) `mapDom` incoming
>   unqs    = mkUnqual `mapDom` incoming
>
>   listed  = unionRels $
>               map (mEntSpec isHiding exps) 
>                   (impList imp)
>   incoming
>     | isHiding  = exps `minusRel` listed
>     | otherwise = listed
>
>   isHiding  = impHiding imp
>   exps      = expsOf (impSource imp)

First we define the relation #listed#, which contains exported entities 
matching _any_ of the entity specifications in the list of the import 
declaration.  Next, we compute the subset of the export relation of the 
source module, which matches the specification (#incoming#).  If we have
a normal import, #incoming# is exactly #listed#.  If however we are
dealing with a ``hiding'' import, we need to take all those entities
which are exported, but are _not_ in #listed#.

Having computed #incoming#, we now need to convert it to an
in-scope relation.  To do that, we adjust the names of the entities
according to the import declaration.  If we have a ``qualified'' import,
we introduce only qualified names for the imported entities (#qs#), otherwise
we also add the unqualified names.  The qualified names are computed by
simply qualifying all names in #incoming# with the #impAs# specification
of the declaration.

\vspace{6mm}
Recursive modules
-----------------
\label{rec-mods}

The reader might have noticed that to compute the exports of a module we
need to know what is in scope, and to compute what is in scope we need
to know certain exports.  If we allow for mutually recursive modules, then our
equations become mutually recursive.  In this section, we show how they are
solved.

The idea is that the export and in-scope relations for a
group of mutually recursive modules are computed at the same time (as they all
depend on each other).  This means that the compilation unit of a compiler
is not a single module, but a _strongly connected component_ of mutually
recursive modules.  In fact, one could process _all_ modules at once
and still get the same result, but this will not be very practical for
large systems.  For this reason, we will work with strongly connected 
components of modules, and assume that they are processed in dependency order.

The function #computeInsOuts# computes the in-scope and export
relations of the modules in a strongly connected component.
The argument #otherExps# is a function mapping module names to export relations.
It needs to be defined only for modules in earlier (dependency-wise)
strongly connected components.
\newpage

> computeInsOuts :: 
>   (ModName -> Rel Name Entity) -> [Module] ->
>    [(Rel QName Entity, Rel Name Entity)]
> computeInsOuts otherExps mods = inscps `zip` exps
>   where
>   inscps = computeIs exps
>   exps   = lfpAfter nextExps $ 
>             replicate (length mods) emptyRel
>
>   nextExps = computeEs . computeIs
>
>   computeEs is = zipWith exports mods is
>   computeIs es = map (`inscope` toFun es) mods
>
>   toFun es m   = maybe (otherExps m) (es !!) 
>                        (lookup m mod_ixs)
>   mod_ixs      = map modName mods `zip` [0..]
>
> lfpAfter f x = if fx == x then fx else lfpAfter f fx
>   where 
>   fx = f x

The function #computeInsOuts# starts by assuming that the modules in
the strongly connected component do not export anything.  It then
keeps applying the function #nextExps# to obtain successive approximations
to the exports of each of the modules until a fixed point is reached.
Thus, the export-relations in a strongly connected component are the
least fixed point of the function #nextExps#.  Similarly, the in-scope 
relations are the least fixed point of #computeIs . computeEs#, 
but using the well known fact that $fix (f . g) = f (fix (g . f))$, 
we obtain the definition of #inscps# used above.

The function #nextExps# computes the next approximation to the
exports of each module.  Given the current exports of each module,
it first determines what are the corresponding new in-scope relations,
and based on that computes the new exports.  It makes use of the
helper functions #computeEs# and #computeIs#, which are generalizations of 
 #exports# and #inscope# respectively, to work with strongly connected
components rather than just single modules.

The functions #computeEs# and #computeIs# work in a similar way: they
apply #exports# (or #inscope# respectively) to all modules in  the strongly
connected component.
The main difference between the two is that the export relation of 
a module depends only on the in-scope relation of the same module, 
while the in-scope relation depends on the export relations of many modules. 
As a result, before mapping #inscope# over the strongly
connected component, we need to compute a function that maps
module names to their respective export relations.  This is done by
 #toFun#.  If the module is in the current strongly connected component,
 the result of #toFun# just projects the appropriate export relation.
 Otherwise, we use the parameter #otherExps# to lookup the exports
of a previously processed module.

\begin{ex}
In this example we consider a module that imports itself,
and use the algorithm above to compute its in-scope and export
relations:
|  module A (B.f) where
|  import A as B
|  f = ...

\newpage
We start with an empty export relation, and 
then compute the in-scope relation:  
$$f \mapsto \ent{f}{A}, A.f \mapsto \ent{f}{A}$$ 
Note that this contains only the locally defined names, because #A#
imports only from itself, and we have assumed it does not export anything.
Because the in-scope relation does not relate #B.f# to anything, nothing
is exported again, and so we reach a fixed point.
Thus module #A# does not export anything, and it has one entity in-scope:
$\ent{f}{A}$.  This entity may be referred to by either #A.f#, or just #f#.
\end{ex}

\begin{ex}
This example is a slight variation on the previous one, two modules
are involved: #A# and #B#:
|  module A (B.f) where
|  import A as B
|  import qualified B
|  f = ...
| 
|  module B where
|  f = ...
The dependencies between those two are: #A# depends on #A# and #B#, 
and #B# does not depend on anything, so we first analyze #B#.  Since it
has no export specification, it exports all locally defined names and
nothing else, so we have the following in-scope and export relations:

\begin{tabular}{cc}
in-scope                   & exports     \\ 
$f \mapsto \ent{f}{B}, B.f \mapsto \ent{f}{B}$ & $f \mapsto \ent{f}{B}$
\end{tabular}

Next we analyze module #A#. The steps in the fixed point calculation are:

\begin{tabular}{rcc}
   & in-scope                   & exports     \\ 
1. &                            & $\emptyset$ \\
2. & $f \mapsto \ent{f}{A}, A.f \mapsto \ent{f}{A}, B.f \mapsto \ent{f}{B}$ 
   & $f \mapsto \ent{f}{B}$ \\
3. & $f \mapsto \{ \ent{f}{A}, \ent{f}{B} \}, A.f \mapsto \ent{f}{A}, 
    B.f \mapsto \ent{f}{B}$ 
   & $f \mapsto \ent{f}{B}$ 
\end{tabular}

We start with the empty export relation.  The in-scope relation we
compute contains the locally defined names and also the imports.  In this
case the imports are just $B.f \mapsto \ent{f}{B}$.  On the next iteration,
 $f \mapsto \ent{f}{B}$ gets exported.  So the next in-scope relation contains
more imports: $f \mapsto \ent{f}{B}$, and $B.f \mapsto \ent{f}{B}$, which
came in through the #import A# declaration.  These do not add anything new to
the exports of the module, so we have reached a fixed point.
\end{ex}

We defined the import and export relations of modules in terms
of the least fixed point of a certain function.  To ensure that the
algorithm above terminates, we need to ensure that this least
fixed point exists.  Now we examine this issue in more detail.  

We used lists to represent the in-scope and export relations of a group 
of modules. Since relations may be ordered by inclusion,
we may also order lists (of equal length) of relations by a point-wise
ordering.  In a similar fashion we obtain a lattice structure on the
lists of relations by using the lattice structure of relations point-wise.
The lattice obtained in this way is finite, as each module may only define 
finitely many entities and each entity may only have finitely many names.
Names get associated with entities in two ways: by local declaration,
in which case an entity receives two names (qualified and unqualified);
and by import declaration, in which case an entity receives either one or 
two names.
Since there are only finitely many local and import declarations,
an entity may only have finitely many names. 

It then follows from the Knaster-Tarski theorem \cite{Alg90}, 
that #nextExps# has a least fixed point if it is monotonic with respect to 
the above ordering.  The #inscope# and #exports# functions are monotonic
with respect to the relation ordering, as they are essentially filters that
produce larger outputs when given larger inputs.  
The same holds for #computeEs#, #computeIs#, and their composition #nextExps#.