explaining the problem, and why is not a monad

This commit is contained in:
Gabriel Scherer 2020-05-29 18:34:03 +02:00
parent eb505000ee
commit 8ec42d2b38
2 changed files with 93 additions and 13 deletions

View File

@ -12,4 +12,12 @@
author = {François Pottier},
title = {the {Inferno} library: \url{https://gitlab.inria.fr/fpottier/inferno}},
year = 2014,
}
}
@techreport{composing-monads,
title = {Composing Monads},
author = {Mark P. Jones and Luc Duponcheel},
institution ={Yale University},
year = {1993},
number = {YALEU/DCS/RR-1004},
}

View File

@ -133,25 +133,33 @@ concluded by a final ``map'' that takes all the witnesses of the
constraint-solving pieces, and combines them into a final witness for
the inferred term.
\subsection{Our sufferings} Now consider extending pair to handle not
\subsection{The problem} Consider extending this code to handle not
just binary pairs, but n-ary tuples \lstinline{Tuple us}. Instead of
a statically-known number of inference variables (2), we want to
generate one for each subterm of the list. How would you program this
with the proposed API?
generate one for each subterm of the list, to build the result
type. How would you program this with the proposed API?
If \lstinline{'a co} was an inference/elaboration \emph{monad}, it
would be natural to decompose the problem with a function of type
\lstinline{'a list -> Infer.variable list co}, or possibly
\lstinline{mapM : ('a -> 'b co) -> 'a list -> 'b list co},
and then \lstinline{bind} the result to continue. For example:
There would be a standard solution if \lstinline{'a co} was an
inference/elaboration \emph{monad}, with a fresh-inference-variable
operation \lstinline{exist : (variable * F.ty) co}. We would write
something like, with a generic combinator %
\lstinline{traverse : 'a co list -> 'a list co}:
\begin{lstlisting}
| ML.Tuple us ->
traverse (List.map (fun _ -> exists) us) >>= fun vs ->
List.mapM2 (fun (v, ty) u -> hastype u w ^& return ty) vs us >>= fun utys ->
let on_each u =
exist (fun v -> v ^& hastype u v)
<$$> fun (ty, (v, u')) -> (v, (u', ty)) in
mapM on_each us >>= fun vutys ->
let vs, utys = List.split vutys in
w --- Infer.product vs ^^
return (ML.tuple utys)
\end{lstlisting}
Instead, we have to write code in continuation-passing style:
%
But \lstinline{'a co} is \emph{not} a monad for a good reason -- we
explain this in the next section. Instead, we have to write code in
continuation-passing style:
%
\begin{lstlisting}
let rec traverse (us : 'a list) (k : variable list -> 'r co) : ((F.term * F.ty) list * 'r) co =
match us with
@ -161,13 +169,77 @@ Instead, we have to write code in continuation-passing style:
traverse us (fun vs ->
hastype u v ^&
k (v :: vs)
)) <$$> fun (ty, u', (utys, rs)) -> ((u', ty) :: utys, r)
)) <$$> fun (ty, (u', (utys, rs))) -> ((u', ty) :: utys, r)
in
traverse us (fun vs ->
w --- Infer.product vs
) <$$> fun (utys, ()) -> F.Tuple (List.combine u's tys)
\end{lstlisting}
Note: \lstinline{traverse} is called with a single continuation; it is
possible to inline this continuation in the empty-list case, and get
a simpler, less generic function, that could not be reused
elsewhere. We wish to discuss the reusable presentation in this document;
the API design improvements also help in the non-reusable case.
There are two difficulties with this API and the code we had to write:
\begin{enumerate}
\item It is much harder to figure out how to write this code than in the monadic
version, which allows a natural problem decomposition. Programmers may not know
at all how to generate dynamically many inference variables, and/or when a CPS
is necessary and how to achieve it.
\item The code, in either versions, is hard to read. In particular,
values are \emph{named} far from where they are \emph{expressed} in
the program; notice for example how \lstinline{u'} binds the witness
of \lstinline{hastype u v}, but is placed farther away in the
program. When we explain this code, we say that the
\lstinline{Infer.ty} result of \lstinline{exists} or the
\lstinline{(F.term * F.ty) list} result of \lstinline{traverse} is
``pushed on the stack''; we would rather avoid this particular style
of point-free programming.
\end{enumerate}
Our contribution is to identify, explain this problem, and propose
a partial solution: a systematic approach to programming with
``binders'' that can be explained and documented, and a different API
that lets us name terms in a more natural, local way -- at the cost of
``unpacking'' operations later on, as we will explain.
While our work is specific to Inferno (and OCaml), we believe that
this API problem arises in the more general case of an applicative
functor with \emph{quantifiers}, as \lstinline{exist} in our example.
\subsection{Not a monad} At the cost of our page limit, let us explain
why the \lstinline{'a co} type of Inferno cannot be given the
structure of a monad.
The witness \lstinline{'a} is built with knowledge of the
\emph{global} solution to the inference constraints
generated. Consider the applicative combinator%
\lstinline{pair : 'a co -> 'b co -> ('a * 'b) co}; both arguments
generate a part of the constraint, but the witnesses of \lstinline{'a}
and \lstinline{'b} can only be computed once the constraints on
\emph{both} sides are solved. For example, when inferring the ML term
\lstinline{(x, x + 1)}, the type inferred for the first component is
determined by unifications coming from the constraint of the second.
Implementing \lstinline{bind : 'a co -> ('a -> 'b co) -> 'b co}
would require building the witness for \lstinline{'a} before the
constraint arising from \lstinline{'b co} is known; this cannot be
done if \lstinline{'a} requires the final solution.
If you are abstractly inclined: internally \lstinline{'a co} is
defined as \lstinline{raw_constraint * (solution -> 'a)} it is the
composition of a ``writer'' monad that generate constraints and
a ``reader'' monad consuming the solution. For the composition
$F \circ G$ of two monads to itself be a monad, a sufficient
condition\citep{composing-monads} is that they commute:
$(F \circ G) \to (G \circ F)$ -- this suffices to define
$\mathtt{join} : (F \circ G \circ F \circ G) \to F \circ G$ from the
$\mathtt{join}$ operation of $F$ and $G$. Commuting the reader with
the writer would require reading the solution before writing the
constraint; this is not possible if we want the final solution.
\bibliography{biblio}
\end{document}