2010-07-16

State of the Lambda by Brian Goetz

Borrowed from State of the Lambda by Brian Goetz, 6 July 2010

This is an updated proposal to add lambda expressions (informally,
"closures") to the Java programming language. This sketch is built on
the straw-man proposal made by Mark Reinhold in December
2009.



  1. Background; SAM classes

    The Java programming language already has a form of closures:
    anonymous inner classes. There are a number of reasons these are
    considered imperfect closures, primarily:



    • Bulky syntax

    • Inability to capture non-final local variables

    • Transparency issues surrounding the meaning of return, break,
      continue, and 'this'

    • No nonlocal control flow operators


    It is not a goal of Project Lambda to address all of these issues.


    The standard way for Java APIs to define callbacks is to use an
    interface representing the callback method, such as:


    public interface CallbackHandler 
    {
    public void callback(Context c);
    }

    The CallbackHandler interface has a useful property: it has a single
    abstract method
    . Many common interfaces and abstract classes have
    this property, such as Runnable, Callable, EventHandler, or
    Comparator. We call these classes SAM classes.


    The biggest pain point for anonymous inner classes is bulkiness. To
    call a method taking a CallbackHandler, one typically creates an
    anonymous inner class:


    foo.doSomething(new CallbackHandler() 
    {
    public void callback(Context c)
    {
    System.out.println("pippo");
    }
    });

    The anonymous inner class here is what some might call a "vertical
    problem": five lines of source code to encapsulate a single statement.



    Astute readers will notice that the syntax used for examples in this
    document differ from that expressed in the straw-man proposal. This
    does not reflect a final decision on syntax; we are still
    experimenting with various candidate syntax options.




  2. Lambda expressions

    Lambda expressions are anonymous functions, aimed at addressing the
    "vertical problem" by replacing the machinery of anonymous inner
    classes with a simpler mechanism. One way to do that would be to add
    function types to the language, but this has several disadvantages:
    - Mixing of structural and nominal types;
    - Divergence of library styles (some libraries would continue to use
    callback objects, while others would use function types).



    So, we have instead chosen to take the path of making it easier to
    create instances of callback objects.


    Here are some examples of lambda expressions:


    { -> 42 }

    { int x -> x + 1 }

    The first expression takes no arguments, and returns the integer 42;
    the second takes a single integer argument, named x, and returns x+1.


    Lambda expressions are distinguished from ordinary statement blocks by
    the presence of a (possibly empty) formal parameter list and the ->
    token. The lambda expressions shown so far are a simplified form
    containing a single expression; there is also a multi-statement form
    that can contain one or more statements.



  3. SAM conversion

    One can describe a SAM type by its return type, parameter types, and
    checked exception types. Similarly, one can describe the type of a
    lambda expression by its return type, parameter types, and exception
    types.


    Informally, a lambda expression e is convertible-to a SAM type S if
    an anonymous inner class that is a subtype of S and that declares a
    method with the same name as S's abstract method and a signature and
    return type corresponding to the lambda expressions signature and
    return type would be considered assignment-compatible with S.


    The return type and exception types of a lambda expression are
    inferred by the compiler; the parameter types may be explicitly
    specified or they may be inferred from the assignment context (see
    Target Typing, below.)


    When a lambda expression is converted to a SAM type, invoking the
    single abstract method of the SAM instance causes the body of the
    lambda expression to be invoked.


    For example, SAM conversion will happen in the context of assignment:


    CallbackHandler cb = { Context c -> System.out.println("pippo") };

    In this case, the lambda expression has a single Context parameter,
    has void return type, and throws no checked exceptions, and is
    therefore compatible with the SAM type CallbackHandler.


  4. Target Typing

    Lambda expressions can only appear in context where it will be
    converted to a variable of SAM type; the type of 'this' inside the
    lambda expression is (a subtype of) the SAM type to which the lambda
    expression is being converted. So the following code will print
    "Yes":


    Runnable r = { -> 
    if (this instanceof Runnable)
    System.out.println("Yes");
    };
    r.run();

    The following use of lambda expressions is forbidden because it does
    not appear in a SAM-convertible context:


    Object o = { -> 42 };

    In a method invocation context, the target type for a lambda
    expression used as a method parameter is inferred by examining the set
    of possible compatible method signatures for the method being invoked.
    This entails some additional complexity in method selection;
    ordinarily the types of all parameters are computed, and then the set
    of compatible methods is computed, and a most specific method is
    selected if possible. Inference of the target type for lambda-valued
    actual parameters happens after the types of the other parameters is
    computed but before method selection; method selection then happens
    using the inferred target types for the lambda-valued parameters.


    The type of the formal parameters to the lambda expression can also be
    inferred from the target type of the lambda expression. So we can
    abbreviate our callback handler as:


    CallbackHandler cb = { c -> System.out.println("pippo") };

    as the type of the parameter c can be inferred from the target type
    of the lambda expression.


    Allowing the formal parameter types to be inferred in this way
    furthers a desirable design goal: "Don't turn a vertical problem into
    a horizontal problem." We wish that the reader of the code have to
    wade through as little code as possible before arriving at the "meat"
    of the lambda expression.


    The user can explicitly choose a target type by specifying a type
    name. This might be for clarity, or might be because there are
    multiple overloaded methods and the compiler cannot correctly chose
    the target type. For example:


    executor.submit(Callable<String> { -> "foo" });

    If the target type is an abstract class, it is an open question as to
    whether we want to permit an argument list so a constructor other than
    the no-arg constructor can be used.


  5. Lambda bodies

    In addition to the simplified expression form of a lambda body, a
    lambda body can also contain a list of statements, similar to a method
    body, with several differences: the break, return, and continue
    statements are not permitted, and a "yield" statement, whose form is
    similar to to the return statement, is permitted instead of a return
    statement. The type of a multi-statement lambda expression is
    inferred by unifying the type of the values yielded by the set of
    yield statements. As with method bodies, every control path through a
    multi-statement lambda expression must either yield a value, yield no
    value, or throw an exception. Expressions after a yield statement are
    unreachable.


    The complete syntax is given by:


    lambda-exp := "{" arg-list "->" lambda-body "}"
    arg-list := "(" args ")" | args
    args := arg | arg "," args
    arg := [ type ] identifier
    lambda-body := expression | statement-list [ ";" ]
    statement-list := statement | statement ";" statement-list

  6. Instance capture

    Once the target type of a lambda expression is determined, the body of
    a lambda expression is treated largely the same way as an anonymous
    inner class whose parent is the target type. The 'this' variable
    refers to the SAM-converted lambda (whose type is a subtype of the
    target type). Variables of the form OuterClassName.this refer to the
    instances of lexically enclosing classes, just as with inner classes.
    Unqualified names may refer to members of the SAM class (if it is a
    class and not an interface), or to members of lexically enclosing
    classes, using the same rules as for inner classes.


    For members of lexically enclosing instanaces, member capture is
    treated as if the references were desugared to use the appropriate
    "Outer.this" qualifier and Outer.this is captured as if it were a
    local final variable.


  7. Local variable capture

    The current rules for capturing local variables of enclosing contexts
    in inner classes are quite restrictive; only final variables may be
    captured. For lambda expressions (and for consistency, probably inner
    class instances as well), we relax these rules to also allow for
    capture of effectively final local variables. (Informally, a local
    variable is effectively final if making it final would not cause a
    compilation failure.)


    It is likely that we will not permit capture of mutable local
    variables. The reason is that idioms like this:


    int sum = 0;
    list.forEach({ Element e -> sum += e.size(); });

    are fundamentally serial; it is quite difficult to write lambda bodies
    like this that do not have race conditions. Unless we are willing to
    enforce (preferably statically) that such lambdas not escape their
    capturing thread, such a feature may likely cause more trouble than it
    solves.


  8. Exception transparency

    A separate document on exception transparency proposes our strategy
    for amending generics to allow abstraction over thrown checked
    exception types.


  9. Method references

    SAM conversion allows us to take an anonymous method body and treat it
    as if it were a SAM type. It is often desirable to do the same with
    an existing method (such as when a class has multiple methods that are
    signature-compatible with Comparable.compareTo().)


    Method references are expressions which have the same treatment as
    lambda expressions (i.e., they can only be SAM-converted), but instead
    of providing a method body they refer to a method of an existing class
    or object instance.


    For example, consider a Person class that can be sorted by name or by
    age:


    class Person 
    {
    private final String name;
    private final int age;

    public static int compareByAge(Person a, Person b) { ... }

    public static int compareByName(Person a, Person b) { ... }
    }

    Person[] people = ...
    Arrays.sort(people, #Person.compareByAge);

    Here, the expression #Person.compareByAge is sugar for a lambda
    expression whose formal argument list is copied from the method
    Person.compareByAge, and whose body calls Person.compareByAge. This
    lambda expression will then get SAM-converted to a Comparator.


    If the method being referenced is overloaded, it can be disambiguated
    by providing a list of argument types:


    Arrays.sort(people, #Person.compareByAge(Person, Person));

    Instance methods can be referenced as well, by providing a receiver
    variable:


    Arrays.sort(people, #comparatorHolder.comparePersonByAge);

    In this case, the implicit lambda expression would capture a final
    copy of the "comparatorHolder" reference and the body would invoke
    the comparePersonByAge using that as the receiver.


    We may choose to restrict the forms that the receiver can take, rather
    than allowing arbitrary object-valued expressions like
    "#foo(bar).moo", when capturing instance method references.


  10. Extension methods

    A separate document on defender methods proposes our
    strategy for extending existing interfaces with virtual extension
    methods.


No hay comentarios: