RFR(m): 8177290 add copy factory methods for unmodifiable List, Set, Map

classic Classic list List threaded Threaded
42 messages Options
123
Reply | Threaded
Open this post in threaded view
|

Re: RFR(m): 8177290 add copy factory methods for unmodifiable List, Set, Map

Patrick Reinhart
Am 02.11.2017 um 20:08 schrieb Roger Riggs:
> Hi,
>
> It would also support a more fluent use of the APIs.  Though if we
> start down that road
> it may become littered with convenience methods.  Something to ponder
> for a bit.
>
One one hand I see your obligation, on the other hand we would open the
chance to write
more optimal implementations for a library implementers though. That
seems to me also
an considerable point here. For instance an unmodifiable List could
basically return a fast
copy of itself here using an internal constructor passing the values
from itself or a Set
implementation has not to check for duplicate entries or similar.

-Patrick

>
>
> On 11/1/2017 4:29 PM, Patrick Reinhart wrote:
>> In this case I would prefer a non static copyOf() method on the list
>> to create a unmodifiable list/set/map, where the optimal factory
>> method can be called. This would also solve the problem of a
>> concurrent implementation.
>>
>> -Patrick
>>
>>
>>> Am 01.11.2017 um 21:05 schrieb Louis Wasserman <[hidden email]>:
>>>
>>> I disagree, actually.  Collections with size zero and one are
>>> significantly more common than bigger collections.
>>>
>>> In Guava's immutable collection factories (ImmutableList.of(...)
>>> etc.), we observed a roughly exponential decline in the number of
>>> users of factory methods of each size: if N people created empty
>>> lists, ~N/2 created singleton lists, ~N/4 created two-element lists,
>>> etc.  We got noticeable pushback from RAM-sensitive customers when
>>> we eliminated some specialized singleton collection implementations.
>>>
>>> Our experience has been that specializing for N=0 and N=1 does pay
>>> off.  Probably not N=2, though?
>>>
>>> On Wed, Nov 1, 2017 at 12:45 PM Patrick Reinhart <[hidden email]
>>> <mailto:[hidden email]>> wrote:
>>>
>>>> Am 01.11.2017 um 20:25 schrieb Stuart Marks
>>>> <[hidden email] <mailto:[hidden email]>>:
>>>>
>>>> On 10/31/17 5:52 PM, Bernd Eckenfels wrote:
>>>>> Having a List.of(List) copy constructor would save an additional
>>>>> array copy in the collector Which uses 
>>>>> (List<T>)List.of(list.toArray())
>>>> The quickest way to get all the elements from the source collection
>>>> is to call toArray() on it. Some copy constructors (like
>>>> ArrayList's) simply use the returned array for internal storage.
>>>> This saves a copy, but it relies on the source collection's
>>>> toArray() implementation to be correct. In particular, the returned
>>>> array must be freshly created, and that array must be of type
>>>> Object[]. If either of these is violated, it will break the ArrayList.
>>>>
>>>> The "immutable" collections behind List.of/copyOf/etc. are
>>>> fastidious about allocating their internal arrays in order to
>>>> ensure that they are referenced only from within the newly created
>>>> collection. This requires making an „extra" copy of the array
>>>> returned by toArray().
>>>>
>>>> An alternative is for the new collection to preallocate its
>>>> internal array using the source's size(), and then to copy the
>>>> elements out. But the source’s
>>>> size might change during the copy (e.g., if it’s a concurrent
>>>> collection) so this complicates things.
>>> I think the array „overhead“ would be only for the cases of zero,
>>> one and two value implementations. That seems to me not worth of
>>> optimizing…
>>>
>>>> I think the only safe way to avoid the extra copy is to create a
>>>> private interface that can be used by JDK implementations.
>>>>
>>>> s'marks
>>> -Patrick
>


Reply | Threaded
Open this post in threaded view
|

Re: RFR(m): 8177290 add copy factory methods for unmodifiable List, Set, Map

Patrick Reinhart
In reply to this post by Stuart Marks
Hi Stuart,

After having thought over your arguments about copyOf versus unmodifiableCopyOf length discussion (also with my colleague) I was thinking about why not create additional X.of(X) methods instead of X.copyOf(X). It would seem to me a logical enhancement in the sense of the existing API though.

-Patrick

> Am 02.11.2017 um 23:04 schrieb Stuart Marks <[hidden email]>:
>
>
>> Why not using:
>>
>>     coll.stream().collect(Collectors.toImmutableSet())
>>
>> As Collectors.toImmutableSet() is currently implemented, with serial Stream it will create a single HashSet, add all the elements to it and call Set.of(HashSet.toArray()) with it. Pretty much the same as what Tagir proposes, but the Collector could be made more efficient in the future and with it, the optimization would automatically extend to Set.copyOf()...
> This is mainly about whether Set.copyOf() is implemented in terms of Collectors.toUnmodifiableSet(), or vice-versa, which then calls Set.of(T[]) to do the actual creation. Some future optimization will probably replace both of these implementations with calls to JDK internal methods that can bypass the extra copying, so it doesn't really matter which one of these calls the other right now.
>
> s'marks
>

Reply | Threaded
Open this post in threaded view
|

Re: RFR(m): 8177290 add copy factory methods for unmodifiable List, Set, Map

Peter Levart
Hi Patrick,

On 11/10/2017 09:49 AM, Patrick Reinhart wrote:
> Hi Stuart,
>
> After having thought over your arguments about copyOf versus unmodifiableCopyOf length discussion (also with my colleague) I was thinking about why not create additional X.of(X) methods instead of X.copyOf(X). It would seem to me a logical enhancement in the sense of the existing API though.
>
> -Patrick

I'm sure Stuart thought of that, but decided against it because of usual
ambiguity problems that occur when two candidate .of() methods apply at
the same time, for example:

List<String> strings = List.of("a", "b", "c");

List<?> foo = List.of(strings);

What did programmer want to end up in foo, List<List<String>> or
List<String>? What did javac choose? Answers may differ.

Regards, Peter

>
>> Am 02.11.2017 um 23:04 schrieb Stuart Marks <[hidden email]>:
>>
>>
>>> Why not using:
>>>
>>>      coll.stream().collect(Collectors.toImmutableSet())
>>>
>>> As Collectors.toImmutableSet() is currently implemented, with serial Stream it will create a single HashSet, add all the elements to it and call Set.of(HashSet.toArray()) with it. Pretty much the same as what Tagir proposes, but the Collector could be made more efficient in the future and with it, the optimization would automatically extend to Set.copyOf()...
>> This is mainly about whether Set.copyOf() is implemented in terms of Collectors.toUnmodifiableSet(), or vice-versa, which then calls Set.of(T[]) to do the actual creation. Some future optimization will probably replace both of these implementations with calls to JDK internal methods that can bypass the extra copying, so it doesn't really matter which one of these calls the other right now.
>>
>> s'marks
>>

Reply | Threaded
Open this post in threaded view
|

Re: RFR(m): 8177290 add copy factory methods for unmodifiable List, Set, Map

Funda Secgin
In reply to this post by Stuart Marks
--

Fs
Reply | Threaded
Open this post in threaded view
|

Re: RFR(m): 8177290 add copy factory methods for unmodifiable List, Set, Map

John Rose-3
In reply to this post by Stuart Marks
Late to the party, but these lines rub me the wrong way:

@return the new {@code List}
@return the new {@code Set}
@return the new {@code Map}

The word "new" is a loaded term, which usually means
(or can be easily mistaken to mean) that a new object
identity is guaranteed.  Thus, "new" shouldn't be used
to specify the behavior of value-based classes.

Given that that the underlying objects are of VBCs,
and that we are encouraging programmers to rely on
the efficiency of chained copies, it should say something
like this instead:

@return a {@code List} containing the same elements as the given collection
@return a {@code Set} containing the same elements as the given collection
@return a {@code Map} containing the same mappings as the given map

(Or even s/return a/return an unmodifiable/.)

— John
Reply | Threaded
Open this post in threaded view
|

Re: RFR(m): 8177290 add copy factory methods for unmodifiable List, Set, Map

John Rose-3
In reply to this post by Patrick Reinhart
On Nov 3, 2017, at 1:26 AM, Patrick Reinhart <[hidden email]> wrote:
>
>>
>> On 11/1/2017 4:29 PM, Patrick Reinhart wrote:
>>> In this case I would prefer a non static copyOf() method on the list
>>> to create a unmodifiable list/set/map, where the optimal factory
>>> method can be called. This would also solve the problem of a
>>> concurrent implementation.

Such methods would, unfortunately, be a disastrous mistake.

We are encouraging users to rely on the unmodifiability of the
value-based lists, sets, and maps returned by copyOf, the
same way they rely on the unmodifiability of Strings.  But
if we made copyOf be non-static, then an attacker (or just
a buggy program) could introduce a broken implementation
of List that returned a time-varying List.  This would break
the contract of copyOf, but it would be impossible to detect.
The result would be a less reliable copyOf API, and
eventually some TOCTTOU security vulnerabilities.
This kind of surprise mutability is red meat for security
researchers.

So it's got to be a static method, although if List were a
class we could also choose to make copyOf be a final method.

I said "unfortunately" above because I too would prefer
fluent syntax like ls.unmodifiableCopy() or ls.frozen()
instead of List.copyOf(ls), since fluent syntax reads
and composes cleanly from left to right.

I think a good way to get what we want may be to
introduce a way for selected methods of an interface
to mark themselves (somehow) as "fluent statics",
meaning that they can be invoked with their first
argument in receiver position, before the dot.
In effect, they would act like "final default" methods
but without having to damage interfaces with final
methods.

I don't have a syntax in mind for defining a "fluent
static", but I have thought for some time that allowing
such sugar for interface statics is the best way to
adapt the notion of class final methods to interfaces.

This is not easy to do, since it is a language change,
and only applies to a small (though influential) number of
methods, those which should *not* be overridable but
*should* (for some reason of API design) support virtual
(fluent, postfix, left-to-right) notation.

For more details please see JDK-8191530 and if you
think of a good use case for these thingies, let me know
so I can add it to the write-up.

— John
Reply | Threaded
Open this post in threaded view
|

Re: RFR(m): 8177290 add copy factory methods for unmodifiable List, Set, Map

forax
Hi John,
i strongly believe that static fluent methods have no place in a blue collar language, we can do better, in fact the have already done better by introducing default method instead of extension methods like in C# or Kotlin*.

The main issue with this kind of method is that it quacks like a duck but it's not a duck, i.e. you call those static methods as virtual methods but there are no virtual methods because you can not override them.

Adding extension methods as feature is like a black hole, extension methods allow you to write clunky DSLs so it's a feature that will force the language to move to add more new features to be able to write decent DSLs (think implicit object as an example), and so on. So it sucks your design budget like a black hole.

In term of performance, the code inside a static extension methods tends to become megamorphic, in C#, Data Link (Link for Collection API) is slow because of that, in Kotlin you can workaround that by using the keyword inline which will copy the code into the callsite avoiding profile pollution.

So in my opinion, the only possible option is to introduce final default methods, i fully understand that this is non trivial to introduce this kind of methods in the VM but this discussion is a good use case for their introduction.

regards,
Rémi

* i will not criticize Kotlin designers on that, they wanted to enhance Java APIs without owning them so they had little options.

----- Mail original -----
> De: "John Rose" <[hidden email]>
> À: "Patrick Reinhart" <[hidden email]>
> Cc: "core-libs-dev" <[hidden email]>
> Envoyé: Samedi 18 Novembre 2017 07:40:05
> Objet: Re: RFR(m): 8177290 add copy factory methods for unmodifiable List, Set, Map

> On Nov 3, 2017, at 1:26 AM, Patrick Reinhart <[hidden email]> wrote:
>>
>>>
>>> On 11/1/2017 4:29 PM, Patrick Reinhart wrote:
>>>> In this case I would prefer a non static copyOf() method on the list
>>>> to create a unmodifiable list/set/map, where the optimal factory
>>>> method can be called. This would also solve the problem of a
>>>> concurrent implementation.
>
> Such methods would, unfortunately, be a disastrous mistake.
>
> We are encouraging users to rely on the unmodifiability of the
> value-based lists, sets, and maps returned by copyOf, the
> same way they rely on the unmodifiability of Strings.  But
> if we made copyOf be non-static, then an attacker (or just
> a buggy program) could introduce a broken implementation
> of List that returned a time-varying List.  This would break
> the contract of copyOf, but it would be impossible to detect.
> The result would be a less reliable copyOf API, and
> eventually some TOCTTOU security vulnerabilities.
> This kind of surprise mutability is red meat for security
> researchers.
>
> So it's got to be a static method, although if List were a
> class we could also choose to make copyOf be a final method.
>
> I said "unfortunately" above because I too would prefer
> fluent syntax like ls.unmodifiableCopy() or ls.frozen()
> instead of List.copyOf(ls), since fluent syntax reads
> and composes cleanly from left to right.
>
> I think a good way to get what we want may be to
> introduce a way for selected methods of an interface
> to mark themselves (somehow) as "fluent statics",
> meaning that they can be invoked with their first
> argument in receiver position, before the dot.
> In effect, they would act like "final default" methods
> but without having to damage interfaces with final
> methods.
>
> I don't have a syntax in mind for defining a "fluent
> static", but I have thought for some time that allowing
> such sugar for interface statics is the best way to
> adapt the notion of class final methods to interfaces.
>
> This is not easy to do, since it is a language change,
> and only applies to a small (though influential) number of
> methods, those which should *not* be overridable but
> *should* (for some reason of API design) support virtual
> (fluent, postfix, left-to-right) notation.
>
> For more details please see JDK-8191530 and if you
> think of a good use case for these thingies, let me know
> so I can add it to the write-up.
>
> — John
Reply | Threaded
Open this post in threaded view
|

Re: RFR(m): 8177290 add copy factory methods for unmodifiable List, Set, Map

Brian Goetz-2
I'm with Remi on this.  

Sent from my MacBook Wheel

> On Nov 18, 2017, at 5:41 AM, Remi Forax <[hidden email]> wrote:
>
> Hi John,
> i strongly believe that static fluent methods have no place in a blue collar language, we can do better, in fact the have already done better by introducing default method instead of extension methods like in C# or Kotlin*.
>
> The main issue with this kind of method is that it quacks like a duck but it's not a duck, i.e. you call those static methods as virtual methods but there are no virtual methods because you can not override them.
>
> Adding extension methods as feature is like a black hole, extension methods allow you to write clunky DSLs so it's a feature that will force the language to move to add more new features to be able to write decent DSLs (think implicit object as an example), and so on. So it sucks your design budget like a black hole.
>
> In term of performance, the code inside a static extension methods tends to become megamorphic, in C#, Data Link (Link for Collection API) is slow because of that, in Kotlin you can workaround that by using the keyword inline which will copy the code into the callsite avoiding profile pollution.
>
> So in my opinion, the only possible option is to introduce final default methods, i fully understand that this is non trivial to introduce this kind of methods in the VM but this discussion is a good use case for their introduction.
>
> regards,
> Rémi
>
> * i will not criticize Kotlin designers on that, they wanted to enhance Java APIs without owning them so they had little options.
>
> ----- Mail original -----
>> De: "John Rose" <[hidden email]>
>> À: "Patrick Reinhart" <[hidden email]>
>> Cc: "core-libs-dev" <[hidden email]>
>> Envoyé: Samedi 18 Novembre 2017 07:40:05
>> Objet: Re: RFR(m): 8177290 add copy factory methods for unmodifiable List, Set, Map
>
>>> On Nov 3, 2017, at 1:26 AM, Patrick Reinhart <[hidden email]> wrote:
>>>
>>>>
>>>>> On 11/1/2017 4:29 PM, Patrick Reinhart wrote:
>>>>> In this case I would prefer a non static copyOf() method on the list
>>>>> to create a unmodifiable list/set/map, where the optimal factory
>>>>> method can be called. This would also solve the problem of a
>>>>> concurrent implementation.
>>
>> Such methods would, unfortunately, be a disastrous mistake.
>>
>> We are encouraging users to rely on the unmodifiability of the
>> value-based lists, sets, and maps returned by copyOf, the
>> same way they rely on the unmodifiability of Strings.  But
>> if we made copyOf be non-static, then an attacker (or just
>> a buggy program) could introduce a broken implementation
>> of List that returned a time-varying List.  This would break
>> the contract of copyOf, but it would be impossible to detect.
>> The result would be a less reliable copyOf API, and
>> eventually some TOCTTOU security vulnerabilities.
>> This kind of surprise mutability is red meat for security
>> researchers.
>>
>> So it's got to be a static method, although if List were a
>> class we could also choose to make copyOf be a final method.
>>
>> I said "unfortunately" above because I too would prefer
>> fluent syntax like ls.unmodifiableCopy() or ls.frozen()
>> instead of List.copyOf(ls), since fluent syntax reads
>> and composes cleanly from left to right.
>>
>> I think a good way to get what we want may be to
>> introduce a way for selected methods of an interface
>> to mark themselves (somehow) as "fluent statics",
>> meaning that they can be invoked with their first
>> argument in receiver position, before the dot.
>> In effect, they would act like "final default" methods
>> but without having to damage interfaces with final
>> methods.
>>
>> I don't have a syntax in mind for defining a "fluent
>> static", but I have thought for some time that allowing
>> such sugar for interface statics is the best way to
>> adapt the notion of class final methods to interfaces.
>>
>> This is not easy to do, since it is a language change,
>> and only applies to a small (though influential) number of
>> methods, those which should *not* be overridable but
>> *should* (for some reason of API design) support virtual
>> (fluent, postfix, left-to-right) notation.
>>
>> For more details please see JDK-8191530 and if you
>> think of a good use case for these thingies, let me know
>> so I can add it to the write-up.
>>
>> — John

Reply | Threaded
Open this post in threaded view
|

Re: RFR(m): 8177290 add copy factory methods for unmodifiable List, Set, Map

John Rose-3
On Nov 18, 2017, at 11:10 AM, Brian Goetz <[hidden email]> wrote:
>
> I'm with Remi on this.  

Which suggestion of his two?  Blue-collar language with bad DSLs,
or final defaults?

— John
Reply | Threaded
Open this post in threaded view
|

Re: RFR(m): 8177290 add copy factory methods for unmodifiable List, Set, Map

Brian Goetz-2
Use-site static extension methods = bad.  

Sent from my MacBook Wheel

> On Nov 18, 2017, at 11:12 AM, John Rose <[hidden email]> wrote:
>
>> On Nov 18, 2017, at 11:10 AM, Brian Goetz <[hidden email]> wrote:
>>
>> I'm with Remi on this.  
>
> Which suggestion of his two?  Blue-collar language with bad DSLs,
> or final defaults?
>
> — John
Reply | Threaded
Open this post in threaded view
|

Re: RFR(m): 8177290 add copy factory methods for unmodifiable List, Set, Map

John Rose-3
On Nov 18, 2017, at 11:16 AM, Brian Goetz <[hidden email]> wrote:
>
> Use-site static extension methods = bad.  

Right; that wasn't on the table from either Remy or me.
All def-site stuff, so the designer of the defining interface
has full control of the DSL.  JDK-8191530 is def-site extension
methods, a compromise that meets most or all of our requirements.

(Side note:  We could enable third-party DSL stuff with value types
and specialization by using the value object as a view, and the
real type of the extended object/value as a generic parameter.
The value type acts just like a C++ smart pointer and can warp
the original API around however it likes.)


> Sent from my MacBook Wheel
>
> On Nov 18, 2017, at 11:12 AM, John Rose <[hidden email] <mailto:[hidden email]>> wrote:
>
>> On Nov 18, 2017, at 11:10 AM, Brian Goetz <[hidden email] <mailto:[hidden email]>> wrote:
>>>
>>> I'm with Remi on this.  
>>
>> Which suggestion of his two?  Blue-collar language with bad DSLs,
>> or final defaults?
>>
>> — John

Reply | Threaded
Open this post in threaded view
|

Re: RFR(m): 8177290 add copy factory methods for unmodifiable List, Set, Map

John Rose-3
In reply to this post by Stuart Marks
On Oct 30, 2017, at 6:50 PM, Stuart Marks <[hidden email]> wrote:
>
> (also includes 8184690: add Collectors for collecting into unmodifiable List, Set, and Map)

Now I'm going to be picky about the names of the Collectors;
please bear with me a moment.  Consider `toUnmodifiableList`
and its two cousins (`toUSet`, `toUMap`).

The most natural names `toList` (etc.) are already taken.
They mean "do whatever is most useful for the collector,
perhaps returning unmodifiable or modifiable results".

So the problem here is giving the user the option to say,
"no, I really want a value-based-class-style list here; I know
you can do this for me".

`Collectors.toList` can't ever be upgraded in place, since
the programmer is getting a Collector's Choice result,
and programmers will be inadvertently relying on whatever
that (undocumented) choice happens to be.

I think this makes the name `toList` significantly less useful,
and I for one will never want to use `toList` again, once
`toUnmodifiableList` is available.

Meanwhile, `toUnmodifiableList` is so ugly to the ear,
eye, and fingers that it will be less used just because of
its long spelling.

Users should be allowed to code as if unmodifiability is a
common practice, not something that must be re-announced
every time I make a new temporary list.  That's why `copyOf`
and not `unmodifiableCopyOf` is the right move, and
`toUnmodifiableList` is not.

(See also Gafter's blog[1] on making NEW FEATURES look NEW.
The NEW UNMODIFIABILITY will soon cease to be new but in
its present form it will shout forever like an aging rocker.)

To get out of this bind, I suggest picking a shorter name that
emphasizes that the result is the _elements_ in the list, not the
list object per se.  This is a VBC[2] move, one which we need to
learn to make more often, especially as value types arrive.
So I suggest these names for the collectors:  `elements` (for
a list), `elementSet`, and `elementMap`.  Same signatures,
of course.

Another approach to this, which recycles the old names but
doesn't have the VBC vibe, is to create a second set of
factories for list/set/map collectors that look just like the
old ones, but behave more predictably.

For example, one might try to add methods to `Collector`
named `asModifiable` and `asUnmodifiable`, so people can
write `toList().asModifiable()`.  I don't think this works, nor
is it very pretty (looks too "NEW"), although it provides a
more principled fix to the Collector's Choice problem.

If we go forward with the change as written, I think people will
only want to remember the one `toList` and forget about his ugly
brother.  Only zealots like me will remember to add the extra
incantation ("NEW!  now Unmodifiable!").

Another, better way to recycle the name `toList` is to provide
`Collectors.Modifiable` and `Collectors.Unmodifiable`, and then
have users import `Collectors.Unmodifiable.toList`.  This at least
would move the shouting up to the static imports list and out of
real code.

But I think `toList` is broken and should be avoided.  So I
think it should be `elements`, `elementSet`, and `elementMap`.

Even if you think that unmodifiability should be mentioned every
time a collector collects some stuff, observe that `toUnmodifiableList`
is not a strong parallel with the older API point, `Collections.unmodifiableList`.
That guy creates a _view_ while `copyOf` and `elements` (my name)
create non-view unmodifiable (VBC, shallowly immutable) lists.
So `Collectors.toUnmodifiableList` is both ugly and ill-sorted with `copyOf`.

— John

[1]: http://gafter.blogspot.com/2017/06/making-new-language-features-stand-out.html
[2]: https://docs.oracle.com/javase/8/docs/api/java/lang/doc-files/ValueBased.html

Reply | Threaded
Open this post in threaded view
|

Re: RFR(m): 8177290 add copy factory methods for unmodifiable List, Set, Map

John Rose-3
On Nov 18, 2017, at 7:34 PM, John Rose <[hidden email]> wrote:

>
> On Oct 30, 2017, at 6:50 PM, Stuart Marks <[hidden email]> wrote:
>>
>> (also includes 8184690: add Collectors for collecting into unmodifiable List, Set, and Map)
>
> Now I'm going to be picky about the names of the Collectors;
> please bear with me a moment.  Consider `toUnmodifiableList`
> and its two cousins (`toUSet`, `toUMap`).
>
> The most natural names `toList` (etc.) are already taken.
> They mean "do whatever is most useful for the collector,
> perhaps returning unmodifiable or modifiable results".

Let me adjust my position here, FTR.  I am now aware (thanks
Brian) that `toList` is not broken but intentionally under-specified,
pending future changes.  It is my personal hope that the future
changes will specify that the result of `toList` is safely publishable
and an unmodifiable non-view.  This issue is tracked as JDK-8180352.

(Logically possible alternatives would seem to include an otherwise
unconstrained mutable list, an ArrayList, or a continuation of the
"Chef's Choice" policy in effect today.  I suspect we could find
advocates for all of those positions, as evidenced by comments
on JDK-8180352.  I just added mine for the record.)

If at some point in the future `toList` does produce the same
kind of safe list as `List.of`, then I won't have anything to be
picky about.  Other folks can use `toCollection(ArrayList::new)`
or some other explicit op-in for mutability.

For now, a security-conscious user like me can work with
`Collections.toUnmodifiableList`.  That API point will (I hope)
have a short career, ending when `toList` does something at
least as useful.

If at some point in the future `toList` switches to guarantee
the mutable ArrayList (as it seems to supply today), then the
hope for a simple and safe `toList` will be dashed, and we
will have to look for something else that is explicitly oriented
towards value based classes, such as `Collectors.valueList`
and/or `Stream.values`.

Value-based classes are an important "attractor" for API
design, because they can be simultaneously safe and
performant, compared to Java arrays and ArrayList.

(The performance comes from the elimination of copies
under the hood, as well as structure flattening, optimizations
which potential modification makes impossible.  The safety
comes from reduction of difficult-to-find TOCTTOU attack
surfaces, as beyond the usual claimed benefits of "FP style".)

When we get to Valhalla value types, there will be many
more value-based classes in the world, since a value
type cannot be anything other than a value-based class.
Whatever we do with `toList` in the future, it should take
into account value-based usages.  They are here and
more are coming.

One more point:  When programming with value-based
classes, the container is almost irrelevant, and the contents
are the whole story.  Thus, API points which return value-based
multiple values should probably not mention the container
type (List) unless there is a real danger of ambiguity.
If I have a tree node type and I want to use a value-based
List to encode its children, it should look like this:

   List<Node> children() { return this.children; }

or this:

   List<Node> children() { return List.copyOf(this.children); }

not these:

   List<Node> childrenList() ...
   List<Node> listOfChildren() ...
   List<Node> childrenUnmodifiableList() ...
   private List<Node> secretChildrenArray() ...

(etc.)

An API in the value-based style could say, in a class header or
package intro page, that collection values are value-based if not
otherwise specified, and then simply use unmodifiable building
blocks everywhere.

In any case, I think the de-emphasis of container identity enabled
by value-based types provides a subtle but strong push on API
design, towards plural nouns, and away from explicit discussion
of container details.

— John

P.S. Ten years ago I designed the MethodType API, which talks
about a single return type and multiple parameter types.  The
corresponding API points are returnType and (not parameterTypes
but) parameterList *and* parameterArray.  Because at the time
arrays and lists were both heavily used carriers for multiple values,
we had to focus on the box types (List, Array).  Also we didn't
have VBC rules yet.  In the future I hope similar API designs can
(a) ignore legacy arrays, and (b) just use VBC lists, and then
(c) be a little less noisy about how the data is carried around.

P.P.S. Fully retiring arrays will require dislodging them from
their favored position with varargs and similar spots in the stack.

Reply | Threaded
Open this post in threaded view
|

Re: RFR(m): 8177290 add copy factory methods for unmodifiable List, Set, Map

John Rose-3
In reply to this post by Brian Goetz-2
On Nov 18, 2017, at 8:10 AM, Brian Goetz <[hidden email]> wrote:

>
> I'm with Remi on this.  
>
> Sent from my MacBook Wheel
>
>> On Nov 18, 2017, at 5:41 AM, Remi Forax <[hidden email] <mailto:[hidden email]>> wrote:
>>
>> Hi John,
>> i strongly believe that static fluent methods have no place in a blue collar language
>> ...

>> So in my opinion, the only possible option is to introduce final default methods, i fully understand that this is non trivial to introduce this kind of methods in the VM but this discussion is a good use case for their introduction.

We have four choices on the table with respect to the occasional
need for securable API points in fluent APIs:

0. Blue collar language:  Pick only one of fluency or security.

1. Secure a default method by marking it final.

2. Secure a fluent API point by binding to a static in the same type.

4. Extension methods:  Anybody can "import" new statics into any type.

I think #0 hurts security, which is why I keep objecting to it.

Brian thinks #1 puts too many restrictions on implementors.

Although it would seem to allow everybody to do whatever they want,
#4 is not a fit for Java.  APIs in java.base are carefully curated,
and allowing third parties to add "nice" methods would interfere
with that curation.

Option #2 has the known good properties of C# extension methods
(decoupling from receiver dispatch, natural notation), without the known
bad properties of C# extension methods (lack of curation).

So #2 is the least ugly solution for an ugly problem, except possibly #1
Which I would accept also.  I would also be glad to see a #5 that would
please everybody.

— John

P.S. Security is a big concern for us developers of java.base.  And it is
surely a concern for everyone else, at least in part.  If you program
behind only a firewall, consider that program integrity and robustness
are corollaries of security.  Your past self and teammates are throwing
buggy code at your present self; you want to be secure from that
even if you don't fear nefarious attackers.  When we secure the
Java stack from nefarious attackers we are also preventing large
numbers of unintentional bugs.

Reply | Threaded
Open this post in threaded view
|

Re: RFR(m): 8177290 add copy factory methods for unmodifiable List, Set, Map

forax
> De: "John Rose" <[hidden email]>
> À: "Brian Goetz" <[hidden email]>, "Rémi Forax" <[hidden email]>
> Cc: "core-libs-dev" <[hidden email]>
> Envoyé: Mercredi 22 Novembre 2017 04:31:52
> Objet: Re: RFR(m): 8177290 add copy factory methods for unmodifiable List, Set,
> Map

> On Nov 18, 2017, at 8:10 AM, Brian Goetz < [ mailto:[hidden email] |
> [hidden email] ] > wrote:

>> I'm with Remi on this.

>> Sent from my MacBook Wheel

>>> On Nov 18, 2017, at 5:41 AM, Remi Forax < [ mailto:[hidden email] |
>>> [hidden email] ] > wrote:

>>> Hi John,
>>> i strongly believe that static fluent methods have no place in a blue collar
>>> language
>>> ...
>>> So in my opinion, the only possible option is to introduce final default
>>> methods, i fully understand that this is non trivial to introduce this kind of
>>> methods in the VM but this discussion is a good use case for their
>>> introduction.

> We have four choices on the table with respect to the occasional
> need for securable API points in fluent APIs:

> 0. Blue collar language: Pick only one of fluency or security.

> 1. Secure a default method by marking it final.

> 2. Secure a fluent API point by binding to a static in the same type.

> 4. Extension methods: Anybody can "import" new statics into any type.

> I think #0 hurts security, which is why I keep objecting to it.

> Brian thinks #1 puts too many restrictions on implementors.

The whole point of final on methods is to add security and as a side effect, it bother the implementors. Security always bother people that doesn't have to handle security issues.
So you can have security even in a blue collar language.

I do not see the point to add a new semantics if you have a way to solve the current issue by applying a know semantics 'final' on an already existing feature 'default method'.

> Although it would seem to allow everybody to do whatever they want,
> #4 is not a fit for Java. APIs in java.base are carefully curated,
> and allowing third parties to add "nice" methods would interfere
> with that curation.

> Option #2 has the known good properties of C# extension methods
> (decoupling from receiver dispatch, natural notation), without the known
> bad properties of C# extension methods (lack of curation).

> So #2 is the least ugly solution for an ugly problem, except possibly #1
> Which I would accept also.

#2 also works by accident, the fact that by inheritance you can see static method is ugly.
The lambda EG has made clear that the fact that you can see a static method by inheritance was a mistake by making static methods in interface non visible by inheritance.

> I would also be glad to see a #5 that would please everybody.

or a #3, there is no #3 in your list.

> — John

Rémi

> P.S. Security is a big concern for us developers of java.base. And it is
> surely a concern for everyone else, at least in part. If you program
> behind only a firewall, consider that program integrity and robustness
> are corollaries of security. Your past self and teammates are throwing
> buggy code at your present self; you want to be secure from that
> even if you don't fear nefarious attackers. When we secure the
> Java stack from nefarious attackers we are also preventing large
> numbers of unintentional bugs.

I think everybody on this list agree about that point.
Reply | Threaded
Open this post in threaded view
|

Re: RFR(m): 8177290 add copy factory methods for unmodifiable List, Set, Map

Stephen Colebourne-2
In reply to this post by John Rose-3
On 19 November 2017 at 03:34, John Rose <[hidden email]> wrote:
> Meanwhile, `toUnmodifiableList` is so ugly to the ear,
> eye, and fingers that it will be less used just because of
> its long spelling.

Many developers are already using Google Guava which has added
toImmutableList() and friends. More widely, we use the toXxx() form
for all the collectors we've written (in OpenGamma Strata). As such, I
think the toXxx() naming is well established at this point, and
toUnmodifiableList() will be fine.

Stephen
Reply | Threaded
Open this post in threaded view
|

Re: RFR(m): 8177290 add copy factory methods for unmodifiable List, Set, Map

forax
I think i prefer toImmutableList() than toUnmodifiableList() because the List is truly immutable and not an unmodifiable proxy in front of a mutable List (like Collections.unmodifiableList() does).

Rémi

----- Mail original -----
> De: "Stephen Colebourne" <[hidden email]>
> À: "core-libs-dev" <[hidden email]>
> Envoyé: Mercredi 22 Novembre 2017 16:42:56
> Objet: Re: RFR(m): 8177290 add copy factory methods for unmodifiable List, Set, Map

> On 19 November 2017 at 03:34, John Rose <[hidden email]> wrote:
>> Meanwhile, `toUnmodifiableList` is so ugly to the ear,
>> eye, and fingers that it will be less used just because of
>> its long spelling.
>
> Many developers are already using Google Guava which has added
> toImmutableList() and friends. More widely, we use the toXxx() form
> for all the collectors we've written (in OpenGamma Strata). As such, I
> think the toXxx() naming is well established at this point, and
> toUnmodifiableList() will be fine.
>
> Stephen
Reply | Threaded
Open this post in threaded view
|

Re: RFR(m): 8177290 add copy factory methods for unmodifiable List, Set, Map

Stuart Marks
In reply to this post by John Rose-3
On 11/17/17 9:43 PM, John Rose wrote:

> Late to the party, but these lines rub me the wrong way:
>
> @return the new {@code List}
> @return the new {@code Set}
> @return the new {@code Map}
>
> The word "new" is a loaded term, which usually means
> (or can be easily mistaken to mean) that a new object
> identity is guaranteed.  Thus, "new" shouldn't be used
> to specify the behavior of value-based classes.
>
> Given that that the underlying objects are of VBCs,
> and that we are encouraging programmers to rely on
> the efficiency of chained copies, it should say something
> like this instead:
>
> @return a {@code List} containing the same elements as the given collection
> @return a {@code Set} containing the same elements as the given collection
> @return a {@code Map} containing the same mappings as the given map
>
> (Or even s/return a/return an unmodifiable/.)

OK, per your other message, we'll stick with
Collectors.toUnmodifiableList/Set/Map, in the hope that in the future we can
work on improving various incarnations of toList() to have more desirable
properties.

Meanwhile, I agree that "@return the new {@code List}" is wrong and I'll adjust
it (respectively, Set and Map).

Another point from our discussions is that the copyOf() methods should get an
@implNote that says they'll try not to make unnecessary copies.

Also included are some markup changes and a small Set.copyOf implementation
change suggested upthread.

Final(?) webrev:

     http://cr.openjdk.java.net/~smarks/reviews/8177290/webrev.3/

Specdiff:

 
http://cr.openjdk.java.net/~smarks/reviews/8177290/specdiff.3/overview-summary.html

s'marks
Reply | Threaded
Open this post in threaded view
|

Re: RFR(m): 8177290 add copy factory methods for unmodifiable List, Set, Map

Stuart Marks
In reply to this post by forax
On 11/22/17 8:45 AM, Remi Forax wrote:
> I think i prefer toImmutableList() than toUnmodifiableList() because the List
> is truly immutable and not an unmodifiable proxy in front of a mutable List
> (like Collections.unmodifiableList() does).

Immutability is like wine.

If you put a spoonful of wine in a barrel full of sewage, you get sewage. If you
put a spoonful of sewage in a barrel full of wine, you get sewage.
(Schopenhauer's Law of Entropy)

Similarly, if you add a little immutability to something mutable, you get
mutability. And if you add a little mutability to something immutable, you get
mutability.

To get the desired properties of immutability (e.g., thread-safety, or the
ability to hand around references safely without making defensive copies), it's
insufficient for a collection to be "immutable"; you also have to make some
statements about the contents. It's thus more precise to say that a collection
is unmodifiable, and to provide a clear definition of an unmodifiable collection.

Although unmodifiable collections have been around since the collections
framework was introduced, oddly enough the concept has never had a good
definition. The current proposal includes definitions for unmodifiability,
unmodifiable collections, and unmodifiable views, and it clearly specifies which
APIs return unmodifiable views and which return unmodifiable collections.

s'marks


Reply | Threaded
Open this post in threaded view
|

Re: RFR(m): 8177290 add copy factory methods for unmodifiable List, Set, Map

Martin Buchholz-3
Guava's "immutable collections" are very popular
https://github.com/google/guava/wiki/ImmutableCollectionsExplained
and it's not a good idea to fight their advertised notion of "immutable".

No generic container class can promise s'marks-style immutability (until
valhalla perhaps?) so it's not that useful a concept in this domain.
We like to think of Optional as an immutable value based class but you
can't stop anyone who really wants to mutate the contained element from
creating an Optional<AtomicReference<Object>>

We could do a better job of clarifying consequences of element mutation.
E.g. do we ever say that hash based collections/maps are broken if elements
are ever mutated in such a way that their hash code changes while they are
in the collection/map?
123