RFR(s): 8060192: Add default method Collection.toArray(generator)

classic Classic list List threaded Threaded
28 messages Options
12
Reply | Threaded
Open this post in threaded view
|

RFR(s): 8060192: Add default method Collection.toArray(generator)

Stuart Marks
Hi all,

Please review this small enhancement to add a default method
Collection.toArray(generator) that takes a function that creates the destination
array. This is analogous to Stream.toArray(generator).

Bug:

     https://bugs.openjdk.java.net/browse/JDK-8060192

Webrev:

     http://cr.openjdk.java.net/~smarks/reviews/8060192/webrev.0/

Thanks,

s'marks
Reply | Threaded
Open this post in threaded view
|

Re: RFR(s): 8060192: Add default method Collection.toArray(generator)

Martin Buchholz-3
+    // no spec changes relative to supertype
+    public <T> T[] toArray(IntFunction<T[]> generator) {

You probably at least need @inheritDoc for the unchecked exception throws
(as I've forgotten many times)

On Mon, Dec 4, 2017 at 7:20 PM, Stuart Marks <[hidden email]>
wrote:

> Hi all,
>
> Please review this small enhancement to add a default method
> Collection.toArray(generator) that takes a function that creates the
> destination array. This is analogous to Stream.toArray(generator).
>
> Bug:
>
>     https://bugs.openjdk.java.net/browse/JDK-8060192
>
> Webrev:
>
>     http://cr.openjdk.java.net/~smarks/reviews/8060192/webrev.0/
>
> Thanks,
>
> s'marks
>
Reply | Threaded
Open this post in threaded view
|

Re: RFR(s): 8060192: Add default method Collection.toArray(generator)

Martin Buchholz-3
The needToWorkAround6260652 changes ought to be in a separate changeset.

The biggest question is whether Collection.toArray(generator) pulls its
weight, especially in view of
https://shipilev.net/blog/2016/arrays-wisdom-ancients.

I rarely want to dump elements into a typed array.  Dumping into Object[]
with toArray() is just fine for me (but I'm a biased core library
developer).
Reply | Threaded
Open this post in threaded view
|

Re: RFR(s): 8060192: Add default method Collection.toArray(generator)

Martin Buchholz-3
On Mon, Dec 4, 2017 at 8:26 PM, Martin Buchholz <[hidden email]> wrote:

>
> The biggest question is whether Collection.toArray(generator) pulls its
> weight, especially in view of https://shipilev.net/blog/
> 2016/arrays-wisdom-ancients.
>

Oh wait - Aleksey actually links to this bug!  Sounds like he would be the
most qualified reviewer!
Reply | Threaded
Open this post in threaded view
|

Re: RFR(s): 8060192: Add default method Collection.toArray(generator)

Remi Forax
In reply to this post by Martin Buchholz-3
Hi Martin,

----- Mail original -----
> De: "Martin Buchholz" <[hidden email]>
> À: "Stuart Marks" <[hidden email]>
> Cc: "core-libs-dev" <[hidden email]>
> Envoyé: Mardi 5 Décembre 2017 05:26:02
> Objet: Re: RFR(s): 8060192: Add default method Collection.toArray(generator)

> The needToWorkAround6260652 changes ought to be in a separate changeset.
>
> The biggest question is whether Collection.toArray(generator) pulls its
> weight, especially in view of
> https://shipilev.net/blog/2016/arrays-wisdom-ancients.
>
> I rarely want to dump elements into a typed array.  Dumping into Object[]
> with toArray() is just fine for me (but I'm a biased core library
> developer).

Dumping an ArrayList<String> into an array of String is fairly frequent, i think.

The main issue with the current API, <T> T[] toArray(T[] array), is that T can be unrelated to E (the type of the element in the collection) so one can write
  ArrayList<Integer> list = ...
  String[] array = list.toArray(new String[0]);
it may even works if the list is empty.

It's getting worst if E and T can be a primitive type (with valhalla), because toArray(T[]) as to support all combinations.

So in my opinion, introducing toArray(generator) is a step in the right direction.

cheers,
Rémi
Reply | Threaded
Open this post in threaded view
|

Re: RFR(s): 8060192: Add default method Collection.toArray(generator)

Stuart Marks
In reply to this post by Martin Buchholz-3
>  
> +    // no spec changes relative to supertype
> +    public <T> T[] toArray(IntFunction<T[]> generator) {
>
> You probably at least need @inheritDoc for the unchecked exception throws (as I've forgotten many times)
>

There's a new javadoc option "--override-methods summary" that was recently
added that we're using now. If an overriding method has no doc comment, it's
simply listed in the "Methods inherited from" section at the bottom of the
summary table. (It's now called "Methods declared in".)

See it in action here:

     http://download.java.net/java/jdk10/docs/api/java/util/Properties.html

Compare it to the JDK 9 version of Properties.

(This suggests a big cleanup in collections, and probably many other areas,
where docs were copied into subclasses for no good reason, or because they
wanted to disinherit stuff that properly belongs in @implSpec.)

> The needToWorkAround6260652 changes ought to be in a separate changeset.

OK, I can pull these out.

> The biggest question is whether Collection.toArray(generator) pulls its weight, especially in view of https://shipilev.net/blog/2016/arrays-wisdom-ancients.
>
> I rarely want to dump elements into a typed array.  Dumping into Object[] with toArray() is just fine for me (but I'm a biased core library developer).

Yeah, most collections want to use Object[]. But when you want a typed array,
this is really helpful.

> Oh wait - Aleksey actually links to this bug!  Sounds like he would be the most
> qualified reviewer!

:-)

s'marks
Reply | Threaded
Open this post in threaded view
|

Re: RFR(s): 8060192: Add default method Collection.toArray(generator)

David Lloyd-2
In reply to this post by Remi Forax
On Tue, Dec 5, 2017 at 1:03 AM, Remi Forax <[hidden email]> wrote:

> Dumping an ArrayList<String> into an array of String is fairly frequent, i think.
>
> The main issue with the current API, <T> T[] toArray(T[] array), is that T can be unrelated to E (the type of the element in the collection) so one can write
>   ArrayList<Integer> list = ...
>   String[] array = list.toArray(new String[0]);
> it may even works if the list is empty.
>
> It's getting worst if E and T can be a primitive type (with valhalla), because toArray(T[]) as to support all combinations.
>
> So in my opinion, introducing toArray(generator) is a step in the right direction.

The signature of the proposed generator is

> public <T> T[] toArray(IntFunction<T[]> generator)

So I don't think that anything has really changed in this regard; T
can still be unrelated to E.

That said, sending in a constant String[0] is probably just as good as
a generator, or better (in my naive estimation the tradeoff is a >0
comparison versus an interface method dispatch), than an i -> new
String[i] unless some kind of benchmark would appear which disproves
that hypothesis.
--
- DML
Reply | Threaded
Open this post in threaded view
|

Re: RFR(s): 8060192: Add default method Collection.toArray(generator)

Paul Sandoz
In reply to this post by Stuart Marks


> On 4 Dec 2017, at 19:20, Stuart Marks <[hidden email]> wrote:
>
> Hi all,
>
> Please review this small enhancement to add a default method Collection.toArray(generator) that takes a function that creates the destination array. This is analogous to Stream.toArray(generator).
>
> Bug:
>
>    https://bugs.openjdk.java.net/browse/JDK-8060192
>

 345      * <p>Suppose {@code x} is a collection known to contain only strings.
 346      * The following code can be used to dump the collection into a newly
 347      * allocated array of {@code String}:

Make it an API note? (same for toArray(T[])


 352      * @implSpec
 353      * The default implementation calls the generator function with zero
 354      * and then passes the resulting array to {@link #toArray(T[])}.

That’s reasonable. I pondered about being vague and specifying a value in the range if [0, size()), to allow wiggle room for using 0 or size().


 360      * @throws ArrayStoreException if the runtime type of any element in this
 361      *         collection is not assignable to the {@linkplain Class#getComponentType
 362      *         runtime component type} of array returned by the generator function

s/of array returned by the generator function/of the generated array/ ?

Paul.

> Webrev:
>
>    http://cr.openjdk.java.net/~smarks/reviews/8060192/webrev.0/
>
> Thanks,
>
> s'marks

Reply | Threaded
Open this post in threaded view
|

Re: RFR(s): 8060192: Add default method Collection.toArray(generator)

Stuart Marks


On 12/5/17 10:47 AM, Paul Sandoz wrote:
>
>   345      * <p>Suppose {@code x} is a collection known to contain only strings.
>   346      * The following code can be used to dump the collection into a newly
>   347      * allocated array of {@code String}:
>
> Make it an API note? (same for toArray(T[])

Will do.

>   352      * @implSpec
>   353      * The default implementation calls the generator function with zero
>   354      * and then passes the resulting array to {@link #toArray(T[])}.
>
> That’s reasonable. I pondered about being vague and specifying a value in the range if [0, size()), to allow wiggle room for using 0 or size().

No, I really think it has to be exactly zero. If the default were to choose some
value K < size(), the collection size could change to be smaller than K after
allocation but before the call to toArray(T[]), resulting in an array that is
unexpectedly large.

>   360      * @throws ArrayStoreException if the runtime type of any element in this
>   361      *         collection is not assignable to the {@linkplain Class#getComponentType
>   362      *         runtime component type} of array returned by the generator function
>
> s/of array returned by the generator function/of the generated array/ ?

Will do.

s'marks
Reply | Threaded
Open this post in threaded view
|

Re: RFR(s): 8060192: Add default method Collection.toArray(generator)

Paul Sandoz


> On 5 Dec 2017, at 11:19, Stuart Marks <[hidden email]> wrote:
>
>
>
> On 12/5/17 10:47 AM, Paul Sandoz wrote:
>>  345      * <p>Suppose {@code x} is a collection known to contain only strings.
>>  346      * The following code can be used to dump the collection into a newly
>>  347      * allocated array of {@code String}:
>> Make it an API note? (same for toArray(T[])
>
> Will do.
>
>>  352      * @implSpec
>>  353      * The default implementation calls the generator function with zero
>>  354      * and then passes the resulting array to {@link #toArray(T[])}.
>> That’s reasonable. I pondered about being vague and specifying a value in the range if [0, size()), to allow wiggle room for using 0 or size().
>
> No, I really think it has to be exactly zero. If the default were to choose some value K < size(), the collection size could change to be smaller than K after allocation but before the call to toArray(T[]), resulting in an array that is unexpectedly large.

Ah yes, concurrent collections.

Paul.

>
>>  360      * @throws ArrayStoreException if the runtime type of any element in this
>>  361      *         collection is not assignable to the {@linkplain Class#getComponentType
>>  362      *         runtime component type} of array returned by the generator function
>> s/of array returned by the generator function/of the generated array/ ?
>
> Will do.
>
> s'marks

Reply | Threaded
Open this post in threaded view
|

Re: RFR(s): 8060192: Add default method Collection.toArray(generator)

Stuart Marks
Revised webrev based on comments from Martin Buchholz and Paul Sandoz:

     http://cr.openjdk.java.net/~smarks/reviews/8060192/webrev.1/

I've done a bit of rewriting of the @apiNote sections to clarify the intended
use of the various toArray() methods.

s'marks

Reply | Threaded
Open this post in threaded view
|

Re: RFR(s): 8060192: Add default method Collection.toArray(generator)

Stuart Marks
In reply to this post by David Lloyd-2


> The signature of the proposed generator is
>
>> public <T> T[] toArray(IntFunction<T[]> generator)
>
> So I don't think that anything has really changed in this regard; T
> can still be unrelated to E.

Right, the new method doesn't offer any safety compared to toArray(T[]). You can
still get ArrayStoreException if T is incompatible with E. That's in part what
led me to update the @throws ArrayStoreException text, since the previous text
was basically wrong. But fundamentally the problem is unchanged.

There's a related enhancement request

     https://bugs.openjdk.java.net/browse/JDK-7023484
     add typesafe static Collections.toArray(Collection<? extends T>,
IntFunction<T[]>)

that would provide static type safety for this operation. Unclear whether it's
worthwhile to add something like this, though.

> That said, sending in a constant String[0] is probably just as good as
> a generator, or better (in my naive estimation the tradeoff is a >0
> comparison versus an interface method dispatch), than an i -> new
> String[i] unless some kind of benchmark would appear which disproves
> that hypothesis.

I did some quick benchmarking, and it looks like String[]::new is a shade faster
(heh) than new String[size], but not quite as fast as new String[0]. But take
these results with a grain of salt. I was doing the benchmarking on my laptop
with my entire environment running, and I was also measuring my development
fastdebug build. I wasn't able to see the same difference between new
String[size] and new String[0] that Shipilëv observed.[1] It would be
interesting to get some rigorous benchmarks to compare these, but I haven't yet
taken the time to do this.

I don't think the main point of this is performance, though. I think the new
preferred way to create an array of a particular type should be
toArray(Foo[]::new), which is preferable to toArray(new Foo[0]) or
toArray(EMPTY_FOO_ARRAY). I suppose those of us "in the know" would recognize
toArray(new Foo[0]) to be idiomatic. But to understand it, you have to fully
understand the semantics of toArray(T[]), and you have to have read Shipilëv's
article to understand why creating an apparently wasted zero-sized array is
(usually) the right thing. That seems pretty obscure.

s'marks


[1] https://shipilev.net/blog/2016/arrays-wisdom-ancients/
Reply | Threaded
Open this post in threaded view
|

Re: RFR(s): 8060192: Add default method Collection.toArray(generator)

Tagir Valeev
In reply to this post by Stuart Marks
Hello!

I think, CopyOnWriteArrayList and a List returned by Arrays.asList
deserve specialized implementation as well.

With best regards,
Tagir Valeev.

On Wed, Dec 6, 2017 at 8:27 AM, Stuart Marks <[hidden email]> wrote:
> Revised webrev based on comments from Martin Buchholz and Paul Sandoz:
>
>     http://cr.openjdk.java.net/~smarks/reviews/8060192/webrev.1/
>
> I've done a bit of rewriting of the @apiNote sections to clarify the
> intended use of the various toArray() methods.
>
> s'marks
>
Reply | Threaded
Open this post in threaded view
|

Re: RFR(s): 8060192: Add default method Collection.toArray(generator)

Bernd Eckenfels-4
In reply to this post by Stuart Marks
Should the test also check for the case the generator returns under- and oversized arrays?

The default impl looks less efficient than (new T[0]), should it really be removed as a major Javadoc example? (Or maybe add more specific implementations besides ArrayList?)

Gruss
Bernd
--
http://bernd.eckenfels.net
________________________________
From: core-libs-dev <[hidden email]> on behalf of Stuart Marks <[hidden email]>
Sent: Wednesday, December 6, 2017 2:27:44 AM
To: core-libs-dev
Subject: Re: RFR(s): 8060192: Add default method Collection.toArray(generator)

Revised webrev based on comments from Martin Buchholz and Paul Sandoz:

     http://cr.openjdk.java.net/~smarks/reviews/8060192/webrev.1/

I've done a bit of rewriting of the @apiNote sections to clarify the intended
use of the various toArray() methods.

s'marks

Reply | Threaded
Open this post in threaded view
|

Re: RFR(s): 8060192: Add default method Collection.toArray(generator)

Stuart Marks
In reply to this post by Tagir Valeev
On 12/5/17 6:54 PM, Tagir Valeev wrote:
> I think, CopyOnWriteArrayList and a List returned by Arrays.asList
> deserve specialized implementation as well.

Sure, I can add one to Array.asList right away (as part of this changeset). It's
even covered by tests already.

I'll work with Martin to update COWAL. Depending on how he wants to handle it,
this might be done separately.

s'marks
Reply | Threaded
Open this post in threaded view
|

Re: RFR(s): 8060192: Add default method Collection.toArray(generator)

Stuart Marks
In reply to this post by Bernd Eckenfels-4


On 12/5/17 8:41 PM, Bernd Eckenfels wrote:
> Should the test also check for the case the generator returns under- and oversized arrays?

Did you mean that the default implementation (and various overriding
implementations) of toArray(generator) should do checks on the array that's
returned from the generator? If so, I'm skeptical of the utility of adding such
checks.

If the array is too short, ArrayIndexOutOfBoundsException will occur, so no
check is necessary. If the array is too long, it will work, but the application
might have trouble ascertaining the number of elements copied. Presumably it
wouldn't create a too-long array unless it can deal with this. In any case, it
doesn't necessarily seem like this should be an error.

> The default impl looks less efficient than (new T[0]), should it really be removed as a major Javadoc example? (Or maybe add more specific implementations besides ArrayList?)

Although my benchmarks were inconclusive, it doesn't seem like the extra call to
the generator with zero should be a performance problem. Even if this is a
problem, it will disappear as more overrides are added to the system.

s'marks
Reply | Threaded
Open this post in threaded view
|

Re: RFR(s): 8060192: Add default method Collection.toArray(generator)

Stuart Marks
In reply to this post by Stuart Marks
[Martin: please review changes to the JSR 166 TCK.]

Another updated webrev for this changeset:

     http://cr.openjdk.java.net/~smarks/reviews/8060192/webrev.2/

This includes an override of toArray(IntFunction) in the implementation of
Arrays.asList(), as requested by Tagir Valeev.

Overrides are also added for various wrapper classes in j.u.Collections. Turns
out there's a test test/jdk/java/util/Collections/Wrappers.java that checks to
ensure that the wrappers don't inherit any default methods. (This has been a
source of bugs in the past.)

Significantly, this webrev also includes changes to several tests in the JSR 166
TCK. The issue is that these tests have cases for null handling, where they call

     coll.toArray(null)

to ensure that NPE is thrown. Given that this method is now overloaded:

     toArray(T[])
     toArray(IntFunction)

passing null is now ambiguous and thus generates a compiler error. I've modified
the tests to call toArray((Object[])null) which is a bit of a stopgap. I can't
think of anything obviously better to do, though.

s'marks
Reply | Threaded
Open this post in threaded view
|

Re: RFR(s): 8060192: Add default method Collection.toArray(generator)

Jonathan Bluett-Duncan
Hi Stuart,

Looking over
http://cr.openjdk.java.net/~smarks/reviews/8060192/webrev.2/src/java.base/share/classes/java/util/ArrayList.java.cdiff.html,
I'm wondering if the method `ArrayList.toArray(IntFunction)` should have an
`@Override` to make it clear that it's overriding the default method in
Collection. What do you think?

Cheers,
Jonathan

On 7 December 2017 at 22:58, Stuart Marks <[hidden email]> wrote:

> [Martin: please review changes to the JSR 166 TCK.]
>
> Another updated webrev for this changeset:
>
>     http://cr.openjdk.java.net/~smarks/reviews/8060192/webrev.2/
>
> This includes an override of toArray(IntFunction) in the implementation of
> Arrays.asList(), as requested by Tagir Valeev.
>
> Overrides are also added for various wrapper classes in j.u.Collections.
> Turns out there's a test test/jdk/java/util/Collections/Wrappers.java
> that checks to ensure that the wrappers don't inherit any default methods.
> (This has been a source of bugs in the past.)
>
> Significantly, this webrev also includes changes to several tests in the
> JSR 166 TCK. The issue is that these tests have cases for null handling,
> where they call
>
>     coll.toArray(null)
>
> to ensure that NPE is thrown. Given that this method is now overloaded:
>
>     toArray(T[])
>     toArray(IntFunction)
>
> passing null is now ambiguous and thus generates a compiler error. I've
> modified the tests to call toArray((Object[])null) which is a bit of a
> stopgap. I can't think of anything obviously better to do, though.
>
> s'marks
>
Reply | Threaded
Open this post in threaded view
|

Re: RFR(s): 8060192: Add default method Collection.toArray(generator)

Stuart Marks


On 12/7/17 3:50 PM, Jonathan Bluett-Duncan wrote:

> Looking over
> http://cr.openjdk.java.net/~smarks/reviews/8060192/webrev.2/src/java.base/share/classes/java/util/ArrayList.java.cdiff.html,
> I'm wondering if the method `ArrayList.toArray(IntFunction)` should have an
> `@Override` to make it clear that it's overriding the default method in
> Collection. What do you think?

I guess it wouldn't hurt. I had thought about adding it, but most of the methods
in ArrayList that override implementations from supertypes *don't* have
@Override. Looking more closely, it appears that the ones that override
interface default methods *do* have @Override.

I don't think there's a rule that says that @Override should be used when
overriding interface default methods but not when overriding implementations
from a superclass or when implementing an abstract interface method. Frankly, I
think the JDK is just sloppy here. Most of the existing implementations didn't
use @Override -- possibly because they were introduced before annotations
existed -- and later on, whoever implemented the overrides of the Java 8 default
methods decided to add @Override.

s'marks
Reply | Threaded
Open this post in threaded view
|

Re: RFR(s): 8060192: Add default method Collection.toArray(generator)

Martin Buchholz-3
In reply to this post by Stuart Marks
(I'm still trying to love this new API)

The changes to the jsr166 tck are fine.

I'm not convinced that the new implementation for ArrayList is progress.
The current implementation of toArray(T[]) does

            // Make a new array of a's runtime type, but my contents:
            return (T[]) Arrays.copyOf(elementData, size, a.getClass());

which does not have to pay the cost of zeroing the array, so is potentially
faster.  (depends on whether the VM provides cheap pre-zeroed memory?! what
does jmh say?)

If we're not getting type safety and not getting significantly better
performance, all we have is a more natural API.  But toArray(IntFunction)
also seems not very natural to me, and we'll have to live with the
toArray(new String[0]) wart forever anyways.  (But is it really so bad?)
 (Maybe toArray(Class<componentType>) is actually more natural?)

On Thu, Dec 7, 2017 at 2:58 PM, Stuart Marks <[hidden email]>
wrote:

> [Martin: please review changes to the JSR 166 TCK.]
>
> Another updated webrev for this changeset:
>
>     http://cr.openjdk.java.net/~smarks/reviews/8060192/webrev.2/
>
> This includes an override of toArray(IntFunction) in the implementation of
> Arrays.asList(), as requested by Tagir Valeev.
>
> Overrides are also added for various wrapper classes in j.u.Collections.
> Turns out there's a test test/jdk/java/util/Collections/Wrappers.java
> that checks to ensure that the wrappers don't inherit any default methods.
> (This has been a source of bugs in the past.)
>
> Significantly, this webrev also includes changes to several tests in the
> JSR 166 TCK. The issue is that these tests have cases for null handling,
> where they call
>
>     coll.toArray(null)
>
> to ensure that NPE is thrown. Given that this method is now overloaded:
>
>     toArray(T[])
>     toArray(IntFunction)
>
> passing null is now ambiguous and thus generates a compiler error. I've
> modified the tests to call toArray((Object[])null) which is a bit of a
> stopgap. I can't think of anything obviously better to do, though.
>
> s'marks
>
12