[PATCH] 8178117: Add public state constructors for Int/Long/DoubleSummaryStatistics

classic Classic list List threaded Threaded
24 messages Options
12
Reply | Threaded
Open this post in threaded view
|

[PATCH] 8178117: Add public state constructors for Int/Long/DoubleSummaryStatistics

Chris Dennis
Hi Paul (et al)

Like all things API there are wrinkles here when it comes to implementing.

This patch isn’t final, there appears to be no existing test coverage for these classes beyond testing the compensating summation used in the double implementation, and I left off adding any until it was decided how much parameter validation we want (since that’s the only testing that can really occur here).

The wrinkles relate to the issues around instances that have suffered overflow in count and/or sum which the existing implementation doesn’t defend against:

 * I chose to ignore all parameters if 'count == 0’ and just leave the instance empty. I held off from validating min, max and count however. This obviously 'breaks things’ (beyond how broken they would already be) if count == 0 only due to overflow.
 * I chose to fail if count is negative.
 * I chose to enforce that max and min are logically consistent with count and sum, this can break the moment either one of the overflows as well (I’m also pretty sure that the FPU logic is going to suffer here too - there might be some magic I can do with a pessimistic bit of rounding on the FPU multiplication result).

I intentionally went with the most aggressive parameter validation here to establish one end of what could be implemented.  There are arguments for this and also arguments for the other extreme (no validation at all).  Personally I’m in favor of the current version where the creation will fail if the inputs are only possible through overflow (modulo fixing the FPU precision issues above) even though it’s at odds with approach of the existing implementation.

Would appreciate thoughts/feedback.  Thanks.

Chris


P.S. I presume tests would be required for the parameter checking (once it is finalized)?

Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] 8178117: Add public state constructors for Int/Long/DoubleSummaryStatistics

Jonathan Bluett-Duncan
Hi Chris,

Unfortunately the patch you sent (in what I presume was an attachment) is
missing. I believe the OpenJDK mailing list servers intentionally strip out
attachments in all emails, which seems to be at odds with the advice given
in http://openjdk.java.net/contribute/. (Either the contribution advice or
the servers should be changed, ideally!)

I understand that one can host patches somewhere official, but I've
forgotten the details of the process.

Can anyone help?

Jonathan


On 6 April 2017 at 15:08, Chris Dennis <[hidden email]> wrote:

> Hi Paul (et al)
>
> Like all things API there are wrinkles here when it comes to implementing.
>
> This patch isn’t final, there appears to be no existing test coverage for
> these classes beyond testing the compensating summation used in the double
> implementation, and I left off adding any until it was decided how much
> parameter validation we want (since that’s the only testing that can really
> occur here).
>
> The wrinkles relate to the issues around instances that have suffered
> overflow in count and/or sum which the existing implementation doesn’t
> defend against:
>
>  * I chose to ignore all parameters if 'count == 0’ and just leave the
> instance empty. I held off from validating min, max and count however. This
> obviously 'breaks things’ (beyond how broken they would already be) if
> count == 0 only due to overflow.
>  * I chose to fail if count is negative.
>  * I chose to enforce that max and min are logically consistent with count
> and sum, this can break the moment either one of the overflows as well (I’m
> also pretty sure that the FPU logic is going to suffer here too - there
> might be some magic I can do with a pessimistic bit of rounding on the FPU
> multiplication result).
>
> I intentionally went with the most aggressive parameter validation here to
> establish one end of what could be implemented.  There are arguments for
> this and also arguments for the other extreme (no validation at all).
> Personally I’m in favor of the current version where the creation will fail
> if the inputs are only possible through overflow (modulo fixing the FPU
> precision issues above) even though it’s at odds with approach of the
> existing implementation.
>
> Would appreciate thoughts/feedback.  Thanks.
>
> Chris
>
>
> P.S. I presume tests would be required for the parameter checking (once it
> is finalized)?
>
>
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] 8178117: Add public state constructors for Int/Long/DoubleSummaryStatistics

Chris Dennis
# HG changeset patch
# User chris_dennis
# Date 1491485015 14400
#      Thu Apr 06 09:23:35 2017 -0400
# Node ID d789970b8393032457885e739d76919f714bbd50
# Parent  c0aecf84349c97f4241eab01f7bbfb7660d51be1
8178117: Add public state constructors for Int/Long/DoubleSummaryStatistics

diff --git a/src/java.base/share/classes/java/util/DoubleSummaryStatistics.java b/src/java.base/share/classes/java/util/DoubleSummaryStatistics.java
--- a/src/java.base/share/classes/java/util/DoubleSummaryStatistics.java
+++ b/src/java.base/share/classes/java/util/DoubleSummaryStatistics.java
@@ -76,6 +76,47 @@
     public DoubleSummaryStatistics() { }
 
     /**
+     * Construct a non-empty instance with the supplied min, max, count, and sum.
+     *
+     * <p>If {@code count} is zero then the remaining parameters are ignored
+     * and an empty instance is constructed.
+     * <p>If the arguments are inconsistent then an {@code IllegalArgumentException}
+     * is thrown.  The necessary consistent argument conditions are:
+     * <ul>
+     *   <li>{@code count} &ge 0</li>
+     *   <li>{@code min} &le {@code max}</li>
+     *   <li>({@code count} &times {@code max}) &ge {@code sum}</li>
+     *   <li>({@code count} &times {@code min}) &le {@code sum}</li>
+     * </ul>
+     * This enforcement of argument consistency means that the retrieved set of
+     * values from a {@code DoubleSummaryStatistics} instance may not be a legal
+     * set of arguments for this constructor due to arithmetic overflow in the
+     * original instance.
+     *
+     * @param min the minimum value
+     * @param max the maximum value
+     * @param count the count of values
+     * @param sum the sum of all values
+     * @throws IllegalArgumentException if the arguments are inconsistent
+     */
+    public DoubleSummaryStatistics(double min, double max, long count, double sum) throws IllegalArgumentException {
+        if (count < 0L) {
+            throw new IllegalArgumentException("Negative count value");
+        } else if (count > 0L) {
+            if (min > max) throw new IllegalArgumentException("Minimum greater than maximum");
+            if (count * max < sum) throw new IllegalArgumentException("Maximum inconsistent with sum and count");
+            if (count * min > sum) throw new IllegalArgumentException("Minimum inconsistent with sum and count");
+
+            this.count = count;
+            this.sum = sum;
+            this.simpleSum = sum;
+            this.sumCompensation = 0.0d;
+            this.min = min;
+            this.max = max;
+        }
+    }
+
+    /**
      * Records another value into the summary information.
      *
      * @param value the input value
diff --git a/src/java.base/share/classes/java/util/IntSummaryStatistics.java b/src/java.base/share/classes/java/util/IntSummaryStatistics.java
--- a/src/java.base/share/classes/java/util/IntSummaryStatistics.java
+++ b/src/java.base/share/classes/java/util/IntSummaryStatistics.java
@@ -27,6 +27,9 @@
 import java.util.function.IntConsumer;
 import java.util.stream.Collector;
 
+import static java.lang.Math.multiplyExact;
+import static java.lang.Math.multiplyFull;
+
 /**
  * A state object for collecting statistics such as count, min, max, sum, and
  * average.
@@ -76,6 +79,49 @@
     public IntSummaryStatistics() { }
 
     /**
+     * Construct a non-empty instance with the supplied min, max, count, and sum.
+     *
+     * <p>If {@code count} is zero then the remaining parameters are ignored
+     * and an empty instance is constructed.
+     * <p>If the arguments are inconsistent then an {@code IllegalArgumentException}
+     * is thrown.  The necessary consistent argument conditions are:
+     * <ul>
+     *   <li>{@code count} &ge 0</li>
+     *   <li>{@code min} &le {@code max}</li>
+     *   <li>({@code count} &times {@code max}) &ge {@code sum}</li>
+     *   <li>({@code count} &times {@code min}) &le {@code sum}</li>
+     * </ul>
+     * This enforcement of argument consistency means that the retrieved set of
+     * values from an {@code IntSummaryStatistics} instance may not be a legal
+     * set of arguments for this constructor due to arithmetic overflow in the
+     * original instance.
+     *
+     * @param min the minimum value
+     * @param max the maximum value
+     * @param count the count of values
+     * @param sum the sum of all values
+     * @throws IllegalArgumentException if the arguments are inconsistent
+     */
+    public IntSummaryStatistics(int min, int max, long count, long sum) throws IllegalArgumentException {
+        if (count < 0L) {
+            throw new IllegalArgumentException("Negative count value");
+        } else if (count > 0L) {
+            if (min > max) throw new IllegalArgumentException("Minimum greater than maximum");
+            try {
+                if (multiplyExact(count, max) < sum) throw new IllegalArgumentException("Maximum inconsistent with sum and count");
+            } catch (ArithmeticException e) {}
+            try {
+                if (multiplyExact(count, min) > sum) throw new IllegalArgumentException("Minimum inconsistent with sum and count");
+            } catch (ArithmeticException e) {}
+
+            this.count = count;
+            this.sum = sum;
+            this.min = min;
+            this.max = max;
+        }
+    }
+
+    /**
      * Records a new value into the summary information
      *
      * @param value the input value
diff --git a/src/java.base/share/classes/java/util/LongSummaryStatistics.java b/src/java.base/share/classes/java/util/LongSummaryStatistics.java
--- a/src/java.base/share/classes/java/util/LongSummaryStatistics.java
+++ b/src/java.base/share/classes/java/util/LongSummaryStatistics.java
@@ -28,6 +28,8 @@
 import java.util.function.LongConsumer;
 import java.util.stream.Collector;
 
+import static java.lang.Math.multiplyExact;
+
 /**
  * A state object for collecting statistics such as count, min, max, sum, and
  * average.
@@ -77,6 +79,49 @@
     public LongSummaryStatistics() { }
 
     /**
+     * Construct a non-empty instance with the supplied min, max, count, and sum.
+     *
+     * <p>If {@code count} is zero then the remaining parameters are ignored
+     * and an empty instance is constructed.
+     * <p>If the arguments are inconsistent then an {@code IllegalArgumentException}
+     * is thrown.  The necessary consistent argument conditions are:
+     * <ul>
+     *   <li>{@code count} &ge 0</li>
+     *   <li>{@code min} &le {@code max}</li>
+     *   <li>({@code count} &times {@code max}) &ge {@code sum}</li>
+     *   <li>({@code count} &times {@code min}) &le {@code sum}</li>
+     * </ul>
+     * This enforcement of argument consistency means that the retrieved set of
+     * values from a {@code LongSummaryStatistics} instance may not be a legal
+     * set of arguments for this constructor due to arithmetic overflow in the
+     * original instance.
+     *
+     * @param min the minimum value
+     * @param max the maximum value
+     * @param count the count of values
+     * @param sum the sum of all values
+     * @throws IllegalArgumentException if the arguments are inconsistent
+     */
+    public LongSummaryStatistics(long min, long max, long count, long sum) throws IllegalArgumentException {
+        if (count < 0L) {
+            throw new IllegalArgumentException("Negative count value");
+        } else if (count > 0L) {
+            if (min > max) throw new IllegalArgumentException("Minimum greater than maximum");
+            try {
+                if (multiplyExact(count, max) < sum) throw new IllegalArgumentException("Maximum inconsistent with sum and count");
+            } catch (ArithmeticException e) {}
+            try {
+                if (multiplyExact(count, min) > sum) throw new IllegalArgumentException("Minimum inconsistent with sum and count");
+            } catch (ArithmeticException e) {}
+
+            this.count = count;
+            this.sum = sum;
+            this.min = min;
+            this.max = max;
+        }
+    }
+
+    /**
      * Records a new {@code int} value into the summary information.
      *
      * @param value the input value


> On Apr 6, 2017, at 10:32 AM, Jonathan Bluett-Duncan <[hidden email]> wrote:
>
> Hi Chris,
>
> Unfortunately the patch you sent (in what I presume was an attachment) is missing. I believe the OpenJDK mailing list servers intentionally strip out attachments in all emails, which seems to be at odds with the advice given in http://openjdk.java.net/contribute/. (Either the contribution advice or the servers should be changed, ideally!)
>
> I understand that one can host patches somewhere official, but I've forgotten the details of the process.
>
> Can anyone help?
>
> Jonathan
>
>
> On 6 April 2017 at 15:08, Chris Dennis <[hidden email]> wrote:
> Hi Paul (et al)
>
> Like all things API there are wrinkles here when it comes to implementing.
>
> This patch isn’t final, there appears to be no existing test coverage for these classes beyond testing the compensating summation used in the double implementation, and I left off adding any until it was decided how much parameter validation we want (since that’s the only testing that can really occur here).
>
> The wrinkles relate to the issues around instances that have suffered overflow in count and/or sum which the existing implementation doesn’t defend against:
>
>  * I chose to ignore all parameters if 'count == 0’ and just leave the instance empty. I held off from validating min, max and count however. This obviously 'breaks things’ (beyond how broken they would already be) if count == 0 only due to overflow.
>  * I chose to fail if count is negative.
>  * I chose to enforce that max and min are logically consistent with count and sum, this can break the moment either one of the overflows as well (I’m also pretty sure that the FPU logic is going to suffer here too - there might be some magic I can do with a pessimistic bit of rounding on the FPU multiplication result).
>
> I intentionally went with the most aggressive parameter validation here to establish one end of what could be implemented.  There are arguments for this and also arguments for the other extreme (no validation at all).  Personally I’m in favor of the current version where the creation will fail if the inputs are only possible through overflow (modulo fixing the FPU precision issues above) even though it’s at odds with approach of the existing implementation.
>
> Would appreciate thoughts/feedback.  Thanks.
>
> Chris
>
>
> P.S. I presume tests would be required for the parameter checking (once it is finalized)?
>
>

Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] 8178117: Add public state constructors for Int/Long/DoubleSummaryStatistics

Paul Sandoz
In reply to this post by Chris Dennis
Hi Chris,

Thanks for looking at this.

There is some rudimentary testing using streams in CollectAndSummaryStatisticsTest.java.

I think we should enforce constraints where we reliably can:

1) count >= 0

2) count = 0, then min/max/sum are ignored and it’s as if the default constructor was called. (I thought it might be appropriate to reject since a caller might be passing in incorrect information in error, but the defaults for min/max are important and would be a burden for the caller to pass those values.) In this respect having count as the first parameter of the constructor would be useful.

3)  min <= max

Since for count > 0 the constraints, count * max >= sum, count * min <= sum, cannot be reliably enforced due to overflow i am inclined to not bother and just document.


Note this is gonna be blocked from pushing until the new Compatibility and Specification Review (CSR) process is opened up, which i understand is “soon"ish :-) but that should not block adding some further tests in the interim and tidying up the javadoc.

Paul.


> On 6 Apr 2017, at 07:08, Chris Dennis <[hidden email]> wrote:
>
> Hi Paul (et al)
>
> Like all things API there are wrinkles here when it comes to implementing.
>
> This patch isn’t final, there appears to be no existing test coverage for these classes beyond testing the compensating summation used in the double implementation, and I left off adding any until it was decided how much parameter validation we want (since that’s the only testing that can really occur here).
>
> The wrinkles relate to the issues around instances that have suffered overflow in count and/or sum which the existing implementation doesn’t defend against:
>
> * I chose to ignore all parameters if 'count == 0’ and just leave the instance empty. I held off from validating min, max and count however. This obviously 'breaks things’ (beyond how broken they would already be) if count == 0 only due to overflow.
> * I chose to fail if count is negative.
> * I chose to enforce that max and min are logically consistent with count and sum, this can break the moment either one of the overflows as well (I’m also pretty sure that the FPU logic is going to suffer here too - there might be some magic I can do with a pessimistic bit of rounding on the FPU multiplication result).
>
> I intentionally went with the most aggressive parameter validation here to establish one end of what could be implemented.  There are arguments for this and also arguments for the other extreme (no validation at all).  Personally I’m in favor of the current version where the creation will fail if the inputs are only possible through overflow (modulo fixing the FPU precision issues above) even though it’s at odds with approach of the existing implementation.
>
> Would appreciate thoughts/feedback.  Thanks.
>
> Chris
>
>
> P.S. I presume tests would be required for the parameter checking (once it is finalized)?
>
> <8178117.patch>

Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] 8178117: Add public state constructors for Int/Long/DoubleSummaryStatistics

joe darcy
On an API design note, I would prefer to DoubleSummaryStatistics took a
double... argument to pass in the state of the summation. This
flexibility is necessary to more fully preserve the computed sum. Also,
the we want flexibility to change the amount of internal state
DoubleSummaryStats keeps so we don't want to hard-code that into as
aspect of the API.

Thanks,

-Joe


On 4/11/2017 12:53 PM, Paul Sandoz wrote:

> Hi Chris,
>
> Thanks for looking at this.
>
> There is some rudimentary testing using streams in CollectAndSummaryStatisticsTest.java.
>
> I think we should enforce constraints where we reliably can:
>
> 1) count >= 0
>
> 2) count = 0, then min/max/sum are ignored and it’s as if the default constructor was called. (I thought it might be appropriate to reject since a caller might be passing in incorrect information in error, but the defaults for min/max are important and would be a burden for the caller to pass those values.) In this respect having count as the first parameter of the constructor would be useful.
>
> 3)  min <= max
>
> Since for count > 0 the constraints, count * max >= sum, count * min <= sum, cannot be reliably enforced due to overflow i am inclined to not bother and just document.
>
>
> Note this is gonna be blocked from pushing until the new Compatibility and Specification Review (CSR) process is opened up, which i understand is “soon"ish :-) but that should not block adding some further tests in the interim and tidying up the javadoc.
>
> Paul.
>
>
>> On 6 Apr 2017, at 07:08, Chris Dennis <[hidden email]> wrote:
>>
>> Hi Paul (et al)
>>
>> Like all things API there are wrinkles here when it comes to implementing.
>>
>> This patch isn’t final, there appears to be no existing test coverage for these classes beyond testing the compensating summation used in the double implementation, and I left off adding any until it was decided how much parameter validation we want (since that’s the only testing that can really occur here).
>>
>> The wrinkles relate to the issues around instances that have suffered overflow in count and/or sum which the existing implementation doesn’t defend against:
>>
>> * I chose to ignore all parameters if 'count == 0’ and just leave the instance empty. I held off from validating min, max and count however. This obviously 'breaks things’ (beyond how broken they would already be) if count == 0 only due to overflow.
>> * I chose to fail if count is negative.
>> * I chose to enforce that max and min are logically consistent with count and sum, this can break the moment either one of the overflows as well (I’m also pretty sure that the FPU logic is going to suffer here too - there might be some magic I can do with a pessimistic bit of rounding on the FPU multiplication result).
>>
>> I intentionally went with the most aggressive parameter validation here to establish one end of what could be implemented.  There are arguments for this and also arguments for the other extreme (no validation at all).  Personally I’m in favor of the current version where the creation will fail if the inputs are only possible through overflow (modulo fixing the FPU precision issues above) even though it’s at odds with approach of the existing implementation.
>>
>> Would appreciate thoughts/feedback.  Thanks.
>>
>> Chris
>>
>>
>> P.S. I presume tests would be required for the parameter checking (once it is finalized)?
>>
>> <8178117.patch>

Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] 8178117: Add public state constructors for Int/Long/DoubleSummaryStatistics

Chris Dennis
Color me confused… what would the javadoc on the parameter say? It could I guess have an @implNote documenting the meanings of the parameters… but then what use is it? The proposed API simply limits the precision with which a DoubleSummaryStatistic can be copied to be the same as the precision with which it can be “accessed”.  That doesn’t seem an enormous problem since I suspect that bulk of usages would be to marshall a “finished” instance and therefore there is no real loss occuring. If we wanted to support arbitrary precision wouldn’t it be better to have a constructor variant that took a BigDecimal, and a matching getPreciseSum() that returned a BigDecimal?

Chris

> On Apr 11, 2017, at 4:16 PM, joe darcy <[hidden email]> wrote:
>
> On an API design note, I would prefer to DoubleSummaryStatistics took a double... argument to pass in the state of the summation. This flexibility is necessary to more fully preserve the computed sum. Also, the we want flexibility to change the amount of internal state DoubleSummaryStats keeps so we don't want to hard-code that into as aspect of the API.
>
> Thanks,
>
> -Joe
>
>
> On 4/11/2017 12:53 PM, Paul Sandoz wrote:
>> Hi Chris,
>>
>> Thanks for looking at this.
>>
>> There is some rudimentary testing using streams in CollectAndSummaryStatisticsTest.java.
>>
>> I think we should enforce constraints where we reliably can:
>>
>> 1) count >= 0
>>
>> 2) count = 0, then min/max/sum are ignored and it’s as if the default constructor was called. (I thought it might be appropriate to reject since a caller might be passing in incorrect information in error, but the defaults for min/max are important and would be a burden for the caller to pass those values.) In this respect having count as the first parameter of the constructor would be useful.
>>
>> 3)  min <= max
>>
>> Since for count > 0 the constraints, count * max >= sum, count * min <= sum, cannot be reliably enforced due to overflow i am inclined to not bother and just document.
>>
>>
>> Note this is gonna be blocked from pushing until the new Compatibility and Specification Review (CSR) process is opened up, which i understand is “soon"ish :-) but that should not block adding some further tests in the interim and tidying up the javadoc.
>>
>> Paul.
>>
>>
>>> On 6 Apr 2017, at 07:08, Chris Dennis <[hidden email]> wrote:
>>>
>>> Hi Paul (et al)
>>>
>>> Like all things API there are wrinkles here when it comes to implementing.
>>>
>>> This patch isn’t final, there appears to be no existing test coverage for these classes beyond testing the compensating summation used in the double implementation, and I left off adding any until it was decided how much parameter validation we want (since that’s the only testing that can really occur here).
>>>
>>> The wrinkles relate to the issues around instances that have suffered overflow in count and/or sum which the existing implementation doesn’t defend against:
>>>
>>> * I chose to ignore all parameters if 'count == 0’ and just leave the instance empty. I held off from validating min, max and count however. This obviously 'breaks things’ (beyond how broken they would already be) if count == 0 only due to overflow.
>>> * I chose to fail if count is negative.
>>> * I chose to enforce that max and min are logically consistent with count and sum, this can break the moment either one of the overflows as well (I’m also pretty sure that the FPU logic is going to suffer here too - there might be some magic I can do with a pessimistic bit of rounding on the FPU multiplication result).
>>>
>>> I intentionally went with the most aggressive parameter validation here to establish one end of what could be implemented.  There are arguments for this and also arguments for the other extreme (no validation at all).  Personally I’m in favor of the current version where the creation will fail if the inputs are only possible through overflow (modulo fixing the FPU precision issues above) even though it’s at odds with approach of the existing implementation.
>>>
>>> Would appreciate thoughts/feedback.  Thanks.
>>>
>>> Chris
>>>
>>>
>>> P.S. I presume tests would be required for the parameter checking (once it is finalized)?
>>>
>>> <8178117.patch>
>

Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] 8178117: Add public state constructors for Int/Long/DoubleSummaryStatistics

Chris Dennis
Updated patch with the changed parameter validation, updated javadoc and added test coverage.




> On Apr 11, 2017, at 4:48 PM, Chris Dennis <[hidden email]> wrote:
>
> Color me confused… what would the javadoc on the parameter say? It could I guess have an @implNote documenting the meanings of the parameters… but then what use is it? The proposed API simply limits the precision with which a DoubleSummaryStatistic can be copied to be the same as the precision with which it can be “accessed”.  That doesn’t seem an enormous problem since I suspect that bulk of usages would be to marshall a “finished” instance and therefore there is no real loss occuring. If we wanted to support arbitrary precision wouldn’t it be better to have a constructor variant that took a BigDecimal, and a matching getPreciseSum() that returned a BigDecimal?
>
> Chris
>
>> On Apr 11, 2017, at 4:16 PM, joe darcy <[hidden email]> wrote:
>>
>> On an API design note, I would prefer to DoubleSummaryStatistics took a double... argument to pass in the state of the summation. This flexibility is necessary to more fully preserve the computed sum. Also, the we want flexibility to change the amount of internal state DoubleSummaryStats keeps so we don't want to hard-code that into as aspect of the API.
>>
>> Thanks,
>>
>> -Joe
>>
>>
>> On 4/11/2017 12:53 PM, Paul Sandoz wrote:
>>> Hi Chris,
>>>
>>> Thanks for looking at this.
>>>
>>> There is some rudimentary testing using streams in CollectAndSummaryStatisticsTest.java.
>>>
>>> I think we should enforce constraints where we reliably can:
>>>
>>> 1) count >= 0
>>>
>>> 2) count = 0, then min/max/sum are ignored and it’s as if the default constructor was called. (I thought it might be appropriate to reject since a caller might be passing in incorrect information in error, but the defaults for min/max are important and would be a burden for the caller to pass those values.) In this respect having count as the first parameter of the constructor would be useful.
>>>
>>> 3)  min <= max
>>>
>>> Since for count > 0 the constraints, count * max >= sum, count * min <= sum, cannot be reliably enforced due to overflow i am inclined to not bother and just document.
>>>
>>>
>>> Note this is gonna be blocked from pushing until the new Compatibility and Specification Review (CSR) process is opened up, which i understand is “soon"ish :-) but that should not block adding some further tests in the interim and tidying up the javadoc.
>>>
>>> Paul.
>>>
>>>
>>>> On 6 Apr 2017, at 07:08, Chris Dennis <[hidden email]> wrote:
>>>>
>>>> Hi Paul (et al)
>>>>
>>>> Like all things API there are wrinkles here when it comes to implementing.
>>>>
>>>> This patch isn’t final, there appears to be no existing test coverage for these classes beyond testing the compensating summation used in the double implementation, and I left off adding any until it was decided how much parameter validation we want (since that’s the only testing that can really occur here).
>>>>
>>>> The wrinkles relate to the issues around instances that have suffered overflow in count and/or sum which the existing implementation doesn’t defend against:
>>>>
>>>> * I chose to ignore all parameters if 'count == 0’ and just leave the instance empty. I held off from validating min, max and count however. This obviously 'breaks things’ (beyond how broken they would already be) if count == 0 only due to overflow.
>>>> * I chose to fail if count is negative.
>>>> * I chose to enforce that max and min are logically consistent with count and sum, this can break the moment either one of the overflows as well (I’m also pretty sure that the FPU logic is going to suffer here too - there might be some magic I can do with a pessimistic bit of rounding on the FPU multiplication result).
>>>>
>>>> I intentionally went with the most aggressive parameter validation here to establish one end of what could be implemented.  There are arguments for this and also arguments for the other extreme (no validation at all).  Personally I’m in favor of the current version where the creation will fail if the inputs are only possible through overflow (modulo fixing the FPU precision issues above) even though it’s at odds with approach of the existing implementation.
>>>>
>>>> Would appreciate thoughts/feedback.  Thanks.
>>>>
>>>> Chris
>>>>
>>>>
>>>> P.S. I presume tests would be required for the parameter checking (once it is finalized)?
>>>>
>>>> <8178117.patch>
>>
>


8178117.2.patch.txt (14K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] 8178117: Add public state constructors for Int/Long/DoubleSummaryStatistics

Peter Levart
In reply to this post by Chris Dennis
Hi Dennis,

On 04/11/2017 10:48 PM, Chris Dennis wrote:
> Color me confused… what would the javadoc on the parameter say? It could I guess have an @implNote documenting the meanings of the parameters… but then what use is it? The proposed API simply limits the precision with which a DoubleSummaryStatistic can be copied to be the same as the precision with which it can be “accessed”.  That doesn’t seem an enormous problem since I suspect that bulk of usages would be to marshall a “finished” instance and therefore there is no real loss occuring. If we wanted to support arbitrary precision wouldn’t it be better to have a constructor variant that took a BigDecimal, and a matching getPreciseSum() that returned a BigDecimal?

And how would you compute the value for BigDecimal getPreciseSum() so
that you could then set 3 internal double fields from BigDecimal without
loss of information?

I can imagine a "distributed" implementation of Stream that must
transport partial results across VM boundaries and then combine them. It
would be bad if the results of such summation differed from default
local Stream implementation. If we don't want to make
XxxSummaryStatistics objects Serializable themselves, then perhaps we
need some other API to support transporting the state over the wire with
evolvability in mind. What about something like the following:

     public DoubleSummaryStatistics(DataInput dataInput) throws
IOException {
         int version = dataInput.readByte();
         if (version == 1) {
             long count = dataInput.readLong();
             double sum = dataInput.readDouble();
             double sumCompensation = dataInput.readDouble();
             double simpleSum = dataInput.readDouble();
             double min = dataInput.readDouble();
             double max = dataInput.readDouble();
             // validation of arguments ...

             // ... assignment to fields
             this.count = count;
             this.sum = sum;
             this.sumCompensation = sumCompensation;
             this.simpleSum = simpleSum;
             this.min = min;
             this.max = max;
         } else {
             throw new IllegalArgumentException(
                 "Invalid version in DataInput stream: " + version);
         }
     }

     public void writeTo(DataOutput dataOutput) throws IOException {
         dataOutput.writeByte(1);
         dataOutput.writeLong(count);
         dataOutput.writeDouble(sum);
         dataOutput.writeDouble(sumCompensation);
         dataOutput.writeDouble(simpleSum);
         dataOutput.writeDouble(min);
         dataOutput.writeDouble(max);
     }


Both members could be made protected so that a Serializable subclass
could be devised and they would not show up as public API.

Regards, Peter


>
> Chris
>
>> On Apr 11, 2017, at 4:16 PM, joe darcy <[hidden email]> wrote:
>>
>> On an API design note, I would prefer to DoubleSummaryStatistics took a double... argument to pass in the state of the summation. This flexibility is necessary to more fully preserve the computed sum. Also, the we want flexibility to change the amount of internal state DoubleSummaryStats keeps so we don't want to hard-code that into as aspect of the API.
>>
>> Thanks,
>>
>> -Joe
>>
>>
>> On 4/11/2017 12:53 PM, Paul Sandoz wrote:
>>> Hi Chris,
>>>
>>> Thanks for looking at this.
>>>
>>> There is some rudimentary testing using streams in CollectAndSummaryStatisticsTest.java.
>>>
>>> I think we should enforce constraints where we reliably can:
>>>
>>> 1) count >= 0
>>>
>>> 2) count = 0, then min/max/sum are ignored and it’s as if the default constructor was called. (I thought it might be appropriate to reject since a caller might be passing in incorrect information in error, but the defaults for min/max are important and would be a burden for the caller to pass those values.) In this respect having count as the first parameter of the constructor would be useful.
>>>
>>> 3)  min <= max
>>>
>>> Since for count > 0 the constraints, count * max >= sum, count * min <= sum, cannot be reliably enforced due to overflow i am inclined to not bother and just document.
>>>
>>>
>>> Note this is gonna be blocked from pushing until the new Compatibility and Specification Review (CSR) process is opened up, which i understand is “soon"ish :-) but that should not block adding some further tests in the interim and tidying up the javadoc.
>>>
>>> Paul.
>>>
>>>
>>>> On 6 Apr 2017, at 07:08, Chris Dennis <[hidden email]> wrote:
>>>>
>>>> Hi Paul (et al)
>>>>
>>>> Like all things API there are wrinkles here when it comes to implementing.
>>>>
>>>> This patch isn’t final, there appears to be no existing test coverage for these classes beyond testing the compensating summation used in the double implementation, and I left off adding any until it was decided how much parameter validation we want (since that’s the only testing that can really occur here).
>>>>
>>>> The wrinkles relate to the issues around instances that have suffered overflow in count and/or sum which the existing implementation doesn’t defend against:
>>>>
>>>> * I chose to ignore all parameters if 'count == 0’ and just leave the instance empty. I held off from validating min, max and count however. This obviously 'breaks things’ (beyond how broken they would already be) if count == 0 only due to overflow.
>>>> * I chose to fail if count is negative.
>>>> * I chose to enforce that max and min are logically consistent with count and sum, this can break the moment either one of the overflows as well (I’m also pretty sure that the FPU logic is going to suffer here too - there might be some magic I can do with a pessimistic bit of rounding on the FPU multiplication result).
>>>>
>>>> I intentionally went with the most aggressive parameter validation here to establish one end of what could be implemented.  There are arguments for this and also arguments for the other extreme (no validation at all).  Personally I’m in favor of the current version where the creation will fail if the inputs are only possible through overflow (modulo fixing the FPU precision issues above) even though it’s at odds with approach of the existing implementation.
>>>>
>>>> Would appreciate thoughts/feedback.  Thanks.
>>>>
>>>> Chris
>>>>
>>>>
>>>> P.S. I presume tests would be required for the parameter checking (once it is finalized)?
>>>>
>>>> <8178117.patch>

Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] 8178117: Add public state constructors for Int/Long/DoubleSummaryStatistics

Peter Levart


On 04/12/2017 04:41 PM, Peter Levart wrote:

> On 04/11/2017 10:48 PM, Chris Dennis wrote:
>> Color me confused… what would the javadoc on the parameter say? It
>> could I guess have an @implNote documenting the meanings of the
>> parameters… but then what use is it? The proposed API simply limits
>> the precision with which a DoubleSummaryStatistic can be copied to be
>> the same as the precision with which it can be “accessed”. That
>> doesn’t seem an enormous problem since I suspect that bulk of usages
>> would be to marshall a “finished” instance and therefore there is no
>> real loss occuring. If we wanted to support arbitrary precision
>> wouldn’t it be better to have a constructor variant that took a
>> BigDecimal, and a matching getPreciseSum() that returned a BigDecimal?
>
> And how would you compute the value for BigDecimal getPreciseSum() so
> that you could then set 3 internal double fields from BigDecimal
> without loss of information?


...perhaps we could use the BigDecimal getPreciseSum() together with
getSum() to reconstruct the state. For exmaple:

     /**
      * Construct an instance with values obtained from another instance.
      *
      * @param count what was returned from another instance's {@link
#getCount()}
      * @param sum what was returned from another instance's {@link
#getSum()}
      * @param preciseSum what was returned from another instance's
{@link #getPreciseSum()}
      * @param min what was returned from another instance's {@link
#getMin()}
      * @param max what was returned from another instance's {@link
#getMax()}
      */
     public DoubleSummaryStatistics(long count,
                                    double sum, BigDecimal preciseSum,
                                    double min, double max) {
         if (count < 0L) {
             throw new IllegalArgumentException("count < 0");
         } else if (count > 0L) {
             if (min > max) {
                 throw new IllegalArgumentException("min > max");
             }
             this.count = count;
             this.min = min;
             this.max = max;
             setSum(sum, preciseSum);
         }
     }

     /**
      * If {@link #getSum()} returns {@link Double#NaN} or {@link
Double#POSITIVE_INFINITY}
      * or {@link Double#NEGATIVE_INFINITY}, then this method returns
{@code null},
      * otherwise it returns a value that is a more precise
representation of the
      * sum of values recorded and can be used in
      * {@link #DoubleSummaryStatistics(long, double, BigDecimal,
double, double)}
      * to construct an object with internal state that closely
resembles the state of
      * original object.
      *
      * @return a precise sum in BigDecimal form.
      */
     public final BigDecimal getPreciseSum() {
         double sum = getSum();
         if (Double.isNaN(sum) || Double.isInfinite(sum)) {
             return null;
         } else {
             return new BigDecimal(this.sum).add(
                 new BigDecimal(this.sumCompensation));
         }
     }

     private void setSum(double sum, BigDecimal preciseSum) {
         if (preciseSum == null) {
             if (!Double.isNaN(sum) && !Double.isInfinite(sum)) {
                 throw new IllegalArgumentException(
                     "preciseSum is null but sum is not a NaN and not
Infinite");
             }
             this.sum = this.simpleSum = sum;
             this.sumCompensation = 0d;
         } else {
             if (Double.isNaN(sum) || Double.isInfinite(sum)) {
                 throw new IllegalArgumentException(
                     "preciseSum is non-null but sum is NaN or Infinite");
             }
             this.sum = this.simpleSum = preciseSum.doubleValue();
             this.sumCompensation = preciseSum.subtract(new
BigDecimal(this.sum)).doubleValue();
         }
     }


Hm....


Regards, Peter

Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] 8178117: Add public state constructors for Int/Long/DoubleSummaryStatistics

Chris Dennis
Something like that was what I was envisioning (of course the inability to represent NaN or infinity with a BigDecimal is annoying). To me this approach really only breaks down when you stop using an actual iteration to perform the computation… as long as a “sum” in some form is part of the internal state then this still works.

> On Apr 12, 2017, at 12:27 PM, Peter Levart <[hidden email]> wrote:
>
>
>
> On 04/12/2017 04:41 PM, Peter Levart wrote:
>> On 04/11/2017 10:48 PM, Chris Dennis wrote:
>>> Color me confused… what would the javadoc on the parameter say? It could I guess have an @implNote documenting the meanings of the parameters… but then what use is it? The proposed API simply limits the precision with which a DoubleSummaryStatistic can be copied to be the same as the precision with which it can be “accessed”. That doesn’t seem an enormous problem since I suspect that bulk of usages would be to marshall a “finished” instance and therefore there is no real loss occuring. If we wanted to support arbitrary precision wouldn’t it be better to have a constructor variant that took a BigDecimal, and a matching getPreciseSum() that returned a BigDecimal?
>>
>> And how would you compute the value for BigDecimal getPreciseSum() so that you could then set 3 internal double fields from BigDecimal without loss of information?
>
>
> ...perhaps we could use the BigDecimal getPreciseSum() together with getSum() to reconstruct the state. For exmaple:
>
>    /**
>     * Construct an instance with values obtained from another instance.
>     *
>     * @param count what was returned from another instance's {@link #getCount()}
>     * @param sum what was returned from another instance's {@link #getSum()}
>     * @param preciseSum what was returned from another instance's {@link #getPreciseSum()}
>     * @param min what was returned from another instance's {@link #getMin()}
>     * @param max what was returned from another instance's {@link #getMax()}
>     */
>    public DoubleSummaryStatistics(long count,
>                                   double sum, BigDecimal preciseSum,
>                                   double min, double max) {
>        if (count < 0L) {
>            throw new IllegalArgumentException("count < 0");
>        } else if (count > 0L) {
>            if (min > max) {
>                throw new IllegalArgumentException("min > max");
>            }
>            this.count = count;
>            this.min = min;
>            this.max = max;
>            setSum(sum, preciseSum);
>        }
>    }
>
>    /**
>     * If {@link #getSum()} returns {@link Double#NaN} or {@link Double#POSITIVE_INFINITY}
>     * or {@link Double#NEGATIVE_INFINITY}, then this method returns {@code null},
>     * otherwise it returns a value that is a more precise representation of the
>     * sum of values recorded and can be used in
>     * {@link #DoubleSummaryStatistics(long, double, BigDecimal, double, double)}
>     * to construct an object with internal state that closely resembles the state of
>     * original object.
>     *
>     * @return a precise sum in BigDecimal form.
>     */
>    public final BigDecimal getPreciseSum() {
>        double sum = getSum();
>        if (Double.isNaN(sum) || Double.isInfinite(sum)) {
>            return null;
>        } else {
>            return new BigDecimal(this.sum).add(
>                new BigDecimal(this.sumCompensation));
>        }
>    }
>
>    private void setSum(double sum, BigDecimal preciseSum) {
>        if (preciseSum == null) {
>            if (!Double.isNaN(sum) && !Double.isInfinite(sum)) {
>                throw new IllegalArgumentException(
>                    "preciseSum is null but sum is not a NaN and not Infinite");
>            }
>            this.sum = this.simpleSum = sum;
>            this.sumCompensation = 0d;
>        } else {
>            if (Double.isNaN(sum) || Double.isInfinite(sum)) {
>                throw new IllegalArgumentException(
>                    "preciseSum is non-null but sum is NaN or Infinite");
>            }
>            this.sum = this.simpleSum = preciseSum.doubleValue();
>            this.sumCompensation = preciseSum.subtract(new BigDecimal(this.sum)).doubleValue();
>        }
>    }
>
>
> Hm....
>
>
> Regards, Peter
>

Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] 8178117: Add public state constructors for Int/Long/DoubleSummaryStatistics

joe darcy
In reply to this post by Chris Dennis
On 4/11/2017 1:48 PM, Chris Dennis wrote:
> Color me confused… what would the javadoc on the parameter say? It could I guess have an @implNote documenting the meanings of the parameters… but then what use is it? The proposed API simply limits the precision with which a DoubleSummaryStatistic can be copied to be the same as the precision with which it can be “accessed”.  That doesn’t seem an enormous problem since I suspect that bulk of usages would be to marshall a “finished” instance and therefore there is no real loss occuring. If we wanted to support arbitrary precision wouldn’t it be better to have a constructor variant that took a BigDecimal, and a matching getPreciseSum() that returned a BigDecimal?
>

The javadoc would say something like "takes an array of double values
representing a sum." In other words, rather than forcing the in-progress
sum to be represented as a single double, it could be represented as
more than one double.

The current implementation basically uses two doubles internally to
represent the on-going summation. This extra state prevents many
numerically poor outcomes occurring when computing the sum. Greater
accuracy would be possible if additional internal state were used.

-Joe


Reply | Threaded
Open this post in threaded view
|

Re: [PATCH] 8178117: Add public state constructors for Int/Long/DoubleSummaryStatistics

Peter Levart
Hi Joe,

On 04/12/2017 07:59 PM, Joseph D. Darcy wrote:

> On 4/11/2017 1:48 PM, Chris Dennis wrote:
>> Color me confused… what would the javadoc on the parameter say? It
>> could I guess have an @implNote documenting the meanings of the
>> parameters… but then what use is it? The proposed API simply limits
>> the precision with which a DoubleSummaryStatistic can be copied to be
>> the same as the precision with which it can be “accessed”.  That
>> doesn’t seem an enormous problem since I suspect that bulk of usages
>> would be to marshall a “finished” instance and therefore there is no
>> real loss occuring. If we wanted to support arbitrary precision
>> wouldn’t it be better to have a constructor variant that took a
>> BigDecimal, and a matching getPreciseSum() that returned a BigDecimal?
>>
>
> The javadoc would say something like "takes an array of double values
> representing a sum." In other words, rather than forcing the
> in-progress sum to be represented as a single double, it could be
> represented as more than one double.
>
> The current implementation basically uses two doubles internally to
> represent the on-going summation. This extra state prevents many
> numerically poor outcomes occurring when computing the sum. Greater
> accuracy would be possible if additional internal state were used.
>
> -Joe
>
>

But constructor taking this state is not enough for the solution to the
problem. There would have to be a getter returning that state too,
otherwise there's no point in having such constructor, right?

Are you envisioning that any possible future enhancements of the
summation would all keep the state as a series of double values?

Regards, Peter

Reply | Threaded
Open this post in threaded view
|

RFR Re: [PATCH] 8178117: Add public state constructors for Int/Long/DoubleSummaryStatistics

Paul Sandoz
In reply to this post by Chris Dennis
Hi Chris,

After some hiatus here is an updated webrev, i made some tweaks to the JavaDoc:

  http://cr.openjdk.java.net/~psandoz/jdk10/JDK-8178117-summary-constructors/webrev/ <http://cr.openjdk.java.net/~psandoz/jdk10/JDK-8178117-summary-constructors/webrev/>

And the CSR:

  https://bugs.openjdk.java.net/browse/JDK-8190381 <https://bugs.openjdk.java.net/browse/JDK-8190381>

The support for a double sum of consisting of an array of double is going beyond my complexity budget for this feature. If that is deemed important later on we can add another constructor.

Paul.

> On 11 Apr 2017, at 17:44, Chris Dennis <[hidden email]> wrote:
>
> Updated patch with the changed parameter validation, updated javadoc and added test coverage.
>
> <8178117.2.patch.txt>
>
>> On Apr 11, 2017, at 4:48 PM, Chris Dennis <[hidden email]> wrote:
>>
>> Color me confused… what would the javadoc on the parameter say? It could I guess have an @implNote documenting the meanings of the parameters… but then what use is it? The proposed API simply limits the precision with which a DoubleSummaryStatistic can be copied to be the same as the precision with which it can be “accessed”.  That doesn’t seem an enormous problem since I suspect that bulk of usages would be to marshall a “finished” instance and therefore there is no real loss occuring. If we wanted to support arbitrary precision wouldn’t it be better to have a constructor variant that took a BigDecimal, and a matching getPreciseSum() that returned a BigDecimal?
>>
>> Chris
>>
>>> On Apr 11, 2017, at 4:16 PM, joe darcy <[hidden email]> wrote:
>>>
>>> On an API design note, I would prefer to DoubleSummaryStatistics took a double... argument to pass in the state of the summation. This flexibility is necessary to more fully preserve the computed sum. Also, the we want flexibility to change the amount of internal state DoubleSummaryStats keeps so we don't want to hard-code that into as aspect of the API.
>>>
>>> Thanks,
>>>
>>> -Joe
>>>
>>>
>>> On 4/11/2017 12:53 PM, Paul Sandoz wrote:
>>>> Hi Chris,
>>>>
>>>> Thanks for looking at this.
>>>>
>>>> There is some rudimentary testing using streams in CollectAndSummaryStatisticsTest.java.
>>>>
>>>> I think we should enforce constraints where we reliably can:
>>>>
>>>> 1) count >= 0
>>>>
>>>> 2) count = 0, then min/max/sum are ignored and it’s as if the default constructor was called. (I thought it might be appropriate to reject since a caller might be passing in incorrect information in error, but the defaults for min/max are important and would be a burden for the caller to pass those values.) In this respect having count as the first parameter of the constructor would be useful.
>>>>
>>>> 3)  min <= max
>>>>
>>>> Since for count > 0 the constraints, count * max >= sum, count * min <= sum, cannot be reliably enforced due to overflow i am inclined to not bother and just document.
>>>>
>>>>
>>>> Note this is gonna be blocked from pushing until the new Compatibility and Specification Review (CSR) process is opened up, which i understand is “soon"ish :-) but that should not block adding some further tests in the interim and tidying up the javadoc.
>>>>
>>>> Paul.
>>>>
>>>>
>>>>> On 6 Apr 2017, at 07:08, Chris Dennis <[hidden email]> wrote:
>>>>>
>>>>> Hi Paul (et al)
>>>>>
>>>>> Like all things API there are wrinkles here when it comes to implementing.
>>>>>
>>>>> This patch isn’t final, there appears to be no existing test coverage for these classes beyond testing the compensating summation used in the double implementation, and I left off adding any until it was decided how much parameter validation we want (since that’s the only testing that can really occur here).
>>>>>
>>>>> The wrinkles relate to the issues around instances that have suffered overflow in count and/or sum which the existing implementation doesn’t defend against:
>>>>>
>>>>> * I chose to ignore all parameters if 'count == 0’ and just leave the instance empty. I held off from validating min, max and count however. This obviously 'breaks things’ (beyond how broken they would already be) if count == 0 only due to overflow.
>>>>> * I chose to fail if count is negative.
>>>>> * I chose to enforce that max and min are logically consistent with count and sum, this can break the moment either one of the overflows as well (I’m also pretty sure that the FPU logic is going to suffer here too - there might be some magic I can do with a pessimistic bit of rounding on the FPU multiplication result).
>>>>>
>>>>> I intentionally went with the most aggressive parameter validation here to establish one end of what could be implemented.  There are arguments for this and also arguments for the other extreme (no validation at all).  Personally I’m in favor of the current version where the creation will fail if the inputs are only possible through overflow (modulo fixing the FPU precision issues above) even though it’s at odds with approach of the existing implementation.
>>>>>
>>>>> Would appreciate thoughts/feedback.  Thanks.
>>>>>
>>>>> Chris
>>>>>
>>>>>
>>>>> P.S. I presume tests would be required for the parameter checking (once it is finalized)?
>>>>>
>>>>> <8178117.patch>
>>>
>>
>

Reply | Threaded
Open this post in threaded view
|

Re: RFR Re: [PATCH] 8178117: Add public state constructors for Int/Long/DoubleSummaryStatistics

Brian Burkhalter-2
Hi Paul,

On the specification, e.g., at line 95 of DoubleSummaryStatistics:

s/recorded the count of values/recorded count of values/

Brian

On Oct 30, 2017, at 3:49 PM, Paul Sandoz <[hidden email]> wrote:

> And the CSR:
>
>  https://bugs.openjdk.java.net/browse/JDK-8190381<https://bugs.openjdk.java.net/browse/JDK-8190381>

Reply | Threaded
Open this post in threaded view
|

Re: RFR Re: [PATCH] 8178117: Add public state constructors for Int/Long/DoubleSummaryStatistics

Paul Sandoz

> On 30 Oct 2017, at 16:14, Brian Burkhalter <[hidden email]> wrote:
>
> Hi Paul,
>
> On the specification, e.g., at line 95 of DoubleSummaryStatistics:
>
> s/recorded the count of values/recorded count of values/
>

Thanks! updated the webrev and CSR.

Paul.

> Brian
>
> On Oct 30, 2017, at 3:49 PM, Paul Sandoz <[hidden email] <mailto:[hidden email]>> wrote:
>
>> And the CSR:
>>
>>  https://bugs.openjdk.java.net/browse/JDK-8190381 <https://bugs.openjdk.java.net/browse/JDK-8190381><https://bugs.openjdk.java.net/browse/JDK-8190381 <https://bugs.openjdk.java.net/browse/JDK-8190381>>
>

Reply | Threaded
Open this post in threaded view
|

Re: RFR Re: [PATCH] 8178117: Add public state constructors for Int/Long/DoubleSummaryStatistics

Brian Burkhalter-2
Looks good but I was wondering why there are static imports of a couple of j.l.Math methods in Int/LongSummaryStatistics.

Brian

On Oct 30, 2017, at 4:23 PM, Paul Sandoz <[hidden email]> wrote:

> Thanks! updated the webrev and CSR.

Reply | Threaded
Open this post in threaded view
|

Re: RFR Re: [PATCH] 8178117: Add public state constructors for Int/Long/DoubleSummaryStatistics

joe darcy
In reply to this post by Paul Sandoz
Hello,

Is it intentional that "DoubleSummaryStatistics" is used in a sentence like

   91      * @apiNote
   92      * The enforcement of argument correctness means that the
retrieved set of
   93      * recorded values obtained from a {@code
DoubleSummaryStatistics} source

in all three types being updated?

For the code in the constructor, if you don't want to have something
like an explicit switch on Long.signum(count), I'd prefer to see at
least a comment like

// Use default field values if count == 0

For the double case, I'd recommend future work change the new
constructor to

     DoubleSummaryStatistics(long count, double min, double max, double
sum, double... additionalSum)

where the final argument could be used to convey additional state.

For the double case, if a NaN is used than max and min will be NaN.
Therefore, I'd recommend change the correctness constraint for that type to

     <li>{@code min} &le; {@code max} || (isNaN(min) && isNaN(max)) </li>

with an analogous update to the code.

Thanks,

-Joe

On 10/30/2017 3:49 PM, Paul Sandoz wrote:

> Hi Chris,
>
> After some hiatus here is an updated webrev, i made some tweaks to the
> JavaDoc:
>
> http://cr.openjdk.java.net/~psandoz/jdk10/JDK-8178117-summary-constructors/webrev/ 
> <http://cr.openjdk.java.net/%7Epsandoz/jdk10/JDK-8178117-summary-constructors/webrev/>
>
> And the CSR:
>
> https://bugs.openjdk.java.net/browse/JDK-8190381
>
> The support for a double sum of consisting of an array of double is
> going beyond my complexity budget for this feature. If that is deemed
> important later on we can add another constructor.
>
> Paul.
>
>> On 11 Apr 2017, at 17:44, Chris Dennis <[hidden email]
>> <mailto:[hidden email]>> wrote:
>>
>> Updated patch with the changed parameter validation, updated javadoc
>> and added test coverage.
>>
>> <8178117.2.patch.txt>
>>
>>> On Apr 11, 2017, at 4:48 PM, Chris Dennis <[hidden email]
>>> <mailto:[hidden email]>> wrote:
>>>
>>> Color me confused… what would the javadoc on the parameter say? It
>>> could I guess have an @implNote documenting the meanings of the
>>> parameters… but then what use is it? The proposed API simply limits
>>> the precision with which a DoubleSummaryStatistic can be copied to
>>> be the same as the precision with which it can be “accessed”.  That
>>> doesn’t seem an enormous problem since I suspect that bulk of usages
>>> would be to marshall a “finished” instance and therefore there is no
>>> real loss occuring. If we wanted to support arbitrary precision
>>> wouldn’t it be better to have a constructor variant that took a
>>> BigDecimal, and a matching getPreciseSum() that returned a BigDecimal?
>>>
>>> Chris
>>>
>>>> On Apr 11, 2017, at 4:16 PM, joe darcy <[hidden email]
>>>> <mailto:[hidden email]>> wrote:
>>>>
>>>> On an API design note, I would prefer to DoubleSummaryStatistics
>>>> took a double... argument to pass in the state of the summation.
>>>> This flexibility is necessary to more fully preserve the computed
>>>> sum. Also, the we want flexibility to change the amount of internal
>>>> state DoubleSummaryStats keeps so we don't want to hard-code that
>>>> into as aspect of the API.
>>>>
>>>> Thanks,
>>>>
>>>> -Joe
>>>>
>>>>
>>>> On 4/11/2017 12:53 PM, Paul Sandoz wrote:
>>>>> Hi Chris,
>>>>>
>>>>> Thanks for looking at this.
>>>>>
>>>>> There is some rudimentary testing using streams in
>>>>> CollectAndSummaryStatisticsTest.java.
>>>>>
>>>>> I think we should enforce constraints where we reliably can:
>>>>>
>>>>> 1) count >= 0
>>>>>
>>>>> 2) count = 0, then min/max/sum are ignored and it’s as if the
>>>>> default constructor was called. (I thought it might be appropriate
>>>>> to reject since a caller might be passing in incorrect information
>>>>> in error, but the defaults for min/max are important and would be
>>>>> a burden for the caller to pass those values.) In this respect
>>>>> having count as the first parameter of the constructor would be
>>>>> useful.
>>>>>
>>>>> 3)  min <= max
>>>>>
>>>>> Since for count > 0 the constraints, count * max >= sum, count *
>>>>> min <= sum, cannot be reliably enforced due to overflow i am
>>>>> inclined to not bother and just document.
>>>>>
>>>>>
>>>>> Note this is gonna be blocked from pushing until the new
>>>>> Compatibility and Specification Review (CSR) process is opened up,
>>>>> which i understand is “soon"ish :-) but that should not block
>>>>> adding some further tests in the interim and tidying up the javadoc.
>>>>>
>>>>> Paul.
>>>>>
>>>>>
>>>>>> On 6 Apr 2017, at 07:08, Chris Dennis <[hidden email]
>>>>>> <mailto:[hidden email]>> wrote:
>>>>>>
>>>>>> Hi Paul (et al)
>>>>>>
>>>>>> Like all things API there are wrinkles here when it comes to
>>>>>> implementing.
>>>>>>
>>>>>> This patch isn’t final, there appears to be no existing test
>>>>>> coverage for these classes beyond testing the compensating
>>>>>> summation used in the double implementation, and I left off
>>>>>> adding any until it was decided how much parameter validation we
>>>>>> want (since that’s the only testing that can really occur here).
>>>>>>
>>>>>> The wrinkles relate to the issues around instances that have
>>>>>> suffered overflow in count and/or sum which the existing
>>>>>> implementation doesn’t defend against:
>>>>>>
>>>>>> * I chose to ignore all parameters if 'count == 0’ and just leave
>>>>>> the instance empty. I held off from validating min, max and count
>>>>>> however. This obviously 'breaks things’ (beyond how broken they
>>>>>> would already be) if count == 0 only due to overflow.
>>>>>> * I chose to fail if count is negative.
>>>>>> * I chose to enforce that max and min are logically consistent
>>>>>> with count and sum, this can break the moment either one of the
>>>>>> overflows as well (I’m also pretty sure that the FPU logic is
>>>>>> going to suffer here too - there might be some magic I can do
>>>>>> with a pessimistic bit of rounding on the FPU multiplication result).
>>>>>>
>>>>>> I intentionally went with the most aggressive parameter
>>>>>> validation here to establish one end of what could be
>>>>>> implemented.  There are arguments for this and also arguments for
>>>>>> the other extreme (no validation at all).  Personally I’m in
>>>>>> favor of the current version where the creation will fail if the
>>>>>> inputs are only possible through overflow (modulo fixing the FPU
>>>>>> precision issues above) even though it’s at odds with approach of
>>>>>> the existing implementation.
>>>>>>
>>>>>> Would appreciate thoughts/feedback.  Thanks.
>>>>>>
>>>>>> Chris
>>>>>>
>>>>>>
>>>>>> P.S. I presume tests would be required for the parameter checking
>>>>>> (once it is finalized)?
>>>>>>
>>>>>> <8178117.patch>
>>>>
>>>
>>
>

Reply | Threaded
Open this post in threaded view
|

Re: RFR Re: [PATCH] 8178117: Add public state constructors for Int/Long/DoubleSummaryStatistics

Paul Sandoz
Hi,

> On 30 Oct 2017, at 17:19, Joseph D. Darcy <[hidden email]> wrote:
>
> Hello,
>
> Is it intentional that "DoubleSummaryStatistics" is used in a sentence like
>
>   91      * @apiNote
>   92      * The enforcement of argument correctness means that the retrieved set of
>   93      * recorded values obtained from a {@code DoubleSummaryStatistics} source
>
> in all three types being updated?
>

No, copy’n’paste error. Fixed.


> For the code in the constructor, if you don't want to have something like an explicit switch on Long.signum(count), I'd prefer to see at least a comment like
>
> // Use default field values if count == 0
>

Done.


> For the double case, I'd recommend future work change the new constructor to
>
>     DoubleSummaryStatistics(long count, double min, double max, double sum, double... additionalSum)
>
> where the final argument could be used to convey additional state.
>

Ok.


> For the double case, if a NaN is used than max and min will be NaN. Therefore, I'd recommend change the correctness constraint for that type to
>
>     <li>{@code min} &le; {@code max} || (isNaN(min) && isNaN(max)) </li>
>
> with an analogous update to the code.
>

In that case we need to double (sorry) down on the NaNs and include sum as well:

*   <li>{@code (min <= max && !isNaN(sum)) || (isNaN(min) && isNaN(max) && isNaN(sum))}</li>

I updated the webrev and added some more tests:

  http://cr.openjdk.java.net/~psandoz/jdk10/JDK-8178117-summary-constructors/webrev/

Thanks,
Paul.



> Thanks,
>
> -Joe
>
> On 10/30/2017 3:49 PM, Paul Sandoz wrote:
>> Hi Chris,
>>
>> After some hiatus here is an updated webrev, i made some tweaks to the JavaDoc:
>>
>>   http://cr.openjdk.java.net/~psandoz/jdk10/JDK-8178117-summary-constructors/webrev/
>>
>> And the CSR:
>>
>>   https://bugs.openjdk.java.net/browse/JDK-8190381
>>
>> The support for a double sum of consisting of an array of double is going beyond my complexity budget for this feature. If that is deemed important later on we can add another constructor.
>>
>> Paul.
>>
>>> On 11 Apr 2017, at 17:44, Chris Dennis <[hidden email]> wrote:
>>>
>>> Updated patch with the changed parameter validation, updated javadoc and added test coverage.
>>>
>>> <8178117.2.patch.txt>
>>>
>>>> On Apr 11, 2017, at 4:48 PM, Chris Dennis <[hidden email]> wrote:
>>>>
>>>> Color me confused… what would the javadoc on the parameter say? It could I guess have an @implNote documenting the meanings of the parameters… but then what use is it? The proposed API simply limits the precision with which a DoubleSummaryStatistic can be copied to be the same as the precision with which it can be “accessed”.  That doesn’t seem an enormous problem since I suspect that bulk of usages would be to marshall a “finished” instance and therefore there is no real loss occuring. If we wanted to support arbitrary precision wouldn’t it be better to have a constructor variant that took a BigDecimal, and a matching getPreciseSum() that returned a BigDecimal?
>>>>
>>>> Chris
>>>>
>>>>> On Apr 11, 2017, at 4:16 PM, joe darcy <[hidden email]> wrote:
>>>>>
>>>>> On an API design note, I would prefer to DoubleSummaryStatistics took a double... argument to pass in the state of the summation. This flexibility is necessary to more fully preserve the computed sum. Also, the we want flexibility to change the amount of internal state DoubleSummaryStats keeps so we don't want to hard-code that into as aspect of the API.
>>>>>
>>>>> Thanks,
>>>>>
>>>>> -Joe
>>>>>
>>>>>
>>>>> On 4/11/2017 12:53 PM, Paul Sandoz wrote:
>>>>>> Hi Chris,
>>>>>>
>>>>>> Thanks for looking at this.
>>>>>>
>>>>>> There is some rudimentary testing using streams in CollectAndSummaryStatisticsTest.java.
>>>>>>
>>>>>> I think we should enforce constraints where we reliably can:
>>>>>>
>>>>>> 1) count >= 0
>>>>>>
>>>>>> 2) count = 0, then min/max/sum are ignored and it’s as if the default constructor was called. (I thought it might be appropriate to reject since a caller might be passing in incorrect information in error, but the defaults for min/max are important and would be a burden for the caller to pass those values.) In this respect having count as the first parameter of the constructor would be useful.
>>>>>>
>>>>>> 3)  min <= max
>>>>>>
>>>>>> Since for count > 0 the constraints, count * max >= sum, count * min <= sum, cannot be reliably enforced due to overflow i am inclined to not bother and just document.
>>>>>>
>>>>>>
>>>>>> Note this is gonna be blocked from pushing until the new Compatibility and Specification Review (CSR) process is opened up, which i understand is “soon"ish :-) but that should not block adding some further tests in the interim and tidying up the javadoc.
>>>>>>
>>>>>> Paul.
>>>>>>
>>>>>>
>>>>>>> On 6 Apr 2017, at 07:08, Chris Dennis <[hidden email]> wrote:
>>>>>>>
>>>>>>> Hi Paul (et al)
>>>>>>>
>>>>>>> Like all things API there are wrinkles here when it comes to implementing.
>>>>>>>
>>>>>>> This patch isn’t final, there appears to be no existing test coverage for these classes beyond testing the compensating summation used in the double implementation, and I left off adding any until it was decided how much parameter validation we want (since that’s the only testing that can really occur here).
>>>>>>>
>>>>>>> The wrinkles relate to the issues around instances that have suffered overflow in count and/or sum which the existing implementation doesn’t defend against:
>>>>>>>
>>>>>>> * I chose to ignore all parameters if 'count == 0’ and just leave the instance empty. I held off from validating min, max and count however. This obviously 'breaks things’ (beyond how broken they would already be) if count == 0 only due to overflow.
>>>>>>> * I chose to fail if count is negative.
>>>>>>> * I chose to enforce that max and min are logically consistent with count and sum, this can break the moment either one of the overflows as well (I’m also pretty sure that the FPU logic is going to suffer here too - there might be some magic I can do with a pessimistic bit of rounding on the FPU multiplication result).
>>>>>>>
>>>>>>> I intentionally went with the most aggressive parameter validation here to establish one end of what could be implemented.  There are arguments for this and also arguments for the other extreme (no validation at all).  Personally I’m in favor of the current version where the creation will fail if the inputs are only possible through overflow (modulo fixing the FPU precision issues above) even though it’s at odds with approach of the existing implementation.
>>>>>>>
>>>>>>> Would appreciate thoughts/feedback.  Thanks.
>>>>>>>
>>>>>>> Chris
>>>>>>>
>>>>>>>
>>>>>>> P.S. I presume tests would be required for the parameter checking (once it is finalized)?
>>>>>>>
>>>>>>> <8178117.patch>
>>>>>
>>>>
>>>
>>
>

Reply | Threaded
Open this post in threaded view
|

Re: RFR Re: [PATCH] 8178117: Add public state constructors for Int/Long/DoubleSummaryStatistics

Brian Goetz-2
In reply to this post by Paul Sandoz
While I agree that the overflow-detecting constraint on min/max in the
early version was too aggressive, you could reasonably choose to enforce
the constraint that if count == 0, then so should sum, min, and max.

On 10/30/2017 6:49 PM, Paul Sandoz wrote:

> Hi Chris,
>
> After some hiatus here is an updated webrev, i made some tweaks to the JavaDoc:
>
>    http://cr.openjdk.java.net/~psandoz/jdk10/JDK-8178117-summary-constructors/webrev/ <http://cr.openjdk.java.net/~psandoz/jdk10/JDK-8178117-summary-constructors/webrev/>
>
> And the CSR:
>
>    https://bugs.openjdk.java.net/browse/JDK-8190381 <https://bugs.openjdk.java.net/browse/JDK-8190381>
>
> The support for a double sum of consisting of an array of double is going beyond my complexity budget for this feature. If that is deemed important later on we can add another constructor.
>
> Paul.
>
>> On 11 Apr 2017, at 17:44, Chris Dennis <[hidden email]> wrote:
>>
>> Updated patch with the changed parameter validation, updated javadoc and added test coverage.
>>
>> <8178117.2.patch.txt>
>>
>>> On Apr 11, 2017, at 4:48 PM, Chris Dennis <[hidden email]> wrote:
>>>
>>> Color me confused… what would the javadoc on the parameter say? It could I guess have an @implNote documenting the meanings of the parameters… but then what use is it? The proposed API simply limits the precision with which a DoubleSummaryStatistic can be copied to be the same as the precision with which it can be “accessed”.  That doesn’t seem an enormous problem since I suspect that bulk of usages would be to marshall a “finished” instance and therefore there is no real loss occuring. If we wanted to support arbitrary precision wouldn’t it be better to have a constructor variant that took a BigDecimal, and a matching getPreciseSum() that returned a BigDecimal?
>>>
>>> Chris
>>>
>>>> On Apr 11, 2017, at 4:16 PM, joe darcy <[hidden email]> wrote:
>>>>
>>>> On an API design note, I would prefer to DoubleSummaryStatistics took a double... argument to pass in the state of the summation. This flexibility is necessary to more fully preserve the computed sum. Also, the we want flexibility to change the amount of internal state DoubleSummaryStats keeps so we don't want to hard-code that into as aspect of the API.
>>>>
>>>> Thanks,
>>>>
>>>> -Joe
>>>>
>>>>
>>>> On 4/11/2017 12:53 PM, Paul Sandoz wrote:
>>>>> Hi Chris,
>>>>>
>>>>> Thanks for looking at this.
>>>>>
>>>>> There is some rudimentary testing using streams in CollectAndSummaryStatisticsTest.java.
>>>>>
>>>>> I think we should enforce constraints where we reliably can:
>>>>>
>>>>> 1) count >= 0
>>>>>
>>>>> 2) count = 0, then min/max/sum are ignored and it’s as if the default constructor was called. (I thought it might be appropriate to reject since a caller might be passing in incorrect information in error, but the defaults for min/max are important and would be a burden for the caller to pass those values.) In this respect having count as the first parameter of the constructor would be useful.
>>>>>
>>>>> 3)  min <= max
>>>>>
>>>>> Since for count > 0 the constraints, count * max >= sum, count * min <= sum, cannot be reliably enforced due to overflow i am inclined to not bother and just document.
>>>>>
>>>>>
>>>>> Note this is gonna be blocked from pushing until the new Compatibility and Specification Review (CSR) process is opened up, which i understand is “soon"ish :-) but that should not block adding some further tests in the interim and tidying up the javadoc.
>>>>>
>>>>> Paul.
>>>>>
>>>>>
>>>>>> On 6 Apr 2017, at 07:08, Chris Dennis <[hidden email]> wrote:
>>>>>>
>>>>>> Hi Paul (et al)
>>>>>>
>>>>>> Like all things API there are wrinkles here when it comes to implementing.
>>>>>>
>>>>>> This patch isn’t final, there appears to be no existing test coverage for these classes beyond testing the compensating summation used in the double implementation, and I left off adding any until it was decided how much parameter validation we want (since that’s the only testing that can really occur here).
>>>>>>
>>>>>> The wrinkles relate to the issues around instances that have suffered overflow in count and/or sum which the existing implementation doesn’t defend against:
>>>>>>
>>>>>> * I chose to ignore all parameters if 'count == 0’ and just leave the instance empty. I held off from validating min, max and count however. This obviously 'breaks things’ (beyond how broken they would already be) if count == 0 only due to overflow.
>>>>>> * I chose to fail if count is negative.
>>>>>> * I chose to enforce that max and min are logically consistent with count and sum, this can break the moment either one of the overflows as well (I’m also pretty sure that the FPU logic is going to suffer here too - there might be some magic I can do with a pessimistic bit of rounding on the FPU multiplication result).
>>>>>>
>>>>>> I intentionally went with the most aggressive parameter validation here to establish one end of what could be implemented.  There are arguments for this and also arguments for the other extreme (no validation at all).  Personally I’m in favor of the current version where the creation will fail if the inputs are only possible through overflow (modulo fixing the FPU precision issues above) even though it’s at odds with approach of the existing implementation.
>>>>>>
>>>>>> Would appreciate thoughts/feedback.  Thanks.
>>>>>>
>>>>>> Chris
>>>>>>
>>>>>>
>>>>>> P.S. I presume tests would be required for the parameter checking (once it is finalized)?
>>>>>>
>>>>>> <8178117.patch>

Reply | Threaded
Open this post in threaded view
|

Re: RFR Re: [PATCH] 8178117: Add public state constructors for Int/Long/DoubleSummaryStatistics

joe darcy
In reply to this post by Paul Sandoz
Hi Paul,


On 10/30/2017 9:33 PM, Paul Sandoz wrote:

> Hi,
>
>> On 30 Oct 2017, at 17:19, Joseph D. Darcy <[hidden email]> wrote:
>>
>> Hello,
>>
>> Is it intentional that "DoubleSummaryStatistics" is used in a sentence like
>>
>>    91      * @apiNote
>>    92      * The enforcement of argument correctness means that the retrieved set of
>>    93      * recorded values obtained from a {@code DoubleSummaryStatistics} source
>>
>> in all three types being updated?
>>
> No, copy’n’paste error. Fixed.
>
>
>> For the code in the constructor, if you don't want to have something like an explicit switch on Long.signum(count), I'd prefer to see at least a comment like
>>
>> // Use default field values if count == 0
>>
> Done.
>
>
>> For the double case, I'd recommend future work change the new constructor to
>>
>>      DoubleSummaryStatistics(long count, double min, double max, double sum, double... additionalSum)
>>
>> where the final argument could be used to convey additional state.
>>
> Ok.
>
>
>> For the double case, if a NaN is used than max and min will be NaN. Therefore, I'd recommend change the correctness constraint for that type to
>>
>>      <li>{@code min} &le; {@code max} || (isNaN(min) && isNaN(max)) </li>
>>
>> with an analogous update to the code.
>>
> In that case we need to double (sorry) down on the NaNs and include sum as well:
>
> *   <li>{@code (min <= max && !isNaN(sum)) || (isNaN(min) && isNaN(max) && isNaN(sum))}

A more complete test for the numerical consistency conditions would be
something like

     min <= sum/count  <= max

However, that would require a bit of thought due to potential round-off
so this might not be worth the complexity trade-off. (If any of min,
sum, and max were NaN, the comparisons would be false.)

Cheers,

-Joe


12