Quantcast

RFR(xs) JDK-8163861 JNI NewString() and GetStringLength() documentation incorrect

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

RFR(xs) JDK-8163861 JNI NewString() and GetStringLength() documentation incorrect

George Triantafillou
Please review this very small fix to the spec for JDK-8163861:

JBS: https://bugs.openjdk.java.net/browse/JDK-8163861

webrev: http://cr.openjdk.java.net/~gtriantafill/8163861-webrev/webrev 
<http://cr.openjdk.java.net/%7Egtriantafill/8163861-webrev/webrev>

The change provides a more precise definition for NewString() and
GetStringLength().  Thanks.

-George
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: RFR(xs) JDK-8163861 JNI NewString() and GetStringLength() documentation incorrect

David Holmes
Hi George,

Apologies in advance - this is not "xs" in the discussion. But this
needs to go through CSR process (which is not yet available for 10) so
it can't be pushed yet anyway (sorry). But consider this my CSR review. :)

On 20/04/2017 3:43 AM, George Triantafillou wrote:
> Please review this very small fix to the spec for JDK-8163861:
>
> JBS: https://bugs.openjdk.java.net/browse/JDK-8163861
>
> webrev: http://cr.openjdk.java.net/~gtriantafill/8163861-webrev/webrev
> <http://cr.openjdk.java.net/%7Egtriantafill/8163861-webrev/webrev>
>
> The change provides a more precise definition for NewString() and
> GetStringLength().  Thanks.

As the bug report states 'most of the Java documention was updated to
deal with this', so I think it important to ensure any changes are
consistent with that updated documentation and terminology. Specifically
we need to refer to the String class API specification and the Character
class specification which defines Unicode Character Representations for
the Java platform.

Looking at the JDK 8 JNI spec as referenced in the bug report we first see:

jsize GetStringLength(JNIEnv *env, jstring string);
Returns the length (the count of Unicode characters) of a Java string.

This should be functionally equivalent to java.lang.String.length(). If
anything the "count of Unicode characters" should just be deleted as
this method is intended to return the length of the Java string, just as
String.length does. The phrase "count of Unicode characters" is at best
"loose" as the reporter notes (and this terminology is mis-used in a few
of the JNI string functions!). The String.length() method states:

"The length is equal to the number of Unicode code units in the string."

where "Unicode code units" is defined in the Character class.

So the correct fix would be to change "characters" to "code units".

However that there are two descriptions of how the function behaves - in
javadoc terminology we have one description in the main body:

4168 <p>Returns the length (the count of Unicode
4169 characters) of a Java string.</p>

and one equivalent to @returns:

4180 <h4>RETURNS:</h4>
4181
4182 <p>Returns the length of the Java string.</p>

The latter is perfectly and accurately correct! If you want to know how
the length of a Java String is defined then you need to see the String
class.

So I would say that the correct fix here is to either delete "(the count
of Unicode characters)", or else change "characters" to "code units" as
previously discussed.

---

Turning to NewString .... the JNI spec states:

"Constructs a new java.lang.String object from an array of Unicode
characters."

and the @param equivalent for 'len' states:

"len: length of the Unicode string. May be 0."

So initially we have the problematic use of "Unicode characters" again;
then we have the problem as to what the definition of the length of a
"Unicode string" is.

I find no help elsewhere in the JNI spec to clarify this (despite a very
lengthy discussion on its use of UTF-8 encodings). I can only assume,
based on the implementation, that the expectation is that the array of
"Unicode characters" is in UTF-16 format. In which case a more
technically accurate fix would be to say:

"len: length of the Unicode string in UTF-16 code units. May be 0."

Thanks,
David

PS. I will also add the above to the bug report.

> -George
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: RFR(xs) JDK-8163861 JNI NewString() and GetStringLength() documentation incorrect

George Triantafillou
Hi David,

Thanks for the review.  I wasn't aware that the CSR isn't yet available
for 10.  Do you know when it will be available?

On 4/19/2017 8:48 PM, David Holmes wrote:

> Hi George,
>
> Apologies in advance - this is not "xs" in the discussion. But this
> needs to go through CSR process (which is not yet available for 10) so
> it can't be pushed yet anyway (sorry). But consider this my CSR
> review. :)
>
> On 20/04/2017 3:43 AM, George Triantafillou wrote:
>> Please review this very small fix to the spec for JDK-8163861:
>>
>> JBS: https://bugs.openjdk.java.net/browse/JDK-8163861
>>
>> webrev: http://cr.openjdk.java.net/~gtriantafill/8163861-webrev/webrev
>> <http://cr.openjdk.java.net/%7Egtriantafill/8163861-webrev/webrev>
>>
>> The change provides a more precise definition for NewString() and
>> GetStringLength().  Thanks.
>
> As the bug report states 'most of the Java documention was updated to
> deal with this', so I think it important to ensure any changes are
> consistent with that updated documentation and terminology.
> Specifically we need to refer to the String class API specification
> and the Character class specification which defines Unicode Character
> Representations for the Java platform.
>
> Looking at the JDK 8 JNI spec as referenced in the bug report we first
> see:
>
> jsize GetStringLength(JNIEnv *env, jstring string);
> Returns the length (the count of Unicode characters) of a Java string.
>
> This should be functionally equivalent to java.lang.String.length().
> If anything the "count of Unicode characters" should just be deleted
> as this method is intended to return the length of the Java string,
> just as String.length does. The phrase "count of Unicode characters"
> is at best "loose" as the reporter notes (and this terminology is
> mis-used in a few of the JNI string functions!). The String.length()
> method states:
>
> "The length is equal to the number of Unicode code units in the string."
>
> where "Unicode code units" is defined in the Character class.
>
> So the correct fix would be to change "characters" to "code units".
>
> However that there are two descriptions of how the function behaves -
> in javadoc terminology we have one description in the main body:
>
> 4168 <p>Returns the length (the count of Unicode
> 4169 characters) of a Java string.</p>
>
> and one equivalent to @returns:
>
> 4180 <h4>RETURNS:</h4>
> 4181
> 4182 <p>Returns the length of the Java string.</p>
>
> The latter is perfectly and accurately correct! If you want to know
> how the length of a Java String is defined then you need to see the
> String class.
>
> So I would say that the correct fix here is to either delete "(the
> count of Unicode characters)", or else change "characters" to "code
> units" as previously discussed.
That sounds reasonable!

>
> ---
>
> Turning to NewString .... the JNI spec states:
>
> "Constructs a new java.lang.String object from an array of Unicode
> characters."
>
> and the @param equivalent for 'len' states:
>
> "len: length of the Unicode string. May be 0."
>
> So initially we have the problematic use of "Unicode characters"
> again; then we have the problem as to what the definition of the
> length of a "Unicode string" is.
>
> I find no help elsewhere in the JNI spec to clarify this (despite a
> very lengthy discussion on its use of UTF-8 encodings). I can only
> assume, based on the implementation, that the expectation is that the
> array of "Unicode characters" is in UTF-16 format. In which case a
> more technically accurate fix would be to say:
>
> "len: length of the Unicode string in UTF-16 code units. May be 0."
>
> Thanks,
> David
>
> PS. I will also add the above to the bug report.
Thanks, I appreciate it.

-George
>
>> -George

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: RFR(xs) JDK-8163861 JNI NewString() and GetStringLength() documentation incorrect

David Holmes
On 21/04/2017 12:25 AM, George Triantafillou wrote:
> Hi David,
>
> Thanks for the review.  I wasn't aware that the CSR isn't yet available
> for 10.  Do you know when it will be available?

"real soon now"

David

> On 4/19/2017 8:48 PM, David Holmes wrote:
>> Hi George,
>>
>> Apologies in advance - this is not "xs" in the discussion. But this
>> needs to go through CSR process (which is not yet available for 10) so
>> it can't be pushed yet anyway (sorry). But consider this my CSR
>> review. :)
>>
>> On 20/04/2017 3:43 AM, George Triantafillou wrote:
>>> Please review this very small fix to the spec for JDK-8163861:
>>>
>>> JBS: https://bugs.openjdk.java.net/browse/JDK-8163861
>>>
>>> webrev: http://cr.openjdk.java.net/~gtriantafill/8163861-webrev/webrev
>>> <http://cr.openjdk.java.net/%7Egtriantafill/8163861-webrev/webrev>
>>>
>>> The change provides a more precise definition for NewString() and
>>> GetStringLength().  Thanks.
>>
>> As the bug report states 'most of the Java documention was updated to
>> deal with this', so I think it important to ensure any changes are
>> consistent with that updated documentation and terminology.
>> Specifically we need to refer to the String class API specification
>> and the Character class specification which defines Unicode Character
>> Representations for the Java platform.
>>
>> Looking at the JDK 8 JNI spec as referenced in the bug report we first
>> see:
>>
>> jsize GetStringLength(JNIEnv *env, jstring string);
>> Returns the length (the count of Unicode characters) of a Java string.
>>
>> This should be functionally equivalent to java.lang.String.length().
>> If anything the "count of Unicode characters" should just be deleted
>> as this method is intended to return the length of the Java string,
>> just as String.length does. The phrase "count of Unicode characters"
>> is at best "loose" as the reporter notes (and this terminology is
>> mis-used in a few of the JNI string functions!). The String.length()
>> method states:
>>
>> "The length is equal to the number of Unicode code units in the string."
>>
>> where "Unicode code units" is defined in the Character class.
>>
>> So the correct fix would be to change "characters" to "code units".
>>
>> However that there are two descriptions of how the function behaves -
>> in javadoc terminology we have one description in the main body:
>>
>> 4168 <p>Returns the length (the count of Unicode
>> 4169 characters) of a Java string.</p>
>>
>> and one equivalent to @returns:
>>
>> 4180 <h4>RETURNS:</h4>
>> 4181
>> 4182 <p>Returns the length of the Java string.</p>
>>
>> The latter is perfectly and accurately correct! If you want to know
>> how the length of a Java String is defined then you need to see the
>> String class.
>>
>> So I would say that the correct fix here is to either delete "(the
>> count of Unicode characters)", or else change "characters" to "code
>> units" as previously discussed.
> That sounds reasonable!
>
>>
>> ---
>>
>> Turning to NewString .... the JNI spec states:
>>
>> "Constructs a new java.lang.String object from an array of Unicode
>> characters."
>>
>> and the @param equivalent for 'len' states:
>>
>> "len: length of the Unicode string. May be 0."
>>
>> So initially we have the problematic use of "Unicode characters"
>> again; then we have the problem as to what the definition of the
>> length of a "Unicode string" is.
>>
>> I find no help elsewhere in the JNI spec to clarify this (despite a
>> very lengthy discussion on its use of UTF-8 encodings). I can only
>> assume, based on the implementation, that the expectation is that the
>> array of "Unicode characters" is in UTF-16 format. In which case a
>> more technically accurate fix would be to say:
>>
>> "len: length of the Unicode string in UTF-16 code units. May be 0."
>>
>> Thanks,
>> David
>>
>> PS. I will also add the above to the bug report.
> Thanks, I appreciate it.
>
> -George
>>
>>> -George
>
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: RFR(xs) JDK-8163861 JNI NewString() and GetStringLength() documentation incorrect

joe darcy
On 4/21/2017 12:07 AM, David Holmes wrote:
> On 21/04/2017 12:25 AM, George Triantafillou wrote:
>> Hi David,
>>
>> Thanks for the review.  I wasn't aware that the CSR isn't yet available
>> for 10.  Do you know when it will be available?
>
> "real soon now"

FYI, related discussions about CSR occurring over at
[hidden email].

-Joe
Loading...