RFR: 8263677: Improve Character.isLowerCase/isUpperCase lookups

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

RFR: 8263677: Improve Character.isLowerCase/isUpperCase lookups

Claes Redestad-2
This patch changes the otherLowercase / otherUppercase bits to be set if either the codepoint is of type LOWERCASE_LETTER and UPPERCASE_LETTER, or the Unicode Other_Lowercase / Other_Uppercase property is set. This simplifies the lookup in Character.isLowerCase/isUpperCase to a single table lookup, which appears to be healthy for performance.

I also took the opportunity to clean up the somewhat dated GenerateCharacter utility class.

Testing: tier1-3

-------------

Commit messages:
 - Merge branch 'master' into character_case
 - Cleanups and modernizations
 - Fix lookup in 00, 01, 0E planes
 - Widen the range of codepoints tested by Characters micro
 - Improve Character.isLowerCase/isUpperCase lookups

Changes: https://git.openjdk.java.net/jdk/pull/3028/files
 Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=3028&range=00
  Issue: https://bugs.openjdk.java.net/browse/JDK-8263677
  Stats: 261 lines in 8 files changed: 13 ins; 129 del; 119 mod
  Patch: https://git.openjdk.java.net/jdk/pull/3028.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/3028/head:pull/3028

PR: https://git.openjdk.java.net/jdk/pull/3028
Reply | Threaded
Open this post in threaded view
|

Re: RFR: 8263677: Improve Character.isLowerCase/isUpperCase lookups

Erik Joelsson-2
On Tue, 16 Mar 2021 12:51:02 GMT, Claes Redestad <[hidden email]> wrote:

> This patch changes the otherLowercase / otherUppercase bits to be set if either the codepoint is of type LOWERCASE_LETTER and UPPERCASE_LETTER, or the Unicode Other_Lowercase / Other_Uppercase property is set. This simplifies the lookup in Character.isLowerCase/isUpperCase to a single table lookup, which appears to be healthy for performance.
>
> I also took the opportunity to clean up the somewhat dated GenerateCharacter utility class.
>
> Testing: tier1-3

Looks good from build point of view. I like the code cleanups.

-------------

Marked as reviewed by erikj (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/3028
Reply | Threaded
Open this post in threaded view
|

Re: RFR: 8263677: Improve Character.isLowerCase/isUpperCase lookups [v2]

Naoto Sato-2
In reply to this post by Claes Redestad-2
On Tue, 16 Mar 2021 21:02:28 GMT, Claes Redestad <[hidden email]> wrote:

>> This patch changes the otherLowercase / otherUppercase bits to be set if either the codepoint is of type LOWERCASE_LETTER and UPPERCASE_LETTER, or the Unicode Other_Lowercase / Other_Uppercase property is set. This simplifies the lookup in Character.isLowerCase/isUpperCase to a single table lookup, which appears to be healthy for performance.
>>
>> I also took the opportunity to clean up the somewhat dated GenerateCharacter utility class.
>>
>> Testing: tier1-3
>
> Claes Redestad has updated the pull request incrementally with one additional commit since the last revision:
>
>   Roger review + additional cleanups

Marked as reviewed by naoto (Reviewer).

-------------

PR: https://git.openjdk.java.net/jdk/pull/3028