CFV: Project Trinity

classic Classic list List threaded Threaded
19 messages Options
Reply | Threaded
Open this post in threaded view
|

CFV: Project Trinity

Karthik Ganesan
Hi,

I would like to propose the creation of a new Project:
Project Trinity with myself as the Lead and the Core
Libraries Group as the Sponsoring Group.

This Project would explore enhanced execution of bulk
aggregate calculations over Streams through offloading
calculations to hardware accelerators.

Streams allow developers to express calculations such
that data parallelism can be efficiently exploited. Such
calculations are prime candidates for leveraging enhanced
data-oriented instructions on CPUs (such as SIMD
instructions) or offloading to hardware accelerators
(such as the SPARC Data Accelerator co-processor, further
referred to as DAX [1]).

To identify a path to improving performance and power
efficiency, Project Trinity will explore how libraries
like Streams can be enhanced to leverage data processing
hardware features to execute Streams more efficiently [2].

Directions for exploration include:
- Building a streams-like library optimized for offload to
-- hardware accelerators (such as DAX), or
-- a GPU, or
-- SIMD instructions;
- Optimizations in the Graal compiler to automatically
transform suitable Streams pipelines, taking advantage
of data processing hardware features;
- Explorations with Project Valhalla to expand the
range of effective acceleration to Streams of value types.

Success will be evaluated based upon:
(1) speedups and resource efficiency gains achieved for a
  broad range of representative streams calculations under
offload,
(2) ease of use of the hardware acceleration capability, and
(3) ensuring that there is no time or space overhead for
non-accelerated calculations.

The project will host at least the following mailing list:
- trinity-dev for development discussions and user feedback
- trinity-design for design and specification discussions

About the Lead:
Karthik Ganesan works for Oracle in the Performance and
Applications Engineering group with more than 5 years of
experience in Java/Hotspot performance projects [3].

The initial Reviewers and Committers will be:
* Ahmed Khawaja
* Karthik Ganesan
* Malcolm Kavalsky
* Shrinivas Joshi

Votes are due by May 4, 2017.

Only current OpenJDK Members [4] are eligible to vote on this
motion.  Votes must be cast in the open on the discuss list.
Replying to this message is sufficient if your mail program
honors the Reply-To header.

For Lazy Consensus voting instructions, see [5].

Karthik Ganesan

[1] https://community.oracle.com/docs/DOC-994842
[2] https://bugs.openjdk.java.net/browse/JDK-8150304
[3] Karthik Ganesan, Yao-Min Chen and X Pan, “Scaling
Java Virtual Machine on a Many-core System” at 14th
International Symposium on Integrated Circuits (ISIC 2014)
[4] http://openjdk.java.net/census#members
[5] http://openjdk.java.net/projects/#new-project-vote


Reply | Threaded
Open this post in threaded view
|

Re: CFV: Project Trinity

Christian Thalinger-4
Why can’t we use Project Sumatra[1] for this?

[1] http://openjdk.java.net/projects/sumatra/

> On Apr 21, 2017, at 8:28 AM, Karthik Ganesan <[hidden email]> wrote:
>
> Hi,
>
> I would like to propose the creation of a new Project:
> Project Trinity with myself as the Lead and the Core
> Libraries Group as the Sponsoring Group.
>
> This Project would explore enhanced execution of bulk
> aggregate calculations over Streams through offloading
> calculations to hardware accelerators.
>
> Streams allow developers to express calculations such
> that data parallelism can be efficiently exploited. Such
> calculations are prime candidates for leveraging enhanced
> data-oriented instructions on CPUs (such as SIMD
> instructions) or offloading to hardware accelerators
> (such as the SPARC Data Accelerator co-processor, further
> referred to as DAX [1]).
>
> To identify a path to improving performance and power
> efficiency, Project Trinity will explore how libraries
> like Streams can be enhanced to leverage data processing
> hardware features to execute Streams more efficiently [2].
>
> Directions for exploration include:
> - Building a streams-like library optimized for offload to
> -- hardware accelerators (such as DAX), or
> -- a GPU, or
> -- SIMD instructions;
> - Optimizations in the Graal compiler to automatically
> transform suitable Streams pipelines, taking advantage
> of data processing hardware features;
> - Explorations with Project Valhalla to expand the
> range of effective acceleration to Streams of value types.
>
> Success will be evaluated based upon:
> (1) speedups and resource efficiency gains achieved for a
> broad range of representative streams calculations under
> offload,
> (2) ease of use of the hardware acceleration capability, and
> (3) ensuring that there is no time or space overhead for
> non-accelerated calculations.
>
> The project will host at least the following mailing list:
> - trinity-dev for development discussions and user feedback
> - trinity-design for design and specification discussions
>
> About the Lead:
> Karthik Ganesan works for Oracle in the Performance and
> Applications Engineering group with more than 5 years of
> experience in Java/Hotspot performance projects [3].
>
> The initial Reviewers and Committers will be:
> * Ahmed Khawaja
> * Karthik Ganesan
> * Malcolm Kavalsky
> * Shrinivas Joshi
>
> Votes are due by May 4, 2017.
>
> Only current OpenJDK Members [4] are eligible to vote on this
> motion.  Votes must be cast in the open on the discuss list.
> Replying to this message is sufficient if your mail program
> honors the Reply-To header.
>
> For Lazy Consensus voting instructions, see [5].
>
> Karthik Ganesan
>
> [1] https://community.oracle.com/docs/DOC-994842
> [2] https://bugs.openjdk.java.net/browse/JDK-8150304
> [3] Karthik Ganesan, Yao-Min Chen and X Pan, “Scaling
> Java Virtual Machine on a Many-core System” at 14th
> International Symposium on Integrated Circuits (ISIC 2014)
> [4] http://openjdk.java.net/census#members
> [5] http://openjdk.java.net/projects/#new-project-vote
>
>

Reply | Threaded
Open this post in threaded view
|

Re: CFV: Project Trinity

Paul Sandoz
In reply to this post by Karthik Ganesan
Hi Karthik,

Along with Valhalla i strongly encourage you to also consider explorations with Project Panama, especially with regards to Intel’s Vector API work (under the Panama remit) and how to surface that at a higher DSL-like level.

Paul.

> On 21 Apr 2017, at 11:28, Karthik Ganesan <[hidden email]> wrote:
>
> Hi,
>
> I would like to propose the creation of a new Project:
> Project Trinity with myself as the Lead and the Core
> Libraries Group as the Sponsoring Group.
>
> This Project would explore enhanced execution of bulk
> aggregate calculations over Streams through offloading
> calculations to hardware accelerators.
>
> Streams allow developers to express calculations such
> that data parallelism can be efficiently exploited. Such
> calculations are prime candidates for leveraging enhanced
> data-oriented instructions on CPUs (such as SIMD
> instructions) or offloading to hardware accelerators
> (such as the SPARC Data Accelerator co-processor, further
> referred to as DAX [1]).
>
> To identify a path to improving performance and power
> efficiency, Project Trinity will explore how libraries
> like Streams can be enhanced to leverage data processing
> hardware features to execute Streams more efficiently [2].
>
> Directions for exploration include:
> - Building a streams-like library optimized for offload to
> -- hardware accelerators (such as DAX), or
> -- a GPU, or
> -- SIMD instructions;
> - Optimizations in the Graal compiler to automatically
> transform suitable Streams pipelines, taking advantage
> of data processing hardware features;
> - Explorations with Project Valhalla to expand the
> range of effective acceleration to Streams of value types.
>
> Success will be evaluated based upon:
> (1) speedups and resource efficiency gains achieved for a
> broad range of representative streams calculations under
> offload,
> (2) ease of use of the hardware acceleration capability, and
> (3) ensuring that there is no time or space overhead for
> non-accelerated calculations.
>
> The project will host at least the following mailing list:
> - trinity-dev for development discussions and user feedback
> - trinity-design for design and specification discussions
>
> About the Lead:
> Karthik Ganesan works for Oracle in the Performance and
> Applications Engineering group with more than 5 years of
> experience in Java/Hotspot performance projects [3].
>
> The initial Reviewers and Committers will be:
> * Ahmed Khawaja
> * Karthik Ganesan
> * Malcolm Kavalsky
> * Shrinivas Joshi
>
> Votes are due by May 4, 2017.
>
> Only current OpenJDK Members [4] are eligible to vote on this
> motion.  Votes must be cast in the open on the discuss list.
> Replying to this message is sufficient if your mail program
> honors the Reply-To header.
>
> For Lazy Consensus voting instructions, see [5].
>
> Karthik Ganesan
>
> [1] https://community.oracle.com/docs/DOC-994842
> [2] https://bugs.openjdk.java.net/browse/JDK-8150304
> [3] Karthik Ganesan, Yao-Min Chen and X Pan, “Scaling
> Java Virtual Machine on a Many-core System” at 14th
> International Symposium on Integrated Circuits (ISIC 2014)
> [4] http://openjdk.java.net/census#members
> [5] http://openjdk.java.net/projects/#new-project-vote
>
>

Reply | Threaded
Open this post in threaded view
|

Re: CFV: Project Trinity

Karthik Ganesan
In reply to this post by Christian Thalinger-4
Hi Christian,

Thanks for your interest. This question was brought up previously in the
discussion email thread for this project:

Project Sumatra was aimed at translation of Java byte code to execute on
GPU, which was an ambitious goal and a challenging task to take up. In this
project, we aim to come up with APIs targeting the most common Analytics
operations that can be readily offloaded to accelerators transparently. Most
of the information needed for offload to the accelerator is expected to be
readily provided by the API semantics and there by, simplifying the need to
do tedious byte code analysis.

Thanks,
Karthik
On 4/21/2017 3:18 PM, Christian Thalinger wrote:

> Why can’t we use Project Sumatra[1] for this?
>
> [1] http://openjdk.java.net/projects/sumatra/
>
>> On Apr 21, 2017, at 8:28 AM, Karthik Ganesan <[hidden email]> wrote:
>>
>> Hi,
>>
>> I would like to propose the creation of a new Project:
>> Project Trinity with myself as the Lead and the Core
>> Libraries Group as the Sponsoring Group.
>>
>> This Project would explore enhanced execution of bulk
>> aggregate calculations over Streams through offloading
>> calculations to hardware accelerators.
>>
>> Streams allow developers to express calculations such
>> that data parallelism can be efficiently exploited. Such
>> calculations are prime candidates for leveraging enhanced
>> data-oriented instructions on CPUs (such as SIMD
>> instructions) or offloading to hardware accelerators
>> (such as the SPARC Data Accelerator co-processor, further
>> referred to as DAX [1]).
>>
>> To identify a path to improving performance and power
>> efficiency, Project Trinity will explore how libraries
>> like Streams can be enhanced to leverage data processing
>> hardware features to execute Streams more efficiently [2].
>>
>> Directions for exploration include:
>> - Building a streams-like library optimized for offload to
>> -- hardware accelerators (such as DAX), or
>> -- a GPU, or
>> -- SIMD instructions;
>> - Optimizations in the Graal compiler to automatically
>> transform suitable Streams pipelines, taking advantage
>> of data processing hardware features;
>> - Explorations with Project Valhalla to expand the
>> range of effective acceleration to Streams of value types.
>>
>> Success will be evaluated based upon:
>> (1) speedups and resource efficiency gains achieved for a
>> broad range of representative streams calculations under
>> offload,
>> (2) ease of use of the hardware acceleration capability, and
>> (3) ensuring that there is no time or space overhead for
>> non-accelerated calculations.
>>
>> The project will host at least the following mailing list:
>> - trinity-dev for development discussions and user feedback
>> - trinity-design for design and specification discussions
>>
>> About the Lead:
>> Karthik Ganesan works for Oracle in the Performance and
>> Applications Engineering group with more than 5 years of
>> experience in Java/Hotspot performance projects [3].
>>
>> The initial Reviewers and Committers will be:
>> * Ahmed Khawaja
>> * Karthik Ganesan
>> * Malcolm Kavalsky
>> * Shrinivas Joshi
>>
>> Votes are due by May 4, 2017.
>>
>> Only current OpenJDK Members [4] are eligible to vote on this
>> motion.  Votes must be cast in the open on the discuss list.
>> Replying to this message is sufficient if your mail program
>> honors the Reply-To header.
>>
>> For Lazy Consensus voting instructions, see [5].
>>
>> Karthik Ganesan
>>
>> [1] https://community.oracle.com/docs/DOC-994842
>> [2] https://bugs.openjdk.java.net/browse/JDK-8150304
>> [3] Karthik Ganesan, Yao-Min Chen and X Pan, “Scaling
>> Java Virtual Machine on a Many-core System” at 14th
>> International Symposium on Integrated Circuits (ISIC 2014)
>> [4] http://openjdk.java.net/census#members
>> [5] http://openjdk.java.net/projects/#new-project-vote
>>
>>

Reply | Threaded
Open this post in threaded view
|

Re: CFV: Project Trinity

Christian Thalinger-4

> On Apr 21, 2017, at 11:41 AM, Karthik Ganesan <[hidden email]> wrote:
>
> Hi Christian,
>
> Thanks for your interest. This question was brought up previously in the discussion email thread for this project:
>
> Project Sumatra was aimed at translation of Java byte code to execute on
> GPU, which was an ambitious goal and a challenging task to take up. In this
> project, we aim to come up with APIs targeting the most common Analytics
> operations that can be readily offloaded to accelerators transparently. Most
> of the information needed for offload to the accelerator is expected to be
> readily provided by the API semantics and there by, simplifying the need to
> do tedious byte code analysis.

I disagree.  The first paragraph on the Sumatra project page says:

"This primary goal of this project is to enable Java applications to take advantage of graphics processing units (GPUs) and accelerated processing units (APUs)--whether they are discrete devices or integrated with a CPU--to improve performance.”

while you state:

"This Project would explore enhanced execution of bulk
aggregate calculations over Streams through offloading
calculations to hardware accelerators.”

It’s the same thing.  I just don’t see the need to spin up yet-another OpenJDK project that aims at the same goal.

>
> Thanks,
> Karthik
> On 4/21/2017 3:18 PM, Christian Thalinger wrote:
>> Why can’t we use Project Sumatra[1] for this?
>>
>> [1] http://openjdk.java.net/projects/sumatra/
>>
>>> On Apr 21, 2017, at 8:28 AM, Karthik Ganesan <[hidden email]> wrote:
>>>
>>> Hi,
>>>
>>> I would like to propose the creation of a new Project:
>>> Project Trinity with myself as the Lead and the Core
>>> Libraries Group as the Sponsoring Group.
>>>
>>> This Project would explore enhanced execution of bulk
>>> aggregate calculations over Streams through offloading
>>> calculations to hardware accelerators.
>>>
>>> Streams allow developers to express calculations such
>>> that data parallelism can be efficiently exploited. Such
>>> calculations are prime candidates for leveraging enhanced
>>> data-oriented instructions on CPUs (such as SIMD
>>> instructions) or offloading to hardware accelerators
>>> (such as the SPARC Data Accelerator co-processor, further
>>> referred to as DAX [1]).
>>>
>>> To identify a path to improving performance and power
>>> efficiency, Project Trinity will explore how libraries
>>> like Streams can be enhanced to leverage data processing
>>> hardware features to execute Streams more efficiently [2].
>>>
>>> Directions for exploration include:
>>> - Building a streams-like library optimized for offload to
>>> -- hardware accelerators (such as DAX), or
>>> -- a GPU, or
>>> -- SIMD instructions;
>>> - Optimizations in the Graal compiler to automatically
>>> transform suitable Streams pipelines, taking advantage
>>> of data processing hardware features;
>>> - Explorations with Project Valhalla to expand the
>>> range of effective acceleration to Streams of value types.
>>>
>>> Success will be evaluated based upon:
>>> (1) speedups and resource efficiency gains achieved for a
>>> broad range of representative streams calculations under
>>> offload,
>>> (2) ease of use of the hardware acceleration capability, and
>>> (3) ensuring that there is no time or space overhead for
>>> non-accelerated calculations.
>>>
>>> The project will host at least the following mailing list:
>>> - trinity-dev for development discussions and user feedback
>>> - trinity-design for design and specification discussions
>>>
>>> About the Lead:
>>> Karthik Ganesan works for Oracle in the Performance and
>>> Applications Engineering group with more than 5 years of
>>> experience in Java/Hotspot performance projects [3].
>>>
>>> The initial Reviewers and Committers will be:
>>> * Ahmed Khawaja
>>> * Karthik Ganesan
>>> * Malcolm Kavalsky
>>> * Shrinivas Joshi
>>>
>>> Votes are due by May 4, 2017.
>>>
>>> Only current OpenJDK Members [4] are eligible to vote on this
>>> motion.  Votes must be cast in the open on the discuss list.
>>> Replying to this message is sufficient if your mail program
>>> honors the Reply-To header.
>>>
>>> For Lazy Consensus voting instructions, see [5].
>>>
>>> Karthik Ganesan
>>>
>>> [1] https://community.oracle.com/docs/DOC-994842
>>> [2] https://bugs.openjdk.java.net/browse/JDK-8150304
>>> [3] Karthik Ganesan, Yao-Min Chen and X Pan, “Scaling
>>> Java Virtual Machine on a Many-core System” at 14th
>>> International Symposium on Integrated Circuits (ISIC 2014)
>>> [4] http://openjdk.java.net/census#members
>>> [5] http://openjdk.java.net/projects/#new-project-vote
>>>
>>>
>

Reply | Threaded
Open this post in threaded view
|

Re: CFV: Project Trinity

Doug Simon @ Oracle

> On 21 Apr 2017, at 23:54, Christian Thalinger <[hidden email]> wrote:
>
>
>> On Apr 21, 2017, at 11:41 AM, Karthik Ganesan <[hidden email]> wrote:
>>
>> Hi Christian,
>>
>> Thanks for your interest. This question was brought up previously in the discussion email thread for this project:
>>
>> Project Sumatra was aimed at translation of Java byte code to execute on
>> GPU, which was an ambitious goal and a challenging task to take up. In this
>> project, we aim to come up with APIs targeting the most common Analytics
>> operations that can be readily offloaded to accelerators transparently. Most
>> of the information needed for offload to the accelerator is expected to be
>> readily provided by the API semantics and there by, simplifying the need to
>> do tedious byte code analysis.
>
> I disagree.  The first paragraph on the Sumatra project page says:
>
> "This primary goal of this project is to enable Java applications to take advantage of graphics processing units (GPUs) and accelerated processing units (APUs)--whether they are discrete devices or integrated with a CPU--to improve performance.”
>
> while you state:
>
> "This Project would explore enhanced execution of bulk
> aggregate calculations over Streams through offloading
> calculations to hardware accelerators.”
>
> It’s the same thing.  I just don’t see the need to spin up yet-another OpenJDK project that aims at the same goal.

Maybe this is just a discrepancy between the officially stated aims. I understood Sumatra to be about *automatic* offloading work for existing APIs (such as the Streams API) to a GPU where as Trinity seems to be more about designing an explicit API for GPU offloading.

-Doug
Reply | Threaded
Open this post in threaded view
|

Re: CFV: Project Trinity

Mario Torre-5
Ah, if that's the case it's interesting, and I would certainly vote for
that.

Cheers,
Mario

On Sun 23. Apr 2017 at 13:39, Doug Simon <[hidden email]> wrote:

>
> > On 21 Apr 2017, at 23:54, Christian Thalinger <[hidden email]>
> wrote:
> >
> >
> >> On Apr 21, 2017, at 11:41 AM, Karthik Ganesan <
> [hidden email]> wrote:
> >>
> >> Hi Christian,
> >>
> >> Thanks for your interest. This question was brought up previously in
> the discussion email thread for this project:
> >>
> >> Project Sumatra was aimed at translation of Java byte code to execute on
> >> GPU, which was an ambitious goal and a challenging task to take up. In
> this
> >> project, we aim to come up with APIs targeting the most common Analytics
> >> operations that can be readily offloaded to accelerators transparently.
> Most
> >> of the information needed for offload to the accelerator is expected to
> be
> >> readily provided by the API semantics and there by, simplifying the
> need to
> >> do tedious byte code analysis.
> >
> > I disagree.  The first paragraph on the Sumatra project page says:
> >
> > "This primary goal of this project is to enable Java applications to
> take advantage of graphics processing units (GPUs) and accelerated
> processing units (APUs)--whether they are discrete devices or integrated
> with a CPU--to improve performance.”
> >
> > while you state:
> >
> > "This Project would explore enhanced execution of bulk
> > aggregate calculations over Streams through offloading
> > calculations to hardware accelerators.”
> >
> > It’s the same thing.  I just don’t see the need to spin up yet-another
> OpenJDK project that aims at the same goal.
>
> Maybe this is just a discrepancy between the officially stated aims. I
> understood Sumatra to be about *automatic* offloading work for existing
> APIs (such as the Streams API) to a GPU where as Trinity seems to be more
> about designing an explicit API for GPU offloading.
>
> -Doug
Reply | Threaded
Open this post in threaded view
|

Re: CFV: Project Trinity

Volker Simonis
In reply to this post by Doug Simon @ Oracle
On Sun, Apr 23, 2017 at 1:39 PM, Doug Simon <[hidden email]> wrote:

>
>> On 21 Apr 2017, at 23:54, Christian Thalinger <[hidden email]> wrote:
>>
>>
>>> On Apr 21, 2017, at 11:41 AM, Karthik Ganesan <[hidden email]> wrote:
>>>
>>> Hi Christian,
>>>
>>> Thanks for your interest. This question was brought up previously in the discussion email thread for this project:
>>>
>>> Project Sumatra was aimed at translation of Java byte code to execute on
>>> GPU, which was an ambitious goal and a challenging task to take up. In this
>>> project, we aim to come up with APIs targeting the most common Analytics
>>> operations that can be readily offloaded to accelerators transparently. Most
>>> of the information needed for offload to the accelerator is expected to be
>>> readily provided by the API semantics and there by, simplifying the need to
>>> do tedious byte code analysis.
>>
>> I disagree.  The first paragraph on the Sumatra project page says:
>>
>> "This primary goal of this project is to enable Java applications to take advantage of graphics processing units (GPUs) and accelerated processing units (APUs)--whether they are discrete devices or integrated with a CPU--to improve performance.”
>>
>> while you state:
>>
>> "This Project would explore enhanced execution of bulk
>> aggregate calculations over Streams through offloading
>> calculations to hardware accelerators.”
>>
>> It’s the same thing.  I just don’t see the need to spin up yet-another OpenJDK project that aims at the same goal.
>
> Maybe this is just a discrepancy between the officially stated aims. I understood Sumatra to be about *automatic* offloading work for existing APIs (such as the Streams API) to a GPU where as Trinity seems to be more about designing an explicit API for GPU offloading.
>

So if this is about a explicit API for GPU offloading, will this be a
Java implementation/wrapper for already existing C/C++ APIs like
CUDA/OpenCL. Designing a completely new, Java-specific API seems to be
not very promising to me.

> -Doug
Reply | Threaded
Open this post in threaded view
|

Re: CFV: Project Trinity

Doug Simon @ Oracle

> On 24 Apr 2017, at 10:50, Volker Simonis <[hidden email]> wrote:
>
> On Sun, Apr 23, 2017 at 1:39 PM, Doug Simon <[hidden email]> wrote:
>>
>>> On 21 Apr 2017, at 23:54, Christian Thalinger <[hidden email]> wrote:
>>>
>>>
>>>> On Apr 21, 2017, at 11:41 AM, Karthik Ganesan <[hidden email]> wrote:
>>>>
>>>> Hi Christian,
>>>>
>>>> Thanks for your interest. This question was brought up previously in the discussion email thread for this project:
>>>>
>>>> Project Sumatra was aimed at translation of Java byte code to execute on
>>>> GPU, which was an ambitious goal and a challenging task to take up. In this
>>>> project, we aim to come up with APIs targeting the most common Analytics
>>>> operations that can be readily offloaded to accelerators transparently. Most
>>>> of the information needed for offload to the accelerator is expected to be
>>>> readily provided by the API semantics and there by, simplifying the need to
>>>> do tedious byte code analysis.
>>>
>>> I disagree.  The first paragraph on the Sumatra project page says:
>>>
>>> "This primary goal of this project is to enable Java applications to take advantage of graphics processing units (GPUs) and accelerated processing units (APUs)--whether they are discrete devices or integrated with a CPU--to improve performance.”
>>>
>>> while you state:
>>>
>>> "This Project would explore enhanced execution of bulk
>>> aggregate calculations over Streams through offloading
>>> calculations to hardware accelerators.”
>>>
>>> It’s the same thing.  I just don’t see the need to spin up yet-another OpenJDK project that aims at the same goal.
>>
>> Maybe this is just a discrepancy between the officially stated aims. I understood Sumatra to be about *automatic* offloading work for existing APIs (such as the Streams API) to a GPU where as Trinity seems to be more about designing an explicit API for GPU offloading.
>>
>
> So if this is about a explicit API for GPU offloading, will this be a
> Java implementation/wrapper for already existing C/C++ APIs like
> CUDA/OpenCL. Designing a completely new, Java-specific API seems to be
> not very promising to me.

I agree.

Karthik, maybe you could discuss the differences/similarities between Trinity and the Arapapi project (https://github.com/aparapi/aparapi).

-Doug
Reply | Threaded
Open this post in threaded view
|

Re: CFV: Project Trinity

mark.reinhold
In reply to this post by Karthik Ganesan
2017/4/21 11:28:24 -0700, [hidden email]:

> I would like to propose the creation of a new Project:
> Project Trinity with myself as the Lead and the Core
> Libraries Group as the Sponsoring Group.
>
> ...
>
> Votes are due by May 4, 2017.
>
> Only current OpenJDK Members [4] are eligible to vote on this
> motion.  Votes must be cast in the open on the discuss list.
> Replying to this message is sufficient if your mail program
> honors the Reply-To header.

From a strictly procedural perspective, this CFV is not valid.
Calls for votes on the creation of a new Project must be sent
to the announcement list, as described here:

    http://openjdk.java.net/projects/#new-project-propose

At any rate it has provoked a useful discussion, so let's see
how that turns out before you initiate an actual CFV.

- Mark
Reply | Threaded
Open this post in threaded view
|

Re: CFV: Project Trinity

Karthik Ganesan
In reply to this post by Doug Simon @ Oracle
I would like to thank Paul Sandoz, Christian Thalinger, Doug Simon,
Mario Torre and Volker Simonis for their support and the insightful
questions.

What we are proposing to do as part of this project is complementary to
existing efforts that enable offload to GPUs like Sumatra, AparAPI etc.
These existing projects provide implementations translating existing
Java API via Bytecodes to GPU language. Trinity extends these efforts
and takes it one step further by readily providing the building blocks
for programmers to construct complex bulk data/stream based algorithms
in Java that can be easily offloaded by these existing projects. While
having a route to offload to hardware accelerators is useful, but making
it easier for programmers to leverage will take it one step closer to
adoption.

Projects like Sumatra and AparAPI use the the Stream ForEach() method to
show case offloads. Trinity will offer more such methods with richer
functionality, making it easier for these existing projects to leverage
and deliver hardware capabilities to be readily consumed by programmers.
Unlike the existing Streams API, the library for this new API is
envisioned to have a stronger focus on performance, a dedicated
implementation that will be offload friendly and cover more functions
that are relevant to this domain of programmers.

Also, please note that Trinity casts a wider a net when it comes to
accelerators, not just GPUs/APUs. These accelerators can include
Analytics accelerators like DAX, SIMD units on general purpose cores,
FPGA based accelerators for bulk aggregate operations, GPUs and whatever
more the future holds in terms of heterogeneous computing for bulk data
processing.

Inspired by the existing Streams API that brings succinct functional
programming to Java using lambdas, this project will try to retain such
rich features, significantly simplifying programming in Java for the
performance oriented developers focusing on bulk data processing.

Regards,

Karthik


On 4/24/2017 4:09 AM, Doug Simon wrote:

>> On 24 Apr 2017, at 10:50, Volker Simonis <[hidden email]> wrote:
>>
>> On Sun, Apr 23, 2017 at 1:39 PM, Doug Simon <[hidden email]> wrote:
>>>> On 21 Apr 2017, at 23:54, Christian Thalinger <[hidden email]> wrote:
>>>>
>>>>
>>>>> On Apr 21, 2017, at 11:41 AM, Karthik Ganesan <[hidden email]> wrote:
>>>>>
>>>>> Hi Christian,
>>>>>
>>>>> Thanks for your interest. This question was brought up previously in the discussion email thread for this project:
>>>>>
>>>>> Project Sumatra was aimed at translation of Java byte code to execute on
>>>>> GPU, which was an ambitious goal and a challenging task to take up. In this
>>>>> project, we aim to come up with APIs targeting the most common Analytics
>>>>> operations that can be readily offloaded to accelerators transparently. Most
>>>>> of the information needed for offload to the accelerator is expected to be
>>>>> readily provided by the API semantics and there by, simplifying the need to
>>>>> do tedious byte code analysis.
>>>> I disagree.  The first paragraph on the Sumatra project page says:
>>>>
>>>> "This primary goal of this project is to enable Java applications to take advantage of graphics processing units (GPUs) and accelerated processing units (APUs)--whether they are discrete devices or integrated with a CPU--to improve performance.”
>>>>
>>>> while you state:
>>>>
>>>> "This Project would explore enhanced execution of bulk
>>>> aggregate calculations over Streams through offloading
>>>> calculations to hardware accelerators.”
>>>>
>>>> It’s the same thing.  I just don’t see the need to spin up yet-another OpenJDK project that aims at the same goal.
>>> Maybe this is just a discrepancy between the officially stated aims. I understood Sumatra to be about *automatic* offloading work for existing APIs (such as the Streams API) to a GPU where as Trinity seems to be more about designing an explicit API for GPU offloading.
>>>
>> So if this is about a explicit API for GPU offloading, will this be a
>> Java implementation/wrapper for already existing C/C++ APIs like
>> CUDA/OpenCL. Designing a completely new, Java-specific API seems to be
>> not very promising to me.
> I agree.
>
> Karthik, maybe you could discuss the differences/similarities between Trinity and the Arapapi project (https://github.com/aparapi/aparapi).
>
> -Doug

Reply | Threaded
Open this post in threaded view
|

Re: CFV: Project Trinity

Mario Torre-5
2017-04-24 17:00 GMT+02:00 Karthik Ganesan <[hidden email]>:
> SIMD

I still have an old PS3 with Linux, I'm looking forward to try that out :)

I think the project idea is interesting, I personally would not mind
if it duplicated some efforts from the existing libraries, as long as
the goal is to have a generic framework within the JDK. The challenge
is that such framework should work even if no specialised hardware is
available.

Is "Trinity" going to be something akin the various providers based
subsystems (like the Sound, the Filesystem, JavaScript, etc...), with
multiple possible backends, maybe using one the already mentioned
projects?

I personally think that there is benefit in overcoming the inherent
verbosity (and complexity) of APIs like CUDA or OpenCL (or Metal), but
those framework are verbose for a reason (flexibility). Just wrapping
them around (with little added extras) doesn't really add out much,
and in fact, it just makes things look even more alien (I think about
Jogl for example, does the job perfectly but it really looks like C
wrapped in Java). If we go the extra length to make a project it
should really be a nice, modern, well thought Java API.

To that extent, do you already have some code? It would be very nice
if we can look at something (including design and architecture
documents) before getting a huge patch bomb hitting the repos a week
after the project is approved.

I'll still likely vote for that, I'm intrigued.

Cheers,
Mario
--
pgp key: http://subkeys.pgp.net/ PGP Key ID: 80F240CF
Fingerprint: BA39 9666 94EC 8B73 27FA  FC7C 4086 63E3 80F2 40CF

Java Champion - Blog: http://neugens.wordpress.com - Twitter: @neugens
Proud GNU Classpath developer: http://www.classpath.org/
OpenJDK: http://openjdk.java.net/projects/caciocavallo/

Please, support open standards:
http://endsoftpatents.org/
Reply | Threaded
Open this post in threaded view
|

Re: CFV: Project Trinity

Volker Simonis
In reply to this post by Karthik Ganesan
On Mon, Apr 24, 2017 at 5:00 PM, Karthik Ganesan
<[hidden email]> wrote:

> I would like to thank Paul Sandoz, Christian Thalinger, Doug Simon, Mario
> Torre and Volker Simonis for their support and the insightful questions.
>
> What we are proposing to do as part of this project is complementary to
> existing efforts that enable offload to GPUs like Sumatra, AparAPI etc.
> These existing projects provide implementations translating existing Java
> API via Bytecodes to GPU language. Trinity extends these efforts and takes
> it one step further by readily providing the building blocks for programmers
> to construct complex bulk data/stream based algorithms in Java that can be
> easily offloaded by these existing projects. While having a route to offload
> to hardware accelerators is useful, but making it easier for programmers to
> leverage will take it one step closer to adoption.
>
> Projects like Sumatra and AparAPI use the the Stream ForEach() method to
> show case offloads. Trinity will offer more such methods with richer
> functionality, making it easier for these existing projects to leverage and
> deliver hardware capabilities to be readily consumed by programmers. Unlike
> the existing Streams API, the library for this new API is envisioned to have
> a stronger focus on performance, a dedicated implementation that will be
> offload friendly and cover more functions that are relevant to this domain
> of programmers.
>
> Also, please note that Trinity casts a wider a net when it comes to
> accelerators, not just GPUs/APUs. These accelerators can include Analytics
> accelerators like DAX, SIMD units on general purpose cores, FPGA based
> accelerators for bulk aggregate operations, GPUs and whatever more the
> future holds in terms of heterogeneous computing for bulk data processing.
>

This certainly sounds very ambitious! I'm not an expert in this area,
but I don't think there's even a good C/C++ API which covers this
broad range of "accelerators". What we should certainly avoid is
providing an API which only works with accelerator XXX of vendor YYY.
If the goal of this project is to eventually provide a standard Java
API, it should at least support a wide range of available
"accelerators" which, to repeat my self, makes it quite ambitious.

That said, how is this new library supposed to work? Will it be mainly
implemented in Java with various native C/C++ back-ends or do you plan
to still use VM (aka. HotSpot) support via intrinsics and various
other sorts of JIT compiler optimizations? As far as I understood now,
you'll plan to go for the first approach without HotSpot support,
right? In that case, Trinity would certainly be a good candidate for
using the new JNI work done in project Panama.

> Inspired by the existing Streams API that brings succinct functional
> programming to Java using lambdas, this project will try to retain such rich
> features, significantly simplifying programming in Java for the performance
> oriented developers focusing on bulk data processing.
>
> Regards,
>
> Karthik
>
>
>
> On 4/24/2017 4:09 AM, Doug Simon wrote:
>>>
>>> On 24 Apr 2017, at 10:50, Volker Simonis <[hidden email]>
>>> wrote:
>>>
>>> On Sun, Apr 23, 2017 at 1:39 PM, Doug Simon <[hidden email]>
>>> wrote:
>>>>>
>>>>> On 21 Apr 2017, at 23:54, Christian Thalinger <[hidden email]>
>>>>> wrote:
>>>>>
>>>>>
>>>>>> On Apr 21, 2017, at 11:41 AM, Karthik Ganesan
>>>>>> <[hidden email]> wrote:
>>>>>>
>>>>>> Hi Christian,
>>>>>>
>>>>>> Thanks for your interest. This question was brought up previously in
>>>>>> the discussion email thread for this project:
>>>>>>
>>>>>> Project Sumatra was aimed at translation of Java byte code to execute
>>>>>> on
>>>>>> GPU, which was an ambitious goal and a challenging task to take up. In
>>>>>> this
>>>>>> project, we aim to come up with APIs targeting the most common
>>>>>> Analytics
>>>>>> operations that can be readily offloaded to accelerators
>>>>>> transparently. Most
>>>>>> of the information needed for offload to the accelerator is expected
>>>>>> to be
>>>>>> readily provided by the API semantics and there by, simplifying the
>>>>>> need to
>>>>>> do tedious byte code analysis.
>>>>>
>>>>> I disagree.  The first paragraph on the Sumatra project page says:
>>>>>
>>>>> "This primary goal of this project is to enable Java applications to
>>>>> take advantage of graphics processing units (GPUs) and accelerated
>>>>> processing units (APUs)--whether they are discrete devices or integrated
>>>>> with a CPU--to improve performance.”
>>>>>
>>>>> while you state:
>>>>>
>>>>> "This Project would explore enhanced execution of bulk
>>>>> aggregate calculations over Streams through offloading
>>>>> calculations to hardware accelerators.”
>>>>>
>>>>> It’s the same thing.  I just don’t see the need to spin up yet-another
>>>>> OpenJDK project that aims at the same goal.
>>>>
>>>> Maybe this is just a discrepancy between the officially stated aims. I
>>>> understood Sumatra to be about *automatic* offloading work for existing APIs
>>>> (such as the Streams API) to a GPU where as Trinity seems to be more about
>>>> designing an explicit API for GPU offloading.
>>>>
>>> So if this is about a explicit API for GPU offloading, will this be a
>>> Java implementation/wrapper for already existing C/C++ APIs like
>>> CUDA/OpenCL. Designing a completely new, Java-specific API seems to be
>>> not very promising to me.
>>
>> I agree.
>>
>> Karthik, maybe you could discuss the differences/similarities between
>> Trinity and the Arapapi project (https://github.com/aparapi/aparapi).
>>
>> -Doug
>
>
Reply | Threaded
Open this post in threaded view
|

Re: CFV: Project Trinity

Karthik Ganesan
In reply to this post by Mario Torre-5

Hi Mario, please see comments inline.
On 4/24/2017 11:35 AM, Mario Torre wrote:
> I still have an old PS3 with Linux, I'm looking forward to try that
> out :)
> I think the project idea is interesting, I personally would not mind
> if it duplicated some efforts from the existing libraries, as long as
> the goal is to have a generic framework within the JDK. The challenge
> is that such framework should work even if no specialised hardware is
> available.
Our goals is also to ensure that such a framework will work even if no
specialized hardware is available.
>
> Is "Trinity" going to be something akin the various providers based
> subsystems (like the Sound, the Filesystem, JavaScript, etc...), with
> multiple possible backends, maybe using one the already mentioned
> projects?
Yes, and I think that is a reasonable analogy.
> To that extent, do you already have some code? It would be very nice
> if we can look at something (including design and architecture
> documents) before getting a huge patch bomb hitting the repos a week
> after the project is approved.
Though we have done multiple prototypes trying to change the existing
Streams library to open up to accelerators and extend to cover a wider
range of analytic operations, most of the useful information we bring to
this project is our learning about the challenges involved than a code
patch bomb. :-)
>
> I'll still likely vote for that, I'm intrigued.
Appreciate that.

Thanks,
Karthik
>
> Cheers,
> Mario

Reply | Threaded
Open this post in threaded view
|

Re: CFV: Project Trinity

Karthik Ganesan
In reply to this post by Volker Simonis
Hi Volker,

On 4/24/2017 12:30 PM, Volker Simonis wrote:
> This certainly sounds very ambitious! I'm not an expert in this area,
> but I don't think there's even a good C/C++ API which covers this
> broad range of "accelerators".
It may look ambitious, but if we restrict ourselves to a particular
domain of bulk data processing and look at this library as a domain
specific java library that offers a standard interface backend to
multiple accelerators, it is still a plausible goal to achieve. Such an
API will have the best chance for adoption in the "portable" Java world.

> What we should certainly avoid is
> providing an API which only works with accelerator XXX of vendor YYY.
Indeed, portability will be a key design goal.
> If the goal of this project is to eventually provide a standard Java
> API, it should at least support a wide range of available
> "accelerators" which, to repeat my self, makes it quite ambitious.
The biggest of the problems with accelerator offload is detection of
code patterns that are suitable to be translated to the accelerator.
With a dedicated API, and operations in a specific domain, we
significantly simplify this problem. Based on some prototyping work we
have done using DAX, we have clearly seen the merits of this approach.

We are just signing up to provide a better interface/implementation than
what the current Streams API provides for offload and acceleration in
this domain which is already being used by existing offload related
projects. We are not targeting any dynamic code generation for these
backends which would be redundant given the existing projects like
Sumatra and AparAPI.
>
> That said, how is this new library supposed to work? Will it be mainly
> implemented in Java with various native C/C++ back-ends or do you plan
> to still use VM (aka. HotSpot) support via intrinsics and various
> other sorts of JIT compiler optimizations?
It is something that we would like to explore as part of this project
with input from previous and ongoing offload related projects. Based on
some of our initial experiments with both intrinsics and JNI, it does
not have to be one or the other and we are open to both.  Especially
with the artifacts offered by Panama, this will be very interesting to
explore further.

Thanks,
Karthik

>
>> Inspired by the existing Streams API that brings succinct functional
>> programming to Java using lambdas, this project will try to retain such rich
>> features, significantly simplifying programming in Java for the performance
>> oriented developers focusing on bulk data processing.
>>
>> Regards,
>>
>> Karthik
>>
>>
>>
>> On 4/24/2017 4:09 AM, Doug Simon wrote:
>>>> On 24 Apr 2017, at 10:50, Volker Simonis <[hidden email]>
>>>> wrote:
>>>>
>>>> On Sun, Apr 23, 2017 at 1:39 PM, Doug Simon <[hidden email]>
>>>> wrote:
>>>>>> On 21 Apr 2017, at 23:54, Christian Thalinger <[hidden email]>
>>>>>> wrote:
>>>>>>
>>>>>>
>>>>>>> On Apr 21, 2017, at 11:41 AM, Karthik Ganesan
>>>>>>> <[hidden email]> wrote:
>>>>>>>
>>>>>>> Hi Christian,
>>>>>>>
>>>>>>> Thanks for your interest. This question was brought up previously in
>>>>>>> the discussion email thread for this project:
>>>>>>>
>>>>>>> Project Sumatra was aimed at translation of Java byte code to execute
>>>>>>> on
>>>>>>> GPU, which was an ambitious goal and a challenging task to take up. In
>>>>>>> this
>>>>>>> project, we aim to come up with APIs targeting the most common
>>>>>>> Analytics
>>>>>>> operations that can be readily offloaded to accelerators
>>>>>>> transparently. Most
>>>>>>> of the information needed for offload to the accelerator is expected
>>>>>>> to be
>>>>>>> readily provided by the API semantics and there by, simplifying the
>>>>>>> need to
>>>>>>> do tedious byte code analysis.
>>>>>> I disagree.  The first paragraph on the Sumatra project page says:
>>>>>>
>>>>>> "This primary goal of this project is to enable Java applications to
>>>>>> take advantage of graphics processing units (GPUs) and accelerated
>>>>>> processing units (APUs)--whether they are discrete devices or integrated
>>>>>> with a CPU--to improve performance.”
>>>>>>
>>>>>> while you state:
>>>>>>
>>>>>> "This Project would explore enhanced execution of bulk
>>>>>> aggregate calculations over Streams through offloading
>>>>>> calculations to hardware accelerators.”
>>>>>>
>>>>>> It’s the same thing.  I just don’t see the need to spin up yet-another
>>>>>> OpenJDK project that aims at the same goal.
>>>>> Maybe this is just a discrepancy between the officially stated aims. I
>>>>> understood Sumatra to be about *automatic* offloading work for existing APIs
>>>>> (such as the Streams API) to a GPU where as Trinity seems to be more about
>>>>> designing an explicit API for GPU offloading.
>>>>>
>>>> So if this is about a explicit API for GPU offloading, will this be a
>>>> Java implementation/wrapper for already existing C/C++ APIs like
>>>> CUDA/OpenCL. Designing a completely new, Java-specific API seems to be
>>>> not very promising to me.
>>> I agree.
>>>
>>> Karthik, maybe you could discuss the differences/similarities between
>>> Trinity and the Arapapi project (https://github.com/aparapi/aparapi).
>>>
>>> -Doug
>>

Reply | Threaded
Open this post in threaded view
|

Re: CFV: Project Trinity

Andrew Haley
On 24/04/17 20:54, Karthik Ganesan wrote:

> On 4/24/2017 12:30 PM, Volker Simonis wrote:
>
>> This certainly sounds very ambitious! I'm not an expert in this
>> area, but I don't think there's even a good C/C++ API which covers
>> this broad range of "accelerators".
>
> It may look ambitious, but if we restrict ourselves to a particular
> domain of bulk data processing and look at this library as a domain
> specific java library that offers a standard interface backend to
> multiple accelerators, it is still a plausible goal to achieve. Such an
> API will have the best chance for adoption in the "portable" Java world.

I'm rather troubled by this.  It means that we'll have a "high level"
streams approach and a "bulk data" approach, presumably with different
data structures and APIs.  It means that a programmer coming to a
problem will have to choose between these two approaches, because one
doesn't meet performance targets.

It is the case that was cannot use the fairly new Java streaming
operations to handle bulk data accelerators?

Andrew.
Reply | Threaded
Open this post in threaded view
|

Re: CFV: Project Trinity

mark.reinhold
In reply to this post by Karthik Ganesan
2017/4/24 12:54:54 -0700, [hidden email]:
> ...
>
> It may look ambitious, but if we restrict ourselves to a particular
> domain of bulk data processing and look at this library as a domain
> specific java library that offers a standard interface backend to
> multiple accelerators, it is still a plausible goal to achieve. Such an
> API will have the best chance for adoption in the "portable" Java world.

Aside from the potential overlap with existing Projects, which others
have already pointed out, what I don't yet see here is a strong reason
for why this needs to be an OpenJDK Project.

The Sumatra, Panama, and Valhalla Projects are exploring changes that
are intimately tied to the JDK.  Trinity, as far as I understand it,
proposes to build a library on top of the JDK.  The OpenJDK Community
aims (per the Bylaws [1]) to foster the development of "implementations
of present and future versions of the Java Platform, Standard Edition,
as defined by the Java Community Process, and ... closely-related
projects."  Trinity may, in the end, prove to be a huge success, but
if it's not intimately tied to the JDK then why does it need to be
developed in OpenJDK Project?  Why can't this work be done on GitHub
or Bitbucket?

- Mark


[1] http://openjdk.java.net/bylaws
Reply | Threaded
Open this post in threaded view
|

Re: CFV: Project Trinity

Karthik Ganesan
In reply to this post by Andrew Haley
Hi Andrew,

> It is the case that was cannot use the fairly new Java streaming
> operations to handle bulk data accelerators?
The decision of whether to enhance existing Streams library for offload
to accelerators or create a new one is also part of the exploration
proposed by this project.

Thanks,
Karthik
>
> Andrew.

Reply | Threaded
Open this post in threaded view
|

Re: CFV: Project Trinity

Karthik Ganesan
In reply to this post by mark.reinhold
Hi Mark,

>
> The Sumatra, Panama, and Valhalla Projects are exploring changes that
> are intimately tied to the JDK.  Trinity, as far as I understand it,
> proposes to build a library on top of the JDK.  The OpenJDK Community
> aims (per the Bylaws [1]) to foster the development of "implementations
> of present and future versions of the Java Platform, Standard Edition,
> as defined by the Java Community Process, and ... closely-related
> projects."  Trinity may, in the end, prove to be a huge success, but
> if it's not intimately tied to the JDK then why does it need to be
> developed in OpenJDK Project?
Enhancing the existing Streams library as a possible direction was never
off the table for the proposed project. I am sorry if that was not clear
from the proposal. Based on our initial evaluation using multiple
prototypes, some problems like cracking open lambdas is very important
to enable offload to accelerators and a solution indeed may need a close
tie to the JDK.

I would like to thank every one who took the time to provide a lot of
good feedback for this proposal. We will reevaluate some of the
logistics related to this effort using all this input.

Regards,
Karthik
> Why can't this work be done on GitHub
> or Bitbucket?
>
> - Mark
>
>
> [1] http://openjdk.java.net/bylaws