Quantcast

Project Proposal: Trinity

classic Classic list List threaded Threaded
10 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Project Proposal: Trinity

Karthik Ganesan
Hi,

I would like to propose the creation of a new Project: Project Trinity.

This Project would explore enhanced execution of bulk aggregate
calculations over Streams through offloading calculations to hardware
accelerators.

Streams allow developers to express calculations such that data
parallelism can be efficiently exploited. Such calculations are prime
candidates for leveraging enhanced data-oriented instructions on CPUs
(such as SIMD instructions) or offloading to hardware accelerators (such
as the SPARC Data Accelerator co-processor, further referred to as DAX [1]).

To identify a path to improving performance and power efficiency,
Project Trinity will explore how libraries like Streams can be enhanced
to leverage data processing hardware features to execute Streams more
efficiently.

Directions for exploration include:
- Building a streams-like library optimized for offload to
-- hardware accelerators (such as DAX), or
-- a GPU, or
-- SIMD instructions;
- Optimizations in the Graal compiler to automatically transform
suitable Streams pipelines, taking advantage of data processing hardware
features;
- Explorations with Project Valhalla to expand the range of effective
acceleration to Streams of value types.

Success will be evaluated based upon:
(1) speedups and resource efficiency gains achieved for a broad range of
representative streams calculations under offload,
(2) ease of use of the hardware acceleration capability, and
(3) ensuring that there is no time or space overhead for non-accelerated
calculations.

Can I please request the support of the Core Libraries Group as the
Sponsoring Group with myself as the Project Lead.

Warm Regards,
Karthik Ganesan

[1] https://community.oracle.com/docs/DOC-994842

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Project Proposal: Trinity

Volker Simonis
Hi Karthik,

we had project "Sumatra" [1] for this which is inactive since quite some time.
We also have project "Panama" [2] which, as far as I understand, is
also looking into auto-parallelization/vectorization. See for example
the "Vectors for Java" presentation from JavaOne which describes some
very similar ideas to yours.

What justifies the creation of yet another project instead of doing
this work in the context of the existing projects?
What in your approach is different to the one described in [3] which
is already, at least partially, implemented in project Panama?

Thanks,
Volker

[1] http://openjdk.java.net/projects/sumatra/
[2] http://openjdk.java.net/projects/panama/
[3] http://cr.openjdk.java.net/~psandoz/conferences/2016-JavaOne/j1-2016-vectors-for-java-CON1560.pdf

On Mon, Nov 14, 2016 at 5:23 PM, Karthik Ganesan
<[hidden email]> wrote:

> Hi,
>
> I would like to propose the creation of a new Project: Project Trinity.
>
> This Project would explore enhanced execution of bulk aggregate calculations
> over Streams through offloading calculations to hardware accelerators.
>
> Streams allow developers to express calculations such that data parallelism
> can be efficiently exploited. Such calculations are prime candidates for
> leveraging enhanced data-oriented instructions on CPUs (such as SIMD
> instructions) or offloading to hardware accelerators (such as the SPARC Data
> Accelerator co-processor, further referred to as DAX [1]).
>
> To identify a path to improving performance and power efficiency, Project
> Trinity will explore how libraries like Streams can be enhanced to leverage
> data processing hardware features to execute Streams more efficiently.
>
> Directions for exploration include:
> - Building a streams-like library optimized for offload to
> -- hardware accelerators (such as DAX), or
> -- a GPU, or
> -- SIMD instructions;
> - Optimizations in the Graal compiler to automatically transform suitable
> Streams pipelines, taking advantage of data processing hardware features;
> - Explorations with Project Valhalla to expand the range of effective
> acceleration to Streams of value types.
>
> Success will be evaluated based upon:
> (1) speedups and resource efficiency gains achieved for a broad range of
> representative streams calculations under offload,
> (2) ease of use of the hardware acceleration capability, and
> (3) ensuring that there is no time or space overhead for non-accelerated
> calculations.
>
> Can I please request the support of the Core Libraries Group as the
> Sponsoring Group with myself as the Project Lead.
>
> Warm Regards,
> Karthik Ganesan
>
> [1] https://community.oracle.com/docs/DOC-994842
>
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Project Proposal: Trinity

Karthik Ganesan
Hi Volker,

Thanks for your comments and the relevant questions. We have reviewed
projects Sumatra and Panama and talked to members who are familiar with
the projects.

Project Sumatra was aimed at translation of Java byte code to execute on
GPU, which was an ambitious goal and a challenging task to take up. In
this project, we aim to come up with APIs targeting the most common
Analytics operations that can be readily offloaded to accelerators
transparently. Most of the information needed for offload to the
accelerator is expected to be readily provided by the API semantics and
there by, simplifying the need to do tedious byte code analysis.

While the vector API (part of Panama) brings some most wanted
abstraction for vectors, it is still loop based and is most useful for
superword type of operations leveraging SIMD units on general purpose
cores. The aim of this proposed project is to provide a more abstract
API (similar to the Streams API) that will directly work on streams of
data and transparently accommodate a wider set of heterogeneous
accelerators like DAX, GPUs underneath. Initially, the project will
focus on coming up with a complete set of APIs, relevant input/output
formats, optimized data structures and storage format that can be used
as building blocks to build high performance analytics
applications/frameworks in Java. Simple examples of such operations will
include Scan, select, filter, lookup, transcode, merge, sort etc.
Additionally, this project will also require more functionality like
operating system library calls, handling Garbage Collection needs amidst
offload etc.

The artifacts provided by Project Panama including the code snippets (or
even the Vector API) along with value types from project Valhalla will
come in handy to be leveraged wherever it is applicable in this project.
Overall, I feel that the goals of this project and the needed work are
different from what the Vector API is targeting. Hope this answers your
question.

Regards,

Karthik

On 11/14/2016 10:49 AM, Volker Simonis wrote:

> Hi Karthik,
>
> we had project "Sumatra" [1] for this which is inactive since quite some time.
> We also have project "Panama" [2] which, as far as I understand, is
> also looking into auto-parallelization/vectorization. See for example
> the "Vectors for Java" presentation from JavaOne which describes some
> very similar ideas to yours.
>
> What justifies the creation of yet another project instead of doing
> this work in the context of the existing projects?
> What in your approach is different to the one described in [3] which
> is already, at least partially, implemented in project Panama?
>
> Thanks,
> Volker
>
> [1] http://openjdk.java.net/projects/sumatra/
> [2] http://openjdk.java.net/projects/panama/
> [3] http://cr.openjdk.java.net/~psandoz/conferences/2016-JavaOne/j1-2016-vectors-for-java-CON1560.pdf
>
> On Mon, Nov 14, 2016 at 5:23 PM, Karthik Ganesan
> <[hidden email]> wrote:
>> Hi,
>>
>> I would like to propose the creation of a new Project: Project Trinity.
>>
>> This Project would explore enhanced execution of bulk aggregate calculations
>> over Streams through offloading calculations to hardware accelerators.
>>
>> Streams allow developers to express calculations such that data parallelism
>> can be efficiently exploited. Such calculations are prime candidates for
>> leveraging enhanced data-oriented instructions on CPUs (such as SIMD
>> instructions) or offloading to hardware accelerators (such as the SPARC Data
>> Accelerator co-processor, further referred to as DAX [1]).
>>
>> To identify a path to improving performance and power efficiency, Project
>> Trinity will explore how libraries like Streams can be enhanced to leverage
>> data processing hardware features to execute Streams more efficiently.
>>
>> Directions for exploration include:
>> - Building a streams-like library optimized for offload to
>> -- hardware accelerators (such as DAX), or
>> -- a GPU, or
>> -- SIMD instructions;
>> - Optimizations in the Graal compiler to automatically transform suitable
>> Streams pipelines, taking advantage of data processing hardware features;
>> - Explorations with Project Valhalla to expand the range of effective
>> acceleration to Streams of value types.
>>
>> Success will be evaluated based upon:
>> (1) speedups and resource efficiency gains achieved for a broad range of
>> representative streams calculations under offload,
>> (2) ease of use of the hardware acceleration capability, and
>> (3) ensuring that there is no time or space overhead for non-accelerated
>> calculations.
>>
>> Can I please request the support of the Core Libraries Group as the
>> Sponsoring Group with myself as the Project Lead.
>>
>> Warm Regards,
>> Karthik Ganesan
>>
>> [1] https://community.oracle.com/docs/DOC-994842
>>

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Project Proposal: Trinity

Volker Simonis
Hi Karthik,

thanks a lot for your quick answer and the detailed description of the
project's goals and relation to the other projects. This all sounds
reasonable and very interesting! I wish I'll find some time to take a
deeper look at your project :)

Wish you all the best,
Volker


On Tue, Nov 15, 2016 at 6:57 AM, Karthik Ganesan
<[hidden email]> wrote:

> Hi Volker,
>
> Thanks for your comments and the relevant questions. We have reviewed
> projects Sumatra and Panama and talked to members who are familiar with the
> projects.
>
> Project Sumatra was aimed at translation of Java byte code to execute on
> GPU, which was an ambitious goal and a challenging task to take up. In this
> project, we aim to come up with APIs targeting the most common Analytics
> operations that can be readily offloaded to accelerators transparently. Most
> of the information needed for offload to the accelerator is expected to be
> readily provided by the API semantics and there by, simplifying the need to
> do tedious byte code analysis.
>
> While the vector API (part of Panama) brings some most wanted abstraction
> for vectors, it is still loop based and is most useful for superword type of
> operations leveraging SIMD units on general purpose cores. The aim of this
> proposed project is to provide a more abstract API (similar to the Streams
> API) that will directly work on streams of data and transparently
> accommodate a wider set of heterogeneous accelerators like DAX, GPUs
> underneath. Initially, the project will focus on coming up with a complete
> set of APIs, relevant input/output formats, optimized data structures and
> storage format that can be used as building blocks to build high performance
> analytics applications/frameworks in Java. Simple examples of such
> operations will include Scan, select, filter, lookup, transcode, merge, sort
> etc. Additionally, this project will also require more functionality like
> operating system library calls, handling Garbage Collection needs amidst
> offload etc.
>
> The artifacts provided by Project Panama including the code snippets (or
> even the Vector API) along with value types from project Valhalla will come
> in handy to be leveraged wherever it is applicable in this project. Overall,
> I feel that the goals of this project and the needed work are different from
> what the Vector API is targeting. Hope this answers your question.
>
> Regards,
>
> Karthik
>
>
> On 11/14/2016 10:49 AM, Volker Simonis wrote:
>>
>> Hi Karthik,
>>
>> we had project "Sumatra" [1] for this which is inactive since quite some
>> time.
>> We also have project "Panama" [2] which, as far as I understand, is
>> also looking into auto-parallelization/vectorization. See for example
>> the "Vectors for Java" presentation from JavaOne which describes some
>> very similar ideas to yours.
>>
>> What justifies the creation of yet another project instead of doing
>> this work in the context of the existing projects?
>> What in your approach is different to the one described in [3] which
>> is already, at least partially, implemented in project Panama?
>>
>> Thanks,
>> Volker
>>
>> [1] http://openjdk.java.net/projects/sumatra/
>> [2] http://openjdk.java.net/projects/panama/
>> [3]
>> http://cr.openjdk.java.net/~psandoz/conferences/2016-JavaOne/j1-2016-vectors-for-java-CON1560.pdf
>>
>> On Mon, Nov 14, 2016 at 5:23 PM, Karthik Ganesan
>> <[hidden email]> wrote:
>>>
>>> Hi,
>>>
>>> I would like to propose the creation of a new Project: Project Trinity.
>>>
>>> This Project would explore enhanced execution of bulk aggregate
>>> calculations
>>> over Streams through offloading calculations to hardware accelerators.
>>>
>>> Streams allow developers to express calculations such that data
>>> parallelism
>>> can be efficiently exploited. Such calculations are prime candidates for
>>> leveraging enhanced data-oriented instructions on CPUs (such as SIMD
>>> instructions) or offloading to hardware accelerators (such as the SPARC
>>> Data
>>> Accelerator co-processor, further referred to as DAX [1]).
>>>
>>> To identify a path to improving performance and power efficiency, Project
>>> Trinity will explore how libraries like Streams can be enhanced to
>>> leverage
>>> data processing hardware features to execute Streams more efficiently.
>>>
>>> Directions for exploration include:
>>> - Building a streams-like library optimized for offload to
>>> -- hardware accelerators (such as DAX), or
>>> -- a GPU, or
>>> -- SIMD instructions;
>>> - Optimizations in the Graal compiler to automatically transform suitable
>>> Streams pipelines, taking advantage of data processing hardware features;
>>> - Explorations with Project Valhalla to expand the range of effective
>>> acceleration to Streams of value types.
>>>
>>> Success will be evaluated based upon:
>>> (1) speedups and resource efficiency gains achieved for a broad range of
>>> representative streams calculations under offload,
>>> (2) ease of use of the hardware acceleration capability, and
>>> (3) ensuring that there is no time or space overhead for non-accelerated
>>> calculations.
>>>
>>> Can I please request the support of the Core Libraries Group as the
>>> Sponsoring Group with myself as the Project Lead.
>>>
>>> Warm Regards,
>>> Karthik Ganesan
>>>
>>> [1] https://community.oracle.com/docs/DOC-994842
>>>
>
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Project Proposal: Trinity

Vladimir Ivanov
In reply to this post by Karthik Ganesan
Thanks for the clarifications, Karthik.

Have you considered conducting the experiments in Project Panama?

IMO the goals you stated fit Panama really well. There are different
directions being explored at the momemnt and most of them are relevant
for the project you propose.

Though current explorations in Vector API are primarily focused on SIMD
support, there are no inherent barriers in extending the scope to GPUs
and special-purpose HW accelerators. It doesn't mean in any way it
should be part of Vector API work, but both projects should win from
such collaboration.

During explorations you can rely on new FFI, foreign data layout
support, machine code snippets, and value types (since they are
important for Panama, there will be regular syncs between projects and
the engineering costs can be amortized).

Overall, I expect all participants to benefit from such synergy.

Best regards,
Vladimir Ivanov

On 11/15/16 8:57 AM, Karthik Ganesan wrote:

> Hi Volker,
>
> Thanks for your comments and the relevant questions. We have reviewed
> projects Sumatra and Panama and talked to members who are familiar with
> the projects.
>
> Project Sumatra was aimed at translation of Java byte code to execute on
> GPU, which was an ambitious goal and a challenging task to take up. In
> this project, we aim to come up with APIs targeting the most common
> Analytics operations that can be readily offloaded to accelerators
> transparently. Most of the information needed for offload to the
> accelerator is expected to be readily provided by the API semantics and
> there by, simplifying the need to do tedious byte code analysis.
>
> While the vector API (part of Panama) brings some most wanted
> abstraction for vectors, it is still loop based and is most useful for
> superword type of operations leveraging SIMD units on general purpose
> cores. The aim of this proposed project is to provide a more abstract
> API (similar to the Streams API) that will directly work on streams of
> data and transparently accommodate a wider set of heterogeneous
> accelerators like DAX, GPUs underneath. Initially, the project will
> focus on coming up with a complete set of APIs, relevant input/output
> formats, optimized data structures and storage format that can be used
> as building blocks to build high performance analytics
> applications/frameworks in Java. Simple examples of such operations will
> include Scan, select, filter, lookup, transcode, merge, sort etc.
> Additionally, this project will also require more functionality like
> operating system library calls, handling Garbage Collection needs amidst
> offload etc.
>
> The artifacts provided by Project Panama including the code snippets (or
> even the Vector API) along with value types from project Valhalla will
> come in handy to be leveraged wherever it is applicable in this project.
> Overall, I feel that the goals of this project and the needed work are
> different from what the Vector API is targeting. Hope this answers your
> question.
>
> Regards,
>
> Karthik
>
> On 11/14/2016 10:49 AM, Volker Simonis wrote:
>> Hi Karthik,
>>
>> we had project "Sumatra" [1] for this which is inactive since quite
>> some time.
>> We also have project "Panama" [2] which, as far as I understand, is
>> also looking into auto-parallelization/vectorization. See for example
>> the "Vectors for Java" presentation from JavaOne which describes some
>> very similar ideas to yours.
>>
>> What justifies the creation of yet another project instead of doing
>> this work in the context of the existing projects?
>> What in your approach is different to the one described in [3] which
>> is already, at least partially, implemented in project Panama?
>>
>> Thanks,
>> Volker
>>
>> [1] http://openjdk.java.net/projects/sumatra/
>> [2] http://openjdk.java.net/projects/panama/
>> [3]
>> http://cr.openjdk.java.net/~psandoz/conferences/2016-JavaOne/j1-2016-vectors-for-java-CON1560.pdf
>>
>>
>> On Mon, Nov 14, 2016 at 5:23 PM, Karthik Ganesan
>> <[hidden email]> wrote:
>>> Hi,
>>>
>>> I would like to propose the creation of a new Project: Project Trinity.
>>>
>>> This Project would explore enhanced execution of bulk aggregate
>>> calculations
>>> over Streams through offloading calculations to hardware accelerators.
>>>
>>> Streams allow developers to express calculations such that data
>>> parallelism
>>> can be efficiently exploited. Such calculations are prime candidates for
>>> leveraging enhanced data-oriented instructions on CPUs (such as SIMD
>>> instructions) or offloading to hardware accelerators (such as the
>>> SPARC Data
>>> Accelerator co-processor, further referred to as DAX [1]).
>>>
>>> To identify a path to improving performance and power efficiency,
>>> Project
>>> Trinity will explore how libraries like Streams can be enhanced to
>>> leverage
>>> data processing hardware features to execute Streams more efficiently.
>>>
>>> Directions for exploration include:
>>> - Building a streams-like library optimized for offload to
>>> -- hardware accelerators (such as DAX), or
>>> -- a GPU, or
>>> -- SIMD instructions;
>>> - Optimizations in the Graal compiler to automatically transform
>>> suitable
>>> Streams pipelines, taking advantage of data processing hardware
>>> features;
>>> - Explorations with Project Valhalla to expand the range of effective
>>> acceleration to Streams of value types.
>>>
>>> Success will be evaluated based upon:
>>> (1) speedups and resource efficiency gains achieved for a broad range of
>>> representative streams calculations under offload,
>>> (2) ease of use of the hardware acceleration capability, and
>>> (3) ensuring that there is no time or space overhead for non-accelerated
>>> calculations.
>>>
>>> Can I please request the support of the Core Libraries Group as the
>>> Sponsoring Group with myself as the Project Lead.
>>>
>>> Warm Regards,
>>> Karthik Ganesan
>>>
>>> [1] https://community.oracle.com/docs/DOC-994842
>>>
>
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Project Proposal: Trinity

Karthik Ganesan
Hi Vladimir,

Thanks for the support. I certainly agree with you regarding the value
in collaboration between the ongoing vector API efforts and this
project, going forward. I believe that the tools that will be provided
by Panama and Valhalla have set the stage to start exploring something
like Trinity, which may either consume some of these artifacts and/or
choose to improve/expand them over the course of this project. It would
be great if the members familiar with Vector API can participate in the
initial discussions of this project and help steer the relevant aspects
of Trinity in the right direction.

Regards,
Karthik

On 16-11-16 05:01 AM, Vladimir Ivanov wrote:

> Thanks for the clarifications, Karthik.
>
> Have you considered conducting the experiments in Project Panama?
>
> IMO the goals you stated fit Panama really well. There are different
> directions being explored at the momemnt and most of them are relevant
> for the project you propose.
>
> Though current explorations in Vector API are primarily focused on
> SIMD support, there are no inherent barriers in extending the scope to
> GPUs and special-purpose HW accelerators. It doesn't mean in any way
> it should be part of Vector API work, but both projects should win
> from such collaboration.
>
> During explorations you can rely on new FFI, foreign data layout
> support, machine code snippets, and value types (since they are
> important for Panama, there will be regular syncs between projects and
> the engineering costs can be amortized).
>
> Overall, I expect all participants to benefit from such synergy.
>
> Best regards,
> Vladimir Ivanov
>
> On 11/15/16 8:57 AM, Karthik Ganesan wrote:
>> Hi Volker,
>>
>> Thanks for your comments and the relevant questions. We have reviewed
>> projects Sumatra and Panama and talked to members who are familiar with
>> the projects.
>>
>> Project Sumatra was aimed at translation of Java byte code to execute on
>> GPU, which was an ambitious goal and a challenging task to take up. In
>> this project, we aim to come up with APIs targeting the most common
>> Analytics operations that can be readily offloaded to accelerators
>> transparently. Most of the information needed for offload to the
>> accelerator is expected to be readily provided by the API semantics and
>> there by, simplifying the need to do tedious byte code analysis.
>>
>> While the vector API (part of Panama) brings some most wanted
>> abstraction for vectors, it is still loop based and is most useful for
>> superword type of operations leveraging SIMD units on general purpose
>> cores. The aim of this proposed project is to provide a more abstract
>> API (similar to the Streams API) that will directly work on streams of
>> data and transparently accommodate a wider set of heterogeneous
>> accelerators like DAX, GPUs underneath. Initially, the project will
>> focus on coming up with a complete set of APIs, relevant input/output
>> formats, optimized data structures and storage format that can be used
>> as building blocks to build high performance analytics
>> applications/frameworks in Java. Simple examples of such operations will
>> include Scan, select, filter, lookup, transcode, merge, sort etc.
>> Additionally, this project will also require more functionality like
>> operating system library calls, handling Garbage Collection needs amidst
>> offload etc.
>>
>> The artifacts provided by Project Panama including the code snippets (or
>> even the Vector API) along with value types from project Valhalla will
>> come in handy to be leveraged wherever it is applicable in this project.
>> Overall, I feel that the goals of this project and the needed work are
>> different from what the Vector API is targeting. Hope this answers your
>> question.
>>
>> Regards,
>>
>> Karthik
>>
>> On 11/14/2016 10:49 AM, Volker Simonis wrote:
>>> Hi Karthik,
>>>
>>> we had project "Sumatra" [1] for this which is inactive since quite
>>> some time.
>>> We also have project "Panama" [2] which, as far as I understand, is
>>> also looking into auto-parallelization/vectorization. See for example
>>> the "Vectors for Java" presentation from JavaOne which describes some
>>> very similar ideas to yours.
>>>
>>> What justifies the creation of yet another project instead of doing
>>> this work in the context of the existing projects?
>>> What in your approach is different to the one described in [3] which
>>> is already, at least partially, implemented in project Panama?
>>>
>>> Thanks,
>>> Volker
>>>
>>> [1] http://openjdk.java.net/projects/sumatra/
>>> [2] http://openjdk.java.net/projects/panama/
>>> [3]
>>> http://cr.openjdk.java.net/~psandoz/conferences/2016-JavaOne/j1-2016-vectors-for-java-CON1560.pdf 
>>>
>>>
>>>
>>> On Mon, Nov 14, 2016 at 5:23 PM, Karthik Ganesan
>>> <[hidden email]> wrote:
>>>> Hi,
>>>>
>>>> I would like to propose the creation of a new Project: Project
>>>> Trinity.
>>>>
>>>> This Project would explore enhanced execution of bulk aggregate
>>>> calculations
>>>> over Streams through offloading calculations to hardware accelerators.
>>>>
>>>> Streams allow developers to express calculations such that data
>>>> parallelism
>>>> can be efficiently exploited. Such calculations are prime
>>>> candidates for
>>>> leveraging enhanced data-oriented instructions on CPUs (such as SIMD
>>>> instructions) or offloading to hardware accelerators (such as the
>>>> SPARC Data
>>>> Accelerator co-processor, further referred to as DAX [1]).
>>>>
>>>> To identify a path to improving performance and power efficiency,
>>>> Project
>>>> Trinity will explore how libraries like Streams can be enhanced to
>>>> leverage
>>>> data processing hardware features to execute Streams more efficiently.
>>>>
>>>> Directions for exploration include:
>>>> - Building a streams-like library optimized for offload to
>>>> -- hardware accelerators (such as DAX), or
>>>> -- a GPU, or
>>>> -- SIMD instructions;
>>>> - Optimizations in the Graal compiler to automatically transform
>>>> suitable
>>>> Streams pipelines, taking advantage of data processing hardware
>>>> features;
>>>> - Explorations with Project Valhalla to expand the range of effective
>>>> acceleration to Streams of value types.
>>>>
>>>> Success will be evaluated based upon:
>>>> (1) speedups and resource efficiency gains achieved for a broad
>>>> range of
>>>> representative streams calculations under offload,
>>>> (2) ease of use of the hardware acceleration capability, and
>>>> (3) ensuring that there is no time or space overhead for
>>>> non-accelerated
>>>> calculations.
>>>>
>>>> Can I please request the support of the Core Libraries Group as the
>>>> Sponsoring Group with myself as the Project Lead.
>>>>
>>>> Warm Regards,
>>>> Karthik Ganesan
>>>>
>>>> [1] https://community.oracle.com/docs/DOC-994842
>>>>
>>

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Project Proposal: Trinity

Paul Sandoz
In reply to this post by Karthik Ganesan
Hi Karthik,

Thanks for sending this. Some thoughts.

I can see a number of DAX API focused explorations here:

1) A DAX-specific API bound to libdax using JNI
2) A DAX-specific API bound to libdax using Panama
3) A DAX-like API leveraging technologies in either 1) or 2)

Each may allow one to get the most out of a DAX accelerator.

I think 2) and 3) are complimentary to efforts in Panama.

3) is where alternative implementations leveraging SIMDs and GPUs might also be a good fit.

As one goes further down the abstraction road it gets a little fuzzier and there may be duplication and IMO we should be vigilant and consider consolidating particular aspects in such cases.

And as one goes further down the abstraction road, to say java.util.stream.Stream, i believe the problem gets much harder. The set of valid j.u.stream.Stream pipelines that might map to DAX operations, and further might map efficiently, is likely to be quite small and to the developer the performance model unclear. Cracking lambdas is certainly not easy, and is likely to be costly as well. To some extent project Sumatra ran into such difficulties, although i think in your case the problem is a little easier than that Sumatra is trying to solve. Still, it’s not easy to detect and translate appropriate j.u.stream.Stream pipelines into another form.

As i understand it DAX provides a number of fairly simple bulk transformation operations over arrays of data, with some flexibility in the element layout of that data. Focusing an API on those operations and layouts is likely to be a more tractable problem. That might include off-heap memory with compatible panama layouts, or on-heap somehow compatible with layouts for simple value types. Cue hand-waving :-) but in the spirit of 3) this might be the sweet spot.

Paul.



> On 14 Nov 2016, at 08:23, Karthik Ganesan <[hidden email]> wrote:
>
> Hi,
>
> I would like to propose the creation of a new Project: Project Trinity.
>
> This Project would explore enhanced execution of bulk aggregate calculations over Streams through offloading calculations to hardware accelerators.
>
> Streams allow developers to express calculations such that data parallelism can be efficiently exploited. Such calculations are prime candidates for leveraging enhanced data-oriented instructions on CPUs (such as SIMD instructions) or offloading to hardware accelerators (such as the SPARC Data Accelerator co-processor, further referred to as DAX [1]).
>
> To identify a path to improving performance and power efficiency, Project Trinity will explore how libraries like Streams can be enhanced to leverage data processing hardware features to execute Streams more efficiently.
>
> Directions for exploration include:
> - Building a streams-like library optimized for offload to
> -- hardware accelerators (such as DAX), or
> -- a GPU, or
> -- SIMD instructions;
> - Optimizations in the Graal compiler to automatically transform suitable Streams pipelines, taking advantage of data processing hardware features;
> - Explorations with Project Valhalla to expand the range of effective acceleration to Streams of value types.
>
> Success will be evaluated based upon:
> (1) speedups and resource efficiency gains achieved for a broad range of representative streams calculations under offload,
> (2) ease of use of the hardware acceleration capability, and
> (3) ensuring that there is no time or space overhead for non-accelerated calculations.
>
> Can I please request the support of the Core Libraries Group as the Sponsoring Group with myself as the Project Lead.
>
> Warm Regards,
> Karthik Ganesan
>
> [1] https://community.oracle.com/docs/DOC-994842
>

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Project Proposal: Trinity

Karthik Ganesan
Hi Paul,

Thanks for the well thought out comments and suggestions. Overall, the
suggested directions sound reasonable to me as a good starting point for
the project team to explore further.

Regards,
Karthik

On 16-11-21 06:11 PM, Paul Sandoz wrote:

> Hi Karthik,
>
> Thanks for sending this. Some thoughts.
>
> I can see a number of DAX API focused explorations here:
>
> 1) A DAX-specific API bound to libdax using JNI
> 2) A DAX-specific API bound to libdax using Panama
> 3) A DAX-like API leveraging technologies in either 1) or 2)
>
> Each may allow one to get the most out of a DAX accelerator.
>
> I think 2) and 3) are complimentary to efforts in Panama.
>
> 3) is where alternative implementations leveraging SIMDs and GPUs might also be a good fit.
>
> As one goes further down the abstraction road it gets a little fuzzier and there may be duplication and IMO we should be vigilant and consider consolidating particular aspects in such cases.
>
> And as one goes further down the abstraction road, to say java.util.stream.Stream, i believe the problem gets much harder. The set of valid j.u.stream.Stream pipelines that might map to DAX operations, and further might map efficiently, is likely to be quite small and to the developer the performance model unclear. Cracking lambdas is certainly not easy, and is likely to be costly as well. To some extent project Sumatra ran into such difficulties, although i think in your case the problem is a little easier than that Sumatra is trying to solve. Still, it’s not easy to detect and translate appropriate j.u.stream.Stream pipelines into another form.
>
> As i understand it DAX provides a number of fairly simple bulk transformation operations over arrays of data, with some flexibility in the element layout of that data. Focusing an API on those operations and layouts is likely to be a more tractable problem. That might include off-heap memory with compatible panama layouts, or on-heap somehow compatible with layouts for simple value types. Cue hand-waving :-) but in the spirit of 3) this might be the sweet spot.
>
> Paul.
>
>
>
>> On 14 Nov 2016, at 08:23, Karthik Ganesan <[hidden email]> wrote:
>>
>> Hi,
>>
>> I would like to propose the creation of a new Project: Project Trinity.
>>
>> This Project would explore enhanced execution of bulk aggregate calculations over Streams through offloading calculations to hardware accelerators.
>>
>> Streams allow developers to express calculations such that data parallelism can be efficiently exploited. Such calculations are prime candidates for leveraging enhanced data-oriented instructions on CPUs (such as SIMD instructions) or offloading to hardware accelerators (such as the SPARC Data Accelerator co-processor, further referred to as DAX [1]).
>>
>> To identify a path to improving performance and power efficiency, Project Trinity will explore how libraries like Streams can be enhanced to leverage data processing hardware features to execute Streams more efficiently.
>>
>> Directions for exploration include:
>> - Building a streams-like library optimized for offload to
>> -- hardware accelerators (such as DAX), or
>> -- a GPU, or
>> -- SIMD instructions;
>> - Optimizations in the Graal compiler to automatically transform suitable Streams pipelines, taking advantage of data processing hardware features;
>> - Explorations with Project Valhalla to expand the range of effective acceleration to Streams of value types.
>>
>> Success will be evaluated based upon:
>> (1) speedups and resource efficiency gains achieved for a broad range of representative streams calculations under offload,
>> (2) ease of use of the hardware acceleration capability, and
>> (3) ensuring that there is no time or space overhead for non-accelerated calculations.
>>
>> Can I please request the support of the Core Libraries Group as the Sponsoring Group with myself as the Project Lead.
>>
>> Warm Regards,
>> Karthik Ganesan
>>
>> [1] https://community.oracle.com/docs/DOC-994842
>>

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Project Proposal: Trinity

Thomas Wuerthinger
Karthik,

On your point below to automatically transform suitable Java code via the Graal compiler for execution with DAX: The Graal team is very interested to assist you in exploring this area. It could make DAX more widely applicable also to third party Java data processing libraries.

- thomas


> On 23 Nov 2016, at 21:43, Karthik Ganesan <[hidden email]> wrote:
>
> Hi Paul,
>
> Thanks for the well thought out comments and suggestions. Overall, the suggested directions sound reasonable to me as a good starting point for the project team to explore further.
>
> Regards,
> Karthik
>
> On 16-11-21 06:11 PM, Paul Sandoz wrote:
>> Hi Karthik,
>>
>> Thanks for sending this. Some thoughts.
>>
>> I can see a number of DAX API focused explorations here:
>>
>> 1) A DAX-specific API bound to libdax using JNI
>> 2) A DAX-specific API bound to libdax using Panama
>> 3) A DAX-like API leveraging technologies in either 1) or 2)
>>
>> Each may allow one to get the most out of a DAX accelerator.
>>
>> I think 2) and 3) are complimentary to efforts in Panama.
>>
>> 3) is where alternative implementations leveraging SIMDs and GPUs might also be a good fit.
>>
>> As one goes further down the abstraction road it gets a little fuzzier and there may be duplication and IMO we should be vigilant and consider consolidating particular aspects in such cases.
>>
>> And as one goes further down the abstraction road, to say java.util.stream.Stream, i believe the problem gets much harder. The set of valid j.u.stream.Stream pipelines that might map to DAX operations, and further might map efficiently, is likely to be quite small and to the developer the performance model unclear. Cracking lambdas is certainly not easy, and is likely to be costly as well. To some extent project Sumatra ran into such difficulties, although i think in your case the problem is a little easier than that Sumatra is trying to solve. Still, it’s not easy to detect and translate appropriate j.u.stream.Stream pipelines into another form.
>>
>> As i understand it DAX provides a number of fairly simple bulk transformation operations over arrays of data, with some flexibility in the element layout of that data. Focusing an API on those operations and layouts is likely to be a more tractable problem. That might include off-heap memory with compatible panama layouts, or on-heap somehow compatible with layouts for simple value types. Cue hand-waving :-) but in the spirit of 3) this might be the sweet spot.
>>
>> Paul.
>>
>>
>>
>>> On 14 Nov 2016, at 08:23, Karthik Ganesan <[hidden email]> wrote:
>>>
>>> Hi,
>>>
>>> I would like to propose the creation of a new Project: Project Trinity.
>>>
>>> This Project would explore enhanced execution of bulk aggregate calculations over Streams through offloading calculations to hardware accelerators.
>>>
>>> Streams allow developers to express calculations such that data parallelism can be efficiently exploited. Such calculations are prime candidates for leveraging enhanced data-oriented instructions on CPUs (such as SIMD instructions) or offloading to hardware accelerators (such as the SPARC Data Accelerator co-processor, further referred to as DAX [1]).
>>>
>>> To identify a path to improving performance and power efficiency, Project Trinity will explore how libraries like Streams can be enhanced to leverage data processing hardware features to execute Streams more efficiently.
>>>
>>> Directions for exploration include:
>>> - Building a streams-like library optimized for offload to
>>> -- hardware accelerators (such as DAX), or
>>> -- a GPU, or
>>> -- SIMD instructions;
>>> - Optimizations in the Graal compiler to automatically transform suitable Streams pipelines, taking advantage of data processing hardware features;
>>> - Explorations with Project Valhalla to expand the range of effective acceleration to Streams of value types.
>>>
>>> Success will be evaluated based upon:
>>> (1) speedups and resource efficiency gains achieved for a broad range of representative streams calculations under offload,
>>> (2) ease of use of the hardware acceleration capability, and
>>> (3) ensuring that there is no time or space overhead for non-accelerated calculations.
>>>
>>> Can I please request the support of the Core Libraries Group as the Sponsoring Group with myself as the Project Lead.
>>>
>>> Warm Regards,
>>> Karthik Ganesan
>>>
>>> [1] https://community.oracle.com/docs/DOC-994842
>>>
>

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Project Proposal: Trinity

Karthik Ganesan
Hi Thomas,

Thank you for the support and I look forward to the collaboration.

Regards,
Karthik

On 16-11-25 05:07 PM, Thomas Wuerthinger wrote:

> Karthik,
>
> On your point below to automatically transform suitable Java code via the Graal compiler for execution with DAX: The Graal team is very interested to assist you in exploring this area. It could make DAX more widely applicable also to third party Java data processing libraries.
>
> - thomas
>
>
>> On 23 Nov 2016, at 21:43, Karthik Ganesan <[hidden email]> wrote:
>>
>> Hi Paul,
>>
>> Thanks for the well thought out comments and suggestions. Overall, the suggested directions sound reasonable to me as a good starting point for the project team to explore further.
>>
>> Regards,
>> Karthik
>>
>> On 16-11-21 06:11 PM, Paul Sandoz wrote:
>>> Hi Karthik,
>>>
>>> Thanks for sending this. Some thoughts.
>>>
>>> I can see a number of DAX API focused explorations here:
>>>
>>> 1) A DAX-specific API bound to libdax using JNI
>>> 2) A DAX-specific API bound to libdax using Panama
>>> 3) A DAX-like API leveraging technologies in either 1) or 2)
>>>
>>> Each may allow one to get the most out of a DAX accelerator.
>>>
>>> I think 2) and 3) are complimentary to efforts in Panama.
>>>
>>> 3) is where alternative implementations leveraging SIMDs and GPUs might also be a good fit.
>>>
>>> As one goes further down the abstraction road it gets a little fuzzier and there may be duplication and IMO we should be vigilant and consider consolidating particular aspects in such cases.
>>>
>>> And as one goes further down the abstraction road, to say java.util.stream.Stream, i believe the problem gets much harder. The set of valid j.u.stream.Stream pipelines that might map to DAX operations, and further might map efficiently, is likely to be quite small and to the developer the performance model unclear. Cracking lambdas is certainly not easy, and is likely to be costly as well. To some extent project Sumatra ran into such difficulties, although i think in your case the problem is a little easier than that Sumatra is trying to solve. Still, it’s not easy to detect and translate appropriate j.u.stream.Stream pipelines into another form.
>>>
>>> As i understand it DAX provides a number of fairly simple bulk transformation operations over arrays of data, with some flexibility in the element layout of that data. Focusing an API on those operations and layouts is likely to be a more tractable problem. That might include off-heap memory with compatible panama layouts, or on-heap somehow compatible with layouts for simple value types. Cue hand-waving :-) but in the spirit of 3) this might be the sweet spot.
>>>
>>> Paul.
>>>
>>>
>>>
>>>> On 14 Nov 2016, at 08:23, Karthik Ganesan <[hidden email]> wrote:
>>>>
>>>> Hi,
>>>>
>>>> I would like to propose the creation of a new Project: Project Trinity.
>>>>
>>>> This Project would explore enhanced execution of bulk aggregate calculations over Streams through offloading calculations to hardware accelerators.
>>>>
>>>> Streams allow developers to express calculations such that data parallelism can be efficiently exploited. Such calculations are prime candidates for leveraging enhanced data-oriented instructions on CPUs (such as SIMD instructions) or offloading to hardware accelerators (such as the SPARC Data Accelerator co-processor, further referred to as DAX [1]).
>>>>
>>>> To identify a path to improving performance and power efficiency, Project Trinity will explore how libraries like Streams can be enhanced to leverage data processing hardware features to execute Streams more efficiently.
>>>>
>>>> Directions for exploration include:
>>>> - Building a streams-like library optimized for offload to
>>>> -- hardware accelerators (such as DAX), or
>>>> -- a GPU, or
>>>> -- SIMD instructions;
>>>> - Optimizations in the Graal compiler to automatically transform suitable Streams pipelines, taking advantage of data processing hardware features;
>>>> - Explorations with Project Valhalla to expand the range of effective acceleration to Streams of value types.
>>>>
>>>> Success will be evaluated based upon:
>>>> (1) speedups and resource efficiency gains achieved for a broad range of representative streams calculations under offload,
>>>> (2) ease of use of the hardware acceleration capability, and
>>>> (3) ensuring that there is no time or space overhead for non-accelerated calculations.
>>>>
>>>> Can I please request the support of the Core Libraries Group as the Sponsoring Group with myself as the Project Lead.
>>>>
>>>> Warm Regards,
>>>> Karthik Ganesan
>>>>
>>>> [1] https://community.oracle.com/docs/DOC-994842
>>>>

Loading...