Rendering images from PDF files slower in OpenJDK

classic Classic list List threaded Threaded
28 messages Options
12
Reply | Threaded
Open this post in threaded view
|

Rendering images from PDF files slower in OpenJDK

Daniel Persson
Hi everyone,

We render a lot of images with PDFBox with Java 1.8.0 and we want to upgrade to the current OpenJDK 11 but sadly we see some performance degradation switching over to OpenJDK. Anyone have a suggestion to remedy this issue, or can explain why it is slower?

Using the PDFBox app current release downloadable from

Running the command
java -jar pdfbox-app-2.0.11.jar PDFToImage -time test.pdf

We see the following result

---------------------------------------------------------
java version "1.8.0_181"
Java(TM) SE Runtime Environment (build 1.8.0_181-b13)
Java HotSpot(TM) 64-Bit Server VM (build 25.181-b13, mixed mode)
Rendered 1 page in 2762ms
---------------------------------------------------------
openjdk version "9.0.4"
OpenJDK Runtime Environment (build 9.0.4+11)
OpenJDK 64-Bit Server VM (build 9.0.4+11, mixed mode)
Rendered 1 page in 8034ms
---------------------------------------------------------
openjdk version "10.0.2" 2018-07-17
OpenJDK Runtime Environment 18.3 (build 10.0.2+13)
OpenJDK 64-Bit Server VM 18.3 (build 10.0.2+13, mixed mode)
Rendered 1 page in 4255ms
---------------------------------------------------------
openjdk version "11" 2018-09-25
OpenJDK Runtime Environment 18.9 (build 11+28)
OpenJDK 64-Bit Server VM 18.9 (build 11+28, mixed mode)
Rendered 1 page in 4275ms
---------------------------------------------------------
openjdk version "12-ea" 2019-03-19
OpenJDK Runtime Environment 19.3 (build 12-ea+11)
OpenJDK 64-Bit Server VM 19.3 (build 12-ea+11, mixed mode)
Rendered 1 page in 4399ms

The pdf file used in this example can be downloaded from

Best regards
Daniel
Reply | Threaded
Open this post in threaded view
|

Re: Rendering images from PDF files slower in OpenJDK

Philip Race
Multiple pieces are changing across these releases.

Is it the JPEG writing ? Is it freetype vs t2k (font performance)
is it harfbuzz vs icu (text layout), is it marlin vs ductus
(rasterization) ?

So it is very hard to say with any certainty what the cause of the
difference is .. or
why 10 got so much better than 9  .. even if still not back to JDK 8.

Please file a bug at java.com.

-phil.

On 09/25/2018 10:42 PM, Daniel Persson wrote:

> Hi everyone,
>
> We render a lot of images with PDFBox with Java 1.8.0 and we want to
> upgrade to the current OpenJDK 11 but sadly we see some performance
> degradation switching over to OpenJDK. Anyone have a suggestion to
> remedy this issue, or can explain why it is slower?
>
> Using the PDFBox app current release downloadable from
> http://www-us.apache.org/dist/pdfbox/2.0.11/pdfbox-app-2.0.11.jar
>
> Running the command
> java -jar pdfbox-app-2.0.11.jar PDFToImage -time test.pdf
>
> We see the following result
>
> ---------------------------------------------------------
> java version "1.8.0_181"
> Java(TM) SE Runtime Environment (build 1.8.0_181-b13)
> Java HotSpot(TM) 64-Bit Server VM (build 25.181-b13, mixed mode)
> Rendered 1 page in 2762ms
> ---------------------------------------------------------
> openjdk version "9.0.4"
> OpenJDK Runtime Environment (build 9.0.4+11)
> OpenJDK 64-Bit Server VM (build 9.0.4+11, mixed mode)
> Rendered 1 page in 8034ms
> ---------------------------------------------------------
> openjdk version "10.0.2" 2018-07-17
> OpenJDK Runtime Environment 18.3 (build 10.0.2+13)
> OpenJDK 64-Bit Server VM 18.3 (build 10.0.2+13, mixed mode)
> Rendered 1 page in 4255ms
> ---------------------------------------------------------
> openjdk version "11" 2018-09-25
> OpenJDK Runtime Environment 18.9 (build 11+28)
> OpenJDK 64-Bit Server VM 18.9 (build 11+28, mixed mode)
> Rendered 1 page in 4275ms
> ---------------------------------------------------------
> openjdk version "12-ea" 2019-03-19
> OpenJDK Runtime Environment 19.3 (build 12-ea+11)
> OpenJDK 64-Bit Server VM 19.3 (build 12-ea+11, mixed mode)
> Rendered 1 page in 4399ms
>
> The pdf file used in this example can be downloaded from
> https://drive.google.com/file/d/139wP6PDmmQ6KBTyeJTETIrplSuOUgFfG/view?usp=sharing
>
> Best regards
> Daniel

Reply | Threaded
Open this post in threaded view
|

Re: Rendering images from PDF files slower in OpenJDK

Daniel Persson
Hi Phil

What the PDFBox team told me it could have something to do with color mapping.

And my quick profiling shows that the code spends 29% of the time inside of java.awt.image.ColorConvertOp.filter on java 11

But I'll open a issue.

Best regards
Daniel

On Wed, Sep 26, 2018, 19:33 Phil Race <[hidden email]> wrote:
Multiple pieces are changing across these releases.

Is it the JPEG writing ? Is it freetype vs t2k (font performance)
is it harfbuzz vs icu (text layout), is it marlin vs ductus
(rasterization) ?

So it is very hard to say with any certainty what the cause of the
difference is .. or
why 10 got so much better than 9  .. even if still not back to JDK 8.

Please file a bug at java.com.

-phil.

On 09/25/2018 10:42 PM, Daniel Persson wrote:
> Hi everyone,
>
> We render a lot of images with PDFBox with Java 1.8.0 and we want to
> upgrade to the current OpenJDK 11 but sadly we see some performance
> degradation switching over to OpenJDK. Anyone have a suggestion to
> remedy this issue, or can explain why it is slower?
>
> Using the PDFBox app current release downloadable from
> http://www-us.apache.org/dist/pdfbox/2.0.11/pdfbox-app-2.0.11.jar
>
> Running the command
> java -jar pdfbox-app-2.0.11.jar PDFToImage -time test.pdf
>
> We see the following result
>
> ---------------------------------------------------------
> java version "1.8.0_181"
> Java(TM) SE Runtime Environment (build 1.8.0_181-b13)
> Java HotSpot(TM) 64-Bit Server VM (build 25.181-b13, mixed mode)
> Rendered 1 page in 2762ms
> ---------------------------------------------------------
> openjdk version "9.0.4"
> OpenJDK Runtime Environment (build 9.0.4+11)
> OpenJDK 64-Bit Server VM (build 9.0.4+11, mixed mode)
> Rendered 1 page in 8034ms
> ---------------------------------------------------------
> openjdk version "10.0.2" 2018-07-17
> OpenJDK Runtime Environment 18.3 (build 10.0.2+13)
> OpenJDK 64-Bit Server VM 18.3 (build 10.0.2+13, mixed mode)
> Rendered 1 page in 4255ms
> ---------------------------------------------------------
> openjdk version "11" 2018-09-25
> OpenJDK Runtime Environment 18.9 (build 11+28)
> OpenJDK 64-Bit Server VM 18.9 (build 11+28, mixed mode)
> Rendered 1 page in 4275ms
> ---------------------------------------------------------
> openjdk version "12-ea" 2019-03-19
> OpenJDK Runtime Environment 19.3 (build 12-ea+11)
> OpenJDK 64-Bit Server VM 19.3 (build 12-ea+11, mixed mode)
> Rendered 1 page in 4399ms
>
> The pdf file used in this example can be downloaded from
> https://drive.google.com/file/d/139wP6PDmmQ6KBTyeJTETIrplSuOUgFfG/view?usp=sharing
>
> Best regards
> Daniel

Reply | Threaded
Open this post in threaded view
|

Re: Rendering images from PDF files slower in OpenJDK

Philip Race
Interesting and I assume that it was somewhat less in JD8u ?
Off the top of my head that is one thing that didn't change in any big way since JDK 8u.

Perhaps something has changed so that it is now [considered] needed whereas before
it was not? So did it go from zero percent to 29% or from 10% to 29% ?

But even that doesn't on it own account for everything.
29% of 8 seconds would be about 2.5 seconds and doesn't explain going from
< 3 seconds to 8 seconds .. we are still missing at least 2.5 seconds ..


-phil.

On 9/26/18, 11:08 AM, Daniel Persson wrote:
Hi Phil

What the PDFBox team told me it could have something to do with color mapping.

And my quick profiling shows that the code spends 29% of the time inside of java.awt.image.ColorConvertOp.filter on java 11

But I'll open a issue.

Best regards
Daniel

On Wed, Sep 26, 2018, 19:33 Phil Race <[hidden email]> wrote:
Multiple pieces are changing across these releases.

Is it the JPEG writing ? Is it freetype vs t2k (font performance)
is it harfbuzz vs icu (text layout), is it marlin vs ductus
(rasterization) ?

So it is very hard to say with any certainty what the cause of the
difference is .. or
why 10 got so much better than 9  .. even if still not back to JDK 8.

Please file a bug at java.com.

-phil.

On 09/25/2018 10:42 PM, Daniel Persson wrote:
> Hi everyone,
>
> We render a lot of images with PDFBox with Java 1.8.0 and we want to
> upgrade to the current OpenJDK 11 but sadly we see some performance
> degradation switching over to OpenJDK. Anyone have a suggestion to
> remedy this issue, or can explain why it is slower?
>
> Using the PDFBox app current release downloadable from
> http://www-us.apache.org/dist/pdfbox/2.0.11/pdfbox-app-2.0.11.jar
>
> Running the command
> java -jar pdfbox-app-2.0.11.jar PDFToImage -time test.pdf
>
> We see the following result
>
> ---------------------------------------------------------
> java version "1.8.0_181"
> Java(TM) SE Runtime Environment (build 1.8.0_181-b13)
> Java HotSpot(TM) 64-Bit Server VM (build 25.181-b13, mixed mode)
> Rendered 1 page in 2762ms
> ---------------------------------------------------------
> openjdk version "9.0.4"
> OpenJDK Runtime Environment (build 9.0.4+11)
> OpenJDK 64-Bit Server VM (build 9.0.4+11, mixed mode)
> Rendered 1 page in 8034ms
> ---------------------------------------------------------
> openjdk version "10.0.2" 2018-07-17
> OpenJDK Runtime Environment 18.3 (build 10.0.2+13)
> OpenJDK 64-Bit Server VM 18.3 (build 10.0.2+13, mixed mode)
> Rendered 1 page in 4255ms
> ---------------------------------------------------------
> openjdk version "11" 2018-09-25
> OpenJDK Runtime Environment 18.9 (build 11+28)
> OpenJDK 64-Bit Server VM 18.9 (build 11+28, mixed mode)
> Rendered 1 page in 4275ms
> ---------------------------------------------------------
> openjdk version "12-ea" 2019-03-19
> OpenJDK Runtime Environment 19.3 (build 12-ea+11)
> OpenJDK 64-Bit Server VM 19.3 (build 12-ea+11, mixed mode)
> Rendered 1 page in 4399ms
>
> The pdf file used in this example can be downloaded from
> https://drive.google.com/file/d/139wP6PDmmQ6KBTyeJTETIrplSuOUgFfG/view?usp=sharing
>
> Best regards
> Daniel

Reply | Threaded
Open this post in threaded view
|

Re: Rendering images from PDF files slower in OpenJDK

Laurent Bourgès
Hi,

FYI I will run profilers on this test case to compare Oracle JDK8 vs OpenJDK11... 
Will then give you my analysis.

Cheers,
Laurent

Le mer. 26 sept. 2018 à 23:51, Philip Race <[hidden email]> a écrit :
Interesting and I assume that it was somewhat less in JD8u ?
Off the top of my head that is one thing that didn't change in any big way since JDK 8u.

Perhaps something has changed so that it is now [considered] needed whereas before
it was not? So did it go from zero percent to 29% or from 10% to 29% ?

But even that doesn't on it own account for everything.
29% of 8 seconds would be about 2.5 seconds and doesn't explain going from
< 3 seconds to 8 seconds .. we are still missing at least 2.5 seconds ..


-phil.

On 9/26/18, 11:08 AM, Daniel Persson wrote:
Hi Phil

What the PDFBox team told me it could have something to do with color mapping.

And my quick profiling shows that the code spends 29% of the time inside of java.awt.image.ColorConvertOp.filter on java 11

But I'll open a issue.

Best regards
Daniel

On Wed, Sep 26, 2018, 19:33 Phil Race <[hidden email]> wrote:
Multiple pieces are changing across these releases.

Is it the JPEG writing ? Is it freetype vs t2k (font performance)
is it harfbuzz vs icu (text layout), is it marlin vs ductus
(rasterization) ?

So it is very hard to say with any certainty what the cause of the
difference is .. or
why 10 got so much better than 9  .. even if still not back to JDK 8.

Please file a bug at java.com.

-phil.

On 09/25/2018 10:42 PM, Daniel Persson wrote:
> Hi everyone,
>
> We render a lot of images with PDFBox with Java 1.8.0 and we want to
> upgrade to the current OpenJDK 11 but sadly we see some performance
> degradation switching over to OpenJDK. Anyone have a suggestion to
> remedy this issue, or can explain why it is slower?
>
> Using the PDFBox app current release downloadable from
> http://www-us.apache.org/dist/pdfbox/2.0.11/pdfbox-app-2.0.11.jar
>
> Running the command
> java -jar pdfbox-app-2.0.11.jar PDFToImage -time test.pdf
>
> We see the following result
>
> ---------------------------------------------------------
> java version "1.8.0_181"
> Java(TM) SE Runtime Environment (build 1.8.0_181-b13)
> Java HotSpot(TM) 64-Bit Server VM (build 25.181-b13, mixed mode)
> Rendered 1 page in 2762ms
> ---------------------------------------------------------
> openjdk version "9.0.4"
> OpenJDK Runtime Environment (build 9.0.4+11)
> OpenJDK 64-Bit Server VM (build 9.0.4+11, mixed mode)
> Rendered 1 page in 8034ms
> ---------------------------------------------------------
> openjdk version "10.0.2" 2018-07-17
> OpenJDK Runtime Environment 18.3 (build 10.0.2+13)
> OpenJDK 64-Bit Server VM 18.3 (build 10.0.2+13, mixed mode)
> Rendered 1 page in 4255ms
> ---------------------------------------------------------
> openjdk version "11" 2018-09-25
> OpenJDK Runtime Environment 18.9 (build 11+28)
> OpenJDK 64-Bit Server VM 18.9 (build 11+28, mixed mode)
> Rendered 1 page in 4275ms
> ---------------------------------------------------------
> openjdk version "12-ea" 2019-03-19
> OpenJDK Runtime Environment 19.3 (build 12-ea+11)
> OpenJDK 64-Bit Server VM 19.3 (build 12-ea+11, mixed mode)
> Rendered 1 page in 4399ms
>
> The pdf file used in this example can be downloaded from
> https://drive.google.com/file/d/139wP6PDmmQ6KBTyeJTETIrplSuOUgFfG/view?usp=sharing
>
> Best regards
> Daniel

Reply | Threaded
Open this post in threaded view
|

Re: Rendering images from PDF files slower in OpenJDK

Laurent Bourgès
Hi,
I quickly run your test with Oracle JDK8 and OpenJDK 11:
Here are the CPU profiles:
http://cr.openjdk.java.net/~lbourges/pdfbox_profiles/

On JDK8:
java version "1.8.0_192-ea"
Java(TM) SE Runtime Environment (build 1.8.0_192-ea-b04)
Java HotSpot(TM) 64-Bit Server VM (build 25.192-b04, mixed mode)
Rendered 1 page in 3375ms

 777 ms org.apache.pdfbox.pdmodel.graphics.image.SampledImageReader.getRGBImage
  17.8% - 766 ms
org.apache.pdfbox.pdmodel.graphics.image.SampledImageReader.from8bit
  12.6% - 541 ms
org.apache.pdfbox.pdmodel.graphics.image.PDImageXObject.createInputStream
  12.6% - 541 ms org.apache.pdfbox.pdmodel.common.PDStream.createInputStream

On JDK11:

openjdk version "11" 2018-09-25
OpenJDK Runtime Environment 18.9 (build 11+28)
OpenJDK 64-Bit Server VM 18.9 (build 11+28, mixed mode)
Rendered 1 page in 4789ms

2,059 ms org.apache.pdfbox.pdmodel.graphics.image.SampledImageReader.getRGBImage
  36.0% - 2,054 ms
org.apache.pdfbox.pdmodel.graphics.image.SampledImageReader.from8bit
  26.2% - 1,495 ms
org.apache.pdfbox.pdmodel.graphics.color.PDDeviceCMYK.toRGBImage
  26.2% - 1,495 ms
org.apache.pdfbox.pdmodel.graphics.color.PDDeviceCMYK.toRGBImageAWT
  26.2% - 1,495 ms
org.apache.pdfbox.pdmodel.graphics.color.PDColorSpace.toRGBImageAWT
  26.0% - 1,481 ms java.awt.image.ColorConvertOp.filter

Actually, pdfbox's SampledImageReader.from8bit() do not use the same
path between JDK8 and JDK11... and ColorConvertOp costs 25% (in my
case)

Cheers,
Laurent

Le jeu. 27 sept. 2018 à 09:19, Laurent Bourgès <[hidden email]> a écrit :
Hi,

FYI I will run profilers on this test case to compare Oracle JDK8 vs OpenJDK11... 
Will then give you my analysis.

Cheers,
Laurent

Le mer. 26 sept. 2018 à 23:51, Philip Race <[hidden email]> a écrit :
Interesting and I assume that it was somewhat less in JD8u ?
Off the top of my head that is one thing that didn't change in any big way since JDK 8u.

Perhaps something has changed so that it is now [considered] needed whereas before
it was not? So did it go from zero percent to 29% or from 10% to 29% ?

But even that doesn't on it own account for everything.
29% of 8 seconds would be about 2.5 seconds and doesn't explain going from
< 3 seconds to 8 seconds .. we are still missing at least 2.5 seconds ..


-phil.

On 9/26/18, 11:08 AM, Daniel Persson wrote:
Hi Phil

What the PDFBox team told me it could have something to do with color mapping.

And my quick profiling shows that the code spends 29% of the time inside of java.awt.image.ColorConvertOp.filter on java 11

But I'll open a issue.

Best regards
Daniel

On Wed, Sep 26, 2018, 19:33 Phil Race <[hidden email]> wrote:
Multiple pieces are changing across these releases.

Is it the JPEG writing ? Is it freetype vs t2k (font performance)
is it harfbuzz vs icu (text layout), is it marlin vs ductus
(rasterization) ?

So it is very hard to say with any certainty what the cause of the
difference is .. or
why 10 got so much better than 9  .. even if still not back to JDK 8.

Please file a bug at java.com.

-phil.

On 09/25/2018 10:42 PM, Daniel Persson wrote:
> Hi everyone,
>
> We render a lot of images with PDFBox with Java 1.8.0 and we want to
> upgrade to the current OpenJDK 11 but sadly we see some performance
> degradation switching over to OpenJDK. Anyone have a suggestion to
> remedy this issue, or can explain why it is slower?
>
> Using the PDFBox app current release downloadable from
> http://www-us.apache.org/dist/pdfbox/2.0.11/pdfbox-app-2.0.11.jar
>
> Running the command
> java -jar pdfbox-app-2.0.11.jar PDFToImage -time test.pdf
>
> We see the following result
>
> ---------------------------------------------------------
> java version "1.8.0_181"
> Java(TM) SE Runtime Environment (build 1.8.0_181-b13)
> Java HotSpot(TM) 64-Bit Server VM (build 25.181-b13, mixed mode)
> Rendered 1 page in 2762ms
> ---------------------------------------------------------
> openjdk version "9.0.4"
> OpenJDK Runtime Environment (build 9.0.4+11)
> OpenJDK 64-Bit Server VM (build 9.0.4+11, mixed mode)
> Rendered 1 page in 8034ms
> ---------------------------------------------------------
> openjdk version "10.0.2" 2018-07-17
> OpenJDK Runtime Environment 18.3 (build 10.0.2+13)
> OpenJDK 64-Bit Server VM 18.3 (build 10.0.2+13, mixed mode)
> Rendered 1 page in 4255ms
> ---------------------------------------------------------
> openjdk version "11" 2018-09-25
> OpenJDK Runtime Environment 18.9 (build 11+28)
> OpenJDK 64-Bit Server VM 18.9 (build 11+28, mixed mode)
> Rendered 1 page in 4275ms
> ---------------------------------------------------------
> openjdk version "12-ea" 2019-03-19
> OpenJDK Runtime Environment 19.3 (build 12-ea+11)
> OpenJDK 64-Bit Server VM 19.3 (build 12-ea+11, mixed mode)
> Rendered 1 page in 4399ms
>
> The pdf file used in this example can be downloaded from
> https://drive.google.com/file/d/139wP6PDmmQ6KBTyeJTETIrplSuOUgFfG/view?usp=sharing
>
> Best regards
> Daniel

Reply | Threaded
Open this post in threaded view
|

Re: Rendering images from PDF files slower in OpenJDK

Philip Race
Thanks for the profiles.

I can't tell - from the profile - or even from [just looking at] the pdfbox
source what might be causing it to behave differently or why
it adds up to so much slower.

"Full stack" debugging - meaning building pdfbox seems to be necessary.
So I think that it might be best that if the pdfbox devs. do the initial evaluation.
Else it will wait a long time ...

-phil.

On 10/01/2018 10:52 AM, Laurent Bourgès wrote:
Hi,
I quickly run your test with Oracle JDK8 and OpenJDK 11:
Here are the CPU profiles:
http://cr.openjdk.java.net/~lbourges/pdfbox_profiles/

On JDK8:
java version "1.8.0_192-ea"
Java(TM) SE Runtime Environment (build 1.8.0_192-ea-b04)
Java HotSpot(TM) 64-Bit Server VM (build 25.192-b04, mixed mode)
Rendered 1 page in 3375ms

 777 ms org.apache.pdfbox.pdmodel.graphics.image.SampledImageReader.getRGBImage
  17.8% - 766 ms
org.apache.pdfbox.pdmodel.graphics.image.SampledImageReader.from8bit
  12.6% - 541 ms
org.apache.pdfbox.pdmodel.graphics.image.PDImageXObject.createInputStream
  12.6% - 541 ms org.apache.pdfbox.pdmodel.common.PDStream.createInputStream

On JDK11:

openjdk version "11" 2018-09-25
OpenJDK Runtime Environment 18.9 (build 11+28)
OpenJDK 64-Bit Server VM 18.9 (build 11+28, mixed mode)
Rendered 1 page in 4789ms

2,059 ms org.apache.pdfbox.pdmodel.graphics.image.SampledImageReader.getRGBImage
  36.0% - 2,054 ms
org.apache.pdfbox.pdmodel.graphics.image.SampledImageReader.from8bit
  26.2% - 1,495 ms
org.apache.pdfbox.pdmodel.graphics.color.PDDeviceCMYK.toRGBImage
  26.2% - 1,495 ms
org.apache.pdfbox.pdmodel.graphics.color.PDDeviceCMYK.toRGBImageAWT
  26.2% - 1,495 ms
org.apache.pdfbox.pdmodel.graphics.color.PDColorSpace.toRGBImageAWT
  26.0% - 1,481 ms java.awt.image.ColorConvertOp.filter

Actually, pdfbox's SampledImageReader.from8bit() do not use the same
path between JDK8 and JDK11... and ColorConvertOp costs 25% (in my
case)

Cheers,
Laurent

Le jeu. 27 sept. 2018 à 09:19, Laurent Bourgès <[hidden email]> a écrit :
Hi,

FYI I will run profilers on this test case to compare Oracle JDK8 vs OpenJDK11... 
Will then give you my analysis.

Cheers,
Laurent

Le mer. 26 sept. 2018 à 23:51, Philip Race <[hidden email]> a écrit :
Interesting and I assume that it was somewhat less in JD8u ?
Off the top of my head that is one thing that didn't change in any big way since JDK 8u.

Perhaps something has changed so that it is now [considered] needed whereas before
it was not? So did it go from zero percent to 29% or from 10% to 29% ?

But even that doesn't on it own account for everything.
29% of 8 seconds would be about 2.5 seconds and doesn't explain going from
< 3 seconds to 8 seconds .. we are still missing at least 2.5 seconds ..


-phil.

On 9/26/18, 11:08 AM, Daniel Persson wrote:
Hi Phil

What the PDFBox team told me it could have something to do with color mapping.

And my quick profiling shows that the code spends 29% of the time inside of java.awt.image.ColorConvertOp.filter on java 11

But I'll open a issue.

Best regards
Daniel

On Wed, Sep 26, 2018, 19:33 Phil Race <[hidden email]> wrote:
Multiple pieces are changing across these releases.

Is it the JPEG writing ? Is it freetype vs t2k (font performance)
is it harfbuzz vs icu (text layout), is it marlin vs ductus
(rasterization) ?

So it is very hard to say with any certainty what the cause of the
difference is .. or
why 10 got so much better than 9  .. even if still not back to JDK 8.

Please file a bug at java.com.

-phil.

On 09/25/2018 10:42 PM, Daniel Persson wrote:
> Hi everyone,
>
> We render a lot of images with PDFBox with Java 1.8.0 and we want to
> upgrade to the current OpenJDK 11 but sadly we see some performance
> degradation switching over to OpenJDK. Anyone have a suggestion to
> remedy this issue, or can explain why it is slower?
>
> Using the PDFBox app current release downloadable from
> http://www-us.apache.org/dist/pdfbox/2.0.11/pdfbox-app-2.0.11.jar
>
> Running the command
> java -jar pdfbox-app-2.0.11.jar PDFToImage -time test.pdf
>
> We see the following result
>
> ---------------------------------------------------------
> java version "1.8.0_181"
> Java(TM) SE Runtime Environment (build 1.8.0_181-b13)
> Java HotSpot(TM) 64-Bit Server VM (build 25.181-b13, mixed mode)
> Rendered 1 page in 2762ms
> ---------------------------------------------------------
> openjdk version "9.0.4"
> OpenJDK Runtime Environment (build 9.0.4+11)
> OpenJDK 64-Bit Server VM (build 9.0.4+11, mixed mode)
> Rendered 1 page in 8034ms
> ---------------------------------------------------------
> openjdk version "10.0.2" 2018-07-17
> OpenJDK Runtime Environment 18.3 (build 10.0.2+13)
> OpenJDK 64-Bit Server VM 18.3 (build 10.0.2+13, mixed mode)
> Rendered 1 page in 4255ms
> ---------------------------------------------------------
> openjdk version "11" 2018-09-25
> OpenJDK Runtime Environment 18.9 (build 11+28)
> OpenJDK 64-Bit Server VM 18.9 (build 11+28, mixed mode)
> Rendered 1 page in 4275ms
> ---------------------------------------------------------
> openjdk version "12-ea" 2019-03-19
> OpenJDK Runtime Environment 19.3 (build 12-ea+11)
> OpenJDK 64-Bit Server VM 19.3 (build 12-ea+11, mixed mode)
> Rendered 1 page in 4399ms
>
> The pdf file used in this example can be downloaded from
> https://drive.google.com/file/d/139wP6PDmmQ6KBTyeJTETIrplSuOUgFfG/view?usp=sharing
>
> Best regards
> Daniel


Reply | Threaded
Open this post in threaded view
|

Re: Rendering images from PDF files slower in OpenJDK

Daniel Persson
Hi Phil and Laurent

Let's not compare apples and oranges. What I can see it takes the same route and behave similarly.

If you look at

You can see that ConvertOp.filter takes 1.5s longer on Java 11.

If you want to see the full Call trees to check other eventual culprits you can find them here.

Best regards
Daniel

On Mon, Oct 1, 2018 at 11:10 PM Phil Race <[hidden email]> wrote:
Thanks for the profiles.

I can't tell - from the profile - or even from [just looking at] the pdfbox
source what might be causing it to behave differently or why
it adds up to so much slower.

"Full stack" debugging - meaning building pdfbox seems to be necessary.
So I think that it might be best that if the pdfbox devs. do the initial evaluation.
Else it will wait a long time ...

-phil.

On 10/01/2018 10:52 AM, Laurent Bourgès wrote:
Hi,
I quickly run your test with Oracle JDK8 and OpenJDK 11:
Here are the CPU profiles:
http://cr.openjdk.java.net/~lbourges/pdfbox_profiles/

On JDK8:
java version "1.8.0_192-ea"
Java(TM) SE Runtime Environment (build 1.8.0_192-ea-b04)
Java HotSpot(TM) 64-Bit Server VM (build 25.192-b04, mixed mode)
Rendered 1 page in 3375ms

 777 ms org.apache.pdfbox.pdmodel.graphics.image.SampledImageReader.getRGBImage
  17.8% - 766 ms
org.apache.pdfbox.pdmodel.graphics.image.SampledImageReader.from8bit
  12.6% - 541 ms
org.apache.pdfbox.pdmodel.graphics.image.PDImageXObject.createInputStream
  12.6% - 541 ms org.apache.pdfbox.pdmodel.common.PDStream.createInputStream

On JDK11:

openjdk version "11" 2018-09-25
OpenJDK Runtime Environment 18.9 (build 11+28)
OpenJDK 64-Bit Server VM 18.9 (build 11+28, mixed mode)
Rendered 1 page in 4789ms

2,059 ms org.apache.pdfbox.pdmodel.graphics.image.SampledImageReader.getRGBImage
  36.0% - 2,054 ms
org.apache.pdfbox.pdmodel.graphics.image.SampledImageReader.from8bit
  26.2% - 1,495 ms
org.apache.pdfbox.pdmodel.graphics.color.PDDeviceCMYK.toRGBImage
  26.2% - 1,495 ms
org.apache.pdfbox.pdmodel.graphics.color.PDDeviceCMYK.toRGBImageAWT
  26.2% - 1,495 ms
org.apache.pdfbox.pdmodel.graphics.color.PDColorSpace.toRGBImageAWT
  26.0% - 1,481 ms java.awt.image.ColorConvertOp.filter

Actually, pdfbox's SampledImageReader.from8bit() do not use the same
path between JDK8 and JDK11... and ColorConvertOp costs 25% (in my
case)

Cheers,
Laurent

Le jeu. 27 sept. 2018 à 09:19, Laurent Bourgès <[hidden email]> a écrit :
Hi,

FYI I will run profilers on this test case to compare Oracle JDK8 vs OpenJDK11... 
Will then give you my analysis.

Cheers,
Laurent

Le mer. 26 sept. 2018 à 23:51, Philip Race <[hidden email]> a écrit :
Interesting and I assume that it was somewhat less in JD8u ?
Off the top of my head that is one thing that didn't change in any big way since JDK 8u.

Perhaps something has changed so that it is now [considered] needed whereas before
it was not? So did it go from zero percent to 29% or from 10% to 29% ?

But even that doesn't on it own account for everything.
29% of 8 seconds would be about 2.5 seconds and doesn't explain going from
< 3 seconds to 8 seconds .. we are still missing at least 2.5 seconds ..


-phil.

On 9/26/18, 11:08 AM, Daniel Persson wrote:
Hi Phil

What the PDFBox team told me it could have something to do with color mapping.

And my quick profiling shows that the code spends 29% of the time inside of java.awt.image.ColorConvertOp.filter on java 11

But I'll open a issue.

Best regards
Daniel

On Wed, Sep 26, 2018, 19:33 Phil Race <[hidden email]> wrote:
Multiple pieces are changing across these releases.

Is it the JPEG writing ? Is it freetype vs t2k (font performance)
is it harfbuzz vs icu (text layout), is it marlin vs ductus
(rasterization) ?

So it is very hard to say with any certainty what the cause of the
difference is .. or
why 10 got so much better than 9  .. even if still not back to JDK 8.

Please file a bug at java.com.

-phil.

On 09/25/2018 10:42 PM, Daniel Persson wrote:
> Hi everyone,
>
> We render a lot of images with PDFBox with Java 1.8.0 and we want to
> upgrade to the current OpenJDK 11 but sadly we see some performance
> degradation switching over to OpenJDK. Anyone have a suggestion to
> remedy this issue, or can explain why it is slower?
>
> Using the PDFBox app current release downloadable from
> http://www-us.apache.org/dist/pdfbox/2.0.11/pdfbox-app-2.0.11.jar
>
> Running the command
> java -jar pdfbox-app-2.0.11.jar PDFToImage -time test.pdf
>
> We see the following result
>
> ---------------------------------------------------------
> java version "1.8.0_181"
> Java(TM) SE Runtime Environment (build 1.8.0_181-b13)
> Java HotSpot(TM) 64-Bit Server VM (build 25.181-b13, mixed mode)
> Rendered 1 page in 2762ms
> ---------------------------------------------------------
> openjdk version "9.0.4"
> OpenJDK Runtime Environment (build 9.0.4+11)
> OpenJDK 64-Bit Server VM (build 9.0.4+11, mixed mode)
> Rendered 1 page in 8034ms
> ---------------------------------------------------------
> openjdk version "10.0.2" 2018-07-17
> OpenJDK Runtime Environment 18.3 (build 10.0.2+13)
> OpenJDK 64-Bit Server VM 18.3 (build 10.0.2+13, mixed mode)
> Rendered 1 page in 4255ms
> ---------------------------------------------------------
> openjdk version "11" 2018-09-25
> OpenJDK Runtime Environment 18.9 (build 11+28)
> OpenJDK 64-Bit Server VM 18.9 (build 11+28, mixed mode)
> Rendered 1 page in 4275ms
> ---------------------------------------------------------
> openjdk version "12-ea" 2019-03-19
> OpenJDK Runtime Environment 19.3 (build 12-ea+11)
> OpenJDK 64-Bit Server VM 19.3 (build 12-ea+11, mixed mode)
> Rendered 1 page in 4399ms
>
> The pdf file used in this example can be downloaded from
> https://drive.google.com/file/d/139wP6PDmmQ6KBTyeJTETIrplSuOUgFfG/view?usp=sharing
>
> Best regards
> Daniel


Reply | Threaded
Open this post in threaded view
|

Re: Rendering images from PDF files slower in OpenJDK

Laurent Bourgès
Hi Daniel,

Let's not compare apples and oranges. What I can see it takes the same route and behave similarly.

 I agree, I did not take enough time to get accurate profiles, sorry.


If you look at

You can see that ConvertOp.filter takes 1.5s longer on Java 11.

I confirm: 1.8s vs 300ms.

Philip, do you know what could have change in this 2d area ?

I imagine ColorConvertOp delegates to native code so color profile (ICC) or hidpi support may have an impact here (or just compiler options may be different) ...

If needed, I could profile native code using oprofile / perf.

Laurent
Reply | Threaded
Open this post in threaded view
|

Re: Rendering images from PDF files slower in OpenJDK

Philip Race
I've spent some time examining what pdfbox is passing to ColorConvertOp
It is called about 10 or 11 times in this test with images typically 1-2K in each dimension.
The input image is a Custom BufferedImage which uses an ICC_ColorSpace constructed
from a color profile file that is embedded in pdfbox which is an open source equivalent
of what Acrobat uses. It has a 4 component raster and is opaque

This is filtered into a 3 component standard INT_RGB ColorModel.

I've distilled this down into a small program which has an copy of the method
that is defined in pdfbox and is invoking the supposedly slow ColorConvertOp.

So I believe this is all exactly what is happening in pdfbox.

What I find is that it is actually much faster on JDK11 than JDK 8.

prrubuntu:~$ ~/jdk-11/bin/java CConv
4881
prrubuntu:~$ ~/jdk8u181/bin/java CConv
12529


I can't say why that would be but the results are clear.
So I am left to suppose that pdfbox really is doing something different in 8 vs 11.
Or that this not the real problem. What do others see ?

I've attached the program. The 1Mb color profile file can be got from the pdfbox sources.

-phil.


On 10/2/18, 9:35 AM, Laurent Bourgès wrote:
Hi Daniel,

Let's not compare apples and oranges. What I can see it takes the same route and behave similarly.

 I agree, I did not take enough time to get accurate profiles, sorry.


If you look at

You can see that ConvertOp.filter takes 1.5s longer on Java 11.

I confirm: 1.8s vs 300ms.

Philip, do you know what could have change in this 2d area ?

I imagine ColorConvertOp delegates to native code so color profile (ICC) or hidpi support may have an impact here (or just compiler options may be different) ...

If needed, I could profile native code using oprofile / perf.

Laurent

CConv.java (1K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Rendering images from PDF files slower in OpenJDK

Laurent Bourgès
Phil,

If you look at the given pdf file, it has large images that exceed 2k so such ones may be more costly to convert.

As jpeg decoder in openjdk11 is different than oraclejdk8, it may cause more ColorConvertOp filter operations ... if color profiles are different.

Anyway this performance is not related to Marlin renderer, so I can not help much except in its diagnostic.

Cheers,
Laurent

Le mar. 2 oct. 2018 à 23:35, Philip Race <[hidden email]> a écrit :
I've spent some time examining what pdfbox is passing to ColorConvertOp
It is called about 10 or 11 times in this test with images typically 1-2K in each dimension.
The input image is a Custom BufferedImage which uses an ICC_ColorSpace constructed
from a color profile file that is embedded in pdfbox which is an open source equivalent
of what Acrobat uses. It has a 4 component raster and is opaque

This is filtered into a 3 component standard INT_RGB ColorModel.

I've distilled this down into a small program which has an copy of the method
that is defined in pdfbox and is invoking the supposedly slow ColorConvertOp.

So I believe this is all exactly what is happening in pdfbox.

What I find is that it is actually much faster on JDK11 than JDK 8.

prrubuntu:~$ ~/jdk-11/bin/java CConv
4881
prrubuntu:~$ ~/jdk8u181/bin/java CConv
12529


I can't say why that would be but the results are clear.
So I am left to suppose that pdfbox really is doing something different in 8 vs 11.
Or that this not the real problem. What do others see ?

I've attached the program. The 1Mb color profile file can be got from the pdfbox sources.

-phil.


On 10/2/18, 9:35 AM, Laurent Bourgès wrote:
Hi Daniel,

Let's not compare apples and oranges. What I can see it takes the same route and behave similarly.

 I agree, I did not take enough time to get accurate profiles, sorry.


If you look at

You can see that ConvertOp.filter takes 1.5s longer on Java 11.

I confirm: 1.8s vs 300ms.

Philip, do you know what could have change in this 2d area ?

I imagine ColorConvertOp delegates to native code so color profile (ICC) or hidpi support may have an impact here (or just compiler options may be different) ...

If needed, I could profile native code using oprofile / perf.

Laurent
Reply | Threaded
Open this post in threaded view
|

Re: Rendering images from PDF files slower in OpenJDK

Philip Race


On 10/3/18, 1:15 AM, Laurent Bourgès wrote:
Phil,

If you look at the given pdf file, it has large images that exceed 2k so such ones may be more costly to convert.

FWIW the one I profiled was by far the largest at 2577x1540.
The rest are more like 100x100, 200x200 or 500x500 - all approximations.

As jpeg decoder in openjdk11 is different than oraclejdk8, it may cause more ColorConvertOp filter operations ... if color profiles are different.

That doesn't seem likely and in fact since I  instrumented ColorConvertOp in 8 & 11,  I know exactly how many times it was invoked
by pdfbox, (11 times in both cases) and that all the image data is the same. SRC and DEST are the same types etc.

Also the version of LCMS is the same in 8 and 11 (v2.9).

-phil

Anyway this performance is not related to Marlin renderer, so I can not help much except in its diagnostic.

Cheers,
Laurent

Le mar. 2 oct. 2018 à 23:35, Philip Race <[hidden email]> a écrit :
I've spent some time examining what pdfbox is passing to ColorConvertOp
It is called about 10 or 11 times in this test with images typically 1-2K in each dimension.
The input image is a Custom BufferedImage which uses an ICC_ColorSpace constructed
from a color profile file that is embedded in pdfbox which is an open source equivalent
of what Acrobat uses. It has a 4 component raster and is opaque

This is filtered into a 3 component standard INT_RGB ColorModel.

I've distilled this down into a small program which has an copy of the method
that is defined in pdfbox and is invoking the supposedly slow ColorConvertOp.

So I believe this is all exactly what is happening in pdfbox.

What I find is that it is actually much faster on JDK11 than JDK 8.

prrubuntu:~$ ~/jdk-11/bin/java CConv
4881
prrubuntu:~$ ~/jdk8u181/bin/java CConv
12529


I can't say why that would be but the results are clear.
So I am left to suppose that pdfbox really is doing something different in 8 vs 11.
Or that this not the real problem. What do others see ?

I've attached the program. The 1Mb color profile file can be got from the pdfbox sources.

-phil.


On 10/2/18, 9:35 AM, Laurent Bourgès wrote:
Hi Daniel,

Let's not compare apples and oranges. What I can see it takes the same route and behave similarly.

 I agree, I did not take enough time to get accurate profiles, sorry.


If you look at

You can see that ConvertOp.filter takes 1.5s longer on Java 11.

I confirm: 1.8s vs 300ms.

Philip, do you know what could have change in this 2d area ?

I imagine ColorConvertOp delegates to native code so color profile (ICC) or hidpi support may have an impact here (or just compiler options may be different) ...

If needed, I could profile native code using oprofile / perf.

Laurent
Reply | Threaded
Open this post in threaded view
|

Re: Rendering images from PDF files slower in OpenJDK

Laurent Bourgès
Very good job, phil.

I will try your CCONV test on my linux machine to see if it is platform dependent ... or hw ?

Laurent

Le mer. 3 oct. 2018 à 19:19, Philip Race <[hidden email]> a écrit :


On 10/3/18, 1:15 AM, Laurent Bourgès wrote:
Phil,

If you look at the given pdf file, it has large images that exceed 2k so such ones may be more costly to convert.

FWIW the one I profiled was by far the largest at 2577x1540.
The rest are more like 100x100, 200x200 or 500x500 - all approximations.

As jpeg decoder in openjdk11 is different than oraclejdk8, it may cause more ColorConvertOp filter operations ... if color profiles are different.

That doesn't seem likely and in fact since I  instrumented ColorConvertOp in 8 & 11,  I know exactly how many times it was invoked
by pdfbox, (11 times in both cases) and that all the image data is the same. SRC and DEST are the same types etc.

Also the version of LCMS is the same in 8 and 11 (v2.9).

-phil

Anyway this performance is not related to Marlin renderer, so I can not help much except in its diagnostic.

Cheers,
Laurent

Le mar. 2 oct. 2018 à 23:35, Philip Race <[hidden email]> a écrit :
I've spent some time examining what pdfbox is passing to ColorConvertOp
It is called about 10 or 11 times in this test with images typically 1-2K in each dimension.
The input image is a Custom BufferedImage which uses an ICC_ColorSpace constructed
from a color profile file that is embedded in pdfbox which is an open source equivalent
of what Acrobat uses. It has a 4 component raster and is opaque

This is filtered into a 3 component standard INT_RGB ColorModel.

I've distilled this down into a small program which has an copy of the method
that is defined in pdfbox and is invoking the supposedly slow ColorConvertOp.

So I believe this is all exactly what is happening in pdfbox.

What I find is that it is actually much faster on JDK11 than JDK 8.

prrubuntu:~$ ~/jdk-11/bin/java CConv
4881
prrubuntu:~$ ~/jdk8u181/bin/java CConv
12529


I can't say why that would be but the results are clear.
So I am left to suppose that pdfbox really is doing something different in 8 vs 11.
Or that this not the real problem. What do others see ?

I've attached the program. The 1Mb color profile file can be got from the pdfbox sources.

-phil.


On 10/2/18, 9:35 AM, Laurent Bourgès wrote:
Hi Daniel,

Let's not compare apples and oranges. What I can see it takes the same route and behave similarly.

 I agree, I did not take enough time to get accurate profiles, sorry.


If you look at

You can see that ConvertOp.filter takes 1.5s longer on Java 11.

I confirm: 1.8s vs 300ms.

Philip, do you know what could have change in this 2d area ?

I imagine ColorConvertOp delegates to native code so color profile (ICC) or hidpi support may have an impact here (or just compiler options may be different) ...

If needed, I could profile native code using oprofile / perf.

Laurent
Reply | Threaded
Open this post in threaded view
|

Re: Rendering images from PDF files slower in OpenJDK

Daniel Persson
Hi Philip and Laurent.

I've talked with Tilman and Andreas from the PDFBox team and they see similar connections to the ColorConvertOp filter but wanted to try with one of the images of the PDF as a raster.

As we try different things I thought it good for collaboration to create a repository with the code so all can contribute.


I've run the 3 different tests on my Machine (Thinkpad P51s) with custom Gentoo installed, if important to the conversation.

I tried to invite you all as collaborators to this repository if you think this is a bad Idea let me know.

Best regards
Daniel

On Wed, Oct 3, 2018 at 7:51 PM Laurent Bourgès <[hidden email]> wrote:
Very good job, phil.

I will try your CCONV test on my linux machine to see if it is platform dependent ... or hw ?

Laurent

Le mer. 3 oct. 2018 à 19:19, Philip Race <[hidden email]> a écrit :


On 10/3/18, 1:15 AM, Laurent Bourgès wrote:
Phil,

If you look at the given pdf file, it has large images that exceed 2k so such ones may be more costly to convert.

FWIW the one I profiled was by far the largest at 2577x1540.
The rest are more like 100x100, 200x200 or 500x500 - all approximations.

As jpeg decoder in openjdk11 is different than oraclejdk8, it may cause more ColorConvertOp filter operations ... if color profiles are different.

That doesn't seem likely and in fact since I  instrumented ColorConvertOp in 8 & 11,  I know exactly how many times it was invoked
by pdfbox, (11 times in both cases) and that all the image data is the same. SRC and DEST are the same types etc.

Also the version of LCMS is the same in 8 and 11 (v2.9).

-phil

Anyway this performance is not related to Marlin renderer, so I can not help much except in its diagnostic.

Cheers,
Laurent

Le mar. 2 oct. 2018 à 23:35, Philip Race <[hidden email]> a écrit :
I've spent some time examining what pdfbox is passing to ColorConvertOp
It is called about 10 or 11 times in this test with images typically 1-2K in each dimension.
The input image is a Custom BufferedImage which uses an ICC_ColorSpace constructed
from a color profile file that is embedded in pdfbox which is an open source equivalent
of what Acrobat uses. It has a 4 component raster and is opaque

This is filtered into a 3 component standard INT_RGB ColorModel.

I've distilled this down into a small program which has an copy of the method
that is defined in pdfbox and is invoking the supposedly slow ColorConvertOp.

So I believe this is all exactly what is happening in pdfbox.

What I find is that it is actually much faster on JDK11 than JDK 8.

prrubuntu:~$ ~/jdk-11/bin/java CConv
4881
prrubuntu:~$ ~/jdk8u181/bin/java CConv
12529


I can't say why that would be but the results are clear.
So I am left to suppose that pdfbox really is doing something different in 8 vs 11.
Or that this not the real problem. What do others see ?

I've attached the program. The 1Mb color profile file can be got from the pdfbox sources.

-phil.


On 10/2/18, 9:35 AM, Laurent Bourgès wrote:
Hi Daniel,

Let's not compare apples and oranges. What I can see it takes the same route and behave similarly.

 I agree, I did not take enough time to get accurate profiles, sorry.


If you look at

You can see that ConvertOp.filter takes 1.5s longer on Java 11.

I confirm: 1.8s vs 300ms.

Philip, do you know what could have change in this 2d area ?

I imagine ColorConvertOp delegates to native code so color profile (ICC) or hidpi support may have an impact here (or just compiler options may be different) ...

If needed, I could profile native code using oprofile / perf.

Laurent
Reply | Threaded
Open this post in threaded view
|

Re: Rendering images from PDF files slower in OpenJDK

Laurent Bourgès
Hi,
I will get the code and add debugging logs: env & system properties and java2d RenderingHints.

I suspect these hints are different or have a noticiable impact: color interpolation & rendering quality.

I suppose the backend corresponds to software loops but some 2d operations can be accelerated ?

Anyway I will push any change in the code.

PS: I can run linux perf to profile both java & native code....

Cheers,
Laurent

Le jeu. 4 oct. 2018 à 07:50, Daniel Persson <[hidden email]> a écrit :
Hi Philip and Laurent.

I've talked with Tilman and Andreas from the PDFBox team and they see similar connections to the ColorConvertOp filter but wanted to try with one of the images of the PDF as a raster.

As we try different things I thought it good for collaboration to create a repository with the code so all can contribute.


I've run the 3 different tests on my Machine (Thinkpad P51s) with custom Gentoo installed, if important to the conversation.

I tried to invite you all as collaborators to this repository if you think this is a bad Idea let me know.

Best regards
Daniel

On Wed, Oct 3, 2018 at 7:51 PM Laurent Bourgès <[hidden email]> wrote:
Very good job, phil.

I will try your CCONV test on my linux machine to see if it is platform dependent ... or hw ?

Laurent

Le mer. 3 oct. 2018 à 19:19, Philip Race <[hidden email]> a écrit :


On 10/3/18, 1:15 AM, Laurent Bourgès wrote:
Phil,

If you look at the given pdf file, it has large images that exceed 2k so such ones may be more costly to convert.

FWIW the one I profiled was by far the largest at 2577x1540.
The rest are more like 100x100, 200x200 or 500x500 - all approximations.

As jpeg decoder in openjdk11 is different than oraclejdk8, it may cause more ColorConvertOp filter operations ... if color profiles are different.

That doesn't seem likely and in fact since I  instrumented ColorConvertOp in 8 & 11,  I know exactly how many times it was invoked
by pdfbox, (11 times in both cases) and that all the image data is the same. SRC and DEST are the same types etc.

Also the version of LCMS is the same in 8 and 11 (v2.9).

-phil

Anyway this performance is not related to Marlin renderer, so I can not help much except in its diagnostic.

Cheers,
Laurent

Le mar. 2 oct. 2018 à 23:35, Philip Race <[hidden email]> a écrit :
I've spent some time examining what pdfbox is passing to ColorConvertOp
It is called about 10 or 11 times in this test with images typically 1-2K in each dimension.
The input image is a Custom BufferedImage which uses an ICC_ColorSpace constructed
from a color profile file that is embedded in pdfbox which is an open source equivalent
of what Acrobat uses. It has a 4 component raster and is opaque

This is filtered into a 3 component standard INT_RGB ColorModel.

I've distilled this down into a small program which has an copy of the method
that is defined in pdfbox and is invoking the supposedly slow ColorConvertOp.

So I believe this is all exactly what is happening in pdfbox.

What I find is that it is actually much faster on JDK11 than JDK 8.

prrubuntu:~$ ~/jdk-11/bin/java CConv
4881
prrubuntu:~$ ~/jdk8u181/bin/java CConv
12529


I can't say why that would be but the results are clear.
So I am left to suppose that pdfbox really is doing something different in 8 vs 11.
Or that this not the real problem. What do others see ?

I've attached the program. The 1Mb color profile file can be got from the pdfbox sources.

-phil.


On 10/2/18, 9:35 AM, Laurent Bourgès wrote:
Hi Daniel,

Let's not compare apples and oranges. What I can see it takes the same route and behave similarly.

 I agree, I did not take enough time to get accurate profiles, sorry.


If you look at

You can see that ConvertOp.filter takes 1.5s longer on Java 11.

I confirm: 1.8s vs 300ms.

Philip, do you know what could have change in this 2d area ?

I imagine ColorConvertOp delegates to native code so color profile (ICC) or hidpi support may have an impact here (or just compiler options may be different) ...

If needed, I could profile native code using oprofile / perf.

Laurent
Reply | Threaded
Open this post in threaded view
|

Re: Rendering images from PDF files slower in OpenJDK

Philip Race


On 10/03/2018 11:58 PM, Laurent Bourgès wrote:
Hi,
I will get the code and add debugging logs: env & system properties and java2d RenderingHints.

The code in pdfbox passes null for the hints. So there should be no difference attributable to that.

-phil.

I suspect these hints are different or have a noticiable impact: color interpolation & rendering quality.

I suppose the backend corresponds to software loops but some 2d operations can be accelerated ?

Anyway I will push any change in the code.

PS: I can run linux perf to profile both java & native code....

Cheers,
Laurent

Le jeu. 4 oct. 2018 à 07:50, Daniel Persson <[hidden email]> a écrit :
Hi Philip and Laurent.

I've talked with Tilman and Andreas from the PDFBox team and they see similar connections to the ColorConvertOp filter but wanted to try with one of the images of the PDF as a raster.

As we try different things I thought it good for collaboration to create a repository with the code so all can contribute.


I've run the 3 different tests on my Machine (Thinkpad P51s) with custom Gentoo installed, if important to the conversation.

I tried to invite you all as collaborators to this repository if you think this is a bad Idea let me know.

Best regards
Daniel

On Wed, Oct 3, 2018 at 7:51 PM Laurent Bourgès <[hidden email]> wrote:
Very good job, phil.

I will try your CCONV test on my linux machine to see if it is platform dependent ... or hw ?

Laurent

Le mer. 3 oct. 2018 à 19:19, Philip Race <[hidden email]> a écrit :


On 10/3/18, 1:15 AM, Laurent Bourgès wrote:
Phil,

If you look at the given pdf file, it has large images that exceed 2k so such ones may be more costly to convert.

FWIW the one I profiled was by far the largest at 2577x1540.
The rest are more like 100x100, 200x200 or 500x500 - all approximations.

As jpeg decoder in openjdk11 is different than oraclejdk8, it may cause more ColorConvertOp filter operations ... if color profiles are different.

That doesn't seem likely and in fact since I  instrumented ColorConvertOp in 8 & 11,  I know exactly how many times it was invoked
by pdfbox, (11 times in both cases) and that all the image data is the same. SRC and DEST are the same types etc.

Also the version of LCMS is the same in 8 and 11 (v2.9).

-phil

Anyway this performance is not related to Marlin renderer, so I can not help much except in its diagnostic.

Cheers,
Laurent

Le mar. 2 oct. 2018 à 23:35, Philip Race <[hidden email]> a écrit :
I've spent some time examining what pdfbox is passing to ColorConvertOp
It is called about 10 or 11 times in this test with images typically 1-2K in each dimension.
The input image is a Custom BufferedImage which uses an ICC_ColorSpace constructed
from a color profile file that is embedded in pdfbox which is an open source equivalent
of what Acrobat uses. It has a 4 component raster and is opaque

This is filtered into a 3 component standard INT_RGB ColorModel.

I've distilled this down into a small program which has an copy of the method
that is defined in pdfbox and is invoking the supposedly slow ColorConvertOp.

So I believe this is all exactly what is happening in pdfbox.

What I find is that it is actually much faster on JDK11 than JDK 8.

prrubuntu:~$ ~/jdk-11/bin/java CConv
4881
prrubuntu:~$ ~/jdk8u181/bin/java CConv
12529


I can't say why that would be but the results are clear.
So I am left to suppose that pdfbox really is doing something different in 8 vs 11.
Or that this not the real problem. What do others see ?

I've attached the program. The 1Mb color profile file can be got from the pdfbox sources.

-phil.


On 10/2/18, 9:35 AM, Laurent Bourgès wrote:
Hi Daniel,

Let's not compare apples and oranges. What I can see it takes the same route and behave similarly.

 I agree, I did not take enough time to get accurate profiles, sorry.


If you look at

You can see that ConvertOp.filter takes 1.5s longer on Java 11.

I confirm: 1.8s vs 300ms.

Philip, do you know what could have change in this 2d area ?

I imagine ColorConvertOp delegates to native code so color profile (ICC) or hidpi support may have an impact here (or just compiler options may be different) ...

If needed, I could profile native code using oprofile / perf.

Laurent

Reply | Threaded
Open this post in threaded view
|

Re: Rendering images from PDF files slower in OpenJDK

Laurent Bourgès
Phil,
I wondered if ang RenderingHint defaults changed since 8...

Moreover I started playing with linux perf + jit agent and it is easy than before wigh oprofile + jvmtiagent.

I noticed that OracleJDK8 uses KCMS and OpenJDK11 uses LCMS for color conversion as does OpenJDK8, that could explain the performance gap.

Finally PDFImage test is run only once so the overhead may come from warmup (jit, g1)...

More later,
Laurent

Le jeu. 4 oct. 2018 à 20:03, Phil Race <[hidden email]> a écrit :


On 10/03/2018 11:58 PM, Laurent Bourgès wrote:
Hi,
I will get the code and add debugging logs: env & system properties and java2d RenderingHints.

The code in pdfbox passes null for the hints. So there should be no difference attributable to that.

-phil.

I suspect these hints are different or have a noticiable impact: color interpolation & rendering quality.

I suppose the backend corresponds to software loops but some 2d operations can be accelerated ?

Anyway I will push any change in the code.

PS: I can run linux perf to profile both java & native code....

Cheers,
Laurent

Le jeu. 4 oct. 2018 à 07:50, Daniel Persson <[hidden email]> a écrit :
Hi Philip and Laurent.

I've talked with Tilman and Andreas from the PDFBox team and they see similar connections to the ColorConvertOp filter but wanted to try with one of the images of the PDF as a raster.

As we try different things I thought it good for collaboration to create a repository with the code so all can contribute.


I've run the 3 different tests on my Machine (Thinkpad P51s) with custom Gentoo installed, if important to the conversation.

I tried to invite you all as collaborators to this repository if you think this is a bad Idea let me know.

Best regards
Daniel

On Wed, Oct 3, 2018 at 7:51 PM Laurent Bourgès <[hidden email]> wrote:
Very good job, phil.

I will try your CCONV test on my linux machine to see if it is platform dependent ... or hw ?

Laurent

Le mer. 3 oct. 2018 à 19:19, Philip Race <[hidden email]> a écrit :


On 10/3/18, 1:15 AM, Laurent Bourgès wrote:
Phil,

If you look at the given pdf file, it has large images that exceed 2k so such ones may be more costly to convert.

FWIW the one I profiled was by far the largest at 2577x1540.
The rest are more like 100x100, 200x200 or 500x500 - all approximations.

As jpeg decoder in openjdk11 is different than oraclejdk8, it may cause more ColorConvertOp filter operations ... if color profiles are different.

That doesn't seem likely and in fact since I  instrumented ColorConvertOp in 8 & 11,  I know exactly how many times it was invoked
by pdfbox, (11 times in both cases) and that all the image data is the same. SRC and DEST are the same types etc.

Also the version of LCMS is the same in 8 and 11 (v2.9).

-phil

Anyway this performance is not related to Marlin renderer, so I can not help much except in its diagnostic.

Cheers,
Laurent

Le mar. 2 oct. 2018 à 23:35, Philip Race <[hidden email]> a écrit :
I've spent some time examining what pdfbox is passing to ColorConvertOp
It is called about 10 or 11 times in this test with images typically 1-2K in each dimension.
The input image is a Custom BufferedImage which uses an ICC_ColorSpace constructed
from a color profile file that is embedded in pdfbox which is an open source equivalent
of what Acrobat uses. It has a 4 component raster and is opaque

This is filtered into a 3 component standard INT_RGB ColorModel.

I've distilled this down into a small program which has an copy of the method
that is defined in pdfbox and is invoking the supposedly slow ColorConvertOp.

So I believe this is all exactly what is happening in pdfbox.

What I find is that it is actually much faster on JDK11 than JDK 8.

prrubuntu:~$ ~/jdk-11/bin/java CConv
4881
prrubuntu:~$ ~/jdk8u181/bin/java CConv
12529


I can't say why that would be but the results are clear.
So I am left to suppose that pdfbox really is doing something different in 8 vs 11.
Or that this not the real problem. What do others see ?

I've attached the program. The 1Mb color profile file can be got from the pdfbox sources.

-phil.


On 10/2/18, 9:35 AM, Laurent Bourgès wrote:
Hi Daniel,

Let's not compare apples and oranges. What I can see it takes the same route and behave similarly.

 I agree, I did not take enough time to get accurate profiles, sorry.


If you look at

You can see that ConvertOp.filter takes 1.5s longer on Java 11.

I confirm: 1.8s vs 300ms.

Philip, do you know what could have change in this 2d area ?

I imagine ColorConvertOp delegates to native code so color profile (ICC) or hidpi support may have an impact here (or just compiler options may be different) ...

If needed, I could profile native code using oprofile / perf.

Laurent

Reply | Threaded
Open this post in threaded view
|

Re: Rendering images from PDF files slower in OpenJDK

Philip Race
I might be losing it, but I am 99% sure that LCMS is the color conversion engine in 8.
KCMS was there only for backup. You'd have to know the magic flag to get it and
no one has said anything to the effect that they are using it.

-phil.

On 10/4/18, 11:33 AM, Laurent Bourgès wrote:
Phil,
I wondered if ang RenderingHint defaults changed since 8...

Moreover I started playing with linux perf + jit agent and it is easy than before wigh oprofile + jvmtiagent.

I noticed that OracleJDK8 uses KCMS and OpenJDK11 uses LCMS for color conversion as does OpenJDK8, that could explain the performance gap.

Finally PDFImage test is run only once so the overhead may come from warmup (jit, g1)...

More later,
Laurent

Le jeu. 4 oct. 2018 à 20:03, Phil Race <[hidden email]> a écrit :


On 10/03/2018 11:58 PM, Laurent Bourgès wrote:
Hi,
I will get the code and add debugging logs: env & system properties and java2d RenderingHints.

The code in pdfbox passes null for the hints. So there should be no difference attributable to that.

-phil.

I suspect these hints are different or have a noticiable impact: color interpolation & rendering quality.

I suppose the backend corresponds to software loops but some 2d operations can be accelerated ?

Anyway I will push any change in the code.

PS: I can run linux perf to profile both java & native code....

Cheers,
Laurent

Le jeu. 4 oct. 2018 à 07:50, Daniel Persson <[hidden email]> a écrit :
Hi Philip and Laurent.

I've talked with Tilman and Andreas from the PDFBox team and they see similar connections to the ColorConvertOp filter but wanted to try with one of the images of the PDF as a raster.

As we try different things I thought it good for collaboration to create a repository with the code so all can contribute.


I've run the 3 different tests on my Machine (Thinkpad P51s) with custom Gentoo installed, if important to the conversation.

I tried to invite you all as collaborators to this repository if you think this is a bad Idea let me know.

Best regards
Daniel

On Wed, Oct 3, 2018 at 7:51 PM Laurent Bourgès <[hidden email]> wrote:
Very good job, phil.

I will try your CCONV test on my linux machine to see if it is platform dependent ... or hw ?

Laurent

Le mer. 3 oct. 2018 à 19:19, Philip Race <[hidden email]> a écrit :


On 10/3/18, 1:15 AM, Laurent Bourgès wrote:
Phil,

If you look at the given pdf file, it has large images that exceed 2k so such ones may be more costly to convert.

FWIW the one I profiled was by far the largest at 2577x1540.
The rest are more like 100x100, 200x200 or 500x500 - all approximations.

As jpeg decoder in openjdk11 is different than oraclejdk8, it may cause more ColorConvertOp filter operations ... if color profiles are different.

That doesn't seem likely and in fact since I  instrumented ColorConvertOp in 8 & 11,  I know exactly how many times it was invoked
by pdfbox, (11 times in both cases) and that all the image data is the same. SRC and DEST are the same types etc.

Also the version of LCMS is the same in 8 and 11 (v2.9).

-phil

Anyway this performance is not related to Marlin renderer, so I can not help much except in its diagnostic.

Cheers,
Laurent

Le mar. 2 oct. 2018 à 23:35, Philip Race <[hidden email]> a écrit :
I've spent some time examining what pdfbox is passing to ColorConvertOp
It is called about 10 or 11 times in this test with images typically 1-2K in each dimension.
The input image is a Custom BufferedImage which uses an ICC_ColorSpace constructed
from a color profile file that is embedded in pdfbox which is an open source equivalent
of what Acrobat uses. It has a 4 component raster and is opaque

This is filtered into a 3 component standard INT_RGB ColorModel.

I've distilled this down into a small program which has an copy of the method
that is defined in pdfbox and is invoking the supposedly slow ColorConvertOp.

So I believe this is all exactly what is happening in pdfbox.

What I find is that it is actually much faster on JDK11 than JDK 8.

prrubuntu:~$ ~/jdk-11/bin/java CConv
4881
prrubuntu:~$ ~/jdk8u181/bin/java CConv
12529


I can't say why that would be but the results are clear.
So I am left to suppose that pdfbox really is doing something different in 8 vs 11.
Or that this not the real problem. What do others see ?

I've attached the program. The 1Mb color profile file can be got from the pdfbox sources.

-phil.


On 10/2/18, 9:35 AM, Laurent Bourgès wrote:
Hi Daniel,

Let's not compare apples and oranges. What I can see it takes the same route and behave similarly.

 I agree, I did not take enough time to get accurate profiles, sorry.


If you look at

You can see that ConvertOp.filter takes 1.5s longer on Java 11.

I confirm: 1.8s vs 300ms.

Philip, do you know what could have change in this 2d area ?

I imagine ColorConvertOp delegates to native code so color profile (ICC) or hidpi support may have an impact here (or just compiler options may be different) ...

If needed, I could profile native code using oprofile / perf.

Laurent

Reply | Threaded
Open this post in threaded view
|

Re: Rendering images from PDF files slower in OpenJDK

Philip Race
Yep. LCMS is the default in 8u.

And although KCMS is a lot faster  on my CConv test ...

~/jdk8u181/bin/java CConv
13289

 ~/jdk8u181/bin/java -Dsun.java2d.cmm=sun.java2d.cmm.kcms.KcmsServiceProvider CConv
5131


It makes no difference on the pdf conversion :

~/jdk8u181/bin/java -jar pdfbox-app-2.0.11.jar PDFToImage  -time test.pdf Rendered 1 page in 4985ms

~/jdk8u181/bin/java -Dsun.java2d.cmm=sun.java2d.cmm.kcms.KcmsServiceProvider -jar pdfbox-app-2.0.11.jar PDFToImage  -time test.pdf
Rendered 1 page in 4723ms


Note: KCMS maybe faster on CConv but it has no support for modern ICC profiles
and I haven't checked if it is even applying the pdfbox one properly.
But it does have support to split a job into concurrent tasks for sub-images
which can help on the larger images like the one I am using in CConv.

-phil.

On 10/4/18, 2:24 PM, Philip Race wrote:
I might be losing it, but I am 99% sure that LCMS is the color conversion engine in 8.
KCMS was there only for backup. You'd have to know the magic flag to get it and
no one has said anything to the effect that they are using it.

-phil.

On 10/4/18, 11:33 AM, Laurent Bourgès wrote:
Phil,
I wondered if ang RenderingHint defaults changed since 8...

Moreover I started playing with linux perf + jit agent and it is easy than before wigh oprofile + jvmtiagent.

I noticed that OracleJDK8 uses KCMS and OpenJDK11 uses LCMS for color conversion as does OpenJDK8, that could explain the performance gap.

Finally PDFImage test is run only once so the overhead may come from warmup (jit, g1)...

More later,
Laurent

Le jeu. 4 oct. 2018 à 20:03, Phil Race <[hidden email]> a écrit :


On 10/03/2018 11:58 PM, Laurent Bourgès wrote:
Hi,
I will get the code and add debugging logs: env & system properties and java2d RenderingHints.

The code in pdfbox passes null for the hints. So there should be no difference attributable to that.

-phil.

I suspect these hints are different or have a noticiable impact: color interpolation & rendering quality.

I suppose the backend corresponds to software loops but some 2d operations can be accelerated ?

Anyway I will push any change in the code.

PS: I can run linux perf to profile both java & native code....

Cheers,
Laurent

Le jeu. 4 oct. 2018 à 07:50, Daniel Persson <[hidden email]> a écrit :
Hi Philip and Laurent.

I've talked with Tilman and Andreas from the PDFBox team and they see similar connections to the ColorConvertOp filter but wanted to try with one of the images of the PDF as a raster.

As we try different things I thought it good for collaboration to create a repository with the code so all can contribute.


I've run the 3 different tests on my Machine (Thinkpad P51s) with custom Gentoo installed, if important to the conversation.

I tried to invite you all as collaborators to this repository if you think this is a bad Idea let me know.

Best regards
Daniel

On Wed, Oct 3, 2018 at 7:51 PM Laurent Bourgès <[hidden email]> wrote:
Very good job, phil.

I will try your CCONV test on my linux machine to see if it is platform dependent ... or hw ?

Laurent

Le mer. 3 oct. 2018 à 19:19, Philip Race <[hidden email]> a écrit :


On 10/3/18, 1:15 AM, Laurent Bourgès wrote:
Phil,

If you look at the given pdf file, it has large images that exceed 2k so such ones may be more costly to convert.

FWIW the one I profiled was by far the largest at 2577x1540.
The rest are more like 100x100, 200x200 or 500x500 - all approximations.

As jpeg decoder in openjdk11 is different than oraclejdk8, it may cause more ColorConvertOp filter operations ... if color profiles are different.

That doesn't seem likely and in fact since I  instrumented ColorConvertOp in 8 & 11,  I know exactly how many times it was invoked
by pdfbox, (11 times in both cases) and that all the image data is the same. SRC and DEST are the same types etc.

Also the version of LCMS is the same in 8 and 11 (v2.9).

-phil

Anyway this performance is not related to Marlin renderer, so I can not help much except in its diagnostic.

Cheers,
Laurent

Le mar. 2 oct. 2018 à 23:35, Philip Race <[hidden email]> a écrit :
I've spent some time examining what pdfbox is passing to ColorConvertOp
It is called about 10 or 11 times in this test with images typically 1-2K in each dimension.
The input image is a Custom BufferedImage which uses an ICC_ColorSpace constructed
from a color profile file that is embedded in pdfbox which is an open source equivalent
of what Acrobat uses. It has a 4 component raster and is opaque

This is filtered into a 3 component standard INT_RGB ColorModel.

I've distilled this down into a small program which has an copy of the method
that is defined in pdfbox and is invoking the supposedly slow ColorConvertOp.

So I believe this is all exactly what is happening in pdfbox.

What I find is that it is actually much faster on JDK11 than JDK 8.

prrubuntu:~$ ~/jdk-11/bin/java CConv
4881
prrubuntu:~$ ~/jdk8u181/bin/java CConv
12529


I can't say why that would be but the results are clear.
So I am left to suppose that pdfbox really is doing something different in 8 vs 11.
Or that this not the real problem. What do others see ?

I've attached the program. The 1Mb color profile file can be got from the pdfbox sources.

-phil.


On 10/2/18, 9:35 AM, Laurent Bourgès wrote:
Hi Daniel,

Let's not compare apples and oranges. What I can see it takes the same route and behave similarly.

 I agree, I did not take enough time to get accurate profiles, sorry.


If you look at

You can see that ConvertOp.filter takes 1.5s longer on Java 11.

I confirm: 1.8s vs 300ms.

Philip, do you know what could have change in this 2d area ?

I imagine ColorConvertOp delegates to native code so color profile (ICC) or hidpi support may have an impact here (or just compiler options may be different) ...

If needed, I could profile native code using oprofile / perf.

Laurent

Reply | Threaded
Open this post in threaded view
|

Re: Rendering images from PDF files slower in OpenJDK

Laurent Bourgès
Phil,
I just gg a bit and got the PDFImage source:

public static void main( String[] args ) throws IOException
79     {
80         try
81         {
82             // force KCMS (faster than LCMS) if available
83             Class.forName("sun.java2d.cmm.kcms.KcmsServiceProvider");
84             System.setProperty("sun.java2d.cmm", "sun.java2d.cmm.kcms.KcmsServiceProvider");
85         }
86         catch (ClassNotFoundException e)
87         {
88             LOG.debug("KCMS service not found - using LCMS", e);
89         }
90


That's all folks !

Le ven. 5 oct. 2018 à 01:00, Philip Race <[hidden email]> a écrit :
Yep. LCMS is the default in 8u.

And although KCMS is a lot faster  on my CConv test ...

~/jdk8u181/bin/java CConv
13289

 ~/jdk8u181/bin/java -Dsun.java2d.cmm=sun.java2d.cmm.kcms.KcmsServiceProvider CConv
5131


It makes no difference on the pdf conversion :

~/jdk8u181/bin/java -jar pdfbox-app-2.0.11.jar PDFToImage  -time test.pdf Rendered 1 page in 4985ms

~/jdk8u181/bin/java -Dsun.java2d.cmm=sun.java2d.cmm.kcms.KcmsServiceProvider -jar pdfbox-app-2.0.11.jar PDFToImage  -time test.pdf
Rendered 1 page in 4723ms


Note: KCMS maybe faster on CConv but it has no support for modern ICC profiles
and I haven't checked if it is even applying the pdfbox one properly.
But it does have support to split a job into concurrent tasks for sub-images
which can help on the larger images like the one I am using in CConv.

-phil.

On 10/4/18, 2:24 PM, Philip Race wrote:
I might be losing it, but I am 99% sure that LCMS is the color conversion engine in 8.
KCMS was there only for backup. You'd have to know the magic flag to get it and
no one has said anything to the effect that they are using it.

-phil.

On 10/4/18, 11:33 AM, Laurent Bourgès wrote:
Phil,
I wondered if ang RenderingHint defaults changed since 8...

Moreover I started playing with linux perf + jit agent and it is easy than before wigh oprofile + jvmtiagent.

I noticed that OracleJDK8 uses KCMS and OpenJDK11 uses LCMS for color conversion as does OpenJDK8, that could explain the performance gap.

Finally PDFImage test is run only once so the overhead may come from warmup (jit, g1)...

More later,
Laurent

Le jeu. 4 oct. 2018 à 20:03, Phil Race <[hidden email]> a écrit :


On 10/03/2018 11:58 PM, Laurent Bourgès wrote:
Hi,
I will get the code and add debugging logs: env & system properties and java2d RenderingHints.

The code in pdfbox passes null for the hints. So there should be no difference attributable to that.

-phil.

I suspect these hints are different or have a noticiable impact: color interpolation & rendering quality.

I suppose the backend corresponds to software loops but some 2d operations can be accelerated ?

Anyway I will push any change in the code.

PS: I can run linux perf to profile both java & native code....

Cheers,
Laurent

Le jeu. 4 oct. 2018 à 07:50, Daniel Persson <[hidden email]> a écrit :
Hi Philip and Laurent.

I've talked with Tilman and Andreas from the PDFBox team and they see similar connections to the ColorConvertOp filter but wanted to try with one of the images of the PDF as a raster.

As we try different things I thought it good for collaboration to create a repository with the code so all can contribute.


I've run the 3 different tests on my Machine (Thinkpad P51s) with custom Gentoo installed, if important to the conversation.

I tried to invite you all as collaborators to this repository if you think this is a bad Idea let me know.

Best regards
Daniel

On Wed, Oct 3, 2018 at 7:51 PM Laurent Bourgès <[hidden email]> wrote:
Very good job, phil.

I will try your CCONV test on my linux machine to see if it is platform dependent ... or hw ?

Laurent

Le mer. 3 oct. 2018 à 19:19, Philip Race <[hidden email]> a écrit :


On 10/3/18, 1:15 AM, Laurent Bourgès wrote:
Phil,

If you look at the given pdf file, it has large images that exceed 2k so such ones may be more costly to convert.

FWIW the one I profiled was by far the largest at 2577x1540.
The rest are more like 100x100, 200x200 or 500x500 - all approximations.

As jpeg decoder in openjdk11 is different than oraclejdk8, it may cause more ColorConvertOp filter operations ... if color profiles are different.

That doesn't seem likely and in fact since I  instrumented ColorConvertOp in 8 & 11,  I know exactly how many times it was invoked
by pdfbox, (11 times in both cases) and that all the image data is the same. SRC and DEST are the same types etc.

Also the version of LCMS is the same in 8 and 11 (v2.9).

-phil

Anyway this performance is not related to Marlin renderer, so I can not help much except in its diagnostic.

Cheers,
Laurent

Le mar. 2 oct. 2018 à 23:35, Philip Race <[hidden email]> a écrit :
I've spent some time examining what pdfbox is passing to ColorConvertOp
It is called about 10 or 11 times in this test with images typically 1-2K in each dimension.
The input image is a Custom BufferedImage which uses an ICC_ColorSpace constructed
from a color profile file that is embedded in pdfbox which is an open source equivalent
of what Acrobat uses. It has a 4 component raster and is opaque

This is filtered into a 3 component standard INT_RGB ColorModel.

I've distilled this down into a small program which has an copy of the method
that is defined in pdfbox and is invoking the supposedly slow ColorConvertOp.

So I believe this is all exactly what is happening in pdfbox.

What I find is that it is actually much faster on JDK11 than JDK 8.

prrubuntu:~$ ~/jdk-11/bin/java CConv
4881
prrubuntu:~$ ~/jdk8u181/bin/java CConv
12529


I can't say why that would be but the results are clear.
So I am left to suppose that pdfbox really is doing something different in 8 vs 11.
Or that this not the real problem. What do others see ?

I've attached the program. The 1Mb color profile file can be got from the pdfbox sources.

-phil.


On 10/2/18, 9:35 AM, Laurent Bourgès wrote:
Hi Daniel,

Let's not compare apples and oranges. What I can see it takes the same route and behave similarly.

 I agree, I did not take enough time to get accurate profiles, sorry.


If you look at

You can see that ConvertOp.filter takes 1.5s longer on Java 11.

I confirm: 1.8s vs 300ms.

Philip, do you know what could have change in this 2d area ?

I imagine ColorConvertOp delegates to native code so color profile (ICC) or hidpi support may have an impact here (or just compiler options may be different) ...

If needed, I could profile native code using oprofile / perf.

Laurent

12