MR JarFile was: Scanning multi version jars?

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

MR JarFile was: Scanning multi version jars?

Greg Wilkins
All,

Alan suggested this as a more appropriate forum for some issues I'm having
with multi release jar files.

My issues break down into two aspects:

   1. How does a container know which classes within a jar should be
   scanned for annotations - and that is being discussed in the other thread:
   "Scanning multi version jars"
   2. Confusion about the JarFile API - which I'd like to discuss in this
   thread.

At the very least I think the javadoc on JarFile is wrong, but I also think
the behaviour is very confusing

Consider a MR jar with the following contents:

example.jar
├── META-INF
│   ├── MANIFEST.MF
│   └── versions
│       └── 9
│           └── org
│               └── example
│                   ├── InBoth.class
│                   └── OnlyIn9.class
└── org
    └── example
        ├── InBoth.class
        └── OnlyInBase.class


where for debugging purposes I've made the contents of each file be its
path within the jar.

If I run the following code:

JarFile jarFile = new JarFile(new File("/tmp/example.jar"), false,
JarFile.OPEN_READ, Runtime.version());
for (Enumeration<JarEntry> e = jarFile.entries(); e.hasMoreElements(); )
{
    JarEntry entry0 = e.nextElement();
    String name0 = entry0.getName();
    String content0 = IO.toString(jarFile.getInputStream(entry0));
    JarEntry entry1 = jarFile.getJarEntry(name0);
    String name1 = entry1.getName();
    String content1 = IO.toString(jarFile.getInputStream(entry1));
    System.err.printf("%n=== %s ===%n -> %s%n => %s%n =>
%s%n",name0,name1,content0,content1);
}



I get the following output:

=== META-INF/ ===
 -> META-INF/
 =>
 =>

=== META-INF/MANIFEST.MF ===
 -> META-INF/MANIFEST.MF
 => Manifest-Version: 1.0
Multi-Release: true
Created-By: 9 (Oracle Corporation)
 => Manifest-Version: 1.0
Multi-Release: true
Created-By: 9 (Oracle Corporation)

=== org/ ===
 -> org/
 =>
 =>

=== org/example/ ===
 -> org/example/
 =>
 =>

=== org/example/OnlyInBase.class ===
 -> org/example/OnlyInBase.class
 => org/example/OnlyInBase.class
 => org/example/OnlyInBase.class

=== org/example/InBoth.class ===
 -> org/example/InBoth.class
 => org/example/InBoth.class
 => META-INF/versions/9/org/example/InBoth.class

=== META-INF/versions/ ===
 -> META-INF/versions/
 =>
 =>

=== META-INF/versions/9/ ===
 -> META-INF/versions/9/
 =>
 =>

=== META-INF/versions/9/org/ ===
 -> META-INF/versions/9/org/
 =>
 =>

=== META-INF/versions/9/org/example/ ===
 -> META-INF/versions/9/org/example/
 =>
 =>

=== META-INF/versions/9/org/example/OnlyIn9.class ===
 -> META-INF/versions/9/org/example/OnlyIn9.class
 => META-INF/versions/9/org/example/OnlyIn9.class
 => META-INF/versions/9/org/example/OnlyIn9.class

=== META-INF/versions/9/org/example/InBoth.class ===
 -> META-INF/versions/9/org/example/InBoth.class
 => META-INF/versions/9/org/example/InBoth.class
 => META-INF/versions/9/org/example/InBoth.class


The issue I have are that sometimes the class is trying to hide the MR
aspects, yet other times it is not.   Specifically when iterating it
returns JarEntry instances that ignore versions and always return the
content to which they refer, yet if you obtain a JarEntry from the
getJarEntry API, it behaves differently and may return the versioned
content even if it has the un-versioned path.

Specifically in the above example, if I obtain a JarEntry for
org/example/InBoth.class from the enumerator, then it always returns be the
base entry.
But if I do  entry=jarFile.getEntry(entry.getName()), I obtain a JarEntry
that has the name of the base entry, but gives me the versioned content
when used as a reference.  More over, there is nothing in the JarEntry API
that allows me to tell if it is versioned or not (no getVersion())

My expectations of the versioned JarFile API were that the enumeration
should only return entries appropriate for the version. So when configured
for java8, the above jar would enumerate over:

META-INF/
META-INF/MANIFEST.MF

org/

org/example/

org/example/OnlyInBase.class
org/example/InBoth.class


and for java9 the enumeration should return

META-INF/
META-INF/MANIFEST.MF

org/

org/example/

org/example/InBoth.class
org/example/OnlyIn9.class


Ie the existence of the META-INF/versions structure should be hidden unless
a non versioned JarFile is instantiated.

I had also expected that I would be able to query a JarEntry to ask what
version it was for.

The issue being discussed in the other thread about how containers can be
portable in their scanning of MR jars.  If the JarFile enumeration provided
the behaviour above, then containers would not need to implement their own
filtering and could just rely on the versioned JarFile enumeration.
Currently as is, each container will have to implement their own logic to
determine what class files in a jar should be scanned for annotations etc.

regards




On 15 September 2017 at 12:09, Greg Wilkins <[hidden email]> wrote:

>
> Alan,
>
> thanks for correcting me on the API of JarFile - I can see it kind of
> works, but in a very bizarre way (it gives different content for entries
> obtained via the enumerator vs the getJarEntry API, even though both
> entries report the same name).  But I'll discuss that elsewhere.
>
>
>
> The main issue still remains is that it is entirely unclear what files we
> should scan.   I understand the nuanced point that you are trying to make,
> ie "that it depends"  on if the class is public or private, if it is an API
> change, if it is an alternate implementation rather than a new version of
> the same library etc. etc.  I also totally understand that there are
> intended uses and unintended uses for this feature.
>
> However, as an implementer of an application container, it does not matter
> if I understand the nuances of MR jars and intended usage.  What matters is
> do the developers of the 3rd party jars that will be deployed in my
> container understand those nuances?    We have to look at jars that are
> supplied by third parties, with various levels of understanding, perhaps
> with some tricky clever ideas how to mess with the system, and we have to
> decide which classes we are going to scan for annotations.
>
> This is NOT a performance issue.  It is a consistency/portability issue.
> We have to make exactly the same decisions as all the other application
> containers out there, else 3rd party library jars will act differently on
> different containers.
>
> Thus it looks like we need some kind of heuristic to guess what the 3rd
> party developer intended when they used the MR feature.    Some approaches
> will need us to scan all the outer and inner classes to determine if the
> inner classes are referenced and if they are public or private.
>
> The heuristic could then be to analyse an inner class IFF it is public and
> referenced.    Or perhaps that should be if it is public OR referenced?
>
> Alternately, can we just have an heuristic based only on the index.  If
> Foo exists as a versioned class, then only similarly versioned Foo$Bar
> classes should be scanned and base Foo$Bar classes will be ignored?
>
> All of these are possible.  But we need an official documented (perhaps
> tool enforced) policy so that all containers can implement the same
> heuristic so that we can have portability.
>
> Ideally, the containers would not need to implement this heuristic, as it
> would be implemented in the enumerator of JarFile.  Unfortunately that is
> not the case and the enumerator returns all the entries regardless of
> version.   So containers must implement their own enumeration and we need
> to make sure we all implement it the same!
>
> regards
>
>
>
>
>
> On 14 September 2017 at 20:44, Alan Bateman <[hidden email]>
> wrote:
>
>> On 14/09/2017 10:58, Weijun Wang wrote:
>>
>>> :
>>> I know an MR jar allows you to shadow a class file with a
>>> release-specific one, but what if the new release has removed an old class?
>>> It will not appear in the release-specific directory but still exists in
>>> the root. Should we describe this in the MANIFEST?
>>>
>>> A MR JAR is not intended to support multiple versions of the same
>> library, instead the versioned sections are for classes that take advantage
>> of newer language or API features. They help with the migration from using
>> JDK internal APIs to supported/standard APIs for example. So I don't think
>> it should be complicated by an additional list of entries to "hide" in the
>> base or overlaid version sections.
>>
>> Greg's mail doesn't say if Bar is public so I can't tell if his example
>> involves an attempted API change or not. Assuming Bar is not public then
>> compiling the 9 version of Foo.java will generate Foo.class and no
>> Foo$Bar.class. This doesn't mean it's completely orphaned of course as
>> there may be other classes in the base section, and in the same package,
>> that were compiled with references to Bar. The `jar` tool could do some
>> additional validation to catch these references and so avoid
>> IncompatibleClassChangeError at runtime (as might arise if
>> getEnclosingClass were invoked on the inner class). That would help with
>> Greg's annotation scanning scenario too.
>>
>> -Alan
>>
>
>
>
> --
> Greg Wilkins <[hidden email]> CTO http://webtide.com
>



--
Greg Wilkins <[hidden email]> CTO http://webtide.com
Reply | Threaded
Open this post in threaded view
|

Re: MR JarFile was: Scanning multi version jars?

Alan Bateman
On 15/09/2017 05:43, Greg Wilkins wrote:
>
> :
>
> The issue I have are that sometimes the class is trying to hide the MR
> aspects, yet other times it is not.   Specifically when iterating it
> returns JarEntry instances that ignore versions and always return the
> content to which they refer, yet if you obtain a JarEntry from the
> getJarEntry API, it behaves differently and may return the versioned
> content even if it has the un-versioned path.
The entries and stream methods have been discussed here several times.
The conclusion was that they would continue to enumerate or return a
stream over all entries in the JAR file.  A new versionedStream() method
was proposed and I agree is needed, it just didn't make it into Java SE
9. Now might be the time to put it back on the table.

-Alan
Reply | Threaded
Open this post in threaded view
|

Re: MR JarFile was: Scanning multi version jars?

Alan Bateman
On 15/09/2017 11:37, Alan Bateman wrote:

> On 15/09/2017 05:43, Greg Wilkins wrote:
>>
>> :
>>
>> The issue I have are that sometimes the class is trying to hide the
>> MR aspects, yet other times it is not.   Specifically when iterating
>> it returns JarEntry instances that ignore versions and always return
>> the content to which they refer, yet if you obtain a JarEntry from
>> the getJarEntry API, it behaves differently and may return the
>> versioned content even if it has the un-versioned path.
> The entries and stream methods have been discussed here several times.
> The conclusion was that they would continue to enumerate or return a
> stream over all entries in the JAR file.  A new versionedStream()
> method was proposed and I agree is needed, it just didn't make it into
> Java SE 9. Now might be the time to put it back on the table.
Just to follow up from on this thread from September.

jdk-10+34 has the JarFile::versionedStream and JarEntry::getRealName
methods that we've been discussing here to complete the API support for
MR JARs. It would be good to try them out and report back any issues
that you find.

-Alan