FileOutputStream.getFD() vs finalization

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

FileOutputStream.getFD() vs finalization

Hans Boehm-2
The design of the getFD() calls in some Java *Stream classes seems
problematic, and doesn't seem cleanly fixable without a spec change. I
first noticed this in JDK 8 code, but Roger Riggs recent JDK 10 changes
also don't seem to fully address this specific problem, particularly since
the code paths remain more-or-less similar when a FOS subclass has a close
method. It probably remains easier to describe this in a JDK 8/9 setting,
so I'll do so initially.

Close() on the *Stream closes the FD. The finalize() method on the Stream
(not the FD!) calls close(). Assuming JLS 12.6.2 finalization semantics, if
my program consists of:

{
  FileOutputStream f1 = new FileOutputStream("foo");
  FileDescriptor fd = f1.getFD();
  A:
  FileOutputStream f2 = new FileOutputStream(fd);
}

f2 may be associated with a closed fd, since f1 could have been finalized
at any pointer after its construction, in particular at point A.

This is slightly aggravated by the fact that close() is implemented in a
way that doesn't guarantee reachability of the  FileOutputStream while, or
before, it runs. Close() could easily be inlined, and the *Stream fields
promoted to registers. Thus even if you always call close() explicitly, I
think you are still affected by this problem.

Fundamentally the result of getFD() is only valid as long as the underlying
*Stream is reachable, and that is not defined in a way that's
understandable by most programmers. That also makes this API un-Java-like,
in that the user has to worry about very subtle object lifetime rules.

The only semi-plausible solution seems to involve judicious use of
ReachabilityFence() on FileOutputStreams after the getFD() result is no
longer needed. I personally don't think that's really a very plausible
request of the programmer.

I think Roger's JDK 10 changes may help in the case in which there is not
an overridden close() method, since the cleanup action is registered on the
underlying FileDescriptor. But if a subclass overrides close(), I think
we're basically still  in the same boat. For the reasons he already gave,
that seems hard to avoid. And again I think the problem applies even if
close() is explicitly called, since that doesn't really prevent
finalization of the Stream object.

Has this been discussed anywhere? Opinions about how to deal with it?

(This arose while going through the core libraries, looking for premature
finalization issues. So far, this one seems special, in that It really
seems to require an API change.)

Hans
Reply | Threaded
Open this post in threaded view
|

Re: FileOutputStream.getFD() vs finalization

Peter Levart
Hi,

Maybe I'm missing something but...

Hans Boehm je 28. 12. 2017 ob 21:32 napisal:

> The design of the getFD() calls in some Java *Stream classes seems
> problematic, and doesn't seem cleanly fixable without a spec change. I
> first noticed this in JDK 8 code, but Roger Riggs recent JDK 10 changes
> also don't seem to fully address this specific problem, particularly since
> the code paths remain more-or-less similar when a FOS subclass has a close
> method. It probably remains easier to describe this in a JDK 8/9 setting,
> so I'll do so initially.
>
> Close() on the *Stream closes the FD. The finalize() method on the Stream
> (not the FD!) calls close(). Assuming JLS 12.6.2 finalization semantics, if
> my program consists of:
>
> {
>    FileOutputStream f1 = new FileOutputStream("foo");
>    FileDescriptor fd = f1.getFD();
>    A:
>    FileOutputStream f2 = new FileOutputStream(fd);
> }
>
> f2 may be associated with a closed fd, since f1 could have been finalized
> at any pointer after its construction, in particular at point A.

I think there is no such problem here. Each stream (FileInputStream,
FileOutputStream and RandomAccessFile) keeps a reference to the
FileDescriptor (via 'fd' field) and FileDescriptor keeps a reference to
all associated streams (established in stream constructor via
FileDescriptor.attach(Closeable) where the Closeable instance is the
stream instance). So as long as 'fd' above is reachable, so is 'f1'.

> This is slightly aggravated by the fact that close() is implemented in a
> way that doesn't guarantee reachability of the  FileOutputStream while, or
> before, it runs. Close() could easily be inlined, and the *Stream fields
> promoted to registers. Thus even if you always call close() explicitly, I
> think you are still affected by this problem.

File[Input|Output]Stream.close() is an instance method that among other
things, calls close0() native instance method which accesses field 'fd'
in native code via JNI. While close0() native instance method is being
executed, the stream instance is kept reachable (by the JNI spec). In
fact, all stream native methods that do make use of 'fd' are instance
methods that keep stream instance is reachable while being executed -
making the native resource access.

So I think FileInputStream, FileOutputStream and RandomAccessFile are
examples of code where finalization was actually done right, because it
is consistently combined with native instance methods. Unlike direct NIO
buffers that keep native resource (long address) in a field and pass its
value around to other classes (Unsafe) not making sure that the buffer
instance which is tracked by Cleaner is kept reachable everywhere the
underlying native memory is being accessed.


Regards, Peter

> Fundamentally the result of getFD() is only valid as long as the underlying
> *Stream is reachable, and that is not defined in a way that's
> understandable by most programmers. That also makes this API un-Java-like,
> in that the user has to worry about very subtle object lifetime rules.
>
> The only semi-plausible solution seems to involve judicious use of
> ReachabilityFence() on FileOutputStreams after the getFD() result is no
> longer needed. I personally don't think that's really a very plausible
> request of the programmer.
>
> I think Roger's JDK 10 changes may help in the case in which there is not
> an overridden close() method, since the cleanup action is registered on the
> underlying FileDescriptor. But if a subclass overrides close(), I think
> we're basically still  in the same boat. For the reasons he already gave,
> that seems hard to avoid. And again I think the problem applies even if
> close() is explicitly called, since that doesn't really prevent
> finalization of the Stream object.
>
> Has this been discussed anywhere? Opinions about how to deal with it?
>
> (This arose while going through the core libraries, looking for premature
> finalization issues. So far, this one seems special, in that It really
> seems to require an API change.)
>
> Hans

Reply | Threaded
Open this post in threaded view
|

Re: FileOutputStream.getFD() vs finalization

Hans Boehm-2
Thanks!

I had missed the fact that FileDescriptor keeps a list of all owners.
Apparently (1) closing a single File*Stream effectively closes all such
Streams using the same FileDescriptor? And (2) whenever one such Stream is
reachable they are all retained. Thus presumably none can be finalized
until the last JNI instance method is called, which indeed should be enough
to avoid the problem. (I'm not sure whether the JNI spec actually requires
that, but certainly in the absence of cross-language optimization, it's
sufficient.)

Should (1) and (2) be documented? They both seem to potentially matter to
users and, unless I missed something else, are not at all apparent from the
current documentation. I do have to admit that I'm hazy on what the usage
model here is intended to be, and why FileDescriptor is user visible.

Hans

On Sun, Dec 31, 2017 at 3:53 AM, Peter Levart <[hidden email]>
wrote:

> Hi,
>
> Maybe I'm missing something but...
>
> Hans Boehm je 28. 12. 2017 ob 21:32 napisal:
>
> The design of the getFD() calls in some Java *Stream classes seems
> problematic, and doesn't seem cleanly fixable without a spec change. I
> first noticed this in JDK 8 code, but Roger Riggs recent JDK 10 changes
> also don't seem to fully address this specific problem, particularly since
> the code paths remain more-or-less similar when a FOS subclass has a close
> method. It probably remains easier to describe this in a JDK 8/9 setting,
> so I'll do so initially.
>
> Close() on the *Stream closes the FD. The finalize() method on the Stream
> (not the FD!) calls close(). Assuming JLS 12.6.2 finalization semantics, if
> my program consists of:
>
> {
>   FileOutputStream f1 = new FileOutputStream("foo");
>   FileDescriptor fd = f1.getFD();
>   A:
>   FileOutputStream f2 = new FileOutputStream(fd);
> }
>
> f2 may be associated with a closed fd, since f1 could have been finalized
> at any pointer after its construction, in particular at point A.
>
>
> I think there is no such problem here. Each stream (FileInputStream,
> FileOutputStream and RandomAccessFile) keeps a reference to the
> FileDescriptor (via 'fd' field) and FileDescriptor keeps a reference to all
> associated streams (established in stream constructor via
> FileDescriptor.attach(Closeable) where the Closeable instance is the
> stream instance). So as long as 'fd' above is reachable, so is 'f1'.
>
> This is slightly aggravated by the fact that close() is implemented in a
> way that doesn't guarantee reachability of the  FileOutputStream while, or
> before, it runs. Close() could easily be inlined, and the *Stream fields
> promoted to registers. Thus even if you always call close() explicitly, I
> think you are still affected by this problem.
>
>
> File[Input|Output]Stream.close() is an instance method that among other
> things, calls close0() native instance method which accesses field 'fd' in
> native code via JNI. While close0() native instance method is being
> executed, the stream instance is kept reachable (by the JNI spec). In fact,
> all stream native methods that do make use of 'fd' are instance methods
> that keep stream instance is reachable while being executed - making the
> native resource access.
>
> So I think FileInputStream, FileOutputStream and RandomAccessFile are
> examples of code where finalization was actually done right, because it is
> consistently combined with native instance methods. Unlike direct NIO
> buffers that keep native resource (long address) in a field and pass its
> value around to other classes (Unsafe) not making sure that the buffer
> instance which is tracked by Cleaner is kept reachable everywhere the
> underlying native memory is being accessed.
>
>
> Regards, Peter
>
> Fundamentally the result of getFD() is only valid as long as the underlying
> *Stream is reachable, and that is not defined in a way that's
> understandable by most programmers. That also makes this API un-Java-like,
> in that the user has to worry about very subtle object lifetime rules.
>
> The only semi-plausible solution seems to involve judicious use of
> ReachabilityFence() on FileOutputStreams after the getFD() result is no
> longer needed. I personally don't think that's really a very plausible
> request of the programmer.
>
> I think Roger's JDK 10 changes may help in the case in which there is not
> an overridden close() method, since the cleanup action is registered on the
> underlying FileDescriptor. But if a subclass overrides close(), I think
> we're basically still  in the same boat. For the reasons he already gave,
> that seems hard to avoid. And again I think the problem applies even if
> close() is explicitly called, since that doesn't really prevent
> finalization of the Stream object.
>
> Has this been discussed anywhere? Opinions about how to deal with it?
>
> (This arose while going through the core libraries, looking for premature
> finalization issues. So far, this one seems special, in that It really
> seems to require an API change.)
>
> Hans
>
>
>