Re: RFR(XXS): 8182307 - Error during JRMP connection establishment

classic Classic list List threaded Threaded
11 messages Options
Reply | Threaded
Open this post in threaded view
|

Re: RFR(XXS): 8182307 - Error during JRMP connection establishment

Daniel D. Daugherty
Adding core-libs-dev@... since this is RMI code. Thanks Alan!

Folks, this review spans three OpenJDK aliases so Thunderbird's
reply-to-list feature won't get it right. This is one of the
few times that reply-to-all is the right thing to do...

Dan

On 12/7/17 11:38 AM, Daniel D. Daugherty wrote:

> Greetings,
>
> I have a small fix for a very intermittent ServerSocket related test
> failure:
>
>     JDK-8182307: Error during JRMP connection establishment
>     https://bugs.openjdk.java.net/browse/JDK-8182307
>
> The fix is copied from Jerry's work on the following bugs:
>
>     JDK-8182757 JDWP: Socket Transport handshake hangs on Solaris
>     https://bugs.openjdk.java.net/browse/JDK-8182757
>
>     JDK-8178676 nsk/jvmti/AttachOnDemand/attach045 fails with Exception
>     https://bugs.openjdk.java.net/browse/JDK-8178676
>
> For the gory details of the reasons for this fix please see
> Jerry's bugs.
>
> Webrev URL: http://cr.openjdk.java.net/~dcubed/8182307-webrev/jdk10-0/
>
>
> I observed this test failure one time in my Thread-SMR work and have
> been including this fix in months of Thread-SMR stress testing. It was
> also included in both JPRT and Mach5 tier[1-5] testing of the Thread-SMR
> changes.
>
> Thanks, in advance, for any comments, suggestions or questions.
>
> Dan
>

Reply | Threaded
Open this post in threaded view
|

Re: RFR(XXS): 8182307 - Error during JRMP connection establishment

Gerald Thornbrugh
Hi Dan,

Your fix looks good.

Jerry

> Adding core-libs-dev@... since this is RMI code. Thanks Alan!
>
> Folks, this review spans three OpenJDK aliases so Thunderbird's
> reply-to-list feature won't get it right. This is one of the
> few times that reply-to-all is the right thing to do...
>
> Dan
>
> On 12/7/17 11:38 AM, Daniel D. Daugherty wrote:
>> Greetings,
>>
>> I have a small fix for a very intermittent ServerSocket related test
>> failure:
>>
>>     JDK-8182307: Error during JRMP connection establishment
>>     https://bugs.openjdk.java.net/browse/JDK-8182307
>>
>> The fix is copied from Jerry's work on the following bugs:
>>
>>     JDK-8182757 JDWP: Socket Transport handshake hangs on Solaris
>>     https://bugs.openjdk.java.net/browse/JDK-8182757
>>
>>     JDK-8178676 nsk/jvmti/AttachOnDemand/attach045 fails with Exception
>>     https://bugs.openjdk.java.net/browse/JDK-8178676
>>
>> For the gory details of the reasons for this fix please see
>> Jerry's bugs.
>>
>> Webrev URL: http://cr.openjdk.java.net/~dcubed/8182307-webrev/jdk10-0/
>>
>>
>> I observed this test failure one time in my Thread-SMR work and have
>> been including this fix in months of Thread-SMR stress testing. It was
>> also included in both JPRT and Mach5 tier[1-5] testing of the Thread-SMR
>> changes.
>>
>> Thanks, in advance, for any comments, suggestions or questions.
>>
>> Dan
>>
>

Reply | Threaded
Open this post in threaded view
|

Re: RFR(XXS): 8182307 - Error during JRMP connection establishment

Daniel D. Daugherty
Jerry,

Thanks for the review!

Dan


On 12/7/17 12:00 PM, Gerald Thornbrugh wrote:

> Hi Dan,
>
> Your fix looks good.
>
> Jerry
>
>> Adding core-libs-dev@... since this is RMI code. Thanks Alan!
>>
>> Folks, this review spans three OpenJDK aliases so Thunderbird's
>> reply-to-list feature won't get it right. This is one of the
>> few times that reply-to-all is the right thing to do...
>>
>> Dan
>>
>> On 12/7/17 11:38 AM, Daniel D. Daugherty wrote:
>>> Greetings,
>>>
>>> I have a small fix for a very intermittent ServerSocket related test
>>> failure:
>>>
>>>     JDK-8182307: Error during JRMP connection establishment
>>>     https://bugs.openjdk.java.net/browse/JDK-8182307
>>>
>>> The fix is copied from Jerry's work on the following bugs:
>>>
>>>     JDK-8182757 JDWP: Socket Transport handshake hangs on Solaris
>>>     https://bugs.openjdk.java.net/browse/JDK-8182757
>>>
>>>     JDK-8178676 nsk/jvmti/AttachOnDemand/attach045 fails with Exception
>>>     https://bugs.openjdk.java.net/browse/JDK-8178676
>>>
>>> For the gory details of the reasons for this fix please see
>>> Jerry's bugs.
>>>
>>> Webrev URL: http://cr.openjdk.java.net/~dcubed/8182307-webrev/jdk10-0/
>>>
>>>
>>> I observed this test failure one time in my Thread-SMR work and have
>>> been including this fix in months of Thread-SMR stress testing. It was
>>> also included in both JPRT and Mach5 tier[1-5] testing of the
>>> Thread-SMR
>>> changes.
>>>
>>> Thanks, in advance, for any comments, suggestions or questions.
>>>
>>> Dan
>>>
>>
>

Reply | Threaded
Open this post in threaded view
|

Re: RFR(XXS): 8182307 - Error during JRMP connection establishment

roger riggs
In reply to this post by Gerald Thornbrugh
+1

On 12/7/2017 12:00 PM, Gerald Thornbrugh wrote:

> Hi Dan,
>
> Your fix looks good.
>
> Jerry
>
>> Adding core-libs-dev@... since this is RMI code. Thanks Alan!
>>
>> Folks, this review spans three OpenJDK aliases so Thunderbird's
>> reply-to-list feature won't get it right. This is one of the
>> few times that reply-to-all is the right thing to do...
>>
>> Dan
>>
>> On 12/7/17 11:38 AM, Daniel D. Daugherty wrote:
>>> Greetings,
>>>
>>> I have a small fix for a very intermittent ServerSocket related test
>>> failure:
>>>
>>>     JDK-8182307: Error during JRMP connection establishment
>>>     https://bugs.openjdk.java.net/browse/JDK-8182307
>>>
>>> The fix is copied from Jerry's work on the following bugs:
>>>
>>>     JDK-8182757 JDWP: Socket Transport handshake hangs on Solaris
>>>     https://bugs.openjdk.java.net/browse/JDK-8182757
>>>
>>>     JDK-8178676 nsk/jvmti/AttachOnDemand/attach045 fails with Exception
>>>     https://bugs.openjdk.java.net/browse/JDK-8178676
>>>
>>> For the gory details of the reasons for this fix please see
>>> Jerry's bugs.
>>>
>>> Webrev URL: http://cr.openjdk.java.net/~dcubed/8182307-webrev/jdk10-0/
>>>
>>>
>>> I observed this test failure one time in my Thread-SMR work and have
>>> been including this fix in months of Thread-SMR stress testing. It was
>>> also included in both JPRT and Mach5 tier[1-5] testing of the
>>> Thread-SMR
>>> changes.
>>>
>>> Thanks, in advance, for any comments, suggestions or questions.
>>>
>>> Dan
>>>
>>
>

Reply | Threaded
Open this post in threaded view
|

Re: RFR(XXS): 8182307 - Error during JRMP connection establishment

Daniel D. Daugherty
Roger,

Thanks for the review!

Dan

P.S.
I'm planning to push this fix to jdk/hs since the only sightings
have been in jdk/hs testing or in projects that are parented to
jdk/hs repos... Hope that's okay...


On 12/7/17 12:07 PM, Roger Riggs wrote:

> +1
>
> On 12/7/2017 12:00 PM, Gerald Thornbrugh wrote:
>> Hi Dan,
>>
>> Your fix looks good.
>>
>> Jerry
>>
>>> Adding core-libs-dev@... since this is RMI code. Thanks Alan!
>>>
>>> Folks, this review spans three OpenJDK aliases so Thunderbird's
>>> reply-to-list feature won't get it right. This is one of the
>>> few times that reply-to-all is the right thing to do...
>>>
>>> Dan
>>>
>>> On 12/7/17 11:38 AM, Daniel D. Daugherty wrote:
>>>> Greetings,
>>>>
>>>> I have a small fix for a very intermittent ServerSocket related test
>>>> failure:
>>>>
>>>>     JDK-8182307: Error during JRMP connection establishment
>>>> https://bugs.openjdk.java.net/browse/JDK-8182307
>>>>
>>>> The fix is copied from Jerry's work on the following bugs:
>>>>
>>>>     JDK-8182757 JDWP: Socket Transport handshake hangs on Solaris
>>>> https://bugs.openjdk.java.net/browse/JDK-8182757
>>>>
>>>>     JDK-8178676 nsk/jvmti/AttachOnDemand/attach045 fails with
>>>> Exception
>>>> https://bugs.openjdk.java.net/browse/JDK-8178676
>>>>
>>>> For the gory details of the reasons for this fix please see
>>>> Jerry's bugs.
>>>>
>>>> Webrev URL: http://cr.openjdk.java.net/~dcubed/8182307-webrev/jdk10-0/
>>>>
>>>>
>>>> I observed this test failure one time in my Thread-SMR work and have
>>>> been including this fix in months of Thread-SMR stress testing. It was
>>>> also included in both JPRT and Mach5 tier[1-5] testing of the
>>>> Thread-SMR
>>>> changes.
>>>>
>>>> Thanks, in advance, for any comments, suggestions or questions.
>>>>
>>>> Dan
>>>>
>>>
>>
>

Reply | Threaded
Open this post in threaded view
|

Re: RFR(XXS): 8182307 - Error during JRMP connection establishment

serguei.spitsyn@oracle.com
Hi Dan,

The fix looks good to me.
Nice, you have caught it.

Do you want this fixed in 10 or 11?
I thought that the jdk/hs is for 11 now.
Is it correct?

Thanks,
Serguei


On 12/7/17 09:09, Daniel D. Daugherty wrote:

> Roger,
>
> Thanks for the review!
>
> Dan
>
> P.S.
> I'm planning to push this fix to jdk/hs since the only sightings
> have been in jdk/hs testing or in projects that are parented to
> jdk/hs repos... Hope that's okay...
>
>
> On 12/7/17 12:07 PM, Roger Riggs wrote:
>> +1
>>
>> On 12/7/2017 12:00 PM, Gerald Thornbrugh wrote:
>>> Hi Dan,
>>>
>>> Your fix looks good.
>>>
>>> Jerry
>>>
>>>> Adding core-libs-dev@... since this is RMI code. Thanks Alan!
>>>>
>>>> Folks, this review spans three OpenJDK aliases so Thunderbird's
>>>> reply-to-list feature won't get it right. This is one of the
>>>> few times that reply-to-all is the right thing to do...
>>>>
>>>> Dan
>>>>
>>>> On 12/7/17 11:38 AM, Daniel D. Daugherty wrote:
>>>>> Greetings,
>>>>>
>>>>> I have a small fix for a very intermittent ServerSocket related test
>>>>> failure:
>>>>>
>>>>>     JDK-8182307: Error during JRMP connection establishment
>>>>> https://bugs.openjdk.java.net/browse/JDK-8182307
>>>>>
>>>>> The fix is copied from Jerry's work on the following bugs:
>>>>>
>>>>>     JDK-8182757 JDWP: Socket Transport handshake hangs on Solaris
>>>>> https://bugs.openjdk.java.net/browse/JDK-8182757
>>>>>
>>>>>     JDK-8178676 nsk/jvmti/AttachOnDemand/attach045 fails with
>>>>> Exception
>>>>> https://bugs.openjdk.java.net/browse/JDK-8178676
>>>>>
>>>>> For the gory details of the reasons for this fix please see
>>>>> Jerry's bugs.
>>>>>
>>>>> Webrev URL:
>>>>> http://cr.openjdk.java.net/~dcubed/8182307-webrev/jdk10-0/
>>>>>
>>>>>
>>>>> I observed this test failure one time in my Thread-SMR work and have
>>>>> been including this fix in months of Thread-SMR stress testing. It
>>>>> was
>>>>> also included in both JPRT and Mach5 tier[1-5] testing of the
>>>>> Thread-SMR
>>>>> changes.
>>>>>
>>>>> Thanks, in advance, for any comments, suggestions or questions.
>>>>>
>>>>> Dan
>>>>>
>>>>
>>>
>>
>

Reply | Threaded
Open this post in threaded view
|

Re: RFR(XXS): 8182307 - Error during JRMP connection establishment

serguei.spitsyn@oracle.com
On 12/7/17 10:08, [hidden email] wrote:
> Hi Dan,
>
> The fix looks good to me.
> Nice, you have caught it.
>
> Do you want this fixed in 10 or 11?
> I thought that the jdk/hs is for 11 now.
> Is it correct?

Never mind.
I've just found a message from Jesper the jdk/hs is used for 10 pushes
for one more week.

Thanks,
Serguei

>
> Thanks,
> Serguei
>
>
> On 12/7/17 09:09, Daniel D. Daugherty wrote:
>> Roger,
>>
>> Thanks for the review!
>>
>> Dan
>>
>> P.S.
>> I'm planning to push this fix to jdk/hs since the only sightings
>> have been in jdk/hs testing or in projects that are parented to
>> jdk/hs repos... Hope that's okay...
>>
>>
>> On 12/7/17 12:07 PM, Roger Riggs wrote:
>>> +1
>>>
>>> On 12/7/2017 12:00 PM, Gerald Thornbrugh wrote:
>>>> Hi Dan,
>>>>
>>>> Your fix looks good.
>>>>
>>>> Jerry
>>>>
>>>>> Adding core-libs-dev@... since this is RMI code. Thanks Alan!
>>>>>
>>>>> Folks, this review spans three OpenJDK aliases so Thunderbird's
>>>>> reply-to-list feature won't get it right. This is one of the
>>>>> few times that reply-to-all is the right thing to do...
>>>>>
>>>>> Dan
>>>>>
>>>>> On 12/7/17 11:38 AM, Daniel D. Daugherty wrote:
>>>>>> Greetings,
>>>>>>
>>>>>> I have a small fix for a very intermittent ServerSocket related test
>>>>>> failure:
>>>>>>
>>>>>>     JDK-8182307: Error during JRMP connection establishment
>>>>>> https://bugs.openjdk.java.net/browse/JDK-8182307
>>>>>>
>>>>>> The fix is copied from Jerry's work on the following bugs:
>>>>>>
>>>>>>     JDK-8182757 JDWP: Socket Transport handshake hangs on Solaris
>>>>>> https://bugs.openjdk.java.net/browse/JDK-8182757
>>>>>>
>>>>>>     JDK-8178676 nsk/jvmti/AttachOnDemand/attach045 fails with
>>>>>> Exception
>>>>>> https://bugs.openjdk.java.net/browse/JDK-8178676
>>>>>>
>>>>>> For the gory details of the reasons for this fix please see
>>>>>> Jerry's bugs.
>>>>>>
>>>>>> Webrev URL:
>>>>>> http://cr.openjdk.java.net/~dcubed/8182307-webrev/jdk10-0/
>>>>>>
>>>>>>
>>>>>> I observed this test failure one time in my Thread-SMR work and have
>>>>>> been including this fix in months of Thread-SMR stress testing.
>>>>>> It was
>>>>>> also included in both JPRT and Mach5 tier[1-5] testing of the
>>>>>> Thread-SMR
>>>>>> changes.
>>>>>>
>>>>>> Thanks, in advance, for any comments, suggestions or questions.
>>>>>>
>>>>>> Dan
>>>>>>
>>>>>
>>>>
>>>
>>
>

Reply | Threaded
Open this post in threaded view
|

Re: RFR(XXS): 8182307 - Error during JRMP connection establishment

Daniel D. Daugherty
In reply to this post by serguei.spitsyn@oracle.com
On 12/7/17 1:08 PM, [hidden email] wrote:
> Hi Dan,
>
> The fix looks good to me.
> Nice, you have caught it.

Thanks for the review!


> Do you want this fixed in 10 or 11?
> I thought that the jdk/hs is for 11 now.
> Is it correct?

Please see Mark R's e-mail with the subject line of
"JDK 10 enters Rampdown Phase One in one week". In short,
the RDP1 deadline applies to jdk/jdk, jdk/hs and jdk/client
all at the same time.

Dan


>
> Thanks,
> Serguei
>
>
> On 12/7/17 09:09, Daniel D. Daugherty wrote:
>> Roger,
>>
>> Thanks for the review!
>>
>> Dan
>>
>> P.S.
>> I'm planning to push this fix to jdk/hs since the only sightings
>> have been in jdk/hs testing or in projects that are parented to
>> jdk/hs repos... Hope that's okay...
>>
>>
>> On 12/7/17 12:07 PM, Roger Riggs wrote:
>>> +1
>>>
>>> On 12/7/2017 12:00 PM, Gerald Thornbrugh wrote:
>>>> Hi Dan,
>>>>
>>>> Your fix looks good.
>>>>
>>>> Jerry
>>>>
>>>>> Adding core-libs-dev@... since this is RMI code. Thanks Alan!
>>>>>
>>>>> Folks, this review spans three OpenJDK aliases so Thunderbird's
>>>>> reply-to-list feature won't get it right. This is one of the
>>>>> few times that reply-to-all is the right thing to do...
>>>>>
>>>>> Dan
>>>>>
>>>>> On 12/7/17 11:38 AM, Daniel D. Daugherty wrote:
>>>>>> Greetings,
>>>>>>
>>>>>> I have a small fix for a very intermittent ServerSocket related test
>>>>>> failure:
>>>>>>
>>>>>>     JDK-8182307: Error during JRMP connection establishment
>>>>>> https://bugs.openjdk.java.net/browse/JDK-8182307
>>>>>>
>>>>>> The fix is copied from Jerry's work on the following bugs:
>>>>>>
>>>>>>     JDK-8182757 JDWP: Socket Transport handshake hangs on Solaris
>>>>>> https://bugs.openjdk.java.net/browse/JDK-8182757
>>>>>>
>>>>>>     JDK-8178676 nsk/jvmti/AttachOnDemand/attach045 fails with
>>>>>> Exception
>>>>>> https://bugs.openjdk.java.net/browse/JDK-8178676
>>>>>>
>>>>>> For the gory details of the reasons for this fix please see
>>>>>> Jerry's bugs.
>>>>>>
>>>>>> Webrev URL:
>>>>>> http://cr.openjdk.java.net/~dcubed/8182307-webrev/jdk10-0/
>>>>>>
>>>>>>
>>>>>> I observed this test failure one time in my Thread-SMR work and have
>>>>>> been including this fix in months of Thread-SMR stress testing.
>>>>>> It was
>>>>>> also included in both JPRT and Mach5 tier[1-5] testing of the
>>>>>> Thread-SMR
>>>>>> changes.
>>>>>>
>>>>>> Thanks, in advance, for any comments, suggestions or questions.
>>>>>>
>>>>>> Dan
>>>>>>
>>>>>
>>>>
>>>
>>
>

Reply | Threaded
Open this post in threaded view
|

Re: RFR(XXS): 8182307 - Error during JRMP connection establishment

Daniel D. Daugherty
In reply to this post by Daniel D. Daugherty
Greetings,

My fix for the following bug:

     JDK-8182307: Error during JRMP connection establishment
     https://bugs.openjdk.java.net/browse/JDK-8182307

broke two hs-tier3 tests. I'm backing out the fix via:

     JDK-8193225 [BACKOUT] fix for 8182307 Error during JRMP connection
establishment
     https://bugs.openjdk.java.net/browse/JDK-8193225

Here's the webrev:

     http://cr.openjdk.java.net/~dcubed/8193225-webrev/jdk10-0/

This is a simple "hg backout". I need a single (R)eviewer. Thanks!

Dan



On 12/7/17 11:38 AM, Daniel D. Daugherty wrote:

> Greetings,
>
> I have a small fix for a very intermittent ServerSocket related test
> failure:
>
>     JDK-8182307: Error during JRMP connection establishment
>     https://bugs.openjdk.java.net/browse/JDK-8182307
>
> The fix is copied from Jerry's work on the following bugs:
>
>     JDK-8182757 JDWP: Socket Transport handshake hangs on Solaris
>     https://bugs.openjdk.java.net/browse/JDK-8182757
>
>     JDK-8178676 nsk/jvmti/AttachOnDemand/attach045 fails with Exception
>     https://bugs.openjdk.java.net/browse/JDK-8178676
>
> For the gory details of the reasons for this fix please see
> Jerry's bugs.
>
> Webrev URL: http://cr.openjdk.java.net/~dcubed/8182307-webrev/jdk10-0/
>
>
> I observed this test failure one time in my Thread-SMR work and have
> been including this fix in months of Thread-SMR stress testing. It was
> also included in both JPRT and Mach5 tier[1-5] testing of the Thread-SMR
> changes.
>
> Thanks, in advance, for any comments, suggestions or questions.
>
> Dan
>

Reply | Threaded
Open this post in threaded view
|

Re: RFR(XXS): 8182307 - Error during JRMP connection establishment

Alan Bateman
In reply to this post by Daniel D. Daugherty
On 07/12/2017 16:55, Daniel D. Daugherty wrote:

> :
>> Greetings,
>>
>> I have a small fix for a very intermittent ServerSocket related test
>> failure:
>>
>>     JDK-8182307: Error during JRMP connection establishment
>>     https://bugs.openjdk.java.net/browse/JDK-8182307
>>
>> :
>>
>> For the gory details of the reasons for this fix please see
>> Jerry's bugs.
>>
>> Webrev URL: http://cr.openjdk.java.net/~dcubed/8182307-webrev/jdk10-0/
>>
It's not clear to me how this change solves the issue.  It's a "read
timeout" so this means the connection has been established. The client
will not care if the server has enabled SO_REUSEADDR or whether it
initially bound to a fixed or ephemeral port.

Is this issue Solaris only? I ask because there is an awkward issue on
Solaris where the kernel will accept a pending connection when the
process is at its file descriptor limit. We've seen this periodically,
esp. with tests that leave connections or files open. An unsuspecting
tests runs later, establishes a connection but gets timeouts as there
isn't no code at the application level has accepted the connection.

-Alan


Reply | Threaded
Open this post in threaded view
|

Re: RFR(XXS): 8182307 - Error during JRMP connection establishment

Daniel D. Daugherty
On 12/8/17 4:28 AM, Alan Bateman wrote:

> On 07/12/2017 16:55, Daniel D. Daugherty wrote:
>> :
>>> Greetings,
>>>
>>> I have a small fix for a very intermittent ServerSocket related test
>>> failure:
>>>
>>>     JDK-8182307: Error during JRMP connection establishment
>>>     https://bugs.openjdk.java.net/browse/JDK-8182307
>>>
>>> :
>>>
>>> For the gory details of the reasons for this fix please see
>>> Jerry's bugs.
>>>
>>> Webrev URL: http://cr.openjdk.java.net/~dcubed/8182307-webrev/jdk10-0/
>>>
> It's not clear to me how this change solves the issue.  It's a "read
> timeout" so this means the connection has been established.

Yes, the connection has been established, but it has been established
to the wrong ServerSocket. The ServerSocket port that was picked by
the test with its "return new ServerSocket(port)" call was also picked
by another "interloper" process. It's the SO_REUSEADDR attribute that
allows these two processes to both think that they have the same
random port. We have only observed proven sightings of this bug on
Solaris SPARC and Solaris X64 machines.

So the interloper and the server side of the test both did accept()
calls on the same port. The interloper won the race in this case so
it is matched up with the test's client side connect(). The test's
client side starts doing its protocol reads, but the interloper
does not send what the test's client side expects so the test's
client side times out in read().

Here's Jerry's eval note from
https://bugs.openjdk.java.net/browse/JDK-8182757:

> gthornbr Gerald Thornbrugh
> <https://bugs.openjdk.java.net/secure/ViewProfile.jspa?name=gthornbr>
> added a comment - 2017-07-27 11:33
> If a socket is being setup without a fixed port using the SO_REUSEADDR
> flag can lead to other processes interfering with the poll/receive
> process of a debugger/debuggee configuring a socket for communication.
> When SO_REUSEADDR is used other processes can attempt a listen() on
> the same port and receive a connect from the debuggee. This causes the
> debugger to stay in poll() waiting for a connect and the debuggee
> stays in recv() waiting to receive data from the "rogue" process that
> will never send it.
>
> This can also lead to connections being terminated early on the
> debuggee side when the "rogue" process terminates the connection
> because it does not receive what it expected from the client process
> (i.e. the debuggee).
>
> The fix is to not use the SO_REUSEADDR flag for non-fixed port
> sockets. This keeps "rogue" processes from reusing the port address
> and from stealing the connects sent by from the debuggee.

In the hunt for JDK-8182757 we were fortunate that the tests were
configured for the server side accept() call to _not_ timeout.
That allowed us to capture stacks from both the debuggee and
debugger sides. We were also able to capture debug info from
different points in the protocol stack in various repro attemps.
The only thing we didn't do was add debugging info in the kernel
to try and chase the race enabled by SO_REUSEADDR to ground.


This bug's (JDK-8182307) failure mode is more like the other failure
that Jerry fixed: https://bugs.openjdk.java.net/browse/JDK-8178676
The server side accept() is configured to timeout so we don't have
a stack from the server side hang point to prove that the JDK-8178676
failure is the same as the JDK-8182757.


With the fixes for JDK-8182757 and JDK-8178676 in place, we have not
seen these failure modes reproduce. The fix for JDK-8182757 was pushed
on 2017-08-03 and the fix for JDK-8178676 was pushed on 2017-08-14. It
is not proof, but it is a strong indicator that these instances of
this failure mode are fixed.


> The client will not care if the server has enabled SO_REUSEADDR or
> whether it initially bound to a fixed or ephemeral port.

True, but the client has been connected to the interloper process
which is why the read() times out. It is the SO_REUSEADDR attribute
that allows the interloper to accept() the test's server side port
and that does break the client side of the test.


> Is this issue Solaris only? I ask because there is an awkward issue on
> Solaris where the kernel will accept a pending connection when the
> process is at its file descriptor limit. We've seen this periodically,
> esp. with tests that leave connections or files open. An unsuspecting
> tests runs later, establishes a connection but gets timeouts as there
> isn't no code at the application level has accepted the connection.

We have only seen provable sightings of this failure mode on Solaris
SPARC and Solaris X64 machines. Folks have added sightings on other
platforms to the older bug that was tracking the original issue:

     JDK-6303969 JDWP: Socket Transport handshake fails rarely on
InstancesTest.java
     https://bugs.openjdk.java.net/browse/JDK-6303969

but Jerry and I were never able to prove a sighting on anything other
than Solaris.

This "file descriptor limit" issue is new to me. Do you have a pointer
to it? It's entirely possible that there is more than one bug at play
here...

Dan


>
> -Alan
>
>