Re: jtreg testing integrated

classic Classic list List threaded Threaded
24 messages Options
12
Reply | Threaded
Open this post in threaded view
|

Re: jtreg testing integrated

Martin Buchholz-3
[+compiler-dev, jtreg-use]

On Mon, May 19, 2008 at 7:56 AM, Andrew John Hughes
<[hidden email]> wrote:
> 2008/5/19 Mark Wielaard <[hidden email]>:
>>
>> make jtregcheck -k runs the testsuites of hotspot (4 tests, all PASS),
>> langtools (1,342 PASS, 1 FAIL - the version check) and jdk (2,875 tests
>> of which about 130 fail - rerunning tests now). corba, jaxp and jaxws
>> don't come with any tests. This takes about 3 hours on my machine.

Once upon a time, I wrote a test that made sure the hotspot
and jdk library's idea of the current version and supported targets
were in sync.  Unfortunately, it is not a requirement on hotspot
integrations that they pass this test, so the test starts failing whenever
hotspot starts supporting the class file version number for the next
major release.  At least this is a strong hint to the javac team to
catch up soon by incrementing their supported targets, etc...

I like a policy of "Read my lips; no new test failures" but OpenJDK
is not quite there; we get test failure creep when changes in
one component break another component's tests.

>> Most of the failures are because the host javaweb.sfbay.sun.com cannot
>> be resolved.

The jtreg tests were originally designed to be run only by Sun JDK
development and test engineers.  If someone can come up with a
portable way of testing network services (like ftp clients) without
setting up a dedicated machine with a well-known name, that
would be good.  Alternatively, making the name of this
machine configurable when jtreg is run would also
be an improvement, and a much simpler one.  But the obvious
idea of using environment variables doesn't work.  Most environment
variables are not passed to the running java test program.

If it's considered acceptable for IcedTea hackers to get their
hands dirty with not-100%-free technology, y'all could try
running the jtreg tests against IcedTea, vanilla OpenJDK7,
OpenJDK6, and JDK 6u6, and comparing the test failures.

I once wrote a script to compare two jtreg test runs, diff-javatest.
Jonathan et al, could you work (with me) on releasing that as open source?

Martin

>> But there are also some genuine failures in java.awt.color,
>> jmx.snmp, javax.script, javax.print, ... so enough to do for
>> enterprising hackers!
>>
Reply | Threaded
Open this post in threaded view
|

Re: jtreg testing integrated

jonathan.gibbons
Martin,

jtreg is now open source, as of just before JavaOne. See http://openjdk.java.net/jtreg

There is a small new utility called "jtdiff" that comes with jtreg  
that may do what
you want.  It will do n-way comparison of any JavaTest-type results  
(meaning jtreg, JCK, etc)
where each set of results can be given as a JavaTest work directory,  
report directory,
or just the summary.txt file within a report directory. The output can  
be plain text or HTML.

jtdiff is within jtreg.jar, so the easiest way to invoke it is
        java -cp jtreg.jar com.sun.javatest.diff.Main <args>

-- Jon

On May 19, 2008, at 8:30 AM, Martin Buchholz wrote:

> [+compiler-dev, jtreg-use]
>
> On Mon, May 19, 2008 at 7:56 AM, Andrew John Hughes
> <[hidden email]> wrote:
>> 2008/5/19 Mark Wielaard <[hidden email]>:
>>>
>>> make jtregcheck -k runs the testsuites of hotspot (4 tests, all  
>>> PASS),
>>> langtools (1,342 PASS, 1 FAIL - the version check) and jdk (2,875  
>>> tests
>>> of which about 130 fail - rerunning tests now). corba, jaxp and  
>>> jaxws
>>> don't come with any tests. This takes about 3 hours on my machine.
>
> Once upon a time, I wrote a test that made sure the hotspot
> and jdk library's idea of the current version and supported targets
> were in sync.  Unfortunately, it is not a requirement on hotspot
> integrations that they pass this test, so the test starts failing  
> whenever
> hotspot starts supporting the class file version number for the next
> major release.  At least this is a strong hint to the javac team to
> catch up soon by incrementing their supported targets, etc...
>
> I like a policy of "Read my lips; no new test failures" but OpenJDK
> is not quite there; we get test failure creep when changes in
> one component break another component's tests.
>
>>> Most of the failures are because the host javaweb.sfbay.sun.com  
>>> cannot
>>> be resolved.
>
> The jtreg tests were originally designed to be run only by Sun JDK
> development and test engineers.  If someone can come up with a
> portable way of testing network services (like ftp clients) without
> setting up a dedicated machine with a well-known name, that
> would be good.  Alternatively, making the name of this
> machine configurable when jtreg is run would also
> be an improvement, and a much simpler one.  But the obvious
> idea of using environment variables doesn't work.  Most environment
> variables are not passed to the running java test program.
>
> If it's considered acceptable for IcedTea hackers to get their
> hands dirty with not-100%-free technology, y'all could try
> running the jtreg tests against IcedTea, vanilla OpenJDK7,
> OpenJDK6, and JDK 6u6, and comparing the test failures.
>
> I once wrote a script to compare two jtreg test runs, diff-javatest.
> Jonathan et al, could you work (with me) on releasing that as open  
> source?
>
> Martin
>
>>> But there are also some genuine failures in java.awt.color,
>>> jmx.snmp, javax.script, javax.print, ... so enough to do for
>>> enterprising hackers!
>>>

Reply | Threaded
Open this post in threaded view
|

Re: jtreg testing integrated

Martin Buchholz-3
Jonathan,

Thanks for jtdiff.

Suggestions:

- The various help options are confusing.
  Just print all the help available if given -help or -usage.
- The usage should give a paragraph explaining what it does.
  Not too much work; why, you've practically written the required
  words in the quoted text below.
- My first version of diff-javatest was symmetrical.  It printed any
  difference between the two runs, in both directions.  Later I
  realized that (at least my own) usage invariably had the notion
  of a "reference" JDK and a "test" JDK.  I was interested in tests
  run in the "test" JDK but not in the reference JDK, but not vice
  versa; typically the reference JDK results were historical ones
  produced by someone wearing a QA hat, and were more complete
  than the ones in the test JDK, where results were more likely to be
  part of an edit-compile-test cycle.

Try printing the usage message from my own diff-javatest script,
which should still be accessible inside Sun, in /java/tl, for example.

Martin

On Mon, May 19, 2008 at 9:52 AM, Jonathan Gibbons
<[hidden email]> wrote:

> Martin,
>
> jtreg is now open source, as of just before JavaOne. See
> http://openjdk.java.net/jtreg
>
> There is a small new utility called "jtdiff" that comes with jtreg that may
> do what
> you want.  It will do n-way comparison of any JavaTest-type results (meaning
> jtreg, JCK, etc)
> where each set of results can be given as a JavaTest work directory, report
> directory,
> or just the summary.txt file within a report directory. The output can be
> plain text or HTML.
>
> jtdiff is within jtreg.jar, so the easiest way to invoke it is
>        java -cp jtreg.jar com.sun.javatest.diff.Main <args>
>
> -- Jon
>
> On May 19, 2008, at 8:30 AM, Martin Buchholz wrote:
>
>> [+compiler-dev, jtreg-use]
>>
>> On Mon, May 19, 2008 at 7:56 AM, Andrew John Hughes
>> <[hidden email]> wrote:
>>>
>>> 2008/5/19 Mark Wielaard <[hidden email]>:
>>>>
>>>> make jtregcheck -k runs the testsuites of hotspot (4 tests, all PASS),
>>>> langtools (1,342 PASS, 1 FAIL - the version check) and jdk (2,875 tests
>>>> of which about 130 fail - rerunning tests now). corba, jaxp and jaxws
>>>> don't come with any tests. This takes about 3 hours on my machine.
>>
>> Once upon a time, I wrote a test that made sure the hotspot
>> and jdk library's idea of the current version and supported targets
>> were in sync.  Unfortunately, it is not a requirement on hotspot
>> integrations that they pass this test, so the test starts failing whenever
>> hotspot starts supporting the class file version number for the next
>> major release.  At least this is a strong hint to the javac team to
>> catch up soon by incrementing their supported targets, etc...
>>
>> I like a policy of "Read my lips; no new test failures" but OpenJDK
>> is not quite there; we get test failure creep when changes in
>> one component break another component's tests.
>>
>>>> Most of the failures are because the host javaweb.sfbay.sun.com cannot
>>>> be resolved.
>>
>> The jtreg tests were originally designed to be run only by Sun JDK
>> development and test engineers.  If someone can come up with a
>> portable way of testing network services (like ftp clients) without
>> setting up a dedicated machine with a well-known name, that
>> would be good.  Alternatively, making the name of this
>> machine configurable when jtreg is run would also
>> be an improvement, and a much simpler one.  But the obvious
>> idea of using environment variables doesn't work.  Most environment
>> variables are not passed to the running java test program.
>>
>> If it's considered acceptable for IcedTea hackers to get their
>> hands dirty with not-100%-free technology, y'all could try
>> running the jtreg tests against IcedTea, vanilla OpenJDK7,
>> OpenJDK6, and JDK 6u6, and comparing the test failures.
>>
>> I once wrote a script to compare two jtreg test runs, diff-javatest.
>> Jonathan et al, could you work (with me) on releasing that as open source?
>>
>> Martin
>>
>>>> But there are also some genuine failures in java.awt.color,
>>>> jmx.snmp, javax.script, javax.print, ... so enough to do for
>>>> enterprising hackers!
>>>>
>
>
Reply | Threaded
Open this post in threaded view
|

Re: jtreg testing integrated

jonathan.gibbons
Martin,

(removed compiler-dev)

Your comments about unsymmetric runs are interesting.  jtdiff performs
an n-way
comparison, and I'd want to keep that functionality.

The two use cases I had in mind were:

-- Given a set of nightly builds on a set of platforms, compare the
results across
    all the platforms, and report differences

-- Given the same set of nightly builds on a set of platforms, for each
platform
    perform a pair-wise comparison against the corresponding results
last night/week/month.

I'll see about adding an option to specify a reference set of results,
for your
"developer time" use case.

---

Separately, check out the options for handling @ignore tests. Even on
older versions
of jtreg you can use "-k:!ignore" to ignore @ignore tests.  (This works
because @ignore
tests are given an implicit "ignore" keyword.)   With later versions of
jtreg, you can use
-Ignore:{quiet,error,run} to control how @ignore tests should be
handled. Using this
option, you should be to get closer to the goal of "all tests should
pass", meaning
that there are less failures and so less need to compare the output
results with jtdiff.

-- Jon



Martin Buchholz wrote:

> Jonathan,
>
> Thanks for jtdiff.
>
> Suggestions:
>
> - The various help options are confusing.
>   Just print all the help available if given -help or -usage.
> - The usage should give a paragraph explaining what it does.
>   Not too much work; why, you've practically written the required
>   words in the quoted text below.
> - My first version of diff-javatest was symmetrical.  It printed any
>   difference between the two runs, in both directions.  Later I
>   realized that (at least my own) usage invariably had the notion
>   of a "reference" JDK and a "test" JDK.  I was interested in tests
>   run in the "test" JDK but not in the reference JDK, but not vice
>   versa; typically the reference JDK results were historical ones
>   produced by someone wearing a QA hat, and were more complete
>   than the ones in the test JDK, where results were more likely to be
>   part of an edit-compile-test cycle.
>
> Try printing the usage message from my own diff-javatest script,
> which should still be accessible inside Sun, in /java/tl, for example.
>
> Martin
>
> On Mon, May 19, 2008 at 9:52 AM, Jonathan Gibbons
> <[hidden email]> wrote:
>  
>> Martin,
>>
>> jtreg is now open source, as of just before JavaOne. See
>> http://openjdk.java.net/jtreg
>>
>> There is a small new utility called "jtdiff" that comes with jtreg that may
>> do what
>> you want.  It will do n-way comparison of any JavaTest-type results (meaning
>> jtreg, JCK, etc)
>> where each set of results can be given as a JavaTest work directory, report
>> directory,
>> or just the summary.txt file within a report directory. The output can be
>> plain text or HTML.
>>
>> jtdiff is within jtreg.jar, so the easiest way to invoke it is
>>        java -cp jtreg.jar com.sun.javatest.diff.Main <args>
>>
>> -- Jon
>>
>> On May 19, 2008, at 8:30 AM, Martin Buchholz wrote:
>>
>>    
>>> [+compiler-dev, jtreg-use]
>>>
>>> On Mon, May 19, 2008 at 7:56 AM, Andrew John Hughes
>>> <[hidden email]> wrote:
>>>      
>>>> 2008/5/19 Mark Wielaard <[hidden email]>:
>>>>        
>>>>> make jtregcheck -k runs the testsuites of hotspot (4 tests, all PASS),
>>>>> langtools (1,342 PASS, 1 FAIL - the version check) and jdk (2,875 tests
>>>>> of which about 130 fail - rerunning tests now). corba, jaxp and jaxws
>>>>> don't come with any tests. This takes about 3 hours on my machine.
>>>>>          
>>> Once upon a time, I wrote a test that made sure the hotspot
>>> and jdk library's idea of the current version and supported targets
>>> were in sync.  Unfortunately, it is not a requirement on hotspot
>>> integrations that they pass this test, so the test starts failing whenever
>>> hotspot starts supporting the class file version number for the next
>>> major release.  At least this is a strong hint to the javac team to
>>> catch up soon by incrementing their supported targets, etc...
>>>
>>> I like a policy of "Read my lips; no new test failures" but OpenJDK
>>> is not quite there; we get test failure creep when changes in
>>> one component break another component's tests.
>>>
>>>      
>>>>> Most of the failures are because the host javaweb.sfbay.sun.com cannot
>>>>> be resolved.
>>>>>          
>>> The jtreg tests were originally designed to be run only by Sun JDK
>>> development and test engineers.  If someone can come up with a
>>> portable way of testing network services (like ftp clients) withoutjsqrdg
>>> setting up a dedicated machine with a well-known name, that
>>> would be good.  Alternatively, making the name of this
>>> machine configurable when jtreg is run would also
>>> be an improvement, and a much simpler one.  But the obvious
>>> idea of using environment variables doesn't work.  Most environment
>>> variables are not passed to the running java test program.
>>>
>>> If it's considered acceptable for IcedTea hackers to get their
>>> hands dirty with not-100%-free technology, y'all could try
>>> running the jtreg tests against IcedTea, vanilla OpenJDK7,
>>> OpenJDK6, and JDK 6u6, and comparing the test failures.
>>>
>>> I once wrote a script to compare two jtreg test runs, diff-javatest.
>>> Jonathan et al, could you work (with me) on releasing that as open source?
>>>
>>> Martin
>>>
>>>      
>>>>> But there are also some genuine failures in java.awt.color,
>>>>> jmx.snmp, javax.script, javax.print, ... so enough to do for
>>>>> enterprising hackers!
>>>>>
>>>>>          
>>    

Reply | Threaded
Open this post in threaded view
|

Re: jtreg testing integrated

Christian Thalinger
On Mon, 2008-05-19 at 16:07 -0700, Jonathan Gibbons wrote:

> Martin,
>
> (removed compiler-dev)
>
> Your comments about unsymmetric runs are interesting.  jtdiff performs
> an n-way
> comparison, and I'd want to keep that functionality.
>
> The two use cases I had in mind were:
>
> -- Given a set of nightly builds on a set of platforms, compare the
> results across
>     all the platforms, and report differences
>
> -- Given the same set of nightly builds on a set of platforms, for each
> platform
>     perform a pair-wise comparison against the corresponding results
> last night/week/month.

That's exactly what I want to have.  Does jtdiff currently support
anything of the above?

- twisti

Reply | Threaded
Open this post in threaded view
|

Re: jtreg testing integrated

Mark Wielaard
In reply to this post by Martin Buchholz-3
Hi Martin,

On Mon, 2008-05-19 at 08:30 -0700, Martin Buchholz wrote:

> On Mon, May 19, 2008 at 7:56 AM, Andrew John Hughes
> <[hidden email]> wrote:
> > 2008/5/19 Mark Wielaard <[hidden email]>:
> >>
> >> make jtregcheck -k runs the testsuites of hotspot (4 tests, all PASS),
> >> langtools (1,342 PASS, 1 FAIL - the version check) and jdk (2,875 tests
> >> of which about 130 fail - rerunning tests now). corba, jaxp and jaxws
> >> don't come with any tests. This takes about 3 hours on my machine.
>
> Once upon a time, I wrote a test that made sure the hotspot
> and jdk library's idea of the current version and supported targets
> were in sync.  Unfortunately, it is not a requirement on hotspot
> integrations that they pass this test, so the test starts failing whenever
> hotspot starts supporting the class file version number for the next
> major release.  At least this is a strong hint to the javac team to
> catch up soon by incrementing their supported targets, etc...

In this case it is a more mundane version check failure:
tools/javac/versions/check.sh
javac thinks its version isn't 1.6.0, but javac 1.6.0-internal

> I like a policy of "Read my lips; no new test failures" but OpenJDK
> is not quite there; we get test failure creep when changes in
> one component break another component's tests.

Yes, that would be idea. At least for openjdk6/icedtea we seem to be
pretty close actually. It will be more challenging for openjdk7. I
haven't quite figured out all the dynamics around "workspace
integration". But I assume we can get the master tree to zero fail and
then demand that any integration cycle doesn't introduce regressions.

> The jtreg tests were originally designed to be run only by Sun JDK
> development and test engineers.  If someone can come up with a
> portable way of testing network services (like ftp clients) without
> setting up a dedicated machine with a well-known name, that
> would be good.  Alternatively, making the name of this
> machine configurable when jtreg is run would also
> be an improvement, and a much simpler one.  But the obvious
> idea of using environment variables doesn't work.  Most environment
> variables are not passed to the running java test program.

Making it configurable, or even ignorable with keywords would be crucial
for distribution testing. Most distributions don't allow their build
daemons to access the network. But for quality control it is essential
that they do run the full test suite.

I haven't made an inventory of what services would be needed on a
machine to be replace javaweb.sfbay.sun.com with a public machine, but
we can certainly run some services on icedtea.classpath.org or maybe one
of the openjdk.java.net machines.

> If it's considered acceptable for IcedTea hackers to get their
> hands dirty with not-100%-free technology, y'all could try
> running the jtreg tests against IcedTea, vanilla OpenJDK7,
> OpenJDK6, and JDK 6u6, and comparing the test failures.

:) Well, the whole idea behind IcedTea is to provide an OpenJDK
derivative that doesn't depend on any non-free build or runtime
requirements.

But I am certainly interested in comparing results. I do think OpenJDK6
and IcedTea are now so close that we shouldn't be seeing any test result
differences between the two.

That brings up the question how to export a JTreport/JTwork environment.
I only posted the text results http://icedtea.classpath.org/~mjw/jtreg/
since the JTreport and JTwork files have all kinds of hard coded
absolute path references. It would be nice to be able to export it all
so I can upload it to some public site for others to look at and compare
with.

Cheers,

MArk

Reply | Threaded
Open this post in threaded view
|

Re: jtreg testing integrated

Martin Buchholz-3
In reply to this post by jonathan.gibbons
Wait, now I remember, I used to wear the integrator hat too,
and I wanted the ability to data-mine a long series of javatest runs.
I never tackled that problem.

The general problem is really hard.
You want a report that says things like

Test FOO has been failing intermittently since the Ides of March,
but only on 64-bit x86 platforms.

Martin

On Mon, May 19, 2008 at 4:07 PM, Jonathan Gibbons
<[hidden email]> wrote:

> Martin,
>
> (removed compiler-dev)
>
> Your comments about unsymmetric runs are interesting.  jtdiff performs an
> n-way
> comparison, and I'd want to keep that functionality.
>
> The two use cases I had in mind were:
>
> -- Given a set of nightly builds on a set of platforms, compare the results
> across
>   all the platforms, and report differences
>
> -- Given the same set of nightly builds on a set of platforms, for each
> platform
>   perform a pair-wise comparison against the corresponding results last
> night/week/month.
>
> I'll see about adding an option to specify a reference set of results, for
> your
> "developer time" use case.
>
Reply | Threaded
Open this post in threaded view
|

Re: jtreg testing integrated

Martin Buchholz-3
In reply to this post by Mark Wielaard
On Tue, May 20, 2008 at 2:32 AM, Mark Wielaard <[hidden email]> wrote:
>> I like a policy of "Read my lips; no new test failures" but OpenJDK
>> is not quite there; we get test failure creep when changes in
>> one component break another component's tests.
>
> Yes, that would be idea. At least for openjdk6/icedtea we seem to be
> pretty close actually. It will be more challenging for openjdk7. I
> haven't quite figured out all the dynamics around "workspace
> integration". But I assume we can get the master tree to zero fail and
> then demand that any integration cycle doesn't introduce regressions.

There are too many tests to require team integrators to run
them all on each integration cycle.  For a few years I've advocated
adding another level to the tree of workspaces.  My model is to
rename the current MASTER workspace to PURGATORY, and
add a "golden MASTER".
The idea is that once a week or so all tests are run exhaustively,
and when it is confirmed that there are no new test failures,
the tested code from PURGATORY is promoted to MASTER.

> That brings up the question how to export a JTreport/JTwork environment.
> I only posted the text results http://icedtea.classpath.org/~mjw/jtreg/
> since the JTreport and JTwork files have all kinds of hard coded
> absolute path references. It would be nice to be able to export it all
> so I can upload it to some public site for others to look at and compare
> with.

My (unavailable) diff-javatest script had to contend with absolute
paths in the html in the report directory as well.  It made paths
relative by removing root dirs.  It would be good if javatest's output
was made more "portable" in this sense.  It's hard, because you
really do want direct pointers to failing tests and .jtr files, and
their location relative to the report directory cannot in general
be relativized.

Martin
Reply | Threaded
Open this post in threaded view
|

Re: jtreg testing integrated

jonathan.gibbons
In reply to this post by Christian Thalinger
Twisti,

As I said, jtreg is (just) a relatively simple n-way diff program for  
comparing sets of jtreg results.

By itself, it does not provide any infrastructure for maintaining  
those results for wider later
comparison. However, if you were to organize your results into a  
directory tree such as
        jtreg/DATE/PLATFORM
then it should be reasonably easy to write scripts to the platform-
wide or date-based
comparison runs.  I don't think I mentioned yesterday, jtdiff has an  
Ant task too,
so maybe you can do the processing you need inside Ant. You should  
also be able to
invoke jtdiff directly from other Java code too.

-- Jon



On May 20, 2008, at 1:23 AM, Christian Thalinger wrote:

> On Mon, 2008-05-19 at 16:07 -0700, Jonathan Gibbons wrote:
>> Martin,
>>
>> (removed compiler-dev)
>>
>> Your comments about unsymmetric runs are interesting.  jtdiff  
>> performs
>> an n-way
>> comparison, and I'd want to keep that functionality.
>>
>> The two use cases I had in mind were:
>>
>> -- Given a set of nightly builds on a set of platforms, compare the
>> results across
>>    all the platforms, and report differences
>>
>> -- Given the same set of nightly builds on a set of platforms, for  
>> each
>> platform
>>    perform a pair-wise comparison against the corresponding results
>> last night/week/month.
>
> That's exactly what I want to have.  Does jtdiff currently support
> anything of the above?
>
> - twisti
>

Reply | Threaded
Open this post in threaded view
|

Re: jtreg testing integrated

jonathan.gibbons
In reply to this post by Martin Buchholz-3
Martin,

One benefit of jtdiff is that it only requires you to keep the  
summary.txt file, and not the whole report or work directory.
So, it might be possible to keep more historical data and analyse it  
going forward.

Also, jtdiff has "pluggable output formatters", so you could write an  
XML output formatter and just keep jtdiff reports
and analyse those for historical trends.

-- Jon


On May 20, 2008, at 5:36 AM, Martin Buchholz wrote:

> Wait, now I remember, I used to wear the integrator hat too,
> and I wanted the ability to data-mine a long series of javatest runs.
> I never tackled that problem.
>
> The general problem is really hard.
> You want a report that says things like
>
> Test FOO has been failing intermittently since the Ides of March,
> but only on 64-bit x86 platforms.
>
> Martin
>
> On Mon, May 19, 2008 at 4:07 PM, Jonathan Gibbons
> <[hidden email]> wrote:
>> Martin,
>>
>> (removed compiler-dev)
>>
>> Your comments about unsymmetric runs are interesting.  jtdiff  
>> performs an
>> n-way
>> comparison, and I'd want to keep that functionality.
>>
>> The two use cases I had in mind were:
>>
>> -- Given a set of nightly builds on a set of platforms, compare the  
>> results
>> across
>>  all the platforms, and report differences
>>
>> -- Given the same set of nightly builds on a set of platforms, for  
>> each
>> platform
>>  perform a pair-wise comparison against the corresponding results  
>> last
>> night/week/month.
>>
>> I'll see about adding an option to specify a reference set of  
>> results, for
>> your
>> "developer time" use case.
>>

Reply | Threaded
Open this post in threaded view
|

Re: jtreg testing integrated

jonathan.gibbons
In reply to this post by Martin Buchholz-3

On May 20, 2008, at 6:00 AM, Martin Buchholz wrote:

>
> My (unavailable) diff-javatest script had to contend with absolute
> paths in the html in the report directory as well.  It made paths
> relative by removing root dirs.  It would be good if javatest's output
> was made more "portable" in this sense.  It's hard, because you
> really do want direct pointers to failing tests and .jtr files, and
> their location relative to the report directory cannot in general
> be relativized.
>
> Martin

In times past, we tried to resolve the "report" problem you describe
within JavaTest. We actually tried use relative pointers where possible.
The problem was that every solution we came up with broke someone's
use case.  A particularly notable problem was people running tests on
one system and then moving results around to another system.

The solution was to provide a utility called "EditLinks" within the  
JavaTest
framework. I assume it is still available within JT Harness. This is a  
simple
utility for post-processing the links within report files so that you  
can move
report files and work directories around as you choose.

[ Another possibility in the JT Harness space is that it now has a  
much more
configurable report generator. Perhaps the time has come to look again
at the relationship between the work and report directories.  JT Harness
lives at http://jtharness.dev.java.net with a mailing list at
[hidden email]. ]

-- Jon





Reply | Threaded
Open this post in threaded view
|

Re: jtreg testing integrated

jonathan.gibbons
In reply to this post by Mark Wielaard

On May 20, 2008, at 2:32 AM, Mark Wielaard wrote:

> Hi Martin,
>
> On Mon, 2008-05-19 at 08:30 -0700, Martin Buchholz wrote:
>> On Mon, May 19, 2008 at 7:56 AM, Andrew John Hughes
>> <[hidden email]> wrote:
>>> 2008/5/19 Mark Wielaard <[hidden email]>:
>>>>
>>>> make jtregcheck -k runs the testsuites of hotspot (4 tests, all  
>>>> PASS),
>>>> langtools (1,342 PASS, 1 FAIL - the version check) and jdk (2,875  
>>>> tests
>>>> of which about 130 fail - rerunning tests now). corba, jaxp and  
>>>> jaxws
>>>> don't come with any tests. This takes about 3 hours on my machine.
>>
>> Once upon a time, I wrote a test that made sure the hotspot
>> and jdk library's idea of the current version and supported targets
>> were in sync.  Unfortunately, it is not a requirement on hotspot
>> integrations that they pass this test, so the test starts failing  
>> whenever
>> hotspot starts supporting the class file version number for the next
>> major release.  At least this is a strong hint to the javac team to
>> catch up soon by incrementing their supported targets, etc...
>
> In this case it is a more mundane version check failure:
> tools/javac/versions/check.sh
> javac thinks its version isn't 1.6.0, but javac 1.6.0-internal
>
>> I like a policy of "Read my lips; no new test failures" but OpenJDK
>> is not quite there; we get test failure creep when changes in
>> one component break another component's tests.
>
> Yes, that would be idea. At least for openjdk6/icedtea we seem to be
> pretty close actually. It will be more challenging for openjdk7. I
> haven't quite figured out all the dynamics around "workspace
> integration". But I assume we can get the master tree to zero fail and
> then demand that any integration cycle doesn't introduce regressions.
>
>> The jtreg tests were originally designed to be run only by Sun JDK
>> development and test engineers.  If someone can come up with a
>> portable way of testing network services (like ftp clients) without
>> setting up a dedicated machine with a well-known name, that
>> would be good.  Alternatively, making the name of this
>> machine configurable when jtreg is run would also
>> be an improvement, and a much simpler one.  But the obvious
>> idea of using environment variables doesn't work.  Most environment
>> variables are not passed to the running java test program.
>
> Making it configurable, or even ignorable with keywords would be  
> crucial
> for distribution testing. Most distributions don't allow their build
> daemons to access the network. But for quality control it is essential
> that they do run the full test suite.

jtreg allows tests to be tagged with keyords, that can be used on the  
command
line to filter the tests to be executed.

>
>
> I haven't made an inventory of what services would be needed on a
> machine to be replace javaweb.sfbay.sun.com with a public machine, but
> we can certainly run some services on icedtea.classpath.org or maybe  
> one
> of the openjdk.java.net machines.
>
>> If it's considered acceptable for IcedTea hackers to get their
>> hands dirty with not-100%-free technology, y'all could try
>> running the jtreg tests against IcedTea, vanilla OpenJDK7,
>> OpenJDK6, and JDK 6u6, and comparing the test failures.
>
> :) Well, the whole idea behind IcedTea is to provide an OpenJDK
> derivative that doesn't depend on any non-free build or runtime
> requirements.
>
> But I am certainly interested in comparing results. I do think  
> OpenJDK6
> and IcedTea are now so close that we shouldn't be seeing any test  
> result
> differences between the two.
>
> That brings up the question how to export a JTreport/JTwork  
> environment.
> I only posted the text results http://icedtea.classpath.org/~mjw/ 
> jtreg/
> since the JTreport and JTwork files have all kinds of hard coded
> absolute path references. It would be nice to be able to export it all
> so I can upload it to some public site for others to look at and  
> compare
> with.

At a minimum, you'd want to publish the summary.txt files from the  
report directory.
Note also that JT Harness comes with a couple of servlets you can  
install for pretty
viewing .jtr and .jtx files.

>
>
> Cheers,
>
> MArk
>

Reply | Threaded
Open this post in threaded view
|

Re: jtreg testing integrated

Mark Wielaard
In reply to this post by jonathan.gibbons
Hi Jonathan,

On Mon, 2008-05-19 at 16:07 -0700, Jonathan Gibbons wrote:

> Separately, check out the options for handling @ignore tests. Even on
> older versions
> of jtreg you can use "-k:!ignore" to ignore @ignore tests.  (This works
> because @ignore
> tests are given an implicit "ignore" keyword.)   With later versions of
> jtreg, you can use
> -Ignore:{quiet,error,run} to control how @ignore tests should be
> handled. Using this
> option, you should be to get closer to the goal of "all tests should
> pass", meaning
> that there are less failures and so less need to compare the output
> results with jtdiff.

This is a really a great feature! For icedtea we now use "-v1 -a
-ignore:quiet", that give output and results that should be pretty
familiar to people. And this is the set that I hope we can get to be all
PASS in the default case.

One extension might be to have a "-ignore:try" that does try to run it,
that doesn't report it failure, but that does flag it as unexpected
XPASS to alert people to bugs that are (accidentally) fixed but where
the testcase was not yet enabled.

Cheers,

Mark

Reply | Threaded
Open this post in threaded view
|

Re: jtreg testing integrated

Mark Wielaard
In reply to this post by Martin Buchholz-3
Hi Martin,

On Tue, 2008-05-20 at 06:00 -0700, Martin Buchholz wrote:

> On Tue, May 20, 2008 at 2:32 AM, Mark Wielaard <[hidden email]> wrote:
> >> I like a policy of "Read my lips; no new test failures" but OpenJDK
> >> is not quite there; we get test failure creep when changes in
> >> one component break another component's tests.
> >
> > Yes, that would be idea. At least for openjdk6/icedtea we seem to be
> > pretty close actually. It will be more challenging for openjdk7. I
> > haven't quite figured out all the dynamics around "workspace
> > integration". But I assume we can get the master tree to zero fail and
> > then demand that any integration cycle doesn't introduce regressions.
>
> There are too many tests to require team integrators to run
> them all on each integration cycle.

I am not sure. It does take about 3 hours to run all the included tests
(and I assume that when we add more tests or integrate things like mauve
it will rise). But I do hope people, not just integrators, will run them
regularly. Especially when they are working on/integrating larger
patches. And we can always fall back on autobuilders so we have a full
report at least soon after something bad happens so there is some chance
to revert a change relatively quickly.

>   For a few years I've advocated
> adding another level to the tree of workspaces.  My model is to
> rename the current MASTER workspace to PURGATORY, and
> add a "golden MASTER".
> The idea is that once a week or so all tests are run exhaustively,
> and when it is confirmed that there are no new test failures,
> the tested code from PURGATORY is promoted to MASTER.

This is fascinating. Intuitively I would call for less levels instead of
more because that makes issues show up earlier. It is one of the things
I haven't really wrapped my head around. The proliferation of separate
branches/workspaces. One main master tree where all work goes into by
default and only have separate (ad hoc) branches/workspaces for larger
work items that might be destabilizing seems an easier model to work
with.

Cheers,

Mark

Reply | Threaded
Open this post in threaded view
|

Re: jtreg testing integrated

Martin Buchholz-3
[+quality-discuss, jdk7-gk]

On Thu, May 22, 2008 at 7:27 AM, Mark Wielaard <[hidden email]> wrote:

> Hi Martin,
>
> On Tue, 2008-05-20 at 06:00 -0700, Martin Buchholz wrote:
>> On Tue, May 20, 2008 at 2:32 AM, Mark Wielaard <[hidden email]> wrote:
>> >> I like a policy of "Read my lips; no new test failures" but OpenJDK
>> >> is not quite there; we get test failure creep when changes in
>> >> one component break another component's tests.
>> >
>> > Yes, that would be idea. At least for openjdk6/icedtea we seem to be
>> > pretty close actually. It will be more challenging for openjdk7. I
>> > haven't quite figured out all the dynamics around "workspace
>> > integration". But I assume we can get the master tree to zero fail and
>> > then demand that any integration cycle doesn't introduce regressions.
>>
>> There are too many tests to require team integrators to run
>> them all on each integration cycle.
>
> I am not sure. It does take about 3 hours to run all the included tests
> (and I assume that when we add more tests or integrate things like mauve
> it will rise).

Not all the regression tests are open source yet, and not all the
test suites available are open source (and some are likely to be
permanently encumbered).  And we should be adding more
static analysis tools to the testing process.

It sure would be nice to run all tests with -server and -client,
and with different GCs, and on 32 and 64-bit platforms,
with Java assertions enabled and disabled,
with C++ assertions enabled and disabled.

Soon a "full" testing cycle looks like it might take a week.

But I do hope people, not just integrators, will run them
> regularly. Especially when they are working on/integrating larger
> patches. And we can always fall back on autobuilders so we have a full
> report at least soon after something bad happens so there is some chance
> to revert a change relatively quickly.

Much of the world works on this model -
commit to trunk, wait for trouble, revert.
It's certainly much cheaper, and gets feedback quicker,
but creates fear among developers ("Notoriously careless
developer X just did a commit.  I think I'll wait for a week
before pulling")

>>   For a few years I've advocated
>> adding another level to the tree of workspaces.  My model is to
>> rename the current MASTER workspace to PURGATORY, and
>> add a "golden MASTER".
>> The idea is that once a week or so all tests are run exhaustively,
>> and when it is confirmed that there are no new test failures,
>> the tested code from PURGATORY is promoted to MASTER.
>
> This is fascinating. Intuitively I would call for less levels instead of
> more because that makes issues show up earlier. It is one of the things
> I haven't really wrapped my head around. The proliferation of separate
> branches/workspaces. One main master tree where all work goes into by
> default and only have separate (ad hoc) branches/workspaces for larger
> work items that might be destabilizing seems an easier model to work
> with.

It's certainly more work for the integrators.  But for the developers
my model is simple and comfortable.  Youir integrator will give you
a workspace to commit changes to.
Commit there whenever you feel like.  Go on to the next coding task.
Your changes will take a while to percolate into MASTER,
but what do you care?
When you sync, you pull in changes from MASTER, which are
*guaranteed* to not break any of your tests.  If you want specific
changes quickly, pull from PURGATORY or a less-tested team
workspace.

If you have a project where you need to share your work
with other developers immediately,
no problem - just create a project-specific shared workspace
that all project team members can commit to directly.
Decide on a level of testing the team is comfortable with -
including none at all.

Developers in my model are more productive partly because
they don't have to be afraid of breaking other developers.
They can do enough testing for 95% confidence
(which for many changes might mean no testing at all)
then commit.  The system will push back buggy changes
automatically.

Too many times I've suffered because tests in library land
have been broken by changes in hotspot.  Nevertheless,
the JDK MASTER is remarkably stable for a project with so
many developers, largely because of the gradual integration
process, with changes going into MASTER only after being
tested by integrators.  JDK developers don't go around chatting
about "build weather" - is the build broken today?  AGAIN?

This development model doesn't work as well for most
open source projects, because they have fewer, smarter, and more
dedicated developers, so there is less need.
Also, it's hard to find good integrators.  Most people (like myself)
end up doing it as a part-time job.  But just like source code
control systems have gotten sexy, perhaps someday
"code integration and testing systems" will become sexy,
and everyone will want to write one.

Martin
Reply | Threaded
Open this post in threaded view
|

Re: jtreg testing integrated

Mark Wielaard
In reply to this post by jonathan.gibbons
Hi Jon,

On Tue, 2008-05-20 at 08:24 -0700, Jonathan Gibbons wrote:
> The solution was to provide a utility called "EditLinks" within the  
> JavaTest
> framework. I assume it is still available within JT Harness. This is a  
> simple
> utility for post-processing the links within report files so that you  
> can move
> report files and work directories around as you choose.

Heay that is pretty neat!

So I just did this in my icedtea6 dir:
$ for i in hotspot langtools jdk; do java -cp test/jtreg.jar \
  com.sun.javatest.EditLinks -e `pwd` \
  http://icedtea.classpath.org/~mjw/jtreg test/$i; done

And uploaded the results:
http://icedtea.classpath.org/~mjw/jtreg/test/

And indeed the links now work and you can get a html overview of the
failure lists and correct links to the .jtr files such as:
http://icedtea.classpath.org/~mjw/jtreg/test/jdk/JTreport/html/failed.html

Great,

Mark

Reply | Threaded
Open this post in threaded view
|

Re: jtreg testing integrated

Ismael Juma
In reply to this post by Martin Buchholz-3
Martin Buchholz <martinrb@...> writes:
> This development model doesn't work as well for most
> open source projects, because they have fewer, smarter, and more
> dedicated developers, so there is less need.

It's also related to the rate of change taking place (which often is correlated
to the amount of developers in the project). As an external observer, it seems
to me like the Linux kernel has a similar model to OpenJDK with several
integrators at different levels (Linus, Andrew Morton, subsystem maintainers,
arch maintainers, etc.). The rate of change that takes place in each Linux
kernel release is huge and it would be hard to achieve it in any other way.

Regards,
Ismael

Reply | Threaded
Open this post in threaded view
|

Re: jtreg testing integrated

Martin Buchholz-3
It's true that the Linux kernel is developed in a distributed
tree-of-trees model, just like OpenJDK, and Linus himself
is the most outspoken advocate of distributed source code
control systems.  (I mostly agree with him there)

One difference is the culture of testing.  The Linux kernel
is hard to test, and doesn't seem to have a strong culture
of testing, while the JDK has on the order of a million tests
available to be run, which makes great stability and reliability
possible.

Martin

On Thu, May 22, 2008 at 8:35 AM, Ismael Juma <[hidden email]> wrote:

> Martin Buchholz <martinrb@...> writes:
>> This development model doesn't work as well for most
>> open source projects, because they have fewer, smarter, and more
>> dedicated developers, so there is less need.
>
> It's also related to the rate of change taking place (which often is correlated
> to the amount of developers in the project). As an external observer, it seems
> to me like the Linux kernel has a similar model to OpenJDK with several
> integrators at different levels (Linus, Andrew Morton, subsystem maintainers,
> arch maintainers, etc.). The rate of change that takes place in each Linux
> kernel release is huge and it would be hard to achieve it in any other way.
>
> Regards,
> Ismael
>
>
Reply | Threaded
Open this post in threaded view
|

Re: jtreg testing integrated

Ismael Juma
Martin Buchholz <martinrb@...> writes:
> One difference is the culture of testing.  The Linux kernel
> is hard to test, and doesn't seem to have a strong culture
> of testing, while the JDK has on the order of a million tests
> available to be run, which makes great stability and reliability
> possible.

While I understand the testing point, I am not convinced that the conclusion
follows in practice. More specifically, it seems like you're implying that the
JDK is more stable and reliable than the Linux kernel. :)

It's always hard to make general judgements based on personal experiences, but I
don't remember when I last had a kernel panic and I always use the latest stable
release. Admittedly I am mostly a desktop/server user, so I don't touch the
flakier parts of the kernel like suspend and resume.

HotSpot -server on the other hand has been less than inspiring since jdk6u4 when
running some very popular applications.

I ran across an easy to reproduce crash running eclipse[1], a different crash
running the eclipse compiler[2] and index corruption in Lucene[3]. All of them
started from jdk6u4 and none have been fixed as of jdk6u10 b24. So maybe there's
still some work to be done to achieve great stability and reliability. :)

Yes, I am aware that bugs will always exist, I am just a bit sad that such nasty
problems were introduced in a stable release and no fix exists yet. It also
doesn't help that HotSpot has always been rock-solid in the past, so you could
call me spoiled. :)

Regards,
Ismael

[1] http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6614100 -> Note that this
still happens with the latest jdk6u10 beta (b24) unlike what is implied by the
resolution of 6659207 (which someone decided 6614100 was a duplicate of).

[2] http://icedtea.classpath.org/bugzilla/show_bug.cgi?id=152 -> Note that some
people posted similar crash dumps in [1], but the original description of [1]
was for a crash in a different point.

[3] http://tinyurl.com/64c9px (Lucene JIRA)


Reply | Threaded
Open this post in threaded view
|

Re: jtreg testing integrated

jonathan.gibbons
In reply to this post by Mark Wielaard
Mark,

The suggestion regarding -ignore:try is interesting, but I'd have to  
think about
how it could be done. The underlying JT Harness does not support such a
concept, and I doubt that it would be easy to add it. We'd have to  
create a
side table in jtreg of @ignored tests that were executed and which  
passed.
I'll have to go talk to the JT Harness folk to see if I could add that  
info into
a report.

One idea that has been on the table for a while is the idea of a  
"known failure
list".  You do a test run and then compare results against a list  
containing
tests which are regrettably known to fail.  Seems to me we could use
-ignore:run for jtreg, then invoke jtdiff against the KFL to get a  
report of
which tests did not behave as expected, including tests which now pass  
which
previously did not.

-- Jon

On May 22, 2008, at 7:16 AM, Mark Wielaard wrote:

> Hi Jonathan,
>
> On Mon, 2008-05-19 at 16:07 -0700, Jonathan Gibbons wrote:
>> Separately, check out the options for handling @ignore tests. Even on
>> older versions
>> of jtreg you can use "-k:!ignore" to ignore @ignore tests.  (This  
>> works
>> because @ignore
>> tests are given an implicit "ignore" keyword.)   With later  
>> versions of
>> jtreg, you can use
>> -Ignore:{quiet,error,run} to control how @ignore tests should be
>> handled. Using this
>> option, you should be to get closer to the goal of "all tests should
>> pass", meaning
>> that there are less failures and so less need to compare the output
>> results with jtdiff.
>
> This is a really a great feature! For icedtea we now use "-v1 -a
> -ignore:quiet", that give output and results that should be pretty
> familiar to people. And this is the set that I hope we can get to be  
> all
> PASS in the default case.
>
> One extension might be to have a "-ignore:try" that does try to run  
> it,
> that doesn't report it failure, but that does flag it as unexpected
> XPASS to alert people to bugs that are (accidentally) fixed but where
> the testcase was not yet enabled.
>
> Cheers,
>
> Mark
>

12