Discussion:
RFR: 8214230: Classes generated by SystemModulesPlugin.java are not reproducable
Erik Helin
2018-11-28 09:59:54 UTC
Permalink
Hi all,

this small patch ensures that two classes generated by jlink's
SystemModulePlugin (SystemModules$default and SystemModules$all) are
generated in a reproducible manner. That is, if you run jlink two times
in a row with identical input, then the generated classes from both runs
should be identical. This was almost the case already, the only issue is
that some code in SystemModulesPlugin.java are iterating over Sets, and
Set does not have a specified iteration order (the iteration order did
differ between runs).

This patch ensures that the iteration order over the sets is
well-defined by creating an ArrayList from the Set, sorting that
ArrayList and then finally iterate over the sorted ArrayList instead of
the Set.

For smaller programs jlink will now produce reproducible results, the
entire lib/modules file will be reproducible [0]. I have added a JTReg
test to verify this property (again, for smaller programs).

Webrev:
https://cr.openjdk.java.net/~ehelin/8214230/00/

Bug:
https://bugs.openjdk.java.net/browse/JDK-8214230

Testing:
- tier1, tier2 and tier3 on macOS, Windows, Linux (all x86-64).
- Run newly developed JTReg test on macOS, Windows, Linux (all x86-64).
- Run newly developed JTReg test with the fix absent and verified that
the test failed.

Many thanks to Alan Bateman for helping out with the patch!

Thanks,
Erik

[0]: https://reproducible-builds.org/docs/definition/
Claes Redestad
2018-11-28 13:49:37 UTC
Permalink
Hi Erik,

looks good to me!

(Nit) You could consider the old TreeMap trick to ensure key-sorted map
iteration order:

// convert to a TreeMap to ensure sorted iteration order for (var e :
new TreeMap<>(map).entrySet()) { Thanks!

/Claes
Post by Erik Helin
Hi all,
this small patch ensures that two classes generated by jlink's
SystemModulePlugin (SystemModules$default and SystemModules$all) are
generated in a reproducible manner. That is, if you run jlink two
times in a row with identical input, then the generated classes from
both runs should be identical. This was almost the case already, the
only issue is that some code in SystemModulesPlugin.java are iterating
over Sets, and Set does not have a specified iteration order (the
iteration order did differ between runs).
This patch ensures that the iteration order over the sets is
well-defined by creating an ArrayList from the Set, sorting that
ArrayList and then finally iterate over the sorted ArrayList instead
of the Set.
For smaller programs jlink will now produce reproducible results, the
entire lib/modules file will be reproducible [0]. I have added a JTReg
test to verify this property (again, for smaller programs).
https://cr.openjdk.java.net/~ehelin/8214230/00/
https://bugs.openjdk.java.net/browse/JDK-8214230
- tier1, tier2 and tier3 on macOS, Windows, Linux (all x86-64).
- Run newly developed JTReg test on macOS, Windows, Linux (all x86-64).
- Run newly developed JTReg test with the fix absent and verified that
  the test failed.
Many thanks to Alan Bateman for helping out with the patch!
Thanks,
Erik
[0]: https://reproducible-builds.org/docs/definition/
Alan Bateman
2018-11-28 15:01:36 UTC
Permalink
Post by Claes Redestad
Hi Erik,
looks good to me!
(Nit) You could consider the old TreeMap trick to ensure key-sorted
new TreeMap<>(map).entrySet()) { Thanks!
I agree TreeMap would be simpler in generate method where we should just
need to change "locals" to be a TreeMap rather than a HashMap. The rest
looks okay to me.

-Alan
Erik Helin
2018-11-29 08:47:58 UTC
Permalink
Alan, Claes, Mandy,

thank you all for your quick reviews, appreciate it.
Post by Alan Bateman
Post by Claes Redestad
(Nit) You could consider the old TreeMap trick to ensure key-sorted
new TreeMap<>(map).entrySet()) { Thanks!
I agree TreeMap would be simpler in generate method where we should just
need to change "locals" to be a TreeMap rather than a HashMap.
I agree as well, nice suggestion Claes! Please see updated patches below:

- incremental: http://cr.openjdk.java.net/~ehelin/8214230/00-01/
- full: http://cr.openjdk.java.net/~ehelin/8214230/01/

Re-tested that this works as expected on Linux x86-64 by running the new
JTReg test.

Thanks,
Erik
Mandy Chung
2018-11-29 17:57:50 UTC
Permalink
Post by Erik Helin
- incremental: http://cr.openjdk.java.net/~ehelin/8214230/00-01/
- full: http://cr.openjdk.java.net/~ehelin/8214230/01/
+1

Mandy
Alan Bateman
2018-11-29 19:44:29 UTC
Permalink
Post by Erik Helin
- incremental: http://cr.openjdk.java.net/~ehelin/8214230/00-01/
- full: http://cr.openjdk.java.net/~ehelin/8214230/01/
Re-tested that this works as expected on Linux x86-64 by running the
new JTReg test.
Looks okay.

-Alan
Claes Redestad
2018-11-30 13:35:10 UTC
Permalink
Post by Erik Helin
- incremental: http://cr.openjdk.java.net/~ehelin/8214230/00-01/
- full: http://cr.openjdk.java.net/~ehelin/8214230/01/
+1

/Claes

Mandy Chung
2018-11-28 17:55:04 UTC
Permalink
Post by Erik Helin
Hi all,
this small patch ensures that two classes generated by jlink's
SystemModulePlugin (SystemModules$default and SystemModules$all) are
generated in a reproducible manner. That is, if you run jlink two
times in a row with identical input, then the generated classes from
both runs should be identical. This was almost the case already, the
only issue is that some code in SystemModulesPlugin.java are iterating
over Sets, and Set does not have a specified iteration order (the
iteration order did differ between runs).
This patch ensures that the iteration order over the sets is
well-defined by creating an ArrayList from the Set, sorting that
ArrayList and then finally iterate over the sorted ArrayList instead
of the Set.
For smaller programs jlink will now produce reproducible results, the
entire lib/modules file will be reproducible [0]. I have added a JTReg
test to verify this property (again, for smaller programs).
https://cr.openjdk.java.net/~ehelin/8214230/00/
Looks good to me.   I agree with Claes that you could simply use TreeMap.

Mandy
Loading...