Confirm results are reproducible #19

joewiz · 2021-07-06T22:03:41Z

On yesterday's Community Call we discussed evidence indicating users obtained different results despite using identical versions of the exist-xqts-runner and command line flags. I propose we use the latest directions derived from #17 and gather results from as many users as possible:

git clone https://github.com/exist-db/exist-xqts-runner.git (ensure a fresh clone of the master branch, no local modifications)
cd exist-xqts-runner
sbt assembly
target/scala-2.13/exist-xqts-runner-assembly-1.0.0.jar -x HEAD
open target/junit/html/index.html and copy and paste the "Summary" table into your reply to this issue. GitHub is smart enough to transform your HTML into GFM, no fiddling needed. For example, here are my results:

Tests	Failures	Errors	Skipped	Success rate	Time
31557	5167	551	1260	81.88%	439.974

The text was updated successfully, but these errors were encountered:

adamretter · 2021-07-07T06:48:01Z

My laptop:

macOS 11.2.3
Java 1.8.0_292
sbt 1.3.3 / 1.5.0

Results:

Iteration #	Tests	Failures	Errors	Skipped	Success rate	Time
1	31557	5157	551	1260	81.91%	800.824
2	31557	5156	551	1260	81.92%	1556.271
3	31557	5157	551	1260	81.91%	791.146

EB Ubuntu Dev Env:

Ubuntu 20.04.2 LTS (GNU/Linux 5.4.0-1040-kvm x86_64)
Java 1.8.0_292
sbt 1.3.3 / 1.5.4

Results:

Iteration #	Tests	Failures	Errors	Skipped	Success rate	Time
1	31557	5171	551	1260	81.87%	626.455
2	31557	5170	551	1260	81.87%	810.823
3	31557	5172	551	1260	81.86%	616.676

NOTE: The difference between the test runs on my two machines and that of @joewiz appears to be in the number of Failures, all other numbers are consistent. I think the next things to test are:

If we each run several times on the same machine, do we get the same results?
We need to compare individual failures
1. Where the test failures vary on the same machine, as the XQTS is highly concurrent I suspect a concurrency issue (most likely in eXist-db as opposed to XQTS).
2. Where the test failures vary between different machines, this could be the same concurrency issue, or if the results are dramatically different then I would suspect a machine specific environment issue...

duncdrum · 2021-07-07T11:28:20Z

Desktop

macOS: 11.4
java: 1.8.0_292
sbt: 1.3.3 / 1.5.4

Results

Iteration	Tests	Failures	Errors	Skipped	Success rate	Time
1	31557	5159	551	1260	81.91%	523.675
2	31557	5160	551	1260	81.90%	661.144
3	31557	5159	551	1260	81.91%	516.675

marmoure · 2021-07-12T21:01:52Z

Desktop
Windows: 10 Pro 19043.906
java: 1.8.0_281
sbt: 1.3.3 / 1.3.3

Iteration	Tests	Failures	Errors	Skipped	Success rate	Time
1	31557	5161	551	1260	81.91%	699.980
2	31557	5159	551	1260	81.91%	688.543
3	31557	5159	551	1260	81.91%	715.330

joewiz · 2021-07-13T03:09:36Z

My iMac

macOS: 11.4
java: 1.8.0_292 (liberica-jdk8-full)
sbt: 1.5.5

Results

Iteration	Tests	Failures	Errors	Skipped	Success rate	Time
1	31557	5166	551	1260	81.88%	454.876
2	31557	5168	551	1260	81.88%	470.561
3	31557	5166	551	1260	81.88%	495.145

This matches Adam's observation that the Failures alternate between 5166 and 5168, while all other non-Time values remain constant.

The variation in Failures for all of the reported results in this issue is always 0, 1, or 2....

joewiz · 2021-08-01T05:32:50Z

In eXist-db/exist#3966 (comment) I performed a similar batch of runs of exist-xqts-runner. In the 3 runs for that PR, I saw different results within the 3 PR runs, similar to the +/- 0-3 differences we saw here.

In test 1 of the PR, there were 5,162 failures, but in test 2 a few minutes later, there were 3 fewer failures - 5,159 failures. The 3 differences all occurred within the tests for regular expressions in the fn:matches function:

re00062

Test:

   <test-case name="re00062">
      <description>Test regex syntax</description>
      <created by="Michael Kay" on="2011-07-04"/>
      <test>(every $s in tokenize('', ',') satisfies matches($s, '^(?:[^\p{IsBasicLatin}]*)$')) and (every $s in tokenize('a', ',') satisfies not(matches($s, '^(?:[^\p{IsBasicLatin}]*)$')))</test>
      <result>
         <assert-true/>
      </result>
   </test-case>

Failure in 1st test that passed in 2nd test:

Expected: 'AssertTrue', but query returned an error: QueryError(FORX0002,err:FORX0002 Conversion from XPath F&O 3.0 regular expression syntax to Java regular expression syntax failed: Error at character 22 in regular expression "^(?:[^\p{IsBasicLatin}]*)$": invalid block name (BasicLatin) [at line 1, column 135])

Stacktrace:

junit.framework.AssertionFailedError: Expected: &apos;AssertTrue&apos;, but query returned an error: QueryError(FORX0002,err:FORX0002 Conversion from XPath F&amp;O 3.0 regular expression syntax to Java regular expression syntax failed: Error at character 22 in regular expression &quot;^(?:[^\p{IsBasicLatin}]*)$&quot;: invalid block name (BasicLatin) [at line 1, column 135])
	at org.exist.xqts.runner.JUnitResultsSerializerActor.$anonfun$formatJunitTestSet$6(JUnitResultsSerializerActor.scala:142)
	at org.exist.xqts.runner.JUnitResultsSerializerActor.$anonfun$formatJunitTestSet$6$adapted(JUnitResultsSerializerActor.scala:129)
	at scala.collection.immutable.List.foreach(List.scala:333)
	at org.exist.xqts.runner.JUnitResultsSerializerActor.$anonfun$formatJunitTestSet$1(JUnitResultsSerializerActor.scala:129)
	at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.scala:18)
	at cats.effect.internals.IORunLoop$.cats$effect$internals$IORunLoop$$loop(IORunLoop.scala:104)
	at cats.effect.internals.IORunLoop$.restartCancelable(IORunLoop.scala:51)
	at cats.effect.internals.IOBracket$BracketStart.run(IOBracket.scala:100)
	at cats.effect.internals.Trampoline.cats$effect$internals$Trampoline$$immediateLoop(Trampoline.scala:67)
	at cats.effect.internals.Trampoline.startLoop(Trampoline.scala:35)
	at cats.effect.internals.TrampolineEC$JVMTrampoline.super$startLoop(TrampolineEC.scala:90)
	at cats.effect.internals.TrampolineEC$JVMTrampoline.$anonfun$startLoop$1(TrampolineEC.scala:90)
	at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.scala:18)
	at scala.concurrent.BlockContext$.withBlockContext(BlockContext.scala:94)
	at cats.effect.internals.TrampolineEC$JVMTrampoline.startLoop(TrampolineEC.scala:90)
	at cats.effect.internals.Trampoline.execute(Trampoline.scala:43)
	at cats.effect.internals.TrampolineEC.execute(TrampolineEC.scala:42)
	at cats.effect.internals.IOBracket$BracketStart.apply(IOBracket.scala:80)
	at cats.effect.internals.IOBracket$BracketStart.apply(IOBracket.scala:58)
	at cats.effect.internals.IORunLoop$.cats$effect$internals$IORunLoop$$loop(IORunLoop.scala:183)
	at cats.effect.internals.IORunLoop$.restart(IORunLoop.scala:41)
	at cats.effect.internals.IOBracket$.$anonfun$apply$1(IOBracket.scala:48)
	at cats.effect.internals.IOBracket$.$anonfun$apply$1$adapted(IOBracket.scala:34)
	at cats.effect.internals.IOAsync$.$anonfun$apply$1(IOAsync.scala:37)
	at cats.effect.internals.IOAsync$.$anonfun$apply$1$adapted(IOAsync.scala:37)
	at cats.effect.internals.IORunLoop$RestartCallback.start(IORunLoop.scala:447)
	at cats.effect.internals.IORunLoop$.cats$effect$internals$IORunLoop$$loop(IORunLoop.scala:156)
	at cats.effect.internals.IORunLoop$.$anonfun$suspendAsync$1(IORunLoop.scala:321)
	at cats.effect.internals.IORunLoop$.$anonfun$suspendAsync$1$adapted(IORunLoop.scala:320)
	at cats.effect.internals.IORunLoop$RestartCallback.start(IORunLoop.scala:447)
	at cats.effect.internals.IORunLoop$.cats$effect$internals$IORunLoop$$loop(IORunLoop.scala:156)
	at cats.effect.internals.IORunLoop$.start(IORunLoop.scala:38)
	at cats.effect.IO.unsafeRunAsync(IO.scala:274)
	at cats.effect.internals.IOPlatform$.unsafeResync(IOPlatform.scala:39)
	at cats.effect.IO.unsafeRunTimed(IO.scala:342)
	at cats.effect.IO.unsafeRunSync(IO.scala:256)
	at org.exist.xqts.runner.JUnitResultsSerializerActor$$anonfun$receive$1.applyOrElse(JUnitResultsSerializerActor.scala:55)
	at akka.actor.Actor.aroundReceive(Actor.scala:537)
	at akka.actor.Actor.aroundReceive$(Actor.scala:535)
	at org.exist.xqts.runner.JUnitResultsSerializerActor.aroundReceive(JUnitResultsSerializerActor.scala:42)
	at akka.actor.ActorCell.receiveMessage(ActorCell.scala:580)
	at akka.actor.ActorCell.invoke(ActorCell.scala:548)
	at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:270)
	at akka.dispatch.Mailbox.run(Mailbox.scala:231)
	at akka.dispatch.Mailbox.exec(Mailbox.scala:243)
	at java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289)
	at java.util.concurrent.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1056)
	at java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1692)
	at java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:175)

re00225

Test:

   <test-case name="re00225">
      <description>Test regex syntax</description>
      <created by="Michael Kay" on="2011-07-04"/>
      <test>(every $s in tokenize('&#1536;&#1791;,&#1536;&#1537;&#1538;&#1539;&#1540;&#1541;&#1542;&#1543;&#1544;&#1545;&#1546;&#1547;&#1548;&#1549;&#1550;&#1551;&#1552;&#1553;&#1554;&#1555;&#1556;&#1557;&#1558;&#1559;&#1560;&#1561;&#1562;&#1563;&#1564;&#1565;&#1566;&#1567;&#1568;&#1569;&#1570;&#1571;&#1572;&#1573;&#1574;&#1575;&#1576;&#1577;&#1578;&#1579;&#1580;&#1581;&#1582;&#1583;&#1584;&#1585;&#1586;&#1587;&#1588;&#1589;&#1590;&#1591;&#1592;&#1593;&#1594;&#1595;&#1596;&#1597;&#1598;&#1599;&#1600;&#1601;&#1602;&#1603;&#1604;&#1605;&#1606;&#1607;&#1608;&#1609;&#1610;&#1611;&#1612;&#1613;&#1614;&#1615;&#1616;&#1617;&#1618;&#1619;&#1620;&#1621;&#1622;&#1623;&#1624;&#1625;&#1626;&#1627;&#1628;&#1629;&#1630;&#1631;&#1632;&#1633;&#1634;&#1635;&#1636;&#1637;&#1638;&#1639;&#1640;&#1641;&#1642;&#1643;&#1644;&#1645;&#1646;&#1647;&#1648;&#1649;&#1650;&#1651;&#1652;&#1653;&#1654;&#1655;&#1656;&#1657;&#1658;&#1659;&#1660;&#1661;&#1662;&#1663;&#1664;&#1665;&#1666;&#1667;&#1668;&#1669;&#1670;&#1671;&#1672;&#1673;&#1674;&#1675;&#1676;&#1677;&#1678;&#1679;&#1680;&#1681;&#1682;&#1683;&#1684;&#1685;&#1686;&#1687;&#1688;&#1689;&#1690;&#1691;&#1692;&#1693;&#1694;&#1695;&#1696;&#1697;&#1698;&#1699;&#1700;&#1701;&#1702;&#1703;&#1704;&#1705;&#1706;&#1707;&#1708;&#1709;&#1710;&#1711;&#1712;&#1713;&#1714;&#1715;&#1716;&#1717;&#1718;&#1719;&#1720;&#1721;&#1722;&#1723;&#1724;&#1725;&#1726;&#1727;&#1728;&#1729;&#1730;&#1731;&#1732;&#1733;&#1734;&#1735;&#1736;&#1737;&#1738;&#1739;&#1740;&#1741;&#1742;&#1743;&#1744;&#1745;&#1746;&#1747;&#1748;&#1749;&#1750;&#1751;&#1752;&#1753;&#1754;&#1755;&#1756;&#1757;&#1758;&#1759;&#1760;&#1761;&#1762;&#1763;&#1764;&#1765;&#1766;&#1767;&#1768;&#1769;&#1770;&#1771;&#1772;&#1773;&#1774;&#1775;&#1776;&#1777;&#1778;&#1779;&#1780;&#1781;&#1782;&#1783;&#1784;&#1785;&#1786;&#1787;&#1788;&#1789;&#1790;&#1791;', ',') satisfies matches($s, '^(?:\p{IsArabic}+)$')) and (every $s in tokenize('', ',') satisfies not(matches($s, '^(?:\p{IsArabic}+)$')))</test>
      <result>
         <assert-true/>
      </result>
   </test-case>

Failure in 1st test that passed in 2nd test:

Expected: 'AssertTrue', but query returned an error: QueryError(FORX0002,err:FORX0002 Conversion from XPath F&O 3.0 regular expression syntax to Java regular expression syntax failed: Error at character 16 in regular expression "^(?:\p{IsArabic}+)$": invalid block name (Arabic) [at line 1, column 301])

Stacktrace:

junit.framework.AssertionFailedError: Expected: &apos;AssertTrue&apos;, but query returned an error: QueryError(FORX0002,err:FORX0002 Conversion from XPath F&amp;O 3.0 regular expression syntax to Java regular expression syntax failed: Error at character 16 in regular expression &quot;^(?:\p{IsArabic}+)$&quot;: invalid block name (Arabic) [at line 1, column 301])
	at org.exist.xqts.runner.JUnitResultsSerializerActor.$anonfun$formatJunitTestSet$6(JUnitResultsSerializerActor.scala:142)
	at org.exist.xqts.runner.JUnitResultsSerializerActor.$anonfun$formatJunitTestSet$6$adapted(JUnitResultsSerializerActor.scala:129)
	at scala.collection.immutable.List.foreach(List.scala:333)
	at org.exist.xqts.runner.JUnitResultsSerializerActor.$anonfun$formatJunitTestSet$1(JUnitResultsSerializerActor.scala:129)
	at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.scala:18)
	at cats.effect.internals.IORunLoop$.cats$effect$internals$IORunLoop$$loop(IORunLoop.scala:104)
	at cats.effect.internals.IORunLoop$.restartCancelable(IORunLoop.scala:51)
	at cats.effect.internals.IOBracket$BracketStart.run(IOBracket.scala:100)
	at cats.effect.internals.Trampoline.cats$effect$internals$Trampoline$$immediateLoop(Trampoline.scala:67)
	at cats.effect.internals.Trampoline.startLoop(Trampoline.scala:35)
	at cats.effect.internals.TrampolineEC$JVMTrampoline.super$startLoop(TrampolineEC.scala:90)
	at cats.effect.internals.TrampolineEC$JVMTrampoline.$anonfun$startLoop$1(TrampolineEC.scala:90)
	at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.scala:18)
	at scala.concurrent.BlockContext$.withBlockContext(BlockContext.scala:94)
	at cats.effect.internals.TrampolineEC$JVMTrampoline.startLoop(TrampolineEC.scala:90)
	at cats.effect.internals.Trampoline.execute(Trampoline.scala:43)
	at cats.effect.internals.TrampolineEC.execute(TrampolineEC.scala:42)
	at cats.effect.internals.IOBracket$BracketStart.apply(IOBracket.scala:80)
	at cats.effect.internals.IOBracket$BracketStart.apply(IOBracket.scala:58)
	at cats.effect.internals.IORunLoop$.cats$effect$internals$IORunLoop$$loop(IORunLoop.scala:183)
	at cats.effect.internals.IORunLoop$.restart(IORunLoop.scala:41)
	at cats.effect.internals.IOBracket$.$anonfun$apply$1(IOBracket.scala:48)
	at cats.effect.internals.IOBracket$.$anonfun$apply$1$adapted(IOBracket.scala:34)
	at cats.effect.internals.IOAsync$.$anonfun$apply$1(IOAsync.scala:37)
	at cats.effect.internals.IOAsync$.$anonfun$apply$1$adapted(IOAsync.scala:37)
	at cats.effect.internals.IORunLoop$RestartCallback.start(IORunLoop.scala:447)
	at cats.effect.internals.IORunLoop$.cats$effect$internals$IORunLoop$$loop(IORunLoop.scala:156)
	at cats.effect.internals.IORunLoop$.$anonfun$suspendAsync$1(IORunLoop.scala:321)
	at cats.effect.internals.IORunLoop$.$anonfun$suspendAsync$1$adapted(IORunLoop.scala:320)
	at cats.effect.internals.IORunLoop$RestartCallback.start(IORunLoop.scala:447)
	at cats.effect.internals.IORunLoop$.cats$effect$internals$IORunLoop$$loop(IORunLoop.scala:156)
	at cats.effect.internals.IORunLoop$.start(IORunLoop.scala:38)
	at cats.effect.IO.unsafeRunAsync(IO.scala:274)
	at cats.effect.internals.IOPlatform$.unsafeResync(IOPlatform.scala:39)
	at cats.effect.IO.unsafeRunTimed(IO.scala:342)
	at cats.effect.IO.unsafeRunSync(IO.scala:256)
	at org.exist.xqts.runner.JUnitResultsSerializerActor$$anonfun$receive$1.applyOrElse(JUnitResultsSerializerActor.scala:55)
	at akka.actor.Actor.aroundReceive(Actor.scala:537)
	at akka.actor.Actor.aroundReceive$(Actor.scala:535)
	at org.exist.xqts.runner.JUnitResultsSerializerActor.aroundReceive(JUnitResultsSerializerActor.scala:42)
	at akka.actor.ActorCell.receiveMessage(ActorCell.scala:580)
	at akka.actor.ActorCell.invoke(ActorCell.scala:548)
	at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:270)
	at akka.dispatch.Mailbox.run(Mailbox.scala:231)
	at akka.dispatch.Mailbox.exec(Mailbox.scala:243)
	at java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289)
	at java.util.concurrent.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1056)
	at java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1692)
	at java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:175)

re00061

Test:

   <test-case name="re00061">
      <description>Test regex syntax</description>
      <created by="Michael Kay" on="2011-07-04"/>
      <test>(every $s in tokenize('&#256;', ',') satisfies matches($s, '^(?:[^\p{IsBasicLatin}]+)$')) and (every $s in tokenize('', ',') satisfies not(matches($s, '^(?:[^\p{IsBasicLatin}]+)$')))</test>
      <result>
         <assert-true/>
      </result>
   </test-case>

Failure in 1st test that passed in 2nd test:

Expected: 'AssertTrue', but query returned an error: QueryError(FORX0002,err:FORX0002 Conversion from XPath F&O 3.0 regular expression syntax to Java regular expression syntax failed: Error at character 22 in regular expression "^(?:[^\p{IsBasicLatin}]+)$": invalid block name (BasicLatin) [at line 1, column 43])

Stacktrace:

junit.framework.AssertionFailedError: Expected: &apos;AssertTrue&apos;, but query returned an error: QueryError(FORX0002,err:FORX0002 Conversion from XPath F&amp;O 3.0 regular expression syntax to Java regular expression syntax failed: Error at character 22 in regular expression &quot;^(?:[^\p{IsBasicLatin}]+)$&quot;: invalid block name (BasicLatin) [at line 1, column 43])
	at org.exist.xqts.runner.JUnitResultsSerializerActor.$anonfun$formatJunitTestSet$6(JUnitResultsSerializerActor.scala:142)
	at org.exist.xqts.runner.JUnitResultsSerializerActor.$anonfun$formatJunitTestSet$6$adapted(JUnitResultsSerializerActor.scala:129)
	at scala.collection.immutable.List.foreach(List.scala:333)
	at org.exist.xqts.runner.JUnitResultsSerializerActor.$anonfun$formatJunitTestSet$1(JUnitResultsSerializerActor.scala:129)
	at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.scala:18)
	at cats.effect.internals.IORunLoop$.cats$effect$internals$IORunLoop$$loop(IORunLoop.scala:104)
	at cats.effect.internals.IORunLoop$.restartCancelable(IORunLoop.scala:51)
	at cats.effect.internals.IOBracket$BracketStart.run(IOBracket.scala:100)
	at cats.effect.internals.Trampoline.cats$effect$internals$Trampoline$$immediateLoop(Trampoline.scala:67)
	at cats.effect.internals.Trampoline.startLoop(Trampoline.scala:35)
	at cats.effect.internals.TrampolineEC$JVMTrampoline.super$startLoop(TrampolineEC.scala:90)
	at cats.effect.internals.TrampolineEC$JVMTrampoline.$anonfun$startLoop$1(TrampolineEC.scala:90)
	at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.scala:18)
	at scala.concurrent.BlockContext$.withBlockContext(BlockContext.scala:94)
	at cats.effect.internals.TrampolineEC$JVMTrampoline.startLoop(TrampolineEC.scala:90)
	at cats.effect.internals.Trampoline.execute(Trampoline.scala:43)
	at cats.effect.internals.TrampolineEC.execute(TrampolineEC.scala:42)
	at cats.effect.internals.IOBracket$BracketStart.apply(IOBracket.scala:80)
	at cats.effect.internals.IOBracket$BracketStart.apply(IOBracket.scala:58)
	at cats.effect.internals.IORunLoop$.cats$effect$internals$IORunLoop$$loop(IORunLoop.scala:183)
	at cats.effect.internals.IORunLoop$.restart(IORunLoop.scala:41)
	at cats.effect.internals.IOBracket$.$anonfun$apply$1(IOBracket.scala:48)
	at cats.effect.internals.IOBracket$.$anonfun$apply$1$adapted(IOBracket.scala:34)
	at cats.effect.internals.IOAsync$.$anonfun$apply$1(IOAsync.scala:37)
	at cats.effect.internals.IOAsync$.$anonfun$apply$1$adapted(IOAsync.scala:37)
	at cats.effect.internals.IORunLoop$RestartCallback.start(IORunLoop.scala:447)
	at cats.effect.internals.IORunLoop$.cats$effect$internals$IORunLoop$$loop(IORunLoop.scala:156)
	at cats.effect.internals.IORunLoop$.$anonfun$suspendAsync$1(IORunLoop.scala:321)
	at cats.effect.internals.IORunLoop$.$anonfun$suspendAsync$1$adapted(IORunLoop.scala:320)
	at cats.effect.internals.IORunLoop$RestartCallback.start(IORunLoop.scala:447)
	at cats.effect.internals.IORunLoop$.cats$effect$internals$IORunLoop$$loop(IORunLoop.scala:156)
	at cats.effect.internals.IORunLoop$.start(IORunLoop.scala:38)
	at cats.effect.IO.unsafeRunAsync(IO.scala:274)
	at cats.effect.internals.IOPlatform$.unsafeResync(IOPlatform.scala:39)
	at cats.effect.IO.unsafeRunTimed(IO.scala:342)
	at cats.effect.IO.unsafeRunSync(IO.scala:256)
	at org.exist.xqts.runner.JUnitResultsSerializerActor$$anonfun$receive$1.applyOrElse(JUnitResultsSerializerActor.scala:55)
	at akka.actor.Actor.aroundReceive(Actor.scala:537)
	at akka.actor.Actor.aroundReceive$(Actor.scala:535)
	at org.exist.xqts.runner.JUnitResultsSerializerActor.aroundReceive(JUnitResultsSerializerActor.scala:42)
	at akka.actor.ActorCell.receiveMessage(ActorCell.scala:580)
	at akka.actor.ActorCell.invoke(ActorCell.scala:548)
	at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:270)
	at akka.dispatch.Mailbox.run(Mailbox.scala:231)
	at akka.dispatch.Mailbox.exec(Mailbox.scala:243)
	at java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289)
	at java.util.concurrent.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1056)
	at java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1692)
	at java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:175)

I can't speculate why two runs of exist-xqts-runner run a few minutes apart would produce invalid block name errors in one run but not the next. But these variations do not appear to be related to the PR I was investigating.

Tests 2 vs. 3 differed only by 1 test, and this one was in a different location:

group-015

Test:

   <test-case name="group-015">
      <description>No value comparisons are available to compare the grouping keys.</description>
      <created by="Josh Spiegel" on="2012-10-02"/>
      <modified by="Michael Kay" on="2017-03-17" change="avoid assert-xml for non-XML results"/>
      <test>
          for $x in (true(), "true", xs:QName("true"))
          group by $x
          return $x
      </test>
      <result>
        <assert-permutation>true(), "true", xs:QName("true")</assert-permutation>
      </result>
   </test-case>

The failure in test 3 that passed in test 2:

assert-permutation: expected='true(), "true", xs:QName("true")', actual='Q{}true, "true", true()'" type="junit.framework.AssertionFailedError

Stacktrace:

junit.framework.AssertionFailedError: assert-permutation: expected=&apos;true(), &quot;true&quot;, xs:QName(&quot;true&quot;)&apos;, actual=&apos;Q{}true, &quot;true&quot;, true()&apos;
	at org.exist.xqts.runner.JUnitResultsSerializerActor.$anonfun$formatJunitTestSet$6(JUnitResultsSerializerActor.scala:142)
	at org.exist.xqts.runner.JUnitResultsSerializerActor.$anonfun$formatJunitTestSet$6$adapted(JUnitResultsSerializerActor.scala:129)
	at scala.collection.immutable.List.foreach(List.scala:333)
	at org.exist.xqts.runner.JUnitResultsSerializerActor.$anonfun$formatJunitTestSet$1(JUnitResultsSerializerActor.scala:129)
	at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.scala:18)
	at cats.effect.internals.IORunLoop$.cats$effect$internals$IORunLoop$$loop(IORunLoop.scala:104)
	at cats.effect.internals.IORunLoop$.restartCancelable(IORunLoop.scala:51)
	at cats.effect.internals.IOBracket$BracketStart.run(IOBracket.scala:100)
	at cats.effect.internals.Trampoline.cats$effect$internals$Trampoline$$immediateLoop(Trampoline.scala:67)
	at cats.effect.internals.Trampoline.startLoop(Trampoline.scala:35)
	at cats.effect.internals.TrampolineEC$JVMTrampoline.super$startLoop(TrampolineEC.scala:90)
	at cats.effect.internals.TrampolineEC$JVMTrampoline.$anonfun$startLoop$1(TrampolineEC.scala:90)
	at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.scala:18)
	at scala.concurrent.BlockContext$.withBlockContext(BlockContext.scala:94)
	at cats.effect.internals.TrampolineEC$JVMTrampoline.startLoop(TrampolineEC.scala:90)
	at cats.effect.internals.Trampoline.execute(Trampoline.scala:43)
	at cats.effect.internals.TrampolineEC.execute(TrampolineEC.scala:42)
	at cats.effect.internals.IOBracket$BracketStart.apply(IOBracket.scala:80)
	at cats.effect.internals.IOBracket$BracketStart.apply(IOBracket.scala:58)
	at cats.effect.internals.IORunLoop$.cats$effect$internals$IORunLoop$$loop(IORunLoop.scala:183)
	at cats.effect.internals.IORunLoop$.restart(IORunLoop.scala:41)
	at cats.effect.internals.IOBracket$.$anonfun$apply$1(IOBracket.scala:48)
	at cats.effect.internals.IOBracket$.$anonfun$apply$1$adapted(IOBracket.scala:34)
	at cats.effect.internals.IOAsync$.$anonfun$apply$1(IOAsync.scala:37)
	at cats.effect.internals.IOAsync$.$anonfun$apply$1$adapted(IOAsync.scala:37)
	at cats.effect.internals.IORunLoop$RestartCallback.start(IORunLoop.scala:447)
	at cats.effect.internals.IORunLoop$.cats$effect$internals$IORunLoop$$loop(IORunLoop.scala:156)
	at cats.effect.internals.IORunLoop$.$anonfun$suspendAsync$1(IORunLoop.scala:321)
	at cats.effect.internals.IORunLoop$.$anonfun$suspendAsync$1$adapted(IORunLoop.scala:320)
	at cats.effect.internals.IORunLoop$RestartCallback.start(IORunLoop.scala:447)
	at cats.effect.internals.IORunLoop$.cats$effect$internals$IORunLoop$$loop(IORunLoop.scala:156)
	at cats.effect.internals.IORunLoop$.start(IORunLoop.scala:38)
	at cats.effect.IO.unsafeRunAsync(IO.scala:274)
	at cats.effect.internals.IOPlatform$.unsafeResync(IOPlatform.scala:39)
	at cats.effect.IO.unsafeRunTimed(IO.scala:342)
	at cats.effect.IO.unsafeRunSync(IO.scala:256)
	at org.exist.xqts.runner.JUnitResultsSerializerActor$$anonfun$receive$1.applyOrElse(JUnitResultsSerializerActor.scala:55)
	at akka.actor.Actor.aroundReceive(Actor.scala:537)
	at akka.actor.Actor.aroundReceive$(Actor.scala:535)
	at org.exist.xqts.runner.JUnitResultsSerializerActor.aroundReceive(JUnitResultsSerializerActor.scala:42)
	at akka.actor.ActorCell.receiveMessage(ActorCell.scala:580)
	at akka.actor.ActorCell.invoke(ActorCell.scala:548)
	at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:270)
	at akka.dispatch.Mailbox.run(Mailbox.scala:231)
	at akka.dispatch.Mailbox.exec(Mailbox.scala:243)
	at java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289)
	at java.util.concurrent.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1056)
	at java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1692)
	at java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:175)

Comparing test 1 to test 3, all 4 of the exact same differences above were the causes of the differences.

This would explain the consistent range of variation of 0-3 in the results that we all reported:

0 for the case when both tests happened to return identical changes
1 for the case where 1 test failed the 1 GroupByClause test
2 for the case where 1 test failed the 3 matches.re.xml tests and the other failed the 1 GroupByClause test
3 for the case where 1 test failed the 3 matches.re.xml tests

Note that we didn't see a variation of 4—for the case where 1 test failed both the 1 GroupByClause test and the 3 matches.re.xml tests. Perhaps we'd see this if we performed more test runs, or perhaps the two groups of failing tests don't occur in the same run, i.e., they're inter-related?

This is just a running theory. Perhaps there are other tests that fail besides these, and only additional runs and comparisons would reveal them.

To check which testsuites were responsible for the difference between 2 test runs, save the target/junit/data/TESTS-TestSuites.xml files from each run, and then run this query - providing the $tss-1 and $tss-2 paths to the two files:

xquery version "3.1";

let $tss1 := doc("/db/apps/exist-xqts-results/data/5.4.0-SNAPSHOT-with-Juri-PR/test01/junit/data/TESTS-TestSuites.xml")/testsuites/testsuite
let $tss2 := doc("/db/apps/exist-xqts-results/data/5.4.0-SNAPSHOT-with-Juri-PR/test03/junit/data/TESTS-TestSuites.xml")/testsuites/testsuite
return
    array {
        for $ts1 in $tss1
        let $ts1-failures := $ts1/@failures
        let $ts2 := $tss2[@package eq $ts1/@package and @name eq $ts1/@name]
        let $ts2-failures := $ts2/@failures
        return
            if ($ts1-failures ne $ts2-failures) then
                map {
                    "package": $ts1/@package/string(),
                    "name": $ts1/@name/string(),
                    "ts1-failures": $ts1/@failures cast as xs:integer,
                    "ts2-failures": $ts2/@failures cast as xs:integer
                }
            else
                ()
    }

This returns a result like:

[
    {
        "package": "XQTS_HEAD.fn-matches",
        "name": "re",
        "ts1-failures": 7,
        "ts2-failures": 4
    },
    {
        "package": "XQTS_HEAD",
        "name": "prod-GroupByClause",
        "ts1-failures": 15,
        "ts2-failures": 16
    }
]

To derive the table like the one I posted in the PR comment linked above, which listed the tests that returned different results in 2 test runs, I uploaded the entire junit directories to eXist and ran the following query:

xquery version "3.1";

declare namespace output="http://www.w3.org/2010/xslt-xquery-serialization";

declare option output:method "html5";
declare option output:media-type "text/html";

declare function local:compare-testcase($testcase-1, $testcase-2) {
    element tr {
        element td { $testcase-1/../@package/string() },
        element td { $testcase-1/../@name/string() },
        element td { $testcase-1/@name/string() },
        element td { ($testcase-1/*/name(), "pass")[. ne ""][1] },
        element td { ($testcase-2/*/name(), "pass")[. ne ""][1] }
    }
};

declare function local:compare-testcases($testcases-1, $testcases-2) {
    for $tc1 in $testcases-1
    let $name := $tc1/@name
    let $tc2 := $testcases-2[@name eq $name]
    order by $name
    return
        if (
                (empty($tc1/node()) and empty($tc2/node()))
                or 
                ($tc1/error and $tc2/error)
                or 
                ($tc1/failure and $tc2/failure)
                or 
                ($tc1/skipped and $tc2/skipped)
            ) then
            ()
        else
            local:compare-testcase($tc1, $tc2)
};

declare function local:compare-testsuites($testsuites-1, $testsuites-2) {
    element table {
        element thead {
            element tr {
                element th { "testsuite package" },
                element th { "testsuite name" },
                element th { "testcase name" },
                element th { "test 1" },
                element th { "test 2" }
            }
        },
        element tbody {
            for $ts1 in $testsuites-1
            let $package := $ts1/@package
            let $name := $ts1/@name
            let $ts2 := $testsuites-2[@package eq $package and @name eq $name]
            order by $package, $name
            return 
                if ($ts1/@errors eq "0" and $ts2/@errors eq "0") then
                    ()
                else
                    local:compare-testcases($ts1/testcase, $ts2/testcase)
        }
    }
};

let $data-collection := "/db/apps/exist-xqts-results/data"
let $testsuites-1 := 
    doc($data-collection || "/5.4.0-SNAPSHOT-before-Juri-PR/test01/junit/data/TESTS-TestSuites.xml")/testsuites/testsuite
let $testsuites-2 := 
    doc($data-collection || "/5.4.0-SNAPSHOT-with-Juri-PR/test01/junit/data/TESTS-TestSuites.xml")/testsuites/testsuite
return
    local:compare-testsuites($testsuites-1, $testsuites-2)

... returns a table like this:

testsuite package	testsuite name	testcase name	test 1	test 2
XQTS_HEAD	prod-UnaryLookup	UnaryLookup-011	failure	error
XQTS_HEAD	prod-UnaryLookup	UnaryLookup-015	failure	pass
XQTS_HEAD	prod-UnaryLookup	UnaryLookup-016	failure	error
XQTS_HEAD	prod-UnaryLookup	UnaryLookup-017	failure	error
XQTS_HEAD	prod-UnaryLookup	UnaryLookup-023	failure	pass
XQTS_HEAD	prod-UnaryLookup	UnaryLookup-025	failure	pass
XQTS_HEAD	prod-UnaryLookup	UnaryLookup-046	failure	pass

I hope these results and queries help us nail down the sources of unexpected variation in the results of exist-xqts-runner.

dizzzz · 2021-08-01T14:42:34Z

impressive research and analysis !

dizzzz · 2021-08-01T14:43:41Z

Order variations in results sound like... usage of a Hashmap somewhere.

joewiz mentioned this issue Aug 1, 2021

[fix] unary lookups eXist-db/exist#3966

Merged

line-o mentioned this issue Aug 24, 2021

easy comparison of two runs #18

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Confirm results are reproducible #19

Confirm results are reproducible #19

joewiz commented Jul 6, 2021

adamretter commented Jul 7, 2021 •

edited

Loading

duncdrum commented Jul 7, 2021 •

edited by joewiz

Loading

marmoure commented Jul 12, 2021

joewiz commented Jul 13, 2021 •

edited

Loading

joewiz commented Aug 1, 2021 •

edited

Loading

dizzzz commented Aug 1, 2021

dizzzz commented Aug 1, 2021

Confirm results are reproducible #19

Confirm results are reproducible #19

Comments

joewiz commented Jul 6, 2021

adamretter commented Jul 7, 2021 • edited Loading

My laptop:

EB Ubuntu Dev Env:

duncdrum commented Jul 7, 2021 • edited by joewiz Loading

Desktop

Results

marmoure commented Jul 12, 2021

joewiz commented Jul 13, 2021 • edited Loading

My iMac

Results

joewiz commented Aug 1, 2021 • edited Loading

re00062

re00225

re00061

group-015

dizzzz commented Aug 1, 2021

dizzzz commented Aug 1, 2021

adamretter commented Jul 7, 2021 •

edited

Loading

duncdrum commented Jul 7, 2021 •

edited by joewiz

Loading

joewiz commented Jul 13, 2021 •

edited

Loading

joewiz commented Aug 1, 2021 •

edited

Loading