-
Notifications
You must be signed in to change notification settings - Fork 7
/
draft-vaidya-test-anything-protocol-00.xml
701 lines (617 loc) · 28.9 KB
/
draft-vaidya-test-anything-protocol-00.xml
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
<?xml version="1.0" encoding="UTF-8"?>
<!-- Based on http://xml.resource.org/authoring/sample.xml -->
<!-- Converted from http://testanything.org/wiki/index.php?title=TAP_at_IETF:_Draft_Standard&oldid=1863 -->
<!DOCTYPE rfc SYSTEM "rfc2629.dtd" [
<!ENTITY rfc2119 PUBLIC ''
'http://xml.resource.org/public/rfc/bibxml/reference.RFC.2119.xml'>
<!ENTITY rfc5234 PUBLIC ''
'http://xml.resource.org/public/rfc/bibxml/reference.RFC.5234.xml'>
]>
<rfc
category="info"
ipr="full3978"
docName="draft-vaidya-test-anything-protocol-00"
>
<?xml-stylesheet type='text/xsl' href='rfc2629.xslt' ?>
<?rfc toc="yes" ?>
<?rfc symrefs="yes" ?>
<?rfc sortrefs="yes"?>
<?rfc iprnotified="no" ?>
<?rfc strict="yes" ?>
<front>
<title>The Test Anything Protocol (TAP)</title>
<author initials='G.G.' surname="Vaidya" fullname='Gaurav Vaidya' role="editor">
<organization/>
</author>
<author initials='M.G.' surname="Schwern" fullname='Michael G Schwern'>
<organization/>
</author>
<date/>
<abstract>
<t>
The Test Anything Protocol (TAP) is a protocol to allow
communication between unit tests and a test harness. It allows
individual tests (TAP producers) to communicate test results to the
testing harness in a language-agnostic way. Tests can indicate
success, failure, as well as mark tests as unimplemented or
skipped, and to provide additional information to aid in debugging
of failed tests. Unlike other testing systems, it can indicate
missing tests and duplicated tests.
</t>
<t>
This specification defines TAP version 13.
</t>
</abstract>
</front>
<middle>
<section title="Introduction">
<t>
The Test Anything Protocol (TAP) describes a standard format for
testing suites to use. This format provides both human and
machine-readable information on which tests were run, whether they
were successful or failed, as well as other information which might
be useful in tracing the cause of the failure. Being
machine-readable allows this information to be easily parsed by an
automated testing harness. Test results may then be summarized and
analyzed. Being human-readable allows the results of a testing
system to be read manually; this is useful for quickly determining
whether individual tests pass and to aid in debugging the test
system itself.
</t>
<t>
TAP has been used by the Perl interpreter's test suite since 1987.
</t>
<section title="Requirements notation">
<t>The key words "MUST", "MUST NOT", "REQUIRED", "SHALL",
"SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY",
and "OPTIONAL" in this document are to be interpreted as
described in <xref target="RFC2119"/>.</t>
<t>
The grammatical rules in this document are following ABNF and
are to be interpreted as described in <xref target="RFC5234"/>.
</t>
</section>
</section>
<section title="Definitions">
<t>
<list style="hanging" hangIndent="6">
<t hangText="Test:">
<vspace/>
A piece of software which determines if a particular facility or
component is functional.
</t>
<t hangText="Test result:">
<vspace/>
The encoding in TAP of the result of a single test. It may have
exactly one TAP result directive.
</t>
<t hangText="Test set:">
<vspace/>
Zero or more test results with exactly one plan. A test set can
be determined to have failed or passed.
</t>
<t hangText="Test suite:">
<vspace/>
A collection of test sets.
</t>
<t hangText="TAP document:">
<vspace/>
One test set which parses correctly as defined in this
document. Future expansion may allow multiple test sets to be
contained in a single TAP document, allowing an entire test
suite to be stored as a single TAP document.
</t>
<t hangText="Plan:">
<vspace/>
The number of tests which are expected to pass in a single test
set.
</t>
<t hangText="Directive:">
<vspace/>
A directive changes the meaning of a test result; it is
commonly used to indicate that a failing test should be
reported as passing (TODO) or that a passing test should be
tagged as suspicious (SKIP).
</t>
<t hangText="Reason:">
<vspace/>
A reason explains why a directive was necessary.
</t>
<t hangText="Description:">
<vspace/>
A description says what a test result is testing.
</t>
<t hangText="Producer:">
<vspace/>
A producer is a thing which can generate a valid TAP document.
</t>
<t hangText="Consumer:">
<vspace/>
A consumer is a thing which can parse and interpret a TAP
document.
</t>
<t hangText="Filter:">
<vspace/>
A filter a piece of software which consumes a TAP document, and
produces a new TAP document based in some way on the TAP input.
For example, it can reproduce the TAP document exactly,
normalize the formatting, combine multiple documents, or
summarize the result of tests as a new TAP document.
</t>
</list>
</t>
</section>
<section title="Producers and consumers">
<t>
A TAP document might exist as a file, data stream, encoded in
another medium (stored in a PDF file, for instance) or in any other
form which can be parsed as described below. For maximum
compatibility, it may be produced by any TAP producer, and must
then be consumable by any TAP consumer.
</t>
<t>
One example system of producers and consumers is the
Test::More/Test::Harness system in Perl testing. In this system,
test producers are executable Perl scripts with a '.t' extension,
collected in a number of folders, which themselves use the
Test::More library to simplify testing. These scripts can be
executed individually, and emit TAP documents on standard output.
</t>
<t>
A TAP consumer, such as the Perl module TAP::Parser, can interpret
the output of a TAP producer to determine how many tests were run,
which tests succeeded, and which diagnostic information might be
usefully reported to the user.
</t>
<t>
Note that these two processes are often decoupled. The TAP document
emitted by the TAP producer can be stored before being interpreted
by the TAP parser; this allows all diagnostic and testing
information to be stored for later analysis and usage. The TAP
document might be generated on one server, then downloaded to
another before being processed. A single TAP parser can be used to
process files in all these cases, illustrating TAP's utility as a
cross-language, cross-environment tool.
</t>
<t>
The complete flow in this system can look something like this:
</t>
<figure>
<artwork><![CDATA[
------------
| Producer |
------------
|
V
---------------- ----------- --------------------
| TAP document | --+--> | Storage | +--> | Formatted Output |
---------------- | ----------- | --------------------
| | |
| V |
| -------------- | --------------------
+--> | TAP parser |---+--> | Test summaries |
-------------- --------------------
]]></artwork>
</figure>
<t>
Additionally, utilities like "prove" can further simplify running a
suite of TAP producers, by searching for files having certain
characteristics or matching particular patterns. For instance,
conventionally, all tests for a Perl module are stored in the 't/'
folder, and consist of executable scripts (TAP producers) with
extensions of 't'. The "prove" utility, part of the Test::Harness
module, can then be used to find these files and run them all. In
the following example, prove executes t/error.t, t/id.t and t/url.t
and evaluates the produced TAP documents in turn, checking which
ones passed and failed, and providing a complete summary of the
entire test suite run.
</t>
<figure><artwork><![CDATA[
$ prove t/*.t
t/error.t...ok
t/id.t......ok
t/url.t.....ok
Result: PASS
]]></artwork></figure>
</section>
<section title="The Structure of a TAP document">
<t>
A TAP test set consists of one version, one plan, zero or more test
results, and any number of comments and ignored elements. A TAP
test set is split into lines, separated by newline characters. Any
line beginning with the letters 'ok' or 'not ok' is a test result.
All unparsable lines must be ignored by TAP consumers. In order to
keep TAP readable on the system on which it is produced, the
end-of-line character for TAP may be LF or CRLF. The same
end-of-line character must be used throughout a document.
</t>
<!--
(Alternative suggestion is to just use CRLF or LF and pretend
there's a pre-processor which normalizes newlines. Apparently this
is what XML does).
http://www.w3.org/TR/REC-xml/ "2.11 End-of-Line Handling XML parsed
entities are often stored in computer files which, for editing
convenience, are organized into lines. These lines are typically
separated by some combination of the characters CARRIAGE RETURN
(#xD) and LINE FEED (#xA). To simplify the tasks of applications,
the XML processor MUST behave as if it normalized all line breaks
in external parsed entities (including the document entity) on
input, before parsing, by translating both the two-character
sequence #xD #xA and any #xD that is not followed by #xA to a
single #xA character."
-->
<t>
Free-form strings in TAP, such as reasons and descriptions, must be
in UTF8 unless the interchange parties agree otherwise. Strings
MUST NOT contain an EOL or NUL character. Since directives start
with a "#" and come after a description, descriptions MUST NOT
contain a "#".
</t>
<!--
Rationale: This is expressed in the grammar below as a Safe-String
without "#" and a more general String with "#". But we need an
escape system to have a # This whole section is repeating the
grammar but leaving out important details.
-->
<section title="Version">
<t>
Every test set as defined in this document should contain a
version definition. A test set without a version definition
is assumed to be written in TAP version 12, which is not
covered by this document.
</t>
</section>
<section title="Plan">
<t>
Every test set should contain one and only one plan. A test
set containing multiple plans can be parsed as a TAP
document, but will be judged to have failed.
</t>
</section>
<section title="Test results">
<t>
A test result reports the status of one single test. The
test must summarize itself as passing ("ok") or failing
("not ok"), and must have a test number. A test result
without an explicit test number will be assumed to have the
next number in increasing numeric sequence from the start
of the test set. A test result can also have a description
and a directive, which modifies the meaning of the test,
but both are optional.
</t>
<t>
Test results must be present in the correct order, without
missing tests or duplicate test numbers, for the test set
to be judged to pass.
</t>
</section>
<section title="Bail Out">
<t>
A Bail-Out line indicates a gross failure in the TAP
producer. Results from this test set can no longer be
trusted. The TAP consumer should ignore the rest of the TAP
syntax and may terminate the TAP producer if it can do so
safely. If processing multiple files, the TAP consumer
should stop processing other files which are part of the
same test suite.
</t>
</section>
<section title="Comments">
<t>
Comments are lines which have no meaning in TAP. They do
not alter the result of a test. A TAP document with and
without comments has exactly the same meaning.
</t>
<t>
Comments typically contain debugging information. TAP
consumers should not display comments by default, as there
will likely be a large number of tests in such a suite.
</t>
<t>
Note that TAP does not provide a mechanism for comments to
be associated with particular test results; for instance,
comments of a general nature might be interspersed with
comments specific to a particular test.
</t>
</section>
<section title="Ignored elements">
<t>
In order to allow extension of the protocol while
maintaining backwards compatibility, a TAP consumer must
ignore certain lines and elements. Any line which does not
parse must be ignored. Any directive or plan-directive
which is not recognized must also be ignored.
</t>
<t>
Here's an example of a TAP stream which contains ignored elements.
</t>
<figure><artwork><![CDATA[
1..2 # Line 1
Error at line 12 # Line 2
ok 1 # Line 3
ok 2 # BANG # Line 4
]]></artwork></figure>
<t>
Line 2 should be ignored, it does not parse. The directive
"BANG" on test #2 should be treated as an ordinary comment,
it is not a TAP directive.
</t>
</section>
</section>
<section title="Grammar">
<t>
The test set is defined below using Augmented Backus-Naur Form
(ABNF), as defined in <xref target="RFC5234"/>.
</t>
<figure><artwork><![CDATA[
; Overall structure
Testset = Header (Plan Body / Body Plan) Footer
Header = [Comments] [Version]
Footer = [Comments]
Body = *(Comment / TAP-Line)
TAP-Line = Test-Result / Bail-Out
; TAP version
Version = "TAP version" SP Version-Number EOL
Version-Number = Positive-Integer
; Plan
Plan = ( Plan-Simple / Plan-Todo / Plan-Skip-All ) EOL
Plan-Simple = "1.." Number-Of-Tests
Plan-Todo = Plan-Simple "todo" 1*(SP Test-Number) ";"
Plan-Skip-All = "1..0" SP "skip" SP Reason
Reason = String
Number-Of-Tests = 1*DIGIT
Test-Number = Positive-Integer
; Test result
Test-Result = Status [SP Test-Number] [SP Description]
[SP "#" SP Directive [SP Reason]] EOL
Status = "ok" / "not ok"
Description = Safe-String
Directive = "SKIP" / "TODO"
; Bail out
Bail-Out = "Bail out!" [SP Reason] EOL
; Comments
Comment = "#" String EOL
Comments = 1*Comment
; Values
EOL = LF / CRLF
Safe-String = 1*(%x01-09 %x0B-0C %x0E-22 %x24-FF)
String = 1*(Safe-String / "#")
Positive-Integer =
("1" / "2" / "3" / "4" / "5" / "6" / "7" / "8" / "9") *DIGIT
]]></artwork></figure>
<t>
A document must parse as per the grammar above to qualify as a TAP
document. The grammar presented below may be used by TAP consumers
to ensure that a test set contains all the required parts.
</t>
<figure><artwork><![CDATA[
; Overall structure
Passing-Testset = Header
(
Plan-With-Tests Passing-Body
/
Passing-Body Plan-With-Tests
/
Plan-Skip-All
)
Footer
Passing-Body =
[Comments] 1*Passing-Line *(Passing-Line / Comment) [Comments]
Passing-Line =
(Simple-OK / Passing-TODO / Failing-TODO / Passing-SKIP) EOL
Failing-Line =
(Simple-Fail / Failing-SKIP) EOL
; Plans that must have tests
Plan-With-Tests = ( Plan-Simple / Plan-Todo )
; Passing and failing test lines
Simple-OK =
"ok" [SP Test-Number] [SP Description]
Simple-Fail =
"not ok" [SP Test-Number] [SP Description]
Passing-TODO = Simple-OK SP TODO-Directive
Failing-TODO = Simple-Fail SP TODO-Directive
Passing-SKIP = Simple-OK SP SKIP-Directive
Failing-SKIP = Simple-Fail SP SKIP-Directive
TODO-Directive = "# TODO" [SP Reason]
SKIP-Directive = "# SKIP" [SP Reason]
]]></artwork></figure>
</section>
<section title="Passing and Failing">
<t>
Both test sets and test results can be determined to have
passed or failed as detailed below. Note that a test set might
fail even if every single test result contained in it passes.
</t>
<section title="Test results">
<t>
A test result indicates whether the test it reports has
passed ("ok") or failed ("not ok"). It is itself said to
pass or fail, depending on the result of the underlying
test and on any directives applied to that result. It can
be modified with any of the directives listed below. The
table below summarises how directives can change a test
result.
</t>
<texttable title="Test results as modified by directives">
<ttcol></ttcol>
<ttcol align='center'>No directives</ttcol>
<ttcol align='center'>TODO</ttcol>
<ttcol align='center'>SKIP</ttcol>
<c>ok</c>
<c>pass</c>
<c>pass (optionally with warning)</c>
<c>pass</c>
<c>not ok</c>
<c>fail</c>
<c>pass</c>
<c>fail (with warning)</c>
</texttable>
</section>
<section title="The "SKIP" directive">
<t>
This directive indicates that the test was not begun. Usually
this is caused by environmental reasons: a missing optional
library, an operating specific test, or an expensive test that
is only run on request.
</t>
<t>
Since the test was skipped, the test result is expected to be
"ok" (indicating that the test was skipped correctly). A
skipped test with a result of "not ok" is suspicious, and the
TAP consumer should report a warning.
</t>
</section>
<section title="The "TODO" directive">
<t>
This directive indicates that the test was run, but failure is
expected and should not cause the test set to fail. This is
usually because the functionality being tested has not been
completely implemented or is obstructed by a known bug.
</t>
<t>
Neither a failing nor a passing TODO test will cause the test
set to fail, but since passing TODOs are suspicious, they may
optionally be reported by TAP consumers.
</t>
<!--
Rationale: In other testing systems, failing tests which
cannot be immediately fixed are commented out. In TAP, todo
tests allow one to write tests for bugs or missing features
and leave them running. In this way, a todo test becomes an
executable bug tracker. Known, unresolved issues are
silently ignored, leaving the test suite to pass. If an
issue is resolved and a test starts to pass, the parser
warns the developer about the passing todo test and they
can take action.
-->
</section>
<section title="Plan">
<t>
The plan indicates the number of tests which are expected to be
run in the current test set. Two types of plans are defined,
with different requirements for passing:
</t>
<t>
<list style="numbers">
<t>
A simple plan: the number of expected tests must be a
positive integer. In order for a test set with a simple
plan to pass, it must contain exactly the number of test
results indicated in the plan, in ascending numeric order,
without either any gaps in test number or any duplicate
tests. Every test result must pass.
</t>
<t>
A skip-all plan: the number of expected tests is zero. A
test set with a skip-all plan passes unless it contains any
test results.
</t>
</list>
</t>
<t>
Allowing simple plans to plan for zero tests is being
considered, but is not a part of this specification.
</t>
</section>
<section title="Other factors">
<t>
TAP consumers may use any other system-specific factors to
determine whether a test set passes or fails. If such a failure
is to be reported, it MUST inform the user of the state of the
TAP parsing and whether all tests appear to have passed, and
then must separately describe the nature of the system-specific
failure which caused the consumer to become suspicious of the
results.
</t>
<t>
TAP consumers which also act as TAP producers could add
additional test results to the end of the output TAP document
if such a failure occurs. For instance, take a TAP producer
which emits ten successful tests, then throws an exception
and exits with a failure. A filter might add a new (11th)
failing ("not ok") test result to the end of the TAP it
emits, which informs the user of the exception or failure
exit code. Note that such a test result would cause the
test set to fail; unless the test set planned one more test
than it emitted, the number of tests would not equal the
number of tests planned; even if it did this, this 11th
failing test would cause the test set to fail.
</t>
<t>
An example of the use of this option is to check exit codes of
TAP producers for a failure, which might indicate that the
producer failed after having emitted a complete TAP document.
It seems unpragmatic to ignore such exit codes, but this
information cannot be reliably expressed into its own TAP
output by a failing producer, and must be treated separately.
This also allows for language-specific and system-specific
features to be used by the TAP consumer at its discretion to
improve the rigour of its testing.
</t>
</section>
</section>
<section title="Security Considerations">
<!--
See RFC 3552 for tips on how to write this section.
Problems with parsing debugging status?
-->
<t>
A parser which stores test results in a dynamically sized array may
be vulnerable to memory starvation by a test which uses a very high
test number. For example:
</t>
<figure><artwork><![CDATA[
1..3
ok 1
ok 2
ok 123456789
]]></artwork></figure>
<t>
The above test result would cause an array of 123456789 elements to
be allocated. So it is recommended that test results, if stored at
all, are stored in a sparse array.
</t>
</section>
<section title="IANA considerations">
<!-- Still needs much work! Current content is:
The MIME media type for TAP is application/tap.
Type name: application
Subtype name: tap
Required parameters: n/a
Optional parameters: n/a
Encoding considerations: 8bit if UTF-8; binary if UTF-16 or UTF-32
TAP may be represented using UTF-8, UTF-16, or UTF-32. When TAP is written in UTF-8, TAP is 8bit compatible. When TAP iswritten in UTF-16 or UTF-32, the binary content-transfer-encoding must be used.
Security considerations:
Interoperability considerations: n/a
Published specification: TODO: insert once we have a spec (i.e. this document)
Applications that use this media type: Perl's test system (including the core Test::More and Test::Harness modules), Perl testing modules such as Test::Most and Test::Class, applications which store and summarize test results such as Smolder, and testing suites in other systems, including pgTAP and
Additional information:
Magic number(s): "TAP version" in the encoding used.
File extension(s): .tap
Macintosh file type code(s): TEXT
Person & email address to contact for further information: TODO (Can we get TPF to monitor this?)
Intended usage: COMMON
Restrictions on usage: none
Author: TODO
Change controller: TODO
-->
<t>TBD</t>
</section>
<section title="Acknowledgements">
<t>
This document is based on Andy Armstrong's description of the TAP
protocol, version 13, which is itself based on Andy Lester's
description of the TAP protocol, version 1.00. The basis for the
TAP format was created by Larry Wall in the original test script
for Perl 1. Tim Bunce and Andreas Koenig developed it further
with their modifications to the Test::Harness module.
</t>
</section>
</middle>
<back>
<references title='Normative References'>
&rfc2119;
&rfc5234;
</references>
</back>
</rfc>