Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactoring the ics23 spec #975

Open
wants to merge 27 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 20 commits
Commits
Show all changes
27 commits
Select commit Hold shift + click to select a range
9a079cd
refactor the ICS23 spec
konnov Jun 20, 2023
52419f0
ignore empty leaf keys
konnov Jun 20, 2023
aa20aba
lowcase definitions and remove isTree
konnov Jun 20, 2023
446f4d3
clean up the specs
konnov Jun 20, 2023
bbf3f44
fix the test
konnov Jun 20, 2023
c13f256
symbolic hashing
konnov Jun 20, 2023
33bc57d
fix the comments
konnov Jun 21, 2023
8534b5e
rename words to hashes
konnov Jun 21, 2023
89f1559
refactoring the spec
konnov Jun 21, 2023
99c6fe7
refactored hashes, ics23, and ics23 tests
konnov Jun 22, 2023
c690426
uncomment one test
konnov Jun 22, 2023
db92c4c
fix all the tests in ics23test
konnov Jun 23, 2023
b2334dd
clean unneeded files and update the readme
konnov Jun 23, 2023
b81495c
Apply suggestions from code review
konnov Jun 23, 2023
60a2df4
apply comments and fix the spec
konnov Jun 23, 2023
4bcb8eb
fix lessThan
konnov Jun 23, 2023
de4cc26
fix the type error in termConcat
konnov Jul 6, 2023
2335cf9
fixing tests
konnov Jul 6, 2023
3813404
fix all but one test
konnov Jul 6, 2023
8ff175a
Merge branch 'main' into igor/ics23-fix
konnov Sep 4, 2023
2baba60
Merge remote-tracking branch 'origin/main' into igor/ics23-fix
bugarela Sep 10, 2024
ac11ff5
Fix command on README
bugarela Sep 10, 2024
c9f8ca0
Use a sum type to represent term nodes
bugarela Sep 10, 2024
2eea315
Merge remote-tracking branch 'origin/main' into igor/ics23-fix
bugarela Dec 11, 2024
f2f5e2d
Refactor and generalize to any spec
bugarela Dec 20, 2024
1115a51
Fix proof generation
bugarela Dec 20, 2024
5734cb4
Add stronger invariant
bugarela Dec 20, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
54 changes: 21 additions & 33 deletions examples/cosmos/ics23/README.md
Original file line number Diff line number Diff line change
@@ -1,60 +1,48 @@
This is a formal specification and invariants of [ICS23 Spec][] in Quint and TLA+:
This is a formal specification and invariants of [ICS23 Spec][] in Quint.
If this is the first Quint specification you see, we recommend checking simpler
specifications first. This one is quite advanced.

## Quint specification

Specification [ics23.qnt](./ics23.qnt) is the Quint specification that contains
four modules:

- Module `basics` contains basic type definitions and auxiliary definitions.
- Module `hashes` introduces a specification of hashes via a symbolic
representation of irreversible hashes.

- Module `ics23` contains membership and non-membership proofs.
- Module `ics23` contains membership and non-membership proofs,
following the ICS23 code.

- Module `ics23pbt` contains definitions for randomized tests similar to PBT.
- Module `ics23test` contains basic unit tests that demonstrate simple
scenarios of membership and non-membership verification.

- Module `trees` contains advanced randomized tests that use tree generation.
- Module `treeGen` contains advanced randomized tests
that generate randomized tests for ICS23.

**Unit tests.** You can use Quint REPL to run basic unit tests as follows:

```sh
$ quint repl
>>> .load ics23.qnt
>>> import ics23test.*
>>> allTests
$ quint test ics23test.qnt
```

**Randomized generation of examples.** You can run randomized tests to produce
examples of successful membership and non-membership verification as follows:

konnov marked this conversation as resolved.
Show resolved Hide resolved
```sh
$ quint repl
>>> .load ics23.qnt
>>> import ics23pbt.*
>>> _test(1000, 10, "Init", "Next", "TestVerify")
>>> _test(1000, 10, "Init", "Next", "TestNonMem")
```

**Randomized invariant checking.** Finally, you can check membership and
**Randomized invariant checking.** Also, you can check membership and
non-membership invariants with the random simulator:

```sh
$ quint repl
>>> .load ics23.qnt
>>> import trees.*
>>> _test(1000, 1, "Init", "Next", "NonMemInv")
>>> _test(1000, 1, "Init", "Next", "MemInv")
$ quint run --invariant=nonMemInv --max-samples=1000 ics23trees.qnt
$ quint run --invariant=memInv --max-samples=1000 ics23trees.qnt
```

## TLA+ specification
If the simulator finds a violation to the above invariants, please let us know,
as these invariants should hold true.

Specifications [ics23.tla](./ics23.tla) and [ics23trees.tla](./ics23trees.tla)
is the manual translation from Quint to TLA+. The main value of this
specification is the ability to check the invariants with Apalache:
If you would like to produce several examples of checking membership and
non-membership, run the simulator as follows:

```sh
$ apalache-mc check --inv=MemInv ics23trees.tla
$ apalache-mc check --inv=NonMemInv ics23trees.tla
$ quint run --invariant=memExample --max-samples=1000 treeGen.qnt
$ quint run --invariant=nonMemExample --max-samples=1000 treeGen.qnt
```


[ICS23 Spec]: https://github.com/cosmos/ibc/tree/main/spec/core/ics-023-vector-commitments
[ICS23]: https://github.com/confio/ics23/
156 changes: 156 additions & 0 deletions examples/cosmos/ics23/hashes.qnt
Original file line number Diff line number Diff line change
@@ -0,0 +1,156 @@
/// A symbolic specification of a string that is suitable for reasoning about
/// strings and hashes (such as SHA256). The exact hash function is irrelevant.
/// The only assumption is that a hash is irreversible.
///
/// Similar to symbolic reasoning by Dolev and Yao, we would like to represent:
/// - a raw sequence of bytes, e.g., [ 1, 2, 3 ],
/// - a hashed sequence, e.g., h([ 1, 2, 3 ]),
/// - a concatenation of a raw sequence and a hash (in any order),
/// e.g., [ 1, 2 ] + h([ 3, 4 ])],
/// - nested hashes, e.g., h([ 1, 2 ] + h([ 3, 4 ])).
///
/// So the idea is basically to represent words as terms, where the atoms are
/// sequences of integers, and terms are constructed as n-ary applications
/// of the symbol "h", which means "hash". Since Quint enforces strict typing
/// and it does not allow for inductive/recursive algebraic data types,
/// we represent terms as maps that encode trees. For instance,
/// the term h([ 1, 2 ] + h([ 3, 4 ])) + [ 5, 6 ] corresponds to the tree:
///
/// *
/// / \
/// "h" [ 5, 6 ]
/// / \
/// [ 1, 2 ] "h"
konnov marked this conversation as resolved.
Show resolved Hide resolved
/// /
/// [ 3, 4 ]
///
/// The above tree is represented as a map in Quint. Each key corresponds to a path in the tree; for example, the first root's child is [ 0 ], the second root's child is [ 1 ], the first child of [ 0 ] is [ 0, 0], etc.
///
/// Map([ 0 ] -> { t: "h", b: [] },
/// [ 0, 0 ] -> { t: "r", b: [ 1, 2 ] },
/// [ 0, 1 ] -> { t: "h", b: [] } ),
/// [ 0, 1, 0 ] -> { t: "r", b: [ 3, 4 ] },
/// [ 1 ] -> { t: "r", b: [ 5, 6 ] })
///
/// Igor Konnov, Informal Systems, 2022-2023
module hashes {
/// A sequence of bytes is simply a list of integers
type Bytes_t = List[int]

/// compare two lists of integers (e.g., bytes) lexicographically
pure def lessThan(w1: Bytes_t, w2: Bytes_t): bool = {
pure val len1 = length(w1)
pure val len2 = length(w2)
or {
len1 < len2 and indices(w1).forall(i => w1[i] <= w2[i]),
and {
len1 == len2,
indices(w1).exists(i => and {
w1[i] < w2[i],
indices(w1).forall(j => j < i implies w1[j] == w2[j])
})
}
}
}

/// Returns true iff `w` is the nil value.
/// In our spec, nil is equivalent to the empty list.
pure def isNil(w: Bytes_t): bool = {
w == []
}

/// A tree node that represents a fragment of a term
type TERM_NODE_T = {
// node type: either "h" (for hash), or "r" (for raw sequence)
t: str,
// the bytes when t = "r", and [] when t = "h"
b: Bytes_t
}
bugarela marked this conversation as resolved.
Show resolved Hide resolved

/// A word is a map from a path to the term node.
/// The first root's child is [ 0 ], the second root's child is [ 1 ],
/// the first child of [ 0 ] is [ 0, 0], etc.
type Term_t = Bytes_t -> TERM_NODE_T
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For a binary tree, This would also be encoded today as sum type, e.g., a list of LeftNode, RightNode.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have to note that Quint sum types are not inductive/recursive, for a reason of not admitting unbounded data structures. So it would not be possible to encode a tree with Quint sum types.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But now it is a list of integers, where only 0 and 1 is used. Why cannot it be a list of a sum type instead?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Try to write it in any language that has complete support for sum types, e.g., in OCaml. You will see that you need recursion. Basically, you have to say that a node refers to the children nodes (of the same type). If you use a mathematical function instead, e.g., $f: Paths \rightarrow Nodes$, you do not need an inductive definition.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A list of a sum type would not give you much, as you would not want a leaf to appear in the middle of the list.


/// Compute term length in bytes,
/// assuming that a hash occupies 32 bytes (as SHA256 does)
pure def termLen(term: Term_t): int = {
// the roots' children
pure val top =
keys(term).filter(p => length(p) == 1).map(p => term.get(p))
Comment on lines +69 to +71
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Root's children is checked with length(p) == 1. If this would be a defined predicate, perhaps it would be more readable.

top.fold(0, (s, t) => if (t.t == "h") s + 32 else s + length(t.b))
}

/// Construct the term that encodes a raw sequence of bytes
pure def raw(bytes: Bytes_t): Term_t = {
Map([ 0 ] -> { t: "r", b: bytes })
}

/// Is the term representing a raw term?
pure def isRaw(term: Term_t): bool = {
size(keys(term)) == 1 and term.get([0]).t == "r"
}

/// Hash a term
pure def termHash(term: Term_t): Term_t = {
// add "h" on top, which has the key [ 0 ], and attach term to it
pure val paths = Set([ 0 ]).union(keys(term).map(p => [ 0 ].concat(p)))
paths.mapBy(p =>
if (p == [ 0 ]) {
{ t: "h", b: [] }
} else {
term.get(p.slice(1, length(p)))
}
)
}

/// Concatenate two terms. Special attention is paid to the case when the
/// both terms are raw sequences, which requires them to be merged.
pure def termConcat(left: Term_t, right: Term_t): Term_t = {
pure val l = left.get([0])
pure val r = right.get([0])
pure val isLeftRaw = isRaw(left)
pure val isRightRaw = isRaw(right)
if (isLeftRaw and l.b == []) {
right
} else if (isRightRaw and r.b == []) {
left
} else if (isLeftRaw and isRightRaw) {
// both arguments are raw sequences, produce a single raw sequence
Map([0] -> { t: "r", b: l.b.concat(r.b) })
} else {
// Merge the arguments as trees representing terms.
// The number of root's children in the left term:
pure val lwidth = size(keys(left).filter(p => length(p) == 1))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure I understand what is going on here. Does this only work if left and right don't contain an entry for [1]? Otherwise the expression lwidth + p[0] below would suddenly introduce keys whose value is greater than 1.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I remember, every node is encoded as the path to this node. For instance, [0, 1, 0, 1] in a binary tree would mean that you first go left, then right, then left, then right. So if we count the number of paths of length 1, then we get the number of root's children.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, but if we have 2 root's children, we get lwidth = 2, but then two lines below we have

 keys(right).map(p => [ lwidth + p[0] ].concat(p.slice(1, length(p))))

So it seems that I could get a 2 and a 3 as first list entry, and this is what I don't understand.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That definition merges two trees. Say, if you have root's children [0, 1] in tree 1 and root's children [0, 1] in tree 2, then you would like to have root's children [0, 1, 2, 3]. This is what addition does for you: It simply renames the immediate root's children in tree 2, for all paths in that tree.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense. I just thought we would stay with binary trees...

// the paths of the right term shifted to the right by lwidth
pure val rshifted =
keys(right).map(p => [ lwidth + p[0] ].concat(p.slice(1, length(p))))
// the paths of the concatenation
pure val paths = keys(left).union(rshifted)
// the resulting term as a map
paths.mapBy(p =>
if (p[0] < lwidth) {
left.get(p)
} else {
right.get( [ p[0] - lwidth ].concat(p.slice(1, length(p))))
}
)
}
}

// prepend the hash with the length marker
pure def termHashWithLen(term: Term_t): Term_t = {
raw([ 32 ]).termConcat(termHash(term))
}

/// Slice a raw sequence represented by a term.
/// Non-raw sequences are returned unmodified.
pure def termSlice(term: Term_t, start: int, end: int): Term_t = {
pure val first = term.get([ 0 ])
if (isRaw(term)) {
raw(first.b.slice(start, end))
} else {
term
}
}
}
Loading