-
Notifications
You must be signed in to change notification settings - Fork 4
Various Komodo development issues
I collect here various complex situations related to komodod (and iguana)
Error in console or debug log:
tx inputs not found tx=982ca8a3fea28d2c19f145ff7796450df4ffabb6faa95ab2d29d6dd9ec7f14a4 prev=a8e6f4673a760c558bec610c68ebde4368fc21a579ca221ec4a92548a24c1607
ERROR: AcceptToMemoryPool: tx inputs not found
tx inputs not found tx=1681b0f9c04ad1850c09676131559ba5d6519ae4b5757c90decc3ed11e5eed8c prev=cbf952effe5335db952092976dcffba69d571ba44cc5561aeaee541bc4083810
ERROR: AcceptToMemoryPool: tx inputs not found
...
The 'prev' is the coinbase tx. A research showed that its block (which hash was not output in gettransactions rpc but it was present in the wtx object) was not in the chain. Seems once I sent a tx like 982ca8a3fea28d2c19f145ff7796450df4ffabb6faa95ab2d29d6dd9ec7f14a4 and it was saved in the wallet. But later the chain was reorged and its block was disconnected. On start komodod tried to add the wallet txns to the chain mempool and received that error.
In my case it was because I added key chain params (namely ac_import=pubkey) not when the chain first run but at some height so magic value changed with adding new params, but the magic number is used in the first coinbase tx (h=1) to calc reward. So I temporary removed the new param and restared the node from h=0 and the resync became okay
This error could be also returned from getassetchainproof rpc The problems was that I did not update to the last version in April 2019 when notary elections were going on. So my komodod had several outdated notary pubkeys, therefore it could not validate some incoming back notarisations and did not save them in database. As result when I try to obtain the proof for a txn (for which there was no the bn in db) the program searched till the next saved back notarisation which did not included mom for the txn's block and failed (actually the merkle root value was changed because of incorrect nIndex passed into func getting merkle root for the MoM branch, the nIndex calced from the txn's block till bn's height, it was bigger that bn's MoMdepth. And that was a unexpected side way to find out that the txn's block was not in bn's MoM)
Sometimes CPubKey::IsFullyValid() does not detect bad pubkey (correction: pubkey2pk did not check the source array length).
For example tokentransfer rpc has the call to IsFullyValid(). Once it returned true for pubkey with removed last nibble digit (after '59E' removed '2'):
./komodo-cli -ac_name=DIMXY11 tokentransfer 9c2775cd8c7302d79b0664adc26e709db59b74683f3d964ca40ff572b9da9604 02850be3666b776f745d5ea420a8f08984300ebf898c6719ed012420663b4659e 66
{
"result": "success",
"hex": "0400008085202f8902127d77543ac1f4ae1618a385cb9da76e61da68c3522c683d4ab4091a9cf86457000000004847304402203501fb7a518bbccffcc6f68b8f6eca5d6b52aa8a0f3f262db4c5896ec03c2c4202206c849f25ada57efd63403f42b307ba6f6381f2f6e9b78cc641ad0dfb60cda78e01ffffffff0496dab972f50fa44c963d3f68749bb59d706ec2ad64069bd702738ccd75279c010000007b4c79a276a072a26ba067a5658021035d3b0f2e98cf0fba19f80880ec7c08d770c6cf04aa5639bc57130d5ac54874db8140f3100b478b1e6371ec0f1215c59d52eea3758b68f04459c251deb10ccf9f3c827c65bc05cb9d89ffb6f1b0dc5b6443b414acdc26740a30f894d0616cd32f47b8a100af038001f2a10001ffffffff044200000000000000302ea22c8020c8b3f3d643228289e7c1dd772e43a640ff2e2837152e520620cc4186c78fd9b98103120c008203000401ccc819000000000000302ea22c80206befaca20275e1c8880ad164a673ee6a024758b6bb78e044f7a2f70839e1f6318103120c008203000401cccc3c0000000000002321035d3b0f2e98cf0fba19f80880ec7c08d770c6cf04aa5639bc57130d5ac54874dbac0000000000000000476a45f2749c2775cd8c7302d79b0664adc26e709db59b74683f3d964ca40ff572b9da9604012102850be3666b776f745d5ea420a8f08984300ebf898c6719ed012420663b46594b000000005b0000000000000000000000000000"
But the other call returned 'invalid pubkey' for the same params. Strange! Suggestion: use check for pubkey length=33 instead (or in addition to it)
CORRECTION: no, seems it is not problem in CPubKey::FullyValid, it is pubkey2pk() does not check the array length before converting it to CPubKey. It must check the length and if it is not 33 return invalid empty CPubKey
Some rpc methods (like getbalance) accept not simply string params but integer or boolean params. For this to work client.cpp src has a conversion table when param index to convert is specified. The methods that do not use this expect only string params. This might be confusing for curl users as in some cases you need to pass numbers as unquoted values, for other methods you would need to pass numbers as quoted strings.
On two-three node test chains where only one node is mining (as it is advised to set up on such small chains) and ac_supply=100000 by default I often see the situation when I send a tx on non-mining node and it is stuck in this node's mempool for a long time and does not propagate into the mining node (getrawmempool empty). getpeerinfo shows 'synced_blocks=-1' on the mining node, that is, the sync is really not going. Sometimes the reboot of the mining node then the others helps. If the sync is still not going, to kick it, I send any tx like sendtoaddress on the mining node. Then the mining node begins to mine and other nodes become to sync blocks and able to propagate txns from their mempools.
It was in the marmara cc project.
On the second node I did marmarareceive. Then on the first node I did marmaraissue with the param txid of the marmarareceive call and the sendrawtransaction returned good txid. But the marmaraissue tx seemed disappeared as getrawtransaction output no info about it further.
The first node was staking (with staking customization for marmara though), the second node was PoW
First I thought that problem was mempool related: for some time I did not check in marmaraissue that marmarareceive tx was not in mempool so my first suspision was the marmaraissue tx disappearance was related to the fact that marmarareceive tx was in mempool at that moment and somehow marmaraissue tx was not mined successfully because of that.
Anyway, I added checking that marmarareceive tx should not be in mempool. But I have no prove it happened from that. But I noticed once when a marmaraissue tx again disappeared I restarted komodod after sendrawtransaction call. So it seems the marmaraissue tx simply lost from the node mempool when the daemon was reloaded.
Hm, I received this issue many times with no reload. Discovered that total vin == total vout, seems problem was in that no txfee in tx.
See also a similar issue below - 'Valid Transaction seems lost in the chain'
Also in the log there were messages 'ht.111 PoW diff violation PoSperc.74 vs goalperc.75' The problem was in that my Marmara_PoScheck() function incorrectly reported that PoS block was invalid (it incorrectly checked staked tx addresses using an incorrect pubkey). So PoS blocks were recognized as not PoS but PoW, but for PoW blocks difficulty is bigger and these blocks were not correct PoW block, so there were errors.
There were messages 'ht.101 PoW diff violation PoSperc.74 vs goalperc.75'
Invesigation showed that komodo_segid in komodo_PoWtarget always returned value -1. This, in turn, was because of incorrect check of stake tx vout size (before some block height it was required to be 2, but later the requirement was changed to 1. So validation of the stake txns in early blocks was not okay. Set validation to allow both 1 or 2 values.
Correction: there is a function komodo_segids that looks 100 blocks back and retrieves segids for them by using komodo_segid function which in turn does quick check of stake tx.
Make sure that komodo_segids works identically on both block creation and validation (especially if you change stake tx structure). If komodo_segids does not work identically on creation and validation you might receive different difficulty calculation and block will be considered invalid on validation.
If you changed stake tx structure you need to define at what height it was and check old and new structure also in komodo_segid
On Windows GMP mpz_get_si returned a truncated value (after setting 64bit value with mpz_set_si 71000000000LL) On Linux it worked okay. The problem was with this GMP repo we use to build komodo https://github.com/joshuayabut/libgmp (seems it is unofficial): it uses this signature:
void mpz_set_si (mpz_ptr dest, signed long int val);
This 'signed long int' type has different sizes of linux (8 bytes) and windows 64bit (4 bytes) platforms (https://en.cppreference.com/w/cpp/language/types) We developer a patch for this issue which uses internaly mpz_set_ui which uses 'unsigned long long int', it is always 8 byte size. Interesting, that in official doc mpz_set_ui still uses 8/4 byte 'unsigned long int', so our patch would not work if we change the repo to official GMP repo. Seems it is better to use this solution: On forums it is suggested to use basic mpz_import and mpz_export functions to make custom types mpz_set/get https://stackoverflow.com/questions/6598265/convert-uint64-to-gmp-mpir-number (note
See more info here: Using GMP Library with int64_t
It is recommended to have even amount on all the 64 segids. Also no segid should have amount greater that amounts on the rest segids (in this case such segid would be the winner all the time)
Again got into situation with 'invalid block' on validation node: I checked that all 100 komodo_segid returned correct segids (so it was not the cause) But winner still was 0 and the 'eligible' also returned as 0 in komodo_stake(1,...) call. The problem was in MarmaraGetStakeMultiplier function that returned a different multiplier value for block validation (because I erroneously compared mypubkey with the pk in the opret, which of course should not have been done) As the consequence of that, the staking utxos values changed and the winner could not be determined.
Marmara PoS node got into fast cycling of creation of a new block, but with no block generation actually
The problem was cleared when marmara specific opret (with locked-in-loop coins description) became to be parsed correctly. Before that 1x multiplier was returned instead of 100x. How this incorrect multiplier value did influence on the problem, I really don't know. Probably the multiplied value was used somewhere.
very fast log changing, no delays, seems can't find the winner ('iterated till 600' in the log) seems this resolves itself, but why such fast cycling with no delay - processor overload
Cryptocondition cc_conditionFromJSON returns NULL for cc previously serialized with cc_conditionToJSON
Actually a failed function was internally called secp256k1_ec_pubkey_parse which could not parse some pubkey generated from CCtxidaddr function. Only a few pubkeys from several generated in such a way pubkeys caused this issue. My understanding that sometimes CCtxidaddr generates invalid pubkeys and to solve this we need to tweak the generated pubkey if it could not pass the validation.
/bin/bash ./libtool --tag=CXX --mode=compile x86_64-w64-mingw32-g++ -std=gnu++11 -DHAVE_CONFIG_H -I. -I./src -DFD_SETSIZE=16384 -DZMQ_STATIC -D_REENTRANT -D_THREAD_SAFE -pedantic -Werror -Wall -Wno-long-long -g -O2 -Wno-atomic-alignment -Wno-tautological-constant-compare -MT src/src_libzmq_la-mailbox_safe.lo -MD -MP -MF src/.deps/src_libzmq_la-mailbox_safe.Tpo -c -o src/src_libzmq_la-mailbox_safe.lo `test -f 'src/mailbox_safe.cpp' || echo './'`src/mailbox_safe.cpp
libtool: compile: x86_64-w64-mingw32-g++ -std=gnu++11 -DHAVE_CONFIG_H -I. -I./src -DFD_SETSIZE=16384 -DZMQ_STATIC -D_REENTRANT -D_THREAD_SAFE -pedantic -Werror -Wall -Wno-long-long -g -O2 -Wno-atomic-alignment -Wno-tautological-constant-compare -MT src/src_libzmq_la-mailbox_safe.lo -MD -MP -MF src/.deps/src_libzmq_la-mailbox_safe.Tpo -c src/mailbox_safe.cpp -o src/src_libzmq_la-mailbox_safe.o
In file included from src/mailbox_safe.hpp:43:0,
from src/mailbox_safe.cpp:31:
src/condition_variable.hpp:140:10: error: ‘condition_variable_any’ in namespace ‘std’ does not name a type
std::condition_variable_any _cv;
^
src/condition_variable.hpp: In member function ‘int zmq::condition_variable_t::wait(zmq::mutex_t*, int)’:
src/condition_variable.hpp:122:13: error: ‘_cv’ was not declared in this scope
_cv.wait (
^
src/condition_variable.hpp:124:20: error: ‘_cv’ was not declared in this scope
} else if (_cv.wait_for (*mutex_, std::chrono::milliseconds (timeout_))
^
src/condition_variable.hpp:125:28: error: ‘std::cv_status’ has not been declared
== std::cv_status::timeout) {
^
src/condition_variable.hpp: In member function ‘void zmq::condition_variable_t::broadcast()’:
src/condition_variable.hpp:136:9: error: ‘_cv’ was not declared in this scope
_cv.notify_all ();
^
At global scope:
cc1plus: error: unrecognized command line option ‘-Wno-tautological-constant-compare’ [-Werror]
cc1plus: error: unrecognized command line option ‘-Wno-atomic-alignment’ [-Werror]
cc1plus: all warnings being treated as errors
Makefile:4213: recipe for target 'src/src_libzmq_la-mailbox_safe.lo' failed
make[1]: *** [src/src_libzmq_la-mailbox_safe.lo] Error 1
make[1]: Leaving directory '/home/komodo/depends/work/build/x86_64-w64-mingw32/zeromq/4.3.1-032d3192597'
funcs.mk:257: recipe for target '/home/komodo/depends/work/build/x86_64-w64-mingw32/zeromq/4.3.1-032d3192597/./.stamp_built' failed
make: *** [/home/komodo/depends/work/build/x86_64-w64-mingw32/zeromq/4.3.1-032d3192597/./.stamp_built] Error 2
Solution:
just built latest win jl777/komodo -b jl777 without problems. Probably you've tried to build in the same folder where built for linux b4 or don't have some deps installed (first scenario is more probable according to your log).
The install docs were missing the step:
sudo update-alternatives --config x86_64-w64-mingw32-gcc
sudo update-alternatives --config x86_64-w64-mingw32-g++
and select the POSIX version from the options after each command.
Could be several problems:
- oracle subscriptions have run out of funds on mypk oracle cc address (cant AddOracleInputs) -> add new subscriptions with more funding
- oraclefeed sends several data samples txns within one block, more than subscriptions number (actually all the utxos on mypk oracle cc address are already spent in mempool by the previous oraclefeed data txns until they are mined) -> add more subscriptions to accomodate all oraclefeed data samples sent within one block. It is actual for large blocktimes of asset chains
Sometimes after chainging options in Makefile.am it is better to rebuld autoconf script files configure, libtool and config.site with touch configure.ac
and make clean
Otherwise old options are somehow kept in those files after changing Makefile.am
When I tried to build nspv as a shared library for MINGW got into a libtool error:
Warning: This system cannot link to static lib archive /home/ubuntu/repo/test-dll/depends/x86_64-w64-mingw32/share/../lib/libevent.la.
I have the capability to make that library automatically link in when
you link to this library. But I can only do this if you have a
shared version of the library, which you do not appear to have.
So seems the point is that libtool can't link installed libs like libsodium (?)
It was only for MINGW (when building for Linux it was okay). I looked into libtool script and seems found some issues regarding MINGW around that warning message in the script.
Found a workaround (or is this by design as maybe libtool can link only uninstalled libs, that is, in .libs subdirectory?): instead of using libshared_la_LIBADD = -levent
that causes libtool to search for the libsodium shared library, I just passed an option -Wl,-levent
right to the linker in libshared_la_CFLAGS =
param.
Another issue with linking nspv shared library to libbtc as static library - static lib should be noinst
I noticed that static libaries, to which a shared library is linking, should be _noinst _in Makefile.am (set as noinst_LTLIBRARIES=
but not lib_LTLIBRARIES=
param)
occurs on older branches because libsodium-1.0.15 moved into releases/old directory on the web source update to the lastest branch with updated libsodium.mk or fix download path (in new komodod releases the path is fixed)
My two-node MARMARAXY8 POS chain stalls sometimes: gets into long looping, than eventually comes out from the loop but still waits for amount of secs that is more than blocktime=180
526 seconds until elegible, waiting.
connect() to 95.216.252.178:14722 failed after select(): Connection refused (111)
486 seconds until elegible, waiting.
465 seconds until elegible, waiting.
connect() to 185.25.48.72:14722 failed after select(): No route to host (113)
422 seconds until elegible, waiting.
358 seconds until elegible, waiting.
unsolved yet. First I thought it was because not utxos in some segids, but for this case there are 1 utxo in all segids
In cc modules we use a statement to convert string coins to satoshis, similar to this:
int64_t amount = atof((char *)params[1].get_str().c_str()) * COIN + 0.00000000499999;
For the coin value of '2.22226666' it returns '222226665' which might be a problem if some function expects the exact value.
I suggest switching to use AmountFromValue or ParseFixedPoint function (the first one throws exception, the second returns bool. The functions check satoshi accuracy, for example, treat 2.222266666666 as incorrect amount, so we should check bool result or the value could be uninitialized.
Seems connection to a restarted node could be restored from a restarted komodod spv node side. If nspv.exe connected it shuts down its event loop if no more server nodes connected. If libnspv is run as a library (inside some GUI app) it does not advertise its ip-address, so the restarted node does not know where to connect. Seems we need to implement polling in the GUI, or even better: in each call check if no nodes then rediscover peers.
When Prices DTO running 'ERROR: CheckTransaction(): invalid script data for coinbase time lock' appears
Errors in console or debug.log
ERROR: CheckTransaction(): invalid script data for coinbase time lock
ContextualCheckBlock failed ht.352
This is because blocktime is too low (< 180) and received prices data alogorithm calculates incorrect timimg value Use blocktime >= 180 sec
To force rebuild libcc.dll remove src/cc/customcc.so library
When cross-compiling komodod for Windows with build-win.sh
on Debian I got mingw ld failed with a message:
bFD (GNU Binutils) 2.28 assertion fail ../../../upstream/bfd/cofflink.c:265
Solution:
- Use Ubuntu OS
- Upgrade mingw compiler:
sudo update-alternatives --config x86_64-w64-mingw32-gcc
(select POSIX option)
sudo update-alternatives --config x86_64-w64-mingw32-g++
(select POSIX option)
In PoS + PoW two-node chain I sent a valid marmaraissue tx. first getrawtransaction showed the tx was in mempool (no confirmation) then getrawtransaction showed 'no information about tx' eventually the tx appeared in the chain. Seems the reason for this was the chain reorganization due to both mining nodes
There are errors in debuglog like that:
2020-01-13 05:44:50 ERROR: ContextualCheckTransaction(): transaction 658c48625fadfb6757c388daaa1b3093edf69a5b4575abeee5510e071c10491a is expired, expiry block 1306 vs current block 1307
txhex.0400008085202f8901a82e1ceb46842795cdcaf73f29be2ca522b8cdf24e967cb1fdf3b12eab425a0500000000a74ca5a281a1a0819ca28194a067a565802103afc5be570d0ff419425cfcc580cc762ab82baad88c148f5b028d7db7bfeee61d8140cadebfd3bbb50d3e05b5b3ea0a0aaa9ae8ed77cf1a3ebe05ec5fdb0674cbd56f23724697bd563f5c27172c1fb6f613e3bd26dfcb7744d11b635bdc47c1cdcf39a129a527802073cfdd7f8cb80f7b882be14983bee715d6729cb0ec454a42f6e0bc514eadf0bd8103020000af038001efa10001ffffffff02005ed0b200000000652ea22c8020666769295db3781db424626331621aade93d3535d49baa6d900e99acf5d0280e81031210008203000401cc330401ef01022def43012103b70f952cb2b38ca844fe1683cebf58c838cd0f9615f89f19bffe32badc47790e96010000feffff7f750000000000000000236a214d29fa4b654b51b9f5ef5372dbf083770280f5d6e581d6e40579c26e2c3be2b2b6eb031c5e1a0500000000000000000000000000 2020-01-13 05:44:50 ERROR: AcceptToMemoryPool: ContextualCheckTransaction failed
This is a stake tx which ExpiryHeight is set in the block where it is created (here 1306) Seems this error appears when the node is replicated on bad network and a PoS block was re-downloaded twice and tried to connect at H+1 (1307). So all is good and bad block was detected by one of the rules
the configuration: 2 PoS and 1 PoW nodes. Once after start all the nodes repeatedly were mining at the same height and no advancing.
There were errors in all the debug.logs:
2020-02-08 09:15:58 ERROR: ConnectBlock: ac_staked chain failed slow komodo_checkPOW
2020-02-08 09:16:01 komodostaking ERROR: [MCLT0:292] bnhash=0cb32256f1f68a25ace79cd5a49bbfea > bnTarget=0071a000000000000000000000000000 ht.292 PoW diff violation PoSperc.76 vs goalperc.75
I called setgenerate false
on all the nodes then called setgenerate true
with 0 or 2 (appropriately) and the chain resumed.
Something bad with bnTarget happened seems.
Maybe it is bad to start nodes with -gen option set and better to call setgenerate from komodod-cli.
The error cleared after make clean
and cd src/cryptoconditions; make clean
.
Update: sometimes after I built komodod for target linux platform (on the same Ubuntu box) make clean
ends with error that it could not remove libdb_ccx. Repeated make clean and make clean in cryptocondition do not clear this error. If try to build the build also ends with an error. But after this build, make clean already cleans it well, with no errors.
To disable hiding variables under debug, it is needed to turn off the compiler optimisation (change default -O1 to -O0 or -Og The default flags are set by ./depends/hosts/*.mk in host_CFLAGS env vars The optimisation flags are also additionally set in ./src/Makefile.am for each library included into komodod (but now receive an error after komodod start '_nDecemberHardfork not found' - this looks like module organisation issue, maybe move this var into the lib common)
I sent two tokencreate and tokentransfer (enabled for use mempool tx), with mining turned off Mempool was:
./komodo-cli -ac_name=DIMXY14 getrawmempool true
{
"3c8ac39975b5d85886a171add1e4f922929fc77614ac359c6ae4b1372ae0c812": {
"size": 353,
"fee": 0.00010000,
"time": 1589628739,
"height": 627,
"startingpriority": 0,
"currentpriority": 0,
"depends": [
]
},
"d5b58ff8cf99541848d5e368580aeaf05b545538dbf30072695ef5ad11b5d341": {
"size": 352,
"fee": 0.00010000,
"time": 1589628652,
"height": 627,
"startingpriority": 0,
"currentpriority": 0,
"depends": [
]
},
"3c067f73487e6723ea80d94a8e71fc6345db3bbe74ba8c49e25578abf6d88448": {
"size": 546,
"fee": 0.00010000,
"time": 1589628716,
"height": 627,
"startingpriority": 1281.138790035587,
"currentpriority": 1281.138790035587,
"depends": [
"d5b58ff8cf99541848d5e368580aeaf05b545538dbf30072695ef5ad11b5d341"
]
},
"66eee08e49ce7b8926d30a9f0540070d15e4a2eca1575822174e9d53ae6a1a97": {
"size": 546,
"fee": 0.00010000,
"time": 1589628787,
"height": 627,
"startingpriority": 45978.64768683274,
"currentpriority": 45978.64768683274,
"depends": [
"3c8ac39975b5d85886a171add1e4f922929fc77614ac359c6ae4b1372ae0c812"
]
}
}
After mining on, one of those four txns continued to stay in mempool:
./komodo-cli -ac_name=DIMXY14 getrawmempool
[
"3c067f73487e6723ea80d94a8e71fc6345db3bbe74ba8c49e25578abf6d88448"
]
The tx was not in the chain (so it was not the issue when a cc tx could be both in a block and mempool as a result of the failed late block mining) If try to resend it, got '-25 missing inputs' error. It turned out that another tx (db24da14c05cb3f37d1f383e6cb1325a44d1d142438ef24f544506578e8f20cc) that had been created a day before, on another node, was able to spend an input of 'd5b5' tx (seems it was not mined and was in mempool, so it was resumed on the node startup and mining on), from which '3c06' tx depended on, too.
Here is the other node mempool state, before mining:
[
"3c8ac39975b5d85886a171add1e4f922929fc77614ac359c6ae4b1372ae0c812",
"d5b58ff8cf99541848d5e368580aeaf05b545538dbf30072695ef5ad11b5d341",
"66eee08e49ce7b8926d30a9f0540070d15e4a2eca1575822174e9d53ae6a1a97",
"db24da14c05cb3f37d1f383e6cb1325a44d1d142438ef24f544506578e8f20cc"
]
Could this case some time create a problem, resulting in cc validation code behaving differently on two nodes (if we allow to refer txns in the mempool), thus creating forks?
That situation when a tx is both in mempool and in a block occurs because when on a cc chain a block is validated its cc txns are put into mempool. This is good because cc txns might refer mempool txns. If the block is added to the chain its txns are removed from the mempool (in removeForBlock() func). The problem happens when this block is failed (because a remote block has come in earlier and added first) and its txns continue to stay in the mempool. (This is actually the cause of that "Mempool missing inputs" error) So I think we need to do some clean up and restore the previous mempool state if a block is failed. This is done in testing in marmara chain.
When called in libnspv, from FinalizeCCtx, CCPubKey sometimes hung on cc_conditionBinary. I could not exactly remember what was the reason. Later a fix was added into cryptocondition.c, probably it fixed that issue:
ConditionTypes_t asnSubtypes(uint32_t mask) {
ConditionTypes_t types;
memset(&types, '\0', sizeof(types)); // the fix added this line
uint8_t buf[4] = {0,0,0,0};
int maxId = 0;
When syncing a marmara node such errors could be found:
2020-06-14 17:37:45 marmara MarmaraValidateStakeTx ERROR: found activated opret not equal to vintx opret, opret=OP_RETURN ef430121035d94f42e0545a7a5a10988b5b3b359739ed59de7566b69d62f250281cfb5ae4d422c0300feffff7f vintx opret= h=207939
Investigation showed that the stake vintx could not be loaded a that moment (But this vintx exist in the chain and could be loaded later) I saw that even on a non-mining node, doing full syncing That would mean that probably a fork is in the chain and the node was receiving a block referring vintx not in the chain(?) But how do such errors happen on a node doing full sync? Could this be synchronisation implementation errors? The errors might happens while doing full sync pretty deeply from the tip, so it could not be blocks from forked nodes.
More log example:
2020-06-16 13:01:38 UpdateTip: new best=0a24e667d5ece678c4d45cc87bc45295d4edf22fd20504b0941e99e9eef5c80d height=113273 log2_work=36.875327 log2_stake=-inf tx=219701 date=2020-04-05 15:34:31 progress=0.526814 cache=19.7MiB(59869tx)
2020-06-16 13:01:38 UpdateTip: new best=0123f852a6b0962f2af214932ebca7af64bb449ab30ddec8f03cfbd3afe88465 height=113274 log2_work=36.875342 log2_stake=-inf tx=219703 date=2020-04-05 15:35:25 progress=0.526819 cache=19.7MiB(59870tx)
2020-06-16 13:01:38 UpdateTip: new best=0b06f801e68ac418303c7bf655f0141fc6f08b82b3d726cefa04df06c27146fd height=113275 log2_work=36.875357 log2_stake=-inf tx=219705 date=2020-04-05 15:40:13 progress=0.526824 cache=19.7MiB(59871tx)
2020-06-16 13:01:39 GetTransaction could not find txid=5c5295d4491a450868c119b5d94bbdd21bdb9e020eec26fafc5c29cdd54af2ad
2020-06-16 13:01:39 marmara MarmaraValidateStakeTx ERROR: found bad lock-in-loop opret not equal to vintx opret, opret=OP_RETURN ef4b01d28e18a06354ab1a13268d7d3da80e2142b9bd161567275021b201fae1babdc92103ba684a133f4a1cdef02b0ccb28e1b67e7e3fb8abcda115e24728c4adaac58242 vintx opret= h=113526
note this height 113526, it is more than the current sync height (113275)
...
...
2020-06-16 13:01:40 UpdateTip: new best=00c98569c03418496269fe7c57b4d113f1c71e95b302b04a5259c972511ec325 height=113330 log2_work=36.876073 log2_stake=-inf tx=219818 date=2020-04-05 16:37:11 progress=0.527080 cache=19.7MiB(59829tx)
2020-06-16 13:01:40 ConnectBlock adding tx to index 00c02b2b6b3a4fd226f36160fd64a9ba6c937d55cfbf1e4273d084f0271909fd
2020-06-16 13:01:40 ConnectBlock adding tx to index 8ff58ca6284073454f16aacfc3e4270100c8a2c2fa067eaa18d90a473008ccb2
added the stake vintx that could not be loaded with GetTransaction:
2020-06-16 13:01:40 ConnectBlock adding tx to index 5c5295d4491a450868c119b5d94bbdd21bdb9e020eec26fafc5c29cdd54af2ad <-- tx that was not found
It looks like when the node was syncing, at the height=113275 it received a block with a newer height (this block referred the stake vintx that was mined in the block at the height 113330 that still was bigger than the current height=113275, so GetTransaction for it failed). It is not clear why some node sent that block 113526 at that moment. It was not a new block though as the actual chain height was > 215000 atm
Created a simple multithreaded class for error with public static set and get methods and a private static var storing the error value. The get and set were defined in the class definition in include header and the static error was defined in one of the cpp. The build was okay but on the first use the app crashed on MacOS with BAD_ACCESS error on the first use of the method (Linux build worked alright though). Apparently the error var was not linked correctly. The issue on MacOS was resolved when the get and set method definitions were moved from the header to the same cpp where the error var was defined and only get set declarations remained in the header:
CCinclude.h:
// multithreaded CCerror version
class CCerrorMT {
private:
static thread_local std::string th_cc_error;
public:
static void clear();
static void set(const std::string &err);
static std::string get();
static bool empty();
};
CCutilsbits.cpp:
thread_local std::string CCerrorMT::th_cc_error;
void CCerrorMT::clear()
{
th_cc_error.clear();
}
void CCerrorMT::set(const std::string &err)
{
th_cc_error = err;
}
std::string CCerrorMT::get()
{
return th_cc_error;
}
bool CCerrorMT::empty()
{
return th_cc_error.empty();
}
The cc validation code should not rely that mempool txns are exactly the same on all nodes. The delivery of a tx to nodes' mempool queue is not reliable and some mempool txns might lost when they travel to other nodes. So the validation code should not check that a tx (that is not directly spent) exists in mempool, better to check only for confirmed txns (but this is also not very good as txns might get lost as a reorg result).
An important thing is, if validation code still needs to check that some tx is available in mempool, it should not just list them
Different mempool state on the same node when cc tx is added to mempool and the block with this tx is connected
There is a problem that might affect cc module validation code. It is because the mempool state is different when a cc tx is validated first time on addition to mempool and the second time when the block with this tx is connected to the chain. The first time the cc validation code does not yet see the tx in mempool but the second time it already sees the tx in mempool because on the block being connected as all the block's cc txns are added to mempool.
One side effect of that is when the cc validation code might fail if it checks for some utxo and in the first validation run finds that it is not spent in mempool, but the second time it fails to find this utxo because that utxo is spent by the validated tx itself.
(Example: kogs cc baton validation code looks for token utxo deposited to the game. But the game finishing tx sends tokens back to the owners, so when the kogs validates the game finish baton it encounters that the tokens are sent back by the finishing baton itself.)
As a recommendation - cc code should never check for txns while those are in mempool and should check only confirmed txns and yet not check that those txns are spent in mempool.
On komodod start it does reapplying txns from the wallet:
* frame #0: 0x00000001002ee6b2 komodod`AcceptToMemoryPool(pool=0x0000000101507880, state=0x00007ffeefbfe110, tx=<unavailable>, fLimitFree=<unavailable>, pfMissingInputs=0x0000000000000000, fRejectAbsurdFee=<unavailable>, dosLevel=0) at main.cpp:1935:141
frame #1: 0x00000001006f2575 komodod`CWallet::ReacceptWalletTransactions(this=0x0000000103533a00) at wallet.cpp:2982:34
frame #2: 0x00000001002270d8 komodod`AppInit2(threadGroup=<unavailable>, scheduler=<unavailable>) at init.cpp:2089:48
frame #3: 0x00000001000084b6 komodod`AppInit(argc=<unavailable>, argv=<unavailable>) at bitcoind.cpp:236:24
frame #4: 0x0000000100bdc14f komodod`main(argc=15, argv=0x00007ffeefbff858) at bitcoind.cpp:264:20
Some txns might already spent hence 'missing inputs' msgs. Note: normally the 'missing inputs' message is commented out.
Normally 'missing inputs' msg is commented. When a remote block is connected its txns are sent to the mempool and then removed (roughly speaking) So it could be that when txns are sent to mempool on the remote node they might be relayed to this node after the block is connected. So the inputs might me removed from the mempool and the relayed txns are failed to be put into the mempool.
Yet another 'missing inputs' might happen if the address index is incoherent with blocks. THis might happen after a node crash or not shutdown properly. Use -reindex to rebuild the indexes
On console there is an error like this, with no extra notes:
ERROR: CScriptCheck(): b1761b73d69583a7227f0657f4afc21d1bd6bbbf7b935d660668397476f0c861:0 VerifySignature failed: Script evaluated without error but finished with a false/empty top stack element
ERROR: AcceptToMemoryPool: ConnectInputs failed b1761b73d69583a7227f0657f4afc21d1bd6bbbf7b935d660668397476f0c861
On the client side a -26 error code is returned with a error message '16: mandatory-script-verify-flag-failed (Script evaluated without error but finished with a false/empty top stack element)`
This most probably tells about incorrect signature hash and shows itself on an nspv client with bad hash calculation
If a tx is sent with low txfee it is accepted to local mempool (by default settings) but may be rejected by remote nodes by the rate limiter. So the tx stays in the local node mempool and is never mined. It is recommended to set txfee dependent on the tx size (as default 100 sat/kB * size-in-kB)
the komodod nspv server has a rate limiter not allowing same requests more than 1 per second. In such a case the komodod nspv server simply ignores the requests, what is causing timeouts on the client side
This is possible.
I am testing the agreements cc. Suppose I have one node mining and second node non-mining.
- First on the node1 I create an agreement create proposal tx and send it to node1
- Then on the node2 I create an accept tx (spending the proposal tx, checking it is mined by verifying hashBlock not null) and send it to node2. This tx becomes the agreementtx.
- Then on the node1 I create an update proposal tx and send it to node1
- Then on the node2 I create the second update accept tx (spending the update proposal tx and agreementtx) and send it to node2
- Just after, with no delay, on the node1 I create a dispute tx and pass the agreementtxid as a parameter. The dispute creation rpc iterates from the first agreementtx to the last tx with the use of CCgetspenttxid function. The last tx should be the second accept tx, but as it was sent to the node1 and has not reached yet the node2, CCgetspenttxid returns the first agreement tx. So the dispute tx is created spending again the argeement tx. I send the dispute tx to the node2. As there is no yet the second accepted tx relayed into the node2 mempool yet the dispute tx gets all right into node2 mempool. When the disput tx is relayed to the node1 it is rejected there as its inputs already spent. But when the second accept tx is mined and synced in the block to node2 it turns out that the second accept tx and dispute tx both spend the same agreement tx.
So we should be careful with mempool, don't trust mempool txns if there is important data in such txns - could be double spent (I think we could trust txns in mempool if this is game txns or so and losing it does not affect anything seriously)
if you spend an utxo with a tx and while this tx is in mempool and you try to spend the same utxo with yet another tx then on sending this tx to the chain you will receive that -25 error with no message. (in test-eval-nft branch a error message 'tx replacement in mempool not allowed' was added)
if a coinbase (without maturity checking) is added to tx, if a reorg occurs this coinbase might be a bad input (if the block becomes orphan) I saw this on a single mining node after the node's failure and restart - the block became orphan
qa test_channels.py runs for a very long time because it waits for 2 confirmations after most of test txns, however blocks are generated very slowly on my VPC (perhaps because of high difficulty in the bootstrapped test chain)
do make clean
and do ./autogen.sh; ./configure
run make test
if errors, note that python test_secp256k1.py may require missing secp256k1 module
Install it by .env/bin/pip install secp256k1
(maybe first you'll need pip3 python3.x-dev but not sure)
Seems this happened when I built dependencies manually running make. Apparently libcurl was built with a wrong option requiring libidn2. The error resolved when I removed all libcurl.a files, including intermediate libcurl.a files. (it is even simpler to recreate the repo)
config.log shows that conftest file must be compiled with c++17 option. Changing the test function __gmpn_sub_n fixed that:
old: AC_CHECK_LIB([gmp],[[__gmpn_sub_n]],GMP_LIBS=-lgmp, [AC_MSG_ERROR(libgmp missing)])
- failed
fixed: AC_CHECK_LIB([gmp],[__gmpn_sub_n],GMP_LIBS=-lgmp, [AC_MSG_ERROR(libgmp missing)])
- worked
the certificate Digital Signature Trust expired on 30 Sept 2021 remove it and update certs: https://stackoverflow.com/questions/69387175/git-for-windows-ssl-certificate-problem-certificate-has-expired
I made 1of2 threshold with two simple eval conditions A and B. Condition A always returns false and B always returns true To spend such a 1of2 threshold you should construct the scriptSig fulfilment in the correct order: first you must put first the good condition in the fulfilment, in this case it is B. So the B would be a eval condition and A would be an anon condition (as this is 1of2 only one condition is needed in a not-anon state and at the beginning of the condition array)
Note that I could not simply create 1of2 of two eval conditions: the lib did not allow to do this. I had to add at least one more secp256k1 condition
When I changed cryptocondtions sources and ran ./zcutils/build-mac.sh (in tokel repo) I got a error 'g++-8: error: cryptoconditions/.libs/libcryptoconditions_core.a: No such file or directory' I could not find such directory though in any Makefiles. It looks like this '/.libs/' subdirectory is used as a default when a subdir is defined in Makefile.am so it tries to look there for .la files (maybe --prefix can change this) The problem was cured by moving 'cryptoconditions' from DIST_SUBDIRS to SUBDIRS this in Makefile.am (I found this fix in komodo research-new branch):
DIST_SUBDIRS = secp256k1 univalue
SUBDIRS = cryptoconditions
(Seems DIST_SUBDIRS does not invoke rebuild in some cases and SUBDIRS does this better)
When I started a chain of three nodes, I ran mining on node1 and ran a test which sent txns to node3 mempool. For some reason the txns did not come to other nodes mempool. Excerpt from the log:
DNS seeding disabled
net thread start
addcon thread start
opencon thread start
init message: Done loading
msghand thread start
nLocalServices 70000005 1, 1
receive version message: /komodod:0.6.0(tokeld:0.3.1)/: version 170010, blocks=489, us=0.0.0.0:0, peer=1
Added time data, samples 1, offset +0 (+0 minutes)
AddToWallet 6a29b8dd9388011d64fdbbd2566134633dcac26bbe4f2ea8e14735fffda25f16 new
AddToWallet 68e1b0da5d2f5df0020c5dce07bbc1ea872102cbe748078b879e967124274953 new
AddToWallet a2e8dcc633cc605dd901362430c62d076f9aa00f5210babfdf49336ae3e02c7f new
AddToWallet fc47d0080279772d67c80e88ee641038419891cf4141dd658ce2d1a37098ac92 new
AddToWallet c5a7a92ec3a263672c816d3a8cdb73eab9ccbb6e0021718aee2c548f9bb303cd new
AddToWallet 72146a7ff7373cf78e78f1e2637304e6152fe28275703555b7da526bd13070fa new
UpdateTip: new best=c136efb58aee7afbb4df8d76944f0f19bc142ffaa50b84cc70da20af8b9ef7ce height=490 log2_work=9.0416592 log2_stake=-inf tx=1124 date=2022-05-07 11:51:54 progress=1.000000 cache=0.0MiB(17tx)
Leaving InitialBlockDownload (latching to false)
AddToWallet 6a29b8dd9388011d64fdbbd2566134633dcac26bbe4f2ea8e14735fffda25f16
AddToWallet 68e1b0da5d2f5df0020c5dce07bbc1ea872102cbe748078b879e967124274953
AddToWallet a2e8dcc633cc605dd901362430c62d076f9aa00f5210babfdf49336ae3e02c7f
AddToWallet fc47d0080279772d67c80e88ee641038419891cf4141dd658ce2d1a37098ac92
AddToWallet c5a7a92ec3a263672c816d3a8cdb73eab9ccbb6e0021718aee2c548f9bb303cd
AddToWallet 72146a7ff7373cf78e78f1e2637304e6152fe28275703555b7da526bd13070fa
Adding fixed seed nodes.
We can see that at least one node had sent its version, so it was connected. This is unclear bcz even resendwallettransactions did not trigger tx propagation. Only after the node3 restart it was triggered. Maybe the reason was that the txns came before InitialBlockDownload (so this status was checked somewhere in SendMessages and this prevented sending inv messages)? But why did later called resendwallettransactions not help? Strange... Trying repeating the test with InitialBlockDownload that never occurred also did not help (txns were propagated). Trying to send txns while a node was not connected and then calling resendwallettransactions after the node reconnection worked well (in the failed test resendwallettransactions did not trigger relaying) That is, I could not reproduce this, probably a bug or some complex node state was the reason...
When I once ran my split.sh on the 3p node it received an error 'couldnt create duplicates tx' from iguana splitting "splitfunds" rpc. This was on the MCL chain. What I discovered from that:
- It happened at the beginning when initially I had only one utxo of 0.5 MCL in the wallet (txid="381a13", nvout=4).
- I called my split3p.sh and got the error 'couldnt create duplicates tx'
- checked mempool and found hundreds of txns in it
- among them there was a tx with txid="7dcece" which spent uxto 381a13/4
- at this moment listunspent showed an empty list and listaddressgroupings showed 0.49980000 amount
- I restarted the komodod to clear the mempool. After the restart mempool became empty and I could run splitting for MCL. It created a tx with txid=62730a
- The tx with txid="62730a" in mempool eventually never was mined. This tx spent 381a13/4 utxo. (Apparently it had an 'inputs already spent' error on other nodes as 7dcece spent the same utxo).
- Again split3p.sh run produced a 'duplicates error' for marmara.
- After yet another komodod restart the tx 62730a was cleared and I could again execute split3p.sh for marmara without an error. However this needs to be resolved as it was blocking notarisation and required manual intervention to clear.
- Study revealed that 7dcece was in the chain and 62730a was not.
Question is why 7dcece stayed in mempool (although it must have been removed by the coming block on the sync) and after the 1st restart this tx probably was not in the chain as it did not prevent from creating another 62730a tx spending the same utxo.
Maybe it was sync issues on the node?
Happily I called getinfo before the 1st restart. It showed the value "blocks": 1262788,
and for the tx 7dcece "height": 1265400, So indeed the node was not fully synced. Apparently we have sync problems in marmara. But I think I had the same situation on RICK or MORTY chain so it was not just on marmara. Also, marmara debug.log contains many records like that:ERROR: ContextualCheckBlockHeader: forked chain 1266463 older than last notarized (height 1266846) vs
So maybe the reason was in that not all peers were properly upgraded?
Update: in RICK and MORTY I could see 'forked chain error' only on MORTY and the reason is one and the same node sending bad headers.
Some hints how to do merging outdated branches with conflicts, using SourceTree:
When you do merging an old branch and have conflicts in a source file do not use the option 'Resolve Conflicts using theirs' by default.
Here is a situation: you have fixes in fixpak1 branch growing from dev branch. fixpak1 branch has not been merged (rebased) for some time so dev branch was also modified and other new features and fixes have been added there from other branches during this time.
So when you decide to merge fixpak1 you see conflicts. Do not resolve the conflicts automatically by the 'theirs' option or do not apply this option after you manually resolved the conflicts. If you do this you could lose those new fixes and features in dev branch added after you created fixpak1. You even may not see how your new features are overwritten in the SourceTree conflicts view: by default SourceTree shows changes by the 'Diff vs parent' option. I assume it is actually 'git diff dev-last-commit' command and this is actually changes in working tree vs the last commit in the dev branch, that is what git merge could not merge itself. So you won't see other new changes in files in the dev branch. If you choose 'Resolve conflicts by theirs' those new changes may be overwritten by the old code from the fixpak1 branch and this even could not be marked as changes in the default SourceTree view.
To check what other code changes, select the 'Diff vs merged' view option and you will see that new changes respectfully to your old fixpak1 code (I think this actually calls 'git diff fixpak1-last-commit' and I think it basically shows both new changes in the dev branch and the merge results). If you are okay with both views just use the 'Mark file as resolved' option or, make any changes you believe correct and then still use the 'Mark file as resolved' option.
Simply put, if you chose 'Resolve by their' the merge process will consider the fixpak1 files as actual and just replace the dev content by fixpak1 files (with your corrections if you did any).
Thus you will avoid surprising build errors when your new changes are overwritten by code from old branches which you could not even have noticed.
Obvious recommendation: try to build before you use 'Mark as resolved' for a file
When komodod is stopped by 'komodo-cli stop' it often delays for maybe a few minutes before the process actually leaves memory Tracing showed that the delay occurs even after the final message 'Shutdown: done' in debug.log. Debugging and adding printing showed the delay happens in ~CMainCleanup() destructor cleaning big mapBlockIndex object (3 mln size at the present moment). I think it would be even a bigger delay if the PC was in sleeping mode when komodod was running.
I tried to create a zero tx fee transaction (by disabling some code adding fee in wallet.cpp) and send it in two node testnet. Here is a trace from AcceptToMemoryPool():
AcceptToMemoryPool port=17771 case0 nFees=0 GetMinRelayFee(tx, nSize, true)=0 fLimitFree=0
AcceptToMemoryPool port=17771 case1 nFees=0 ::minRelayTxFee.GetFee(nSize)=19 !AllowFree=0
AcceptToMemoryPool port=17771 case2 nFees=0 ::minRelayTxFee.GetFee(nSize)=19 fLimitFree=0
AcceptToMemoryPool port=18881 case0 nFees=0 GetMinRelayFee(tx, nSize, true)=0 fLimitFree=1
AcceptToMemoryPool port=18881 case1 nFees=0 ::minRelayTxFee.GetFee(nSize)=19 !AllowFree=0
AcceptToMemoryPool port=18881 case2 nFees=0 ::minRelayTxFee.GetFee(nSize)=19 fLimitFree=1
AcceptToMemoryPool port=18881 case2.1 dFreeCount=0 GetArg("-limitfreerelay", 15)*10*1000=150000 fLimitFree=1
AcceptToMemoryPool port=17771 case0 nFees=0 GetMinRelayFee(tx, nSize, true)=0 fLimitFree=0
AcceptToMemoryPool port=17771 case1 nFees=0 ::minRelayTxFee.GetFee(nSize)=19 !AllowFree=1
AcceptToMemoryPool port=17771 case2 nFees=0 ::minRelayTxFee.GetFee(nSize)=19 fLimitFree=0
AcceptToMemoryPool port=18881 case0 nFees=0 GetMinRelayFee(tx, nSize, true)=0 fLimitFree=1
AcceptToMemoryPool port=18881 case1 nFees=0 ::minRelayTxFee.GetFee(nSize)=19 !AllowFree=1
AcceptToMemoryPool port=18881 case2 nFees=0 ::minRelayTxFee.GetFee(nSize)=19 fLimitFree=1
AcceptToMemoryPool port=18881 case2.1 dFreeCount=185.042 GetArg("-limitfreerelay", 15)*10*1000=150000 fLimitFree=1
In komodod AcceptToMemoryPool function there is a rate limiter which function is to limit low fee transaction data bytes throughput.
The rate limiter is implemented by a function that is close to exponential decay function.
Exponential decay function decreases on any time period in the same rate, f.e. P(t0+600sec) = P(t9+60sec), where P(t9+60sec) = F(P(t9),60), P(t9) = F(P(t8),60)),... t0.
The decay rate is defined by 10 min window ('tau' const) which means that value P(t0+10min) is P(t0) / 2,71, where 2,71 is the Euler const.
By default the rate limiter throughput param is set to 15 what means 15kb/min.
The actual rate limiter formula is P(t) = (1 - r)^t (which is close to the real decay function P(t) = e^-lt, where l is 'lambda' and means the decay rate and equals 1/'tau').
In our formula r is 1/600sec.
The rate limiter uses difference between consecutive low fee transaction attempts as t (time) value in the formula.
The rate limiter code calculates upper limit for P(t) as 15 * 10 * 1000 = 150,000 of transaction size in bytes and drops transaction if the current P(t) is over that number.
How is this related to the throughput of 15kb/sec?
Let's calculate P1(t+1 sec) for P0 = 150,000. It is equal to 149,750 what makes the difference between P1-P0 as 250 what means 250bytes/sec, which also equals to our param of 15kbytes/min
If you need to remove spent transactions from your wallet (for performance reasons) but your blockchain does not have cleanwallettransactions rpc see this note: https://gist.github.com/DeckerSU/e94386556a7a175f77063e2a73963742
The IsFinalTx() function ensures that the tx could be placed to the chain. This is true if:
- tx.nLocktime == 0
- or, tx.nLockTime <= current tip height or tx.nLockTime <= current tip blocktime
- or, all tx inputs are final, that is, their nSequence == UINT32_MAX (0xFFFFFFFF).
Now we can have OP_CLTV in some tx1 scriptPubKey which has param nLockTime. OP_CLTV provides that the spending tx2 is not placed into the chain before nLockTime.
That is, for OP_CLTV the interpreter checks that tx2.nLockTime is greater than nLockTime in OP_CLTV. Also, when the tx2 is placed in that chain the ContextualCheckBlock() function checks that tx2 is final (that is tx2.nLockTime <= tip height or tip blocktime). But IsFinalTx may also be true if all tx2 inputs nSequence are set to 0xFFFFFFFF. In this latter case OP_CLTV would be bypassed. To fix this, the interpreter code for OP_CLTV also checks that current tx2 input nSequence IS NOT set to 0xFFFFFFFF, so tx2 won't be let to the chain if tx2.nLockTime has unsatisfying value