Skip to content

Commit

Permalink
[WIP] Improve Index Structure
Browse files Browse the repository at this point in the history
Closes: #2028
  • Loading branch information
alexanderkiel committed Sep 11, 2024
1 parent 4a0683f commit 6192848
Show file tree
Hide file tree
Showing 99 changed files with 4,546 additions and 518 deletions.
1 change: 1 addition & 0 deletions dev/blaze/dev/rocksdb.clj
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,7 @@
(ac/supply-async #(rocksdb/compact-range! (index-kv-store) :resource-as-of-index))

(doseq [index [:search-param-value-index
:type-search-param-token-full-resource-index
:resource-value-index
:compartment-search-param-value-index
:compartment-resource-type-index
Expand Down
139 changes: 130 additions & 9 deletions docs/implementation/database.md
Original file line number Diff line number Diff line change
Expand Up @@ -120,14 +120,27 @@ The `SystemStats` index keeps track of the total number of resources, and the nu

The indices not depending on `t` directly point to the resource versions by their content hash.

| Name | Key Parts | Value |
|-------------------------------------|----------------------------------------------------------------|-------|
| SearchParamValueResource | search-param, type, value, id, hash-prefix | - |
| ResourceSearchParamValue | type, id, hash-prefix, search-param, value | - |
| CompartmentSearchParamValueResource | comp-code, comp-id, search-param, type, value, id, hash-prefix | - |
| CompartmentResourceType | comp-code, comp-id, type, id | - |
| SearchParam | code, type | id |
| ActiveSearchParams | id | - |
| Name | Key Parts | Value | Since |
|-------------------------------------------|----------------------------------------------------------------|-------|------:|
| SearchParamValueResource | search-param, type, value, id, hash-prefix | - |
| ResourceSearchParamValue | type, id, hash-prefix, search-param, value | - |
| CompartmentSearchParamValueResource | comp-code, comp-id, search-param, type, value, id, hash-prefix | - |
| CompartmentResourceType | comp-code, comp-id, type, id | - |
| TypeSearchParamTokenFullResource | search-param, type, value, system, id, hash-prefix | - | 0.27 |
| TypeSearchParamTokenSystemResource | search-param, type, system, id, hash-prefix | - | 0.27 |
| TypeSearchParamReferenceCanonicalResource | search-param, type, url, version, id, hash-prefix | - | 0.27 |
| TypeSearchParamReferenceUrlResource | search-param, type, url, id, hash-prefix | - | 0.27 |
| TypeSearchParamReferenceLocalResource | search-param, type, ref-id, ref-type, id, hash-prefix | - | 0.27 |
| ResourceSearchParamTokenFull | type, id, hash-prefix, search-param, value, system | - | 0.27 |
| ResourceSearchParamTokenSystem | type, id, hash-prefix, search-param, system | - | 0.27 |
| ResourceSearchParamReferenceCanonical | type, id, hash-prefix, search-param, url, version | - | 0.27 |
| ResourceSearchParamReferenceUrl | type, id, hash-prefix, search-param, url | - | 0.27 |
| ResourceSearchParamReferenceLocal | type, id, hash-prefix, search-param, ref-id, ref-type | - | 0.27 |
| PatientTypeSearchParamTokenFullResource | patient-id, search-param, type, value, system, id, hash-prefix | - | 0.27 |
| SearchParam | code, type | id |
| ActiveSearchParams | id | - |
| SearchParamCode | code | id |
| System | code | id |

#### SearchParamValueResource

Expand All @@ -137,7 +150,7 @@ The `SearchParamValueResource` index is used to find resources based on search p
* `type` - a 4-byte hash of the resource type
* `value` - the encoded value of the resource reachable by the search parameters FHIRPath expression. The encoding depends on the search parameters type.
* `id` - the logical id of the resource
* `hash-prefix` - a 4-byte prefix of the content-hash of the resource version
* `hash-prefix` - a 4-byte prefix of the content-hash of the resource version

The way the `SearchParamValueResource` index is used, depends on the type of the search parameter. The following sections will explain this in detail for each type:

Expand Down Expand Up @@ -223,6 +236,102 @@ The `ResourceSearchParamValue` index is used to decide whether a resource contai
* `search-param` - a 4-byte hash of the search parameters code used to identify the search parameter
* `value` - the encoded value of the resource reachable by the search parameters FHIRPath expression. The encoding depends on the search parameters type.

#### TypeSearchParamTokenFullResource

New index in v0.27.0. It is used to find resources based on full values of search parameters of type token. Full values consist of the system and value for Identifiers or code for Codings. The system will be the special value 0x000000 if not available in the resource.

* `type` - the type byte of the resource type (one byte)
* `search-param` - a 3-byte identifier of the search parameters code used to identify the search parameter
* `value` - the full code/value
* `system` - a 3-byte identifier of the system URI
* `id` - the logical id of the resource
* `hash-prefix` - a 4-byte prefix of the content-hash of the resource version

#### TypeSearchParamTokenSystemResource

New index in v0.27.0. It is used to find resources based on the system only of search parameters of type token. If the system is not available, no index entry will be written.

* `type` - the type byte of the resource type (one byte)
* `search-param` - a 3-byte identifier of the search parameters code used to identify the search parameter
* `system` - a 3-byte identifier of the system URI
* `id` - the logical id of the resource
* `hash-prefix` - a 4-byte prefix of the content-hash of the resource version

#### TypeSearchParamReferenceCanonicalResource

New index in v0.27.0. It is used to find resources based on the reference value in case it is an canonical URL of search parameters of type reference.

* `type` - the type byte of the resource type (one byte)
* `search-param` - a 3-byte identifier of the search parameters code used to identify the search parameter
* `url` - a 4-byte identifier of the canonical URL
* `version` - the full version
* `id` - the logical id of the resource
* `hash-prefix` - a 4-byte prefix of the content-hash of the resource version

#### TypeSearchParamReferenceUrlResource

New index in v0.27.0. It is used to find resources based on the reference value in case it is an URL of search parameters of type reference.

* `type` - the type byte of the resource type (one byte)
* `search-param` - a 3-byte identifier of the search parameters code used to identify the search parameter
* `url` - the full url
* `id` - the logical id of the resource
* `hash-prefix` - a 4-byte prefix of the content-hash of the resource version

#### TypeSearchParamReferenceLocalResource

New index in v0.27.0. It is used to find resources based on the reference value in case it is a local reference of search parameters of type reference.

* `type` - the type byte of the resource type (one byte)
* `search-param` - a 3-byte identifier of the search parameters code used to identify the search parameter
* `ref-id` - the logical id of the referenced resource
* `ref-type` - the type byte of the referenced resource type (one byte)
* `id` - the logical id of the resource
* `hash-prefix` - a 4-byte prefix of the content-hash of the resource version

#### ResourceSearchParamTokenFull

* `type` - the type byte of the resource type (one byte)
* `id` - the logical id of the resource
* `hash-prefix` - a 4-byte prefix of the content-hash of the resource version
* `search-param` - a 3-byte identifier of the search parameters code used to identify the search parameter
* `value` - the full code/value
* `system` - a 3-byte identifier of the system URI

#### ResourceSearchParamTokenSystem

* `type` - the type byte of the resource type (one byte)
* `id` - the logical id of the resource
* `hash-prefix` - a 4-byte prefix of the content-hash of the resource version
* `search-param` - a 3-byte identifier of the search parameters code used to identify the search parameter
* `system` - a 3-byte identifier of the system URI

#### ResourceSearchParamReferenceCanonical

* `type` - the type byte of the resource type (one byte)
* `id` - the logical id of the resource
* `hash-prefix` - a 4-byte prefix of the content-hash of the resource version
* `search-param` - a 3-byte identifier of the search parameters code used to identify the search parameter
* `url` - a 4-byte identifier of the canonical URL
* `version` - the full version

#### ResourceSearchParamReferenceUrl

* `type` - the type byte of the resource type (one byte)
* `id` - the logical id of the resource
* `hash-prefix` - a 4-byte prefix of the content-hash of the resource version
* `search-param` - a 3-byte identifier of the search parameters code used to identify the search parameter
* `url` - the full url

#### ResourceSearchParamReferenceLocal

* `type` - the type byte of the resource type (one byte)
* `id` - the logical id of the resource
* `hash-prefix` - a 4-byte prefix of the content-hash of the resource version
* `search-param` - a 3-byte identifier of the search parameters code used to identify the search parameter
* `ref-id` - the logical id of the referenced resource
* `ref-type` - the type byte of the referenced resource type (one byte)

#### CompartmentSearchParamValueResource

The `CompartmentSearchParamValueResource` index is used to find resources of a particular compartment based on search parameter values.
Expand All @@ -236,6 +345,18 @@ The `CompartmentResourceType` index is used to find all resources that belong to
* `type` - a 4-byte hash of the resource type of the resource that belongs to the compartment, ex. `Observation`
* `id` - the logical id of the resource that belongs to the compartment, ex. the logical id of the Observation

#### PatientTypeSearchParamTokenFullResource

New index in v0.27.0. It is used to find resources based on full values of search parameters of type token. Full values consist of the system and value for Identifiers or code for Codings. The system will be the special value 0x000000 if not available in the resource.

* `patient-id` - the logical id of the patient
* `type` - the type byte of the resource type (one byte)
* `search-param` - a 3-byte identifier of the search parameters code used to identify the search parameter
* `value` - the full code/value
* `system` - a 3-byte identifier of the system URI
* `id` - the logical id of the resource
* `hash-prefix` - a 4-byte prefix of the content-hash of the resource version

#### ActiveSearchParams

Currently not used.
Expand Down
42 changes: 7 additions & 35 deletions modules/admin-api/test/blaze/admin_api_test.clj
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
[blaze.admin-api :as admin-api]
[blaze.async.comp :as ac :refer [do-sync]]
[blaze.db.api :as d]
[blaze.db.api-stub]
[blaze.db.api-stub :as api-stub]
[blaze.db.impl.index.patient-last-change :as plc]
[blaze.db.kv :as-alias kv]
[blaze.db.kv.rocksdb :as rocksdb]
Expand Down Expand Up @@ -97,50 +97,21 @@
{:dir (str dir "/index")
:block-cache (ig/ref ::rocksdb/block-cache)
:column-families
{:search-param-value-index
{:write-buffer-size-in-mb 1
:max-write-buffer-number 1
:max-bytes-for-level-base-in-mb 1
:target-file-size-base-in-mb 1}
:resource-value-index nil
(assoc
api-stub/index-kv-store-column-families
:compartment-search-param-value-index
{:write-buffer-size-in-mb 1
:max-write-buffer-number 1
:max-bytes-for-level-base-in-mb 1
:target-file-size-base-in-mb 1}
:compartment-resource-type-index nil
:active-search-params nil
:tx-success-index {:reverse-comparator? true}
:tx-error-index nil
:t-by-instant-index {:reverse-comparator? true}
:resource-as-of-index nil
:type-as-of-index nil
:system-as-of-index nil
:patient-last-change-index
{:write-buffer-size-in-mb 1
:max-write-buffer-number 1
:max-bytes-for-level-base-in-mb 1
:target-file-size-base-in-mb 1}
:type-stats-index nil
:system-stats-index nil
:cql-bloom-filter nil
:cql-bloom-filter-by-t nil}}
:target-file-size-base-in-mb 1})}

[::kv/mem :blaze.db.admin/index-kv-store]
{:column-families
{:search-param-value-index nil
:resource-value-index nil
:compartment-search-param-value-index nil
:compartment-resource-type-index nil
:active-search-params nil
:tx-success-index {:reverse-comparator? true}
:tx-error-index nil
:t-by-instant-index {:reverse-comparator? true}
:resource-as-of-index nil
:type-as-of-index nil
:system-as-of-index nil
:type-stats-index nil
:system-stats-index nil}}
{:column-families api-stub/index-kv-store-column-families}

::rs/kv
{:kv-store (ig/ref :blaze.db/resource-kv-store)
Expand Down Expand Up @@ -169,7 +140,8 @@
[:blaze.db.node.resource-indexer/executor :blaze.db.node.resource-indexer.admin/executor] {}

:blaze.db/search-param-registry
{:structure-definition-repo structure-definition-repo}
{:kv-store (ig/ref :blaze.db.main/index-kv-store)
:structure-definition-repo structure-definition-repo}

::rocksdb/block-cache {:size-in-mb 1}

Expand Down
25 changes: 17 additions & 8 deletions modules/byte-buffer/src/blaze/byte_buffer.clj
Original file line number Diff line number Diff line change
Expand Up @@ -127,6 +127,13 @@
[byte-buffer position]
(.position ^ByteBuffer byte-buffer (int position)))

(defn inc-position!
{:inline
(fn [byte-buffer amount]
`(set-position! ~byte-buffer (unchecked-add-int (position ~byte-buffer) (int ~amount))))}
[byte-buffer amount]
(set-position! byte-buffer (+ (position byte-buffer) (long amount))))

(defn remaining
"Returns the number of elements between the current position and the limit."
{:inline
Expand Down Expand Up @@ -236,19 +243,21 @@
[byte-buffer]
(when (pos? (remaining byte-buffer))
(mark! byte-buffer)
(loop [byte (bit-and (long (get-byte! byte-buffer)) 0xFF)
(loop [byte (long (get-byte! byte-buffer))
size 0]
(cond
(zero? byte)
(if (zero? byte)
(do (reset! byte-buffer)
size)

(pos? (remaining byte-buffer))
(recur (bit-and (long (get-byte! byte-buffer)) 0xFF) (inc size))
(if (zero? (remaining byte-buffer))
(do (reset! byte-buffer)
nil)
(recur (long (get-byte! byte-buffer)) (inc size)))))))

:else
(do (reset! byte-buffer)
nil)))))
(defn skip-null-terminated! [byte-buffer]
(if-let [size (size-up-to-null byte-buffer)]
(set-position! byte-buffer (+ (position byte-buffer) (long size) 1))
(throw (Exception. "Can't skip null terminated byte sequence."))))

(defn mismatch
"Finds and returns the relative index of the first mismatch between `a` and
Expand Down
4 changes: 4 additions & 0 deletions modules/byte-buffer/src/blaze/byte_buffer_spec.clj
Original file line number Diff line number Diff line change
Expand Up @@ -65,3 +65,7 @@
(s/fdef bb/size-up-to-null
:args (s/cat :byte-buffer byte-buffer?)
:ret (s/nilable nat-int?))

(s/fdef bb/skip-null-terminated!
:args (s/cat :byte-buffer byte-buffer?)
:ret byte-buffer?)
8 changes: 7 additions & 1 deletion modules/byte-string/Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,12 @@ test:
test-coverage:
clojure -M:test:coverage

deps-tree:
clojure -X:deps tree

deps-list:
clojure -X:deps list

cloc-prod:
cloc src

Expand All @@ -19,4 +25,4 @@ cloc-test:
clean:
rm -rf .clj-kondo/.cache .cpcache target

.PHONY: fmt lint test test-coverage cloc-prod cloc-test clean
.PHONY: fmt lint test test-coverage deps-tree deps-list cloc-prod cloc-test clean
9 changes: 5 additions & 4 deletions modules/byte-string/src/blaze/byte_string.clj
Original file line number Diff line number Diff line change
Expand Up @@ -64,14 +64,15 @@

(defn from-byte-buffer-null-terminated!
"Returns the bytes from `byte-buffer` up to (exclusive) a null byte (0x00) as
byte string ot nil if `byte-buffer` doesn't include a null byte.
byte string or nil if `byte-buffer` doesn't include a null byte.
Increments the position of `byte-buffer` up to including the null byte."
[byte-buffer]
(when-let [size (bb/size-up-to-null byte-buffer)]
(if-let [size (bb/size-up-to-null byte-buffer)]
(let [bs (from-byte-buffer! byte-buffer size)]
(bb/get-byte! byte-buffer)
bs)))
(bb/set-position! byte-buffer (inc (bb/position byte-buffer)))
bs)
(throw (Exception. "Can't read null terminated byte string."))))

(defn from-hex [s]
(ByteString/copyFrom (.decode (BaseEncoding/base16) s)))
Expand Down
2 changes: 1 addition & 1 deletion modules/byte-string/src/blaze/byte_string_spec.clj
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@

(s/fdef bs/from-byte-buffer-null-terminated!
:args (s/cat :byte-buffer byte-buffer?)
:ret (s/nilable bs/byte-string?))
:ret bs/byte-string?)

(s/fdef bs/from-hex
:args (s/cat :s string?)
Expand Down
8 changes: 5 additions & 3 deletions modules/byte-string/test/blaze/byte_string_test.clj
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
[blaze.byte-buffer :as bb]
[blaze.byte-string :as bs]
[blaze.byte-string-spec]
[blaze.test-util :as tu :refer [ba bb bytes=]]
[blaze.test-util :as tu :refer [ba bb bytes= given-thrown]]
[clojure.spec.test.alpha :as st]
[clojure.test :as test :refer [are deftest is testing]]))

Expand Down Expand Up @@ -50,10 +50,12 @@

(deftest from-byte-buffer-null-terminated-test
(testing "empty byte buffer"
(is (nil? (bs/from-byte-buffer-null-terminated! (bb)))))
(given-thrown (bs/from-byte-buffer-null-terminated! (bb))
:message := "Can't read null terminated byte string."))

(testing "one non-null byte"
(is (nil? (bs/from-byte-buffer-null-terminated! (bb 0x01)))))
(given-thrown (bs/from-byte-buffer-null-terminated! (bb 0x01))
:message := "Can't read null terminated byte string."))

(testing "one null byte"
(let [bb (bb 0x00)]
Expand Down
6 changes: 5 additions & 1 deletion modules/db-protocols/src/blaze/db/impl/protocols.clj
Original file line number Diff line number Diff line change
Expand Up @@ -99,7 +99,9 @@
(-pull-many [pull resource-handles] [pull resource-handles elements]))

(defprotocol SearchParam
(-of-type [search-param tb])
(-compile-value [search-param modifier value] "Can return an anomaly.")
(-compile-value-composite [search-param modifier value] "Can return an anomaly.")
(-chunked-resource-handles
[search-param batch-db tid modifier compiled-value])
(-resource-handles
Expand All @@ -113,6 +115,7 @@
(-compartment-keys [search-param context compartment tid compiled-value])
(-matcher [_ batch-db modifier values])
(-compartment-ids [_ resolver resource])
(-index-entries [_ resolver linked-compartments hash resource])
(-index-values [_ resolver resource])
(-index-value-compiler [_]))

Expand All @@ -123,4 +126,5 @@
(-list-by-type [_ type])
(-list-by-target [_ target])
(-linked-compartments [_ resource])
(-compartment-resources [_ compartment-type] [_ compartment-type type]))
(-compartment-resources [_ compartment-type] [_ compartment-type type])
(-tb [_ type]))
Loading

0 comments on commit 6192848

Please sign in to comment.