-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Edge1 #3
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
There is a bug during saving tap data as a sample in a pipeline. The cause of this issue a 400 error from the pipeline-service in case when payload contains field/s with an empty string as a value. OTLP Logs, Metrics and Traces could have empty strings in attributes/scope and other fields. As a solution to fix this: 1. Filter all the empty string fields out of `tags` 2. Replace all the empty strings with `Null` for all other fields. Ref: LOG-18908
fix(otlp): replace or remove fields with an empty string
## [3.1.1](answerbook/vector@v3.1.0...v3.1.1) (2024-01-22) ### Bug Fixes * **otlp**: replace or remove fields with an empty string [db87922](answerbook/vector@db87922) - Sergey Opria [LOG-18908](https://logdna.atlassian.net/browse/LOG-18908) ### Miscellaneous * Merge pull request #392 from answerbook/sopria/LOG-18908 [a599fdd](answerbook/vector@a599fdd) - GitHub [LOG-18908](https://logdna.atlassian.net/browse/LOG-18908)
summary: bumping vrl to include new mezmo groks for psql, golang, and elastic search logs ref: LOG-18993 ref: LOG-18994 ref: LOG-18999
…9-add_groks chore: bump vrl dep to 0.12.0
## [3.1.2](answerbook/vector@v3.1.1...v3.1.2) (2024-01-23) ### Chores * bump vrl depey to rev=v0.12.0 [6d08598](answerbook/vector@6d08598) - dominic-mcallister-logdna [LOG-18993](https://logdna.atlassian.net/browse/LOG-18993) [LOG-18994](https://logdna.atlassian.net/browse/LOG-18994) [LOG-18999](https://logdna.atlassian.net/browse/LOG-18999) ### Miscellaneous * Merge pull request #401 from answerbook/dominic/LOGs-18993_18994_18999-add_groks [edc9090](answerbook/vector@edc9090) - GitHub
Add limits to sliding aggregate processor. 1. Cardinality limit: limits the number of unique keys that can be stored 2. Window limit: limits the number of sliding windows for a given key 3. Window minimum size: restricts the number of new windows that are allocated for a given key. Ref: LOG-18818
## [3.1.3](answerbook/vector@v3.1.2...v3.1.3) (2024-01-23) ### Code Refactoring * Add memory and window limits to sliding aggregate (#400) [aa050e2](answerbook/vector@aa050e2) - GitHub [LOG-18818](https://logdna.atlassian.net/browse/LOG-18818)
…#18922) * feat(http_server source): add all headers to the namespace metadata * feat(http_server source): allow wildcard matching in headers * style: whitespace typo * rework header glob matching, add docs and tests * examples, docs, tests, error on misconfiguration * fmt & clippy cleanup * Generate docs Signed-off-by: Jesse Szwedko <jesse.szwedko@datadoghq.com> * docs grammar adjustment * add some code docs Ref: LOG-19103 --------- Signed-off-by: Jesse Szwedko <jesse.szwedko@datadoghq.com> Co-authored-by: Jesse Szwedko <jesse.szwedko@datadoghq.com> Co-authored-by: neuronull <neuronull@pm.me>
When using `query_parameters` to specify which request parameters should be saved into the event, it was only saving the values. Ref: LOG-19104
Much like was done for headers, specifying which `query_parameters` to be saved into the event should support wildcards. Ref: LOG-19105
feat(sources): `http_server` accepts query parameter wildcards
This adds the PersistenceConnection trait and associated RocksDB implementation, along with the RocksDB dependency. Persistence is also wired into the `sliding_aggregate` transform such that state is initially loaded from the db (if it exists), and will be persisted back on a timer + on shutdown. Ref: LOG-18959
# [3.2.0](answerbook/vector@v3.1.3...v3.2.0) (2024-01-24) ### Bug Fixes * **sources**: `http_server` is not saving query params as key/val [733f3a9](answerbook/vector@733f3a9) - Darin Spivey [LOG-19104](https://logdna.atlassian.net/browse/LOG-19104) ### Features * **http_server source**: add all headers to the namespace metadata (#18922) [3772b19](answerbook/vector@3772b19) - Darin Spivey [LOG-19103](https://logdna.atlassian.net/browse/LOG-19103) * **sources**: `http_server` accepts query parameter wildcards [6627a95](answerbook/vector@6627a95) - Darin Spivey [LOG-19105](https://logdna.atlassian.net/browse/LOG-19105) ### Miscellaneous * Merge pull request #403 from answerbook/darinspivey/LOG-19103 [896eaed](answerbook/vector@896eaed) - GitHub [LOG-19103](https://logdna.atlassian.net/browse/LOG-19103)
Adds libclang for building rocksdb in our release image. Ref: LOG-18959
This pulls the CI environment variable into a cfg option which can be used to control whether specific tests are run in CI. Ref: LOG-18959
feat(persistence): add state persistence implementation and wire into sliding_aggregate
# [3.3.0](answerbook/vector@v3.2.0...v3.3.0) (2024-01-26) ### Bug Fixes * **build**: add libclang-dev dependency to the environment [cfa0321](answerbook/vector@cfa0321) - Mike Del Tito [LOG-18959](https://logdna.atlassian.net/browse/LOG-18959) ### Features * **persistence**: define persistence trait/impl and wire into aggregate [85e5d1a](answerbook/vector@85e5d1a) - Mike Del Tito [LOG-18959](https://logdna.atlassian.net/browse/LOG-18959) ### Miscellaneous * Merge pull request #402 from answerbook/mdeltito/LOG-18959 [a1ec760](answerbook/vector@a1ec760) - GitHub [LOG-18959](https://logdna.atlassian.net/browse/LOG-18959) ### Tests * allow conditional compiling of tests under ci [67d6f7d](answerbook/vector@67d6f7d) - Mike Del Tito [LOG-18959](https://logdna.atlassian.net/browse/LOG-18959)
Adds a PVC and mount for component state storage, with sizing and storageclass defined in the global `resources` configmap. Ref: LOG-19044
In order for us to share a volume across all partitions (to start), each partition needs a separate database. This will also facilitate moving account across partitions in the future, since this process involves pipelines for an account running in multiple partitions for a brief period of time while buffers are drained and components are shut down. Since we still have the HPA in play, and since the pod name includes the partition name, the `POD_NAME` environment variable is used as the path component. Ref: LOG-19044
feat(persistence): add pvc/mount for component state storage
# [3.4.0](answerbook/vector@v3.3.0...v3.4.0) (2024-01-26) ### Bug Fixes * **persistence**: include pod name in db directory [a23107a](answerbook/vector@a23107a) - Mike Del Tito [LOG-19044](https://logdna.atlassian.net/browse/LOG-19044) ### Features * **persistence**: add pvc/mount for component state storage [ff5929f](answerbook/vector@ff5929f) - Mike Del Tito [LOG-19044](https://logdna.atlassian.net/browse/LOG-19044) ### Miscellaneous * Merge pull request #404 from answerbook/mdeltito/LOG-19044 [fdeb534](answerbook/vector@fdeb534) - GitHub [LOG-19044](https://logdna.atlassian.net/browse/LOG-19044)
Adds a namespace to the PVC, and marks the resource vars as optional. Ref: LOG-19044
Some changes do not require a re-run of the full integration suite, including changes to docs and changes to the deployment template.
fix(deployment): adjust resource vars and add namespace
## [3.4.1](answerbook/vector@v3.4.0...v3.4.1) (2024-01-26) ### Bug Fixes * **deployment**: adjust resource vars and add namespace [8f8aff6](answerbook/vector@8f8aff6) - Mike Del Tito [LOG-19044](https://logdna.atlassian.net/browse/LOG-19044) ### Miscellaneous * Merge pull request #405 from answerbook/mdeltito/LOG-19044-fix [c5b3474](answerbook/vector@c5b3474) - GitHub [LOG-19044](https://logdna.atlassian.net/browse/LOG-19044)
Previously the intent was to share a volume across all partitions to start. This, however, has the consequence on scheduling of forcing pods for all partitions to schedule on the same node. Instead, this moves to a volume per-partition by defining a claim template on the sts. In practice this will result in a volume per pod (with a nominal replica count of 1) in each sts. Resources are now also pulled from the partition definition in the `vector-partitions` configmap. Ref: LOG-19044
…tition fix(persistence): define pvc per-partition
## [3.4.2](answerbook/vector@v3.4.1...v3.4.2) (2024-01-29) ### Bug Fixes * **persistence**: define pvc per-partition [f4f2b09](answerbook/vector@f4f2b09) - Mike Del Tito [LOG-19044](https://logdna.atlassian.net/browse/LOG-19044) ### Miscellaneous * Merge pull request #407 from answerbook/mdeltito/LOG-19044-pvc-by-partition [21dc87a](answerbook/vector@21dc87a) - GitHub [LOG-19044](https://logdna.atlassian.net/browse/LOG-19044)
Update the cargo build dependencies for the vrl package to point to the Mezmo open source fork instead of a private repo. Ref: LOG-19155
This adds a (configurable) jitter amount for the flush interval, which will prevent a flood of disk IO when many components are started at nearly the same time. Ref: LOG-19171
This name better reflects the capabilities of this processor. The previous name is still supported for backwards compat. Ref: LOG-19134
refactor: rename `sliding_aggregate` to `mezmo_aggregate_v2` + add jitter to state flush
# [3.5.0](answerbook/vector@v3.4.5...v3.5.0) (2024-01-31) ### Code Refactoring * rename `sliding_aggregate` to `mezmo_aggregate_v2` [0485391](answerbook/vector@0485391) - Mike Del Tito [LOG-19134](https://logdna.atlassian.net/browse/LOG-19134) ### Features * **persistence**: add jitter to state flush [db3cae3](answerbook/vector@db3cae3) - Mike Del Tito [LOG-19171](https://logdna.atlassian.net/browse/LOG-19171) ### Miscellaneous * Merge pull request #410 from answerbook/mdeltito/LOG-19134 [0c71dd0](answerbook/vector@0c71dd0) - GitHub [LOG-19134](https://logdna.atlassian.net/browse/LOG-19134)
summary: update contains groks for k8s and timestamp manipulations ref: LOG-18911
chore(version): bump vrl@0.14.0
## [3.5.1](answerbook/vector@v3.5.0...v3.5.1) (2024-02-01) ### Chores * **version**: bump vrl@0.14.0 [21ee8f0](answerbook/vector@21ee8f0) - dominic-mcallister-logdna [LOG-18911](https://logdna.atlassian.net/browse/LOG-18911) ### Miscellaneous * Merge pull request #411 from answerbook/dominic/LOG-18911 [7e4e670](answerbook/vector@7e4e670) - GitHub [LOG-18911](https://logdna.atlassian.net/browse/LOG-18911)
…config Currently the reduce transform gets threshold limits from env vars, but those vars are missed in the deployment config to be able to pull them from the `vector` config map. Ref: LOG-19194
chore(deployment): added reduce threshold limits into the deployment …
## [3.5.2](answerbook/vector@v3.5.1...v3.5.2) (2024-02-05) ### Chores * **deployment**: added reduce threshold limits into the deployment config [eb244bc](answerbook/vector@eb244bc) - Sergey Opria [LOG-19194](https://logdna.atlassian.net/browse/LOG-19194) ### Miscellaneous * Merge pull request #414 from answerbook/sopria/LOG-19194 [bcd3d4b](answerbook/vector@bcd3d4b) - GitHub [LOG-19194](https://logdna.atlassian.net/browse/LOG-19194)
summary: in the scenario of logdna-agent sending logs to pipeline, which in turn are sent to log-analysis, the fact that the agent originally picked up the log is lost. To help out support, start forwarding the original user agent info to LA if available. ref: LOG-19196
feat(mezmo-sink): include _originating_user_agent
# [3.6.0](answerbook/vector@v3.5.2...v3.6.0) (2024-02-06) ### Features * **mezmo-sink**: include _originating_user_agent [d897521](answerbook/vector@d897521) - dominic-mcallister-logdna [LOG-19196](https://logdna.atlassian.net/browse/LOG-19196) ### Miscellaneous * Merge pull request #413 from answerbook/dominic/LOG-19196 [78c0c8d](answerbook/vector@78c0c8d) - GitHub [LOG-19196](https://logdna.atlassian.net/browse/LOG-19196)
Refactoring to break up the single source file implementation into multiple, smaller source files. Also extracted some logic into smaller methods/functions to reduce the size of the `record()` and `flush_finalized()` bodies. Ref: LOG-19116
Injects the prior accumulated event as the namespaced value `previous` inside of the flush_condition VRL script. This allows us to write a flush condition like: ``` res, err = (.message.value.value / %previous.message.value.value) >= 1.50 if err != null { false } else { res } ``` This example would flush the aggregated value early if the value has grown 50% since the previous window value. The error checking code is required because the division and greater than ordering is fallible in VRL. Ref: LOG-19116
# [3.7.0](answerbook/vector@v3.6.0...v3.7.0) (2024-02-07) ### Chores * refactoring MezmoAggregateV2 [9cb5dbc](answerbook/vector@9cb5dbc) - Dan Hable [LOG-19116](https://logdna.atlassian.net/browse/LOG-19116) ### Features * Expose prior aggregate value in flush condition [aff984c](answerbook/vector@aff984c) - Dan Hable [LOG-19116](https://logdna.atlassian.net/browse/LOG-19116)
This commit adds a unit test to prove that the Mezmo aggregate transform behaves like a sliding aggregate control with the correct configuration. One small issue was found with the new window allocation on the boundary that was fixed alongside the test. Ref: LOG-18963
## [3.7.1](answerbook/vector@v3.7.0...v3.7.1) (2024-02-07) ### Tests * **aggregate-v2**: Supporting tumbling window config [0d12d6f](answerbook/vector@0d12d6f) - Dan Hable [LOG-18963](https://logdna.atlassian.net/browse/LOG-18963)
The original implementation for the `Legacy` namespace (which Mezmo uses) dumps headers and query parameters into the root of the message. Since Mezmo maintains a separate metadata field in th envelope, move those values into the vector `metadata` for the Legacy namespace as it does for Vector namespacing. This prevents the message from being polluted and makes the data available to be inserted into the Mezmo envelope downstream. The Edge http source will retrieve this metadata using the `%` VRL operator when it constructs the `message` envelope (which does not initially exist in the http_server). Ref: LOG-19242
Builds are taking too long. One reason is the "test image build" step which can take 30 minutes. To hep dev flow, *only* do this during PRs so that developers can just push branches to check CI/tests prior to opening a PR that will do the test image build. Also, we regularly see the '1 HOUR' timeouts and jobs are killed. This is a huge waste of time and is usually due to CI servers not being up, but also cargo caches not being primed. Increasing the timeout will allow more time for such things. Ref: LOG-19242
fix(http_server): Store request metadata in PathPrefix::Metadata
## [3.7.2](answerbook/vector@v3.7.1...v3.7.2) (2024-02-08) ### Bug Fixes * **ci**: Speed improvements [99a767f](answerbook/vector@99a767f) - Darin Spivey [LOG-19242](https://logdna.atlassian.net/browse/LOG-19242) * **http_server**: Store request metadata in `PathPrefix::Metadata` [9875ad9](answerbook/vector@9875ad9) - Darin Spivey [LOG-19242](https://logdna.atlassian.net/browse/LOG-19242) ### Miscellaneous * Merge pull request #415 from answerbook/darinspivey/LOG-19242 [1b99c75](answerbook/vector@1b99c75) - GitHub [LOG-19242](https://logdna.atlassian.net/browse/LOG-19242)
For remote task executions (which power Edge tap), include the metadata as well. Ref: LOG-19261
fix(graphql): Include `metadata` in remote task execution
## [3.7.3](answerbook/vector@v3.7.2...v3.7.3) (2024-02-08) ### Bug Fixes * **graphql**: Include `metadata` in remote task execution [2e13c69](answerbook/vector@2e13c69) - Darin Spivey [LOG-19261](https://logdna.atlassian.net/browse/LOG-19261) ### Miscellaneous * Merge pull request #419 from answerbook/darinspivey/LOG-19261 [4301694](answerbook/vector@4301694) - GitHub [LOG-19261](https://logdna.atlassian.net/browse/LOG-19261)
darinspivey
approved these changes
Feb 14, 2024
darinspivey
pushed a commit
that referenced
this pull request
Apr 4, 2024
## [3.13.3](answerbook/vector@v3.13.2...v3.13.3) (2024-03-27) ### Chores * **build**: Use open source vrl fork [347e20f](answerbook/vector@347e20f) - Dan Hable [LOG-19155](https://logdna.atlassian.net/browse/LOG-19155) ### Miscellaneous * Merge pull request #6 from mezmo/darinspivey/answerbook_sync [bd66904](answerbook/vector@bd66904) - GitHub * Merge remote-tracking branch 'answerbook/master' into darinspivey/answerbook_sync [98280d6](answerbook/vector@98280d6) - Darin Spivey * Merge pull request #5 from mezmo/holmberg/LOG-19506 [35fed6f](answerbook/vector@35fed6f) - GitHub [LOG-19506](https://logdna.atlassian.net/browse/LOG-19506) * Merge pull request #3 from mezmo/edge1 [a049b0c](answerbook/vector@a049b0c) - GitHub * Merge branch 'master' into edge1 [596ba22](answerbook/vector@596ba22) - GitHub * Merge pull request #2 from mezmo/dhable/LOG-19155 [7ddc8c2](answerbook/vector@7ddc8c2) - GitHub [LOG-19155](https://logdna.atlassian.net/browse/LOG-19155)
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.