Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[KATC] Update config schema #1770

Closed

Conversation

RebeccaMahany
Copy link
Contributor

This PR makes the following config updates:

  • Rename source column to path to be more consistent with our e.g. dataflatten tables, which also return a path column
  • Rename source to source_paths for clarity; additionally, make it an array rather than a single string to allow for matching against multiple patterns
  • Rename query to source_query to disambiguate from the query column we use elsewhere
  • Rename platform to filter to clarify that we may have other types of filters in the future
  • Update config structure to be a list of tables instead of a map, so that K2 is able to send down config definitions for all tables instead of filtering by platform; launcher handles the filtering instead
  • To support the above, move the table name inside the table config

These changes were made in roughly this order; it is possible to review commit-by-commit if that's easier.

@RebeccaMahany RebeccaMahany marked this pull request as draft July 3, 2024 18:51
@RebeccaMahany RebeccaMahany marked this pull request as ready for review July 3, 2024 19:13
zackattack01
zackattack01 previously approved these changes Jul 3, 2024
James-Pickett
James-Pickett previously approved these changes Jul 3, 2024
@@ -17,7 +17,7 @@ import (
// that performs the query against the source.
type katcSourceType struct {
name string
dataFunc func(ctx context.Context, slogger *slog.Logger, sourcePattern string, query string, sourceConstraints *table.ConstraintList) ([]sourceData, error)
dataFunc func(ctx context.Context, slogger *slog.Logger, sourcePaths []string, query string, pathConstraints *table.ConstraintList) ([]sourceData, error)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a little unsure about how pathConstraints will be used. Can you provide an example? (In this comment is fine, doesn't need to be docs)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see what the code is doing, but I'm not sure it adds a lot over a pushing sourcePath through `filepath.Glob(...)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added documentation with an example

"err", err,
)
continue
}

if cfg.Platform != runtime.GOOS {
// For now, the filter is simply the OS
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we probably want to future proof this a little.

Brainstorming... We could define a rich DSL with boolean logic. Or maybe we can just have a series of strings? Or maybe a series of K:V pairs? Early use might be goos:darwin or something?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about making filter into filters, a map[string]string? Simple to unmarshal and therefore parse, and flexible for future additions?

{
						"name": "kolide_indexeddb_leveldb_test",
						"source_type": "indexeddb_leveldb",
						"filters": {
						     "goos": "darwin"
						},
						"columns": ["data"],
						"source_paths": ["/some/path/to/db.indexeddb.leveldb"],
						"source_query": "db.store",
						"row_transform_steps": ["deserialize_chrome"]
					}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

filters makes sense to me. I don't know if that should be an array of strings, or a map. Array might be little more flexible, I think either can work.

I think we should ponder whether that's an AND or an OR. If we go the way of overlays, this should be an AND, because the overlays handle OR

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've updated to filters in the other PR #1772 -- if we decide to go with this approach I'll make the update here as well.

@RebeccaMahany
Copy link
Contributor Author

Closing in favor of #1772

@RebeccaMahany RebeccaMahany deleted the becca/katc-cfg-update branch July 11, 2024 17:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants