Skip to content

Commit

Permalink
Merge pull request #340 from opencybersecurityalliance/develop
Browse files Browse the repository at this point in the history
v1.6.0
  • Loading branch information
subbyte authored May 17, 2023
2 parents 8f5e221 + b9ea090 commit 9adb547
Show file tree
Hide file tree
Showing 23 changed files with 490 additions and 237 deletions.
2 changes: 0 additions & 2 deletions .github/workflows/code-coverage.yml
Original file line number Diff line number Diff line change
Expand Up @@ -15,10 +15,8 @@ on:
- 'src/**'
types:
- opened
- edited
- reopened
- synchronize
- unlocked

jobs:
codecov:
Expand Down
2 changes: 0 additions & 2 deletions .github/workflows/code-style.yml
Original file line number Diff line number Diff line change
Expand Up @@ -15,10 +15,8 @@ on:
- 'src/**'
types:
- opened
- edited
- reopened
- synchronize
- unlocked

jobs:
codestyle:
Expand Down
71 changes: 71 additions & 0 deletions .github/workflows/integration-testing.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,71 @@
name: Integration Testing With stix-shifter and Live Data Sources
on:
push:
branches:
- develop
- develop_*
- release
paths:
- 'src/**'
pull_request:
branches:
- develop
- develop_*
- release
paths:
- 'src/**'
types:
- opened
- synchronize
- reopened
workflow_dispatch:
inputs:
organization:
description: 'Kestrel repo organization'
required: true
default: 'opencybersecurityalliance'
repository:
description: 'Kestrel repo name'
required: true
default: 'kestrel-lang'
branch:
description: 'Kestrel repo branch'
required: true
default: 'develop'

jobs:
launch:
name: Launch Integration Testing
runs-on: ubuntu-latest
steps:
- name: Initialize testing workflow parameters
run: |
if [[ ${{ github.event_name }} == "workflow_dispatch" ]]; then
echo "organization=${{ github.event.inputs.organization }}" >> $GITHUB_ENV
echo "repository=${{ github.event.inputs.repository }}" >> $GITHUB_ENV
echo "branch=${{ github.event.inputs.branch }}" >> $GITHUB_ENV
elif [[ ${{ github.event_name }} == "push" ]]; then
echo "got a push event. ${{ github.event }}"
echo "organization=${{ github.event.repository.owner.login }}" >> $GITHUB_ENV
echo "repository=${{ github.event.repository.name }}" >> $GITHUB_ENV
GITHUB_REF=${{ github.ref }}
echo "GITHUB_REF=$GITHUB_REF"
echo "branch=$(echo ${GITHUB_REF#refs/heads/} | tr / -)" >> $GITHUB_ENV
elif [[ ${{ github.event_name }} == "pull_request" ]]; then
echo "organization=${{ github.event.pull_request.head.repo.owner.login }}" >> $GITHUB_ENV
echo "repository=${{ github.event.pull_request.head.repo.name }}" >> $GITHUB_ENV
echo "branch=${{ github.event.pull_request.head.ref }}" >> $GITHUB_ENV
fi
- name: Launch integration testing workflows
uses: convictional/trigger-workflow-and-wait@v1.6.5
with:
owner: opencybersecurityalliance
repo: hunting-stack-testing
github_token: ${{ secrets.KESTREL_STIXSHIFTER_INTEGRATION_TESTING_TOKEN }}
workflow_file_name: kestrel-integration-testing-flow.yml
ref: main
wait_interval: 10
propagate_failure: true
trigger_workflow: true
wait_workflow: true
client_payload: '{"organization": "${{ env.organization}}", "repository": "${{ env.repository }}", "branch": "${{ env.branch }}"}'
2 changes: 0 additions & 2 deletions .github/workflows/unit-testing.yml
Original file line number Diff line number Diff line change
Expand Up @@ -17,10 +17,8 @@ on:
- 'tests/**'
types:
- opened
- edited
- reopened
- synchronize
- unlocked

jobs:
unittest:
Expand Down
2 changes: 0 additions & 2 deletions .github/workflows/unused-import.yml
Original file line number Diff line number Diff line change
Expand Up @@ -15,10 +15,8 @@ on:
- 'src/**'
types:
- opened
- edited
- reopened
- synchronize
- unlocked

jobs:
unusedimports:
Expand Down
25 changes: 25 additions & 0 deletions CHANGELOG.rst
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,31 @@ The format is based on `Keep a Changelog`_.
Unreleased
==========

1.6.0 (2023-05-17)
==================

Changed
-------

- Upgrade stix-shifter from v4 to v5 in the stix-shifter datasource interface
- Bump stix-shifter version to v5.3.0 to include latest Elastcisearch ECS mappings
- Restrict scopes of Github workflows to eliminate unnecessary executions

Added
-----

- stix-shifter datasource interface query procedure pipelining: a producer-consumer model for transmission and translation/ingestion
- Integration testing with stix-shifter and the first live data source---Elasticsearch
- Raw String implemented in Kestrel
- Documentation on raw String

Fixed
-----

- Logging module reimplemented to fix #334
- asyncio bug in ``tests/test_fast_translate.py``


1.5.14 (2023-04-19)
===================

Expand Down
2 changes: 1 addition & 1 deletion docs/installation/runtime.rst
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ please use Python inside Windows Subsystem for Linux (WSL).
General Requirements
====================

Python 3 is required. Refer to the `Python installation guide`_ if you do not have Python 3.
Python 3.8 is required. Refer to the `Python installation guide`_ if you need to install or upgrade Python.

OS-specific Requirements
========================
Expand Down
60 changes: 39 additions & 21 deletions docs/language/ecgp.rst
Original file line number Diff line number Diff line change
Expand Up @@ -477,30 +477,17 @@ Two examples of variable references in an ECGP:
# enrich the IPs in network-traffic with x-force threat intelligence
APPLY python://xfeipenrich ON nt_outter
Escaped String
==============
String and Raw String
=====================

Kestrel string literals in comparison expressions are like standard Python
strings (not Python raw string). It supports escaping for special characters,
e.g., ``\n`` means new line.
strings. It supports escaping for special characters, e.g., ``\n`` means new
line.

Some basic rules:

#. If double quotes are used to mark a string literal, any double quote
character inside the string needs to be escaped. Otherwise, escaping for it
is not necessary.

#. If single quotes are used to mark a string literal, any single quote
character inside the string needs to be escaped. Otherwise, escaping for it
is not necessary.

#. Backslash character ``\`` always needs to be escaped in a string literal,
i.e., write ``\\`` to mean a single character ``\`` such as
``'C:\\Windows\\System32\\cmd.exe'``.

The 3rd rule means when writing regular expressions, one can first write a
regular expression in raw string, then replace each ``\`` with ``\\`` before
putting it into Kestrel.
String literals can be enclosed in matching single quotes (``'``) or double
quotes (``"``). The backslash (``\\``) character is used to escape characters
that otherwise have a special meaning, such as newline, backslash itself, or
the quote character.

Examples:

Expand Down Expand Up @@ -535,6 +522,36 @@ Examples:
ps5 = GET process FROM stixshifter://edp1
WHERE name MATCHES '\\w+\\.exe'
The escaped strings are not friendly to the use of regular expression,
resulting one to write four backslashes ``\\\\`` to mean a single exact
backslash char, e.g., STIX pattern requires ``"[artifact:payload_bin MATCHES
'C:\\\\Windows\\\\system32\\\\svchost\\.exe']"`` to mean raw path
``C:\Windows\system32\svchost.exe``. This is explained in `Python re library`_.

To overcome the inconvenience, Kestrel provides *raw string* like Python does,
meaning there is no escaping character in a Kestrel raw string (raw string is
interpreted without escaping evaluation).

.. code-block:: coffeescript
# f1 and f2 describes the same pattern:
# using regex to match an exact string 'C:\Windows\System32\cmd.exe'
f1 = GET file FROM stixshifter://edp1
WHERE name MATCHES 'C:\\\\Windows\\\\System32\\\\cmd\\.exe'
f2 = GET file FROM stixshifter://edp1
WHERE name MATCHES r'C:\\Windows\\System32\\cmd\.exe'
# raw string can be used not only in regex (keyword MATCHES), but any comparison expression
# f3/f4 will get the same results as f1/f2, yet they use exact match instead of regex
f3 = GET file FROM stixshifter://edp1
WHERE name = 'C:\\Windows\\System32\\cmd.exe'
f4 = GET file FROM stixshifter://edp1
WHERE name = r'C:\Windows\System32\cmd.exe'
Time Range
==========

Expand Down Expand Up @@ -597,3 +614,4 @@ range specified.
.. _STIX Cyber Observable Objects: http://docs.oasis-open.org/cti/stix/v2.0/stix-v2.0-part4-cyber-observable-objects.html
.. _OCA/stix-extension: https://github.com/opencybersecurityalliance/stix-extensions
.. _STIX Observation Expression: http://docs.oasis-open.org/cti/stix/v2.0/cs01/part5-stix-patterning/stix-v2.0-cs01-part5-stix-patterning.html#_Toc496717745
.. _Python re library: https://docs.python.org/3/library/re.html
2 changes: 1 addition & 1 deletion docs/tutorial.rst
Original file line number Diff line number Diff line change
Expand Up @@ -66,7 +66,7 @@ Kestrel + Jupyter
=================

To develop a hunt flow using Jupyter Notebook, you need to first follow the
instructions in :ref:`installation/runtime:Front-Ends Installation` to install
instructions in :ref:`installation/runtime:Kestrel Front-End Setup` to install
the Kestrel Jupyter Notebook kernel if you haven't done so.

Creating A Hunt Book
Expand Down
8 changes: 4 additions & 4 deletions setup.cfg
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
[metadata]
name = kestrel-lang
version = 1.5.14
version = 1.6.0
description = Kestrel Threat Hunting Language
long_description = file:README.rst
long_description_content_type = text/x-rst
Expand Down Expand Up @@ -35,9 +35,9 @@ install_requires =
lark>=1.1.5
pyarrow>=5.0.0
docker>=5.0.0
stix-shifter>=4.6.3,<5.0.0
stix-shifter-utils>=4.6.3,<5.0.0
firepit>=2.3.17
stix-shifter>=5.3.0
stix-shifter-utils>=5.3.0
firepit>=2.3.19
typeguard
tests_require =
pytest
Expand Down
35 changes: 7 additions & 28 deletions src/kestrel/__main__.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,36 +6,17 @@

import argparse
import logging
import os

from kestrel.session import Session

logfile = "session.log"
from kestrel.utils import add_logging_handler, clear_logging_handlers

_logger = logging.getLogger(__name__)


def logging_setup(session, verbose_mode, debug_mode):
# setup logging format, channel and granularity
log_format = "%(asctime)s %(levelname)s %(name)s %(message)s"
log_console = logging.StreamHandler()
if session:
log_file_path = os.path.join(session.runtime_directory, logfile)
log_file = logging.FileHandler(log_file_path)
log_handlers = [log_console, log_file] if verbose_mode else [log_file]
else:
log_handlers = [log_console]
# remove existing handlers attached to root logger
root_logger = logging.getLogger()
for h in root_logger.handlers[:]:
root_logger.removeHandler(h)
h.close()
logging.basicConfig(
format=log_format,
datefmt="%H:%M:%S",
level=logging.DEBUG if debug_mode else logging.INFO,
handlers=log_handlers,
)
def logging_setup(if_verbose_mode, if_debug_mode):
clear_logging_handlers()
if if_verbose_mode:
add_logging_handler(logging.StreamHandler(), if_debug_mode)


if __name__ == "__main__":
Expand All @@ -49,11 +30,9 @@ def logging_setup(session, verbose_mode, debug_mode):
)
args = parser.parse_args()

logging_setup(None, args.verbose, args.debug)
logging_setup(args.verbose, args.debug)

with Session(debug_mode=args.debug) as session:
if not args.verbose:
_logger.debug(f"redirect logging to {logfile} in `/tmp/kestrel")
logging_setup(session, args.verbose, args.debug)
with open(args.huntflow, "r") as fp:
huntflow = fp.read()
outputs = session.execute(huntflow)
Expand Down
1 change: 1 addition & 0 deletions src/kestrel/config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ language:
session:
cache_directory_prefix: "kestrel-session-" # under system temp directory
local_database_path: "local.db"
log_path: "session.log"
show_execution_summary: true

# whether/how to prefetch all records/observations for entities
Expand Down
13 changes: 12 additions & 1 deletion src/kestrel/datasource/manager.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,14 @@
InvalidDataSourceInterfaceImplementation,
ConflictingDataSourceInterfaceScheme,
)
import asyncio
import inspect

# TODO: better solution to avoid using nest_asyncio for run_until_complete()
# maybe putting entire Kestrel in async mode
import nest_asyncio

nest_asyncio.apply()


class DataSourceManager(InterfaceManager):
Expand All @@ -30,6 +38,9 @@ def list_data_sources_from_scheme(self, scheme):
def query(self, uri, pattern, session_id, store):
scheme, uri = self._parse_and_complete_uri(uri)
i, c = self._get_interface_with_config(scheme)
rs = i.query(uri, pattern, session_id, c, store)
if inspect.iscoroutinefunction(i.query):
rs = asyncio.run(i.query(uri, pattern, session_id, c, store))
else:
rs = i.query(uri, pattern, session_id, c, store)
self.queried_data_sources.append(uri)
return rs
Loading

0 comments on commit 9adb547

Please sign in to comment.