Utility Modules

Context

Context used to pass data between main Scanner and post-processing code

class utils.context.Context(value_id=None, scan=None, client=None, question_list=None, raw_output=None, human_id=None)

All data needed to perform post processing

An instance of this is passed in to the postprocessing script with the variable name ctx. Virtually all scripts need access to the command output. Use raw for this value (i.e., ctx.raw in your script).

add_additional_output(name: str, output: str)

Similar to set_additional_output, but rather than creating a new data column, this adds more items to an existing column, or creates a new column with the data provided if one does not yet exist.

Parameters:

name – A free-form name to use for this data. Xylok internally uses names that begin with an underscore (_), so using names that begin this way may conflict with internal operation and may change without notice. If appending data to an existing key, make sure to use that name.
output – A string to use for this additional output. This string may appear in reports and spreadsheets, so nonprintable chacacters should be avoided.

add_software_item(name, version: str | None = ''): Add a software item (name and version)

additional_output: Dict[str, str | None] = None: Additional data that the PP script wishes to include beyond the default

answer(name, default=None) → str

Returns the answer to the question as a string

Questions may come from the runner, execution environment, or benchmark. Answers are generally user supplied. If a user has not answered a question, the value given for default will be returned.

Parameters:: default – Value to return if there is no answer.
Returns:: Question answer if available or default if not

clear_output(): Clears all output and additional output from PP script

client: box.Box = None

Client JSON this scan belongs to. Using the scan, can be used to look up the machine/location details Fields include (but are not limited to):

name - Long form name used for client
short_name - Short form name used for client
is_classified - Boolean–if true, this network is classified

property date: datetime | None

Return the date of the scan if available, otherwise returns today’s date.

Parameters:: fallback – bool: If False, don’t fallback to returning today’s date and instead return None
Returns:: Datetime of scan

get_additional_output(): Returns all additional outputs set for this context

ics: str = '***XYLOK ICS***': Default ICS seperator between multiple commands in a single raw output Can be manually set if needed

ics_split(idx=None, *, ics=None, splitlines: bool = False) → str | List[str]

Split raw output based on ctx.ics

If given an index, returns only that instance of the split. If not, returns a list of the raw output split by the ics

If ics is given in this call, it will be set to ctx.ics and reused on the next call.

Parameters:: splitlines – If True, each ICS group will have splitlines() called on it, resulting in a list of strings for each output section.
Returns:: raw output at the given ICS separator or a list of all raw outputs based on the separator

property location: Dict

Return the location this scan was for

Returns:: Location dictionary

property machine: Dict

Return the machine this scan was for

Returns:: Machine dictionary

print_recommendations()

Display current recommendation status in an easily parsable format

Not typically needed by post-processing scripts directly. It will be used as needed by the post-processing manager.

property questions: Dict[str, str]

Returns all questions and answers as a dictionary of processing_id -> answer

Returns:: Dictionary of all questions for this runner, with a key of hte processing_id and value of the answer. It is recommended to use ctx.answer("name") instead, as it is more efficient and clearer.

property raw: str

Get the raw output for this command

Returns:: string version of the raw output of this command

recommend_comment(comment: str | List[str], issues: List[str] | None = None)

Recommend a comment for this check

Typically this is only used indirectly through the other recommend_* functions

An optional recommended comment and “issue” list may be given, where the final user-seen comment will be of the form:

<comment> - <issue 0> - <issue 1> - <issue 2> - …

For backwards compatibility, if comment is a list instead of a string, it will be used as the issue list and the header comment will be roughly “The following items were found which may have contributed to this item’s status:”

recommend_compliant(comment: str | List[str] = None, issues: List[str] | None = None)

Recommend this check be marked “compliant” (not a finding)

An optional recommended comment may be given, typically a reason why this check is compliant. In addition, a list of “issues” may be given, which might be multiple items that contributed to this status. See recommend_comment() for more details.

recommend_manual_review(comment: str | List[str] = None, issues: List[str] | None = None)

Recommend this check be reviewed manually

An optional recommended comment may be given, typically a reason why this check isn’t answerable automatically. In addition, a list of “issues” may be given, which might be multiple items that contributed to this status. See recommend_comment() for more details.

recommend_na(comment: str | List[str] = None, issues: List[str] | None = None)

Recommend this check be marked “not applicable”

An optional recommended comment may be given, typically a reason why this check isn’t applicable. In addition, a list of “issues” may be given, which might be multiple items that contributed to this status. See recommend_comment() for more details.

recommend_noncompliant(comment: str | List[str] = None, issues: List[str] | None = None)

Recommend this check be marked “non-compliant” (a finding)

An optional recommended comment may be given, typically a reason why this check isn’t compliant. In addition, a list of “issues” may be given, which might be multiple items that contributed to this status. See recommend_comment() for more details.

recommend_status(status, comment: str | List[str] = None, issues: List[str] | None = None)

Set the recommended status for this check explicitly

Generally other recommend_* calls are used in scripts

See recommend_comment() for information on comment and issues

scan: Dict = None

Scan JSON this value belongs to. Can be used to get scan metadata, including the date of the scan and what machine/location it was run on. Fields include (but are not limited to):

date - Date the scan was run, in string format ("2019-06-26T20:13:16+00:00"). Generally use date (ctx.date) to get a more easily usable Python Datetime object.
items - A list of scan items, see scan_item for details. Due to performance optimizations, this is generally NOT all scan items in a scan
benchmark_ids - IDs of all benchmarks used in this scan
client_id - ID of the client this scan is for. See client
location_id - ID of the location this scan was for. See location
machine_id - ID of the machine this scan was for. See machine

property scan_item: Dict

Get the scan item as a dict for this command value

A scan item contains at least the following fields:

benchmark_id - Benchmark ID of this benchmark check (i.e., “rhel_6_stig”)
check_id - Database primary key of check in benchmark
human_id - Human ID for this check (i.e., “r)
status - Current compliance status as a string
comment - Current compliance status comment
notable_change - Whether this change was considered “notable” by the user
values - dict item that contains all command values that were run for the same benchmark check.

Returns:: Scan item dictionary

property scan_value: Dict

Get the scan item value as a dict for this command

A scan item value contains at least the following fields (but may contain more):

output - Raw command output
output_hash - Hash of raw command output
pp_output - Existing PP result
pp_hash - Hash of existing PP

Returns:: Scan value dictionary

set_additional_output(name: str, output: str)

Set an additional data column

Parameters:

name – A free-form name to use for this data. Xylok internally uses names that begin with an underscore (_), so using names that begin this way may conflict with internal operation and may change without notice.
output – A string to use for this additional output. This string may appear in reports and spreadsheets, so nonprintable chacacters should be avoided.

set_device_manufacturer(manufacturer): Set the make (manufacturer) for the device

set_device_model(model): Set the model for the device

set_host_name(host_name): Set the name for the device

set_ip_address(ip_address): Set the ip address(es) for the device

set_mac_address(mac_address): Set the mac address(es) for the device

set_os_key(os_key): Set the OS for the device

set_pps_information(pps_information): Get ports, protocols, & services information for the device

set_serial_number(serial_number): Set the serial_number for the device

Recommendations

Compatibility layer that calls context methods

utils.recommendations.recommend_comment(ctx: Context, comment): Compatibility layer that calls ctx.recommend_comment

utils.recommendations.recommend_compliant(ctx: Context, comment: str | List[str] = None): Compatibility layer that calls ctx.recommend_compliant

utils.recommendations.recommend_manual_review(ctx: Context, comment: str | List[str] = None): Compatibility layer that calls ctx.recommend_manual_review

utils.recommendations.recommend_na(ctx: Context, comment: str | List[str] = None): Compatibility layer that calls ctx.recommend_na

utils.recommendations.recommend_noncompliant(ctx: Context, comment: str | List[str] = None): Compatibility layer that calls ctx.recommend_noncompliant

Text Processing and Output

Text

Search and parse strings

utils.text.grep(regex: str, text: str, full_line: bool = True, case_sensitive: bool = True, invert_match: bool = False) → str

Searches the given text with the regex and returns matching line(s)

Works one line at a time–multiline regexes won’t work, use the re library for more complex needs.

full_line causes the entire matching line to be returned, not just the matching portion.

utils.text.ics_split(raw, idx=None, *, ics='***XYLOK ICS***', splitlines: bool = False) → str | List[str]

Split raw output based on ics

If given an index, returns only that instance of the split. If not, returns a list of the raw output split by the ics

Parameters:: splitlines – If True, each ICS group will have splitlines() called on it, resulting in a list of strings for each output section.
Returns:: raw output at the given ICS separator or a list of all raw outputs based on the separator

utils.text.levenshtein(string1: str, string2: str, collapse_whitespace: bool = True) → int

Computes Lenenshtein distance (the number of character changes) between two strings

Parameters:

string1 – First string
string2 – Second string
collapse_whitespace – Whether to compress all whitespace from the strings to single spaces before computing changes. Defaults to True

Returns:

Integer distance, the number of characters that changed between the strings

utils.text.parse_table(text: str, has_header: bool = True, columns: List[str] = None) → List[Dict]

Processes a terminal “table” into a list of Python dicts

For example, the output of ps -A might be:

PID TTY          TIME CMD
?        00:00:04 systemd
?        00:00:00 kthreadd
?        00:00:00 kworker/0:0H
?        00:00:00 mm_percpu_wq

parse_table() will turn this into:

[
    {'CMD': 'systemd', 'PID': '1', 'TIME': '00:00:04', 'TTY': '?'},
    {'CMD': 'kthreadd', 'PID': '2', 'TIME': '00:00:00', 'TTY': '?'},
    {'CMD': 'kworker/0:0H', 'PID': '4', 'TIME': '00:00:00', 'TTY': '?'},
    {'CMD': 'mm_percpu_wq', 'PID': '7', 'TIME': '00:00:00', 'TTY': '?'},
]

The data must either have a header or have columns given–if neither is available, a ValueError will be thrown.

utils.text.remove_comments(text: str, comment_str: str = '#') → str

Removes all lines that are commented out by the given charater

Lines may begin with whitespace

Display

Make output prettier and work with tables

utils.display.print_simple_to_fancy_table(text: str, has_header: bool = True, columns: List[str] = None) → str

Generates and prints a fancy table from the given boring table

A boring table is one in the format: ColHead1 ColHead2 row01val1 row1v2 row2val1 row2v2

It may or may not be column-aligned.

The generated table is both returned and printed.

utils.display.print_table(table: str, columns: List[str] = None) → str

Given a table in parse_table() format (a list of dicts), prints it nicely

If columns is given, only the given keys will be in the table. If columns is None, all keys from the list table row will be used.

The generated table is both returned and printed.

utils.display.table_to_string(table: List[Dict], columns: List[str] = None) → str

Given a table in parse_table() format (a list of dicts), returns a string representation

If columns is given, only the given keys will be in the table. If columns is None, all keys from the list table row will be used.

Windows-specific

Files

Windows file and file-listing utilities

utils.windows.files.childitem_to_table(text: str) → List[Dict]

Given a Get-ChildItem call output for director(ies), converts to to a table

Table is returned in the format suitable for table_to_string, a list of dictionaries. Each dictionary has the keys:

mode (file permissions)

write-date (last date file was modified)

write-time (last time file was modified)

size (file size, 0 for directories)

path (parent directory path)

name (item file name)

utils.windows.files.simplify_file_childitem(text: str, show_size: bool = False, show_date: bool = False)

Simplifies a list of Get-ChildItem calls

Takes standard get-childitem format and changes it into a table

Registry

Windows file and file-listing utilities

utils.windows.registry.simplify_reg_itemproperty(text: str, keys: List[str])

Simplifies a list of Get-ItemProperty calls and displays the requested keys

Any keys requested that don’t exist will display <key not found>. In addition, it always displays the PSPath as a header.

A key may also be given as a regular expression. It will be called with re.fullmatch(), which effectively implies ‘^…$’.

Users

Windows users and groups utilities

utils.windows.users.resolve_sids(text: str) → str: Translate well-known Windows SIDs in text to their human-readable equivalent

Other

Windows PowerShell utilities

utils.windows.powershell.ps_list_cleanup(input_string: str, split_string: str = '\n\n', output_json: bool = False) → List[str]: Clean up and remove empty sets from PowerShell Format-List results for easier processing. Can also be used to clean up other repeating groups of items with a different split_string.

Unix/POSIX-specific

Auditing

Auditing helpers

utils.unix.audit.check_audit_lines(ctx: Context, *, syscalls: List[str], is_x64: bool | None = None) → List[str]

Check if the given syscall(s) are missing from the auditd configuration file

By default, determines if the configuration should include b64 as well as b32 automatically, but this may be explicitly given if needed.

Internally calls recommend_compliant/noncompliant, so callers should not need to do anything else.

Returns the comment sent for recommend_comment. Primarily used for testing purposes, generally not needed by caller.

Files

Unix file and file-listing utilities

utils.unix.files.perms_to_octal(perms: str): Convert a Unix perm string of the from “rwxrwxrwx.” to numeric

utils.unix.files.perms_to_octal_set(perm_str: str): Convert a single three-char section of Unix perms

utils.unix.files.simplify_ls(ls_output: str, show_total: bool = False, **kwargs) → str

Given an ls -l output, simplifies it to the important parts

kwargs are passed to simplify_ls_line

utils.unix.files.simplify_ls_line(line: str, convert_perms: bool = True, show_perms: bool = True, show_links: bool = False, show_user: bool = True, show_group: bool = True, show_size: bool = False, show_date: bool = False, show_filename: bool = True, raise_exception: bool = False) → str

Takes a single line in standard ls -l format and removes unimportant pieces

If convert_perms is True, will return perms as an octal permission set. If raise_exception is True, will raise a ValueError instead of returning the line as-is.

Expected format: -rw-r–r– 1 traherom traherom 62 Sep 27 10:06 ansible.cfg

Running

Parsing and handling process and other run-time information

class utils.unix.running.NetstatSection(value, names=None, *, module=None, qualname=None, type=None, start=1, boundary=None): Types of ports we might find

class utils.unix.running.PsMode(value, names=None, *, module=None, qualname=None, type=None, start=1, boundary=None): Process list modes

utils.unix.running.simplify_ifconfig(ifconf_output: str) → str

Given an ifconfig output, redacts all the IP addresses and MACs

Replaces them with “X.X.X.X” if they exist

utils.unix.running.simplify_netstat(text: str, section: NetstatSection = NetstatSection.NETWORK, listen_only: bool = False) → str

Given the output of a netstat call, returns a table of just the key data

Key data for:

Network connections: Proto, Port, and State

Unix sockets: Proto, Flags, Type, State, Path

utils.unix.running.simplify_ps(text: str, show_uid: bool = True, mode: PsMode = PsMode.AUTO) → str

Given the output of a ps call, returns just non-volatile information

This typically means just the executable name and UID if available.

Works with most ps outputs including ps -A, ps -Al, ps -Af, etc.