Technical Details¶
easy_infra.yml¶
easy_infra
is a project which makes heavy use of generated code and composable container images. easy_infra.yml
is the centralized
configuration to instruct those generation and composition processes. It describes which version of software to use, where, how to generate the very
important /functions.sh
script (more on that in functions.sh and functions.j2), which is what provides all of the hooking and capabilities.
Here is a fictitious easy_infra.yml
that concisely demonstrates the various features that are possible:
_anchors:
file_extensions: &id001
- tf
security: &id002
checkov:
command: checkov -d . --download-external-modules True --skip-download --output
json --output-file-path ${CHECKOV_JSON_REPORT_PATH}/checkov.json,
arg_customizations:
CHECKOV_BASELINE: --baseline
env_customizations:
CHECKOV_LOG_LEVEL: LOG_LEVEL
description: directory scan
validation: &id003
- command: terraform init -backend=false
description: initialization
- command: terraform validate
description: validation
packages:
fluent-bit:
helper: ["all"]
version: v2.0.5
version_argument: --version
checkov:
allow_update: False
version: 2.2.8
version_argument: --version
tfenv:
aliases:
- tfe
allow_filter:
- match: exec
position: 0
file_extensions: *id001
helper: ["terraform"]
security: *id002
monitor:
env_vars:
- TF_DATA_DIR
tool:
name: customname
environments:
- none
- aws
- azure
validation: *id003
version: v3.0.0
version_argument: --version
Note
Not all of these features are always in use in easy_infra.yml
YAML Anchors and Aliases¶
Anchors and Aliases are a yaml concept, and are fully supported in easy_infra.yml
to support reusable components. See &id001
and *id001
above to see an example. Learn more here and here.
Packages¶
All of the terms under packages
are the names of packages, and the details of how and when they are installed are described in that object of the
easy_infra.yml
. In order to register a runtime hook against a package (Learn more about hooks here), or define what
security scans occur prior to executing, it must be defined under packages
.
Alias¶
Certain tools have multiple ways to run them, such as by running kubectl
or simply k
. These aliases often point to the exact same binary, and
if you’d like to support multiple aliases for a tool, provide a list of those aliases under aliases
in easy_infra.yml
, which will result in
/functions.sh
containing an appropriate function hook for each of the aliases.
Allow filter¶
The allow_filter
allows you to perform security scans only for a very specific sub-command of a given package or alias. For instance, in the above
example, we have:
tfenv:
allow_filter:
- match: exec
position: 0
This ensures that, in the generated tfenv
function in /functions.sh
, it will check for exec
in the 0
position (0-indexed, starting after the
command, and after any easy_infra specific arguments have been removed (i.e. --skip-checkov
)), and only if there’s a match will it continue to perform
security scans as described in the security
object under the respective package
(i.e. tfenv
).
Allow update¶
When projects are added to easy_infra
they are automatically on-boarded to our automated maintenance scripts (see def update
in tasks.py
for how
that works). All projects that are properly configured will be automatically updated when invoke update
is run, and allow_update
is a boolean field
under that package in easy_infra.yml
which allows the on-boarding of a package, while exempting it from automatic updates. This is typically temporary, and
only done when a given project changes how it performs releases or makes a breaking changes that we have yet to accommodate.
File extensions¶
file_extensions
exist to support the AUTODETECT
function. If a package
doesn’t have file extensions defined, the project’s autodetect
logic is unable to detect where files that relate to the command being run exist.
Monitor¶
The monitor
section is how you can specify which additional locations on the filesystem we should monitor for
changes to determine if we should rerun the security scans. For instance, if AUTODETECT
is true
and you’re
running a command in multiple directories, and/or if you’re chaining commands which may modify the filesystem.
Monitor currently only supports a env_vars
key containing a list of names of environment variables that the related
tool uses to identify alternative locations to store runtime-critical files. This may be expanded in the future.
Security¶
The backbone of this project is the security
section. All of the terms underneath security define the series of security tools which will be run
every time the related command is run. An alternative easy_infra.yml
would look something like this:
packages:
checkov:
version: 2.2.8
version_argument: --version
kics:
version: v1.5.1
version_argument: version
tfenv:
aliases:
- tfe
allow_filter:
- match: exec
position: 0
file_extensions:
- .tf
security:
checkov:
command: checkov -d . --download-external-modules True --skip-download --output
json --output-file-path ${CHECKOV_JSON_REPORT_PATH}/checkov.json,
arg_customizations:
CHECKOV_BASELINE: --baseline
CHECKOV_EXTERNAL_CHECKS_DIR: --external-checks-dir
CHECKOV_SKIP_CHECK: --skip-check
env_customizations:
CHECKOV_LOG_LEVEL: LOG_LEVEL
description: directory scan
kics:
command: kics scan --type Terraform --no-progress --queries-path ${KICS_INCLUDE_QUERIES_PATH}
--libraries-path ${KICS_LIBRARY_PATH} --report-formats json --output-path
${KICS_JSON_REPORT_PATH} --output-name kics --path .
arg_customizations:
KICS_EXCLUDE_SEVERITIES: --exclude-severities
KICS_INCLUDE_QUERIES: --include-queries
description: directory scan
version: v3.0.0
version_argument: --version
After building easy_infra
with this configuration, you should be able to expect that when you run tfenv exec init
inside of an easy_infra
container,
then it would run both the kics
and checkov
security tools as described under kics: command: ...
and checkov: command: ...
, with additional arg_
or env_ customizations as defined under security_tool: arg_customizations: ...
or security_tool: env_customizations: ...
, when the associated
environment variables are set.
As an example, if you ran tfenv exec init
and also had the CHECKOV_BASELINE
environment variable set to /iac/.checkov.baseline
then the
actual checkov command that would be run would be:
checkov -d . --download-external-modules True --skip-download --output json --output-file-path ${CHECKOV_JSON_REPORT_PATH}/checkov.json,
--baseline /iac/.checkov.baseline
Note
The --baseline ...
at the end was dynamically added due to the environment variable.
Tool¶
If you have a situation where you’d like the tool
to have a different name from the package
, you can simply add the optional tool
key in
your easy_infra.yml
, along with a name
sub-key, and then it will use the provided value during building and testing. A good example of this is
the aws-cli
package and the cloudformation
tool (which is technically aws cloudformation
commands inside the container).
If your tool
only supports a subset of all the possible environments, you can specify environments
as a sub-key under the optional tool
.
For instance, our cloudformation
images are not relevant to azure
.
Validation¶
Sometimes security scanning tools are only equipped to run against IaC which is in a certain state, such as ensuring that the IaC is formatted properly and
valid. validation
is where you can specify what those are, and you can specify a list of commands to run in the specified order, prior to running the
security scanning tools.
Version¶
version
is where you can specify which versions of tool you want to include when you’re building an easy_infra
image. This is what is maintained by this
project’s automated maintenance scripts, and it is parsed into build arguments which are passed into the container image building process.
Version Argument¶
version_argument
is a way for us to describe how a command requests its version inside of easy_infra
. This is useful to know because we avoid running
security scans (and validation, if any is specified) when the version of a tool is being queried inside of an easy_infra
container.
build/¶
All build/Dockerfile*
files must all be able to be built independently, as long as their pre-requisites are met. Typically this means you pass in
the appropriate *_VERSION
build arguments, and you pass in an EASY_INFRA_TAG
build argument that maps to a seiso/easy_infra_base tag locally.
For example, a command like the following should work when run from the build
directory if seiso/easy_infra_base:2022.11.06-terraform-943a052 is
available locally:
docker build -t ansible-test --build-arg ANSIBLE_VERSION=2.9.6+dfsg-1 --build-arg EASY_INFRA_TAG=2022.11.06-terraform-943a052 . -f
Dockerfile.ansible
All build/Dockerfrag*
files cannot be built individually and are only fragments of an image specification. They are meant to be layered on top of
their respective Dockerfile
.
functions.sh and functions.j2¶
functions.j2
is a Jinja2 template, which is rendered into a functions
script, and then copied into each easy_infra
image at build time. This all
works based on the combination of this /functions.sh
file existing inside of the container, commands being run from within a shell (whether or not you
specify bash -c
or not when running a container), and the BASH_ENV
environment variable pointing to /functions.sh
. The way that we ensure that all
commands are run inside a shell is by using "$@"
in the easy_infra
image entrypoint
of docker-entrypoint.sh
.
Because BASH_ENV
will ensure that /functions.sh
is loaded into the shell at initialization, and /functions.sh
contains functions which match the
name of tools which we are protecting, we can use those functions to perform security scans, arbitrary hooks, and logging prior to executing the original
command.
Ultimately, this means that when you run terraform
(or some other properly defined package in easy_infra.yml) inside of easy_infra
, it will actually
run the function “terraform”, which will run the security scans, hooks, and logging, and only after evaluating the precursor logic will it run command
terraform
which runs the terraform
binary from the PATH
.
Marking Git Directories Safe¶
As of git version 2.35.2, git safe.directory was added to mitigate CVE-2022-24765; however, support for configuration of safe directories via environment variables was not added until 2.38.0. We decided using environment variables was best way to do this, because it is the most dynamic.
Mounted directories must be considered “safe” by git, in order for logging to function properly. When mounting a .git folder into the container, the following variables work together to flag it as a safe directory for git:
export GIT_CONFIG_COUNT=1
export GIT_CONFIG_KEY_0="safe.directory"
GIT_CONFIG_VALUE_0="$(git rev-parse --show-toplevel 2>/dev/null || echo /iac)"
export GIT_CONFIG_VALUE_0
We set all of these vars once immediately after setting the hooks and then refresh the GIT_CONFIG_VALUE_0 for each iteration of the dirs loop
to accommodate AUTODETECT=True
.
Internal naming¶
Tool: An executable file in the easy_infra and root user’s
PATH
which perform IaC actions and has an associated security tool, as described in the easy_infra.yml used when building the image.Security tool: An executable file in the easy_infra and root user’s
PATH
which is configured to perform a security scan for an associated “tool” (see above), as configured in theeasy_infra.yml
file used to build the image.Package: The name of a package that can be installed to perform a necessary function. It could be a tool, a security tool, or a generic helper such as
fluent-bit
orenvconsul
.Command: A runtime command, following the use of the term by bash (see this documentation). This could be an alias, a package, or some other executable on the user’s
PATH
.Alias: An executable file in the easy_infra and root user’s
PATH
which executes the installed by a package. Whileaws-cli
would be a package,aws
would be the associated alias.Environment: A supported destination that a tool (see above) may deploy into, such as a cloud provider. An environment constitutes a bundle of packages.
Scan-only functions¶
In addition to the Alias and Package functions which are generated, scan_
functions are also created to allow you to run only the security scans (and
related hooks) for a given Package or Alias. So, for instance, if you have a Package of terraform
, the related scan function would be scan_terraform
.
Whereas, if you have a Package of ansible
which has Aliases of ansible
and ansible-playbook
, you will get scan functions of scan_ansible
and
scan_ansible-playbook
.
High-Level Design of the image build process¶
When building the easy_infra
images, the high level design is that files in the build/
directory are composed together using tasks.py
to
create multiple final container images for various use cases. Those use cases are primarily based around the use of an IaC “tool” (i.e. terraform
or ansible
), and an associated set of “security tools” (i.e. checkov
or kics
) which will run transparently when the IaC tool is used
inside of a container. There are also sometimes optional “environment” (i.e. aws
or azure
) images which add environment-specific helpers or
tools, based on the tool that the image focuses on.
There are two general types of files in build/
; Dockerfile*
and Dockerfrag*
.
All Dockerfile*
files should be able to be built and tested independently, and are effectively the “install” step of building the easy_infra
images. It is possible that an easy_infra
Dockerfile
may only contain a FROM
statement, if we are using a container built and distributed
by the upstream project. Dockerfile
suffixes MUST also be the same as a given package
as outlined in the easy_infra.yml
(aliases are not
supported), with the single exception of Dockerfile.base
(for example, the terraform
package’s Dockerfile
must be
Dockerfile.terraform
).
All Dockerfrag*
files should not be built and tested independently, as they are solely fragments which depend on the related Dockerfile
. For
instance, Dockerfrag.terraform
is meant to build on top of Dockerfile.terraform
. The contents of a Dockerfrag
often hinge around running
COPY
commands to pull files from the Dockerfile
. This model allows us to create extremely minimal final images with limited bloat and
consideration of extraneous packages or dependencies which are only needed at build time.
In order for a Dockerfile
and a Dockerfrag
to be “linked” together, they must share the same suffix. For example, Dockerfrag.abc
should
build on top of Dockerfile.abc
, and it is both expected that in Dockerfrag.abc
it copies files using COPY --from=abc ...
, and that in
Dockerfile.abc
the FROM
statement ends with ... as abc
.
Runtime user support¶
By default, easy_infra
runs as the easy_infra
user and should be fully functional, however we also support the root
user due to various file system
permission issues that often occur in pipelines when running as non-root users. Where possible, the easy_infra
user should be used due to the security risks
of running containers as root
.
Adding to the project¶
Adding a tool¶
Add the package to
easy-infra.yml
underpackages
and include a validsecurity
,version
, andversion_argument
section. Consider other optional configurations as they apply (see easy_infra.yml for more details).Modify
docker-entrypoint.sh
to print the tool version if the correct binary exists inside of the container.Create a
Dockerfile.{tool}
andDockerfrag.{tool}
in thebuild/
directory.You may need to add the tool name or any aliases in
.github/etc/dictionary.txt
if it is not a standard English word, assuming it is used in documentation.Create a new folder in
docs/
and add documentation regarding the tool. Reference the new docs in thetoctree
ofdocs/index.rst
in line with the othertoctree
entries.Consider developing any specialized hooks, using the hooks framework.
Write tests in
tests/test.py
by creating a new function namedrun_{tool}
and following the pattern that otherrun_*
functions follow by creating a list of 3-tuple tests, and then using theexec_tests
function to perform the tests and return the number of tests that were successfully run, logging the amount and type of tests performed at the end of the function.Add a folder under
tests/
aligned to the tool name, and create a variety of different configuration files that will be referenced by the tests intests/test.py
. Ensure that there are:invalid
andsecure
folders containing aligned configuration files, typically undertests/{tool}/general/
.At least one
security_tool/{security_tool}
folder undertests/{tool}
containing insecure code.If you developed hooks which register to the tool, create a
tests/{tool}/hooks/
directory, containing a variety of folders that exercise those built-in hooks.
Identify how the latest released version of the tool (either the “package” or tool name under the package) can be retrieved. Ensure that the
update
function intasks.py
will retrieve the latest version appropriately. You may be able to use some of the existing mechanisms (such as usingapt
, GitHub repo releases, GitHub repo tags, python package versions, etc.) which are maintained ineasy_infra/constants.py
and whose update functions exist ineasy_infra/utils.py
(see theget_latest_release_from_*
functions).
Note
If you need any special configuration at build time specific to the combination of a tool and an environment, you can create a
Dockerfile.{tool}-{environment}
and Dockerfrag.{tool}-{environment}
. These are entirely optional.