4. I’m Andrey
● Enjoying life as technology
specialist, father and endurance
athlete
● 10+ years in the industry
● Writing tools
● Fixing automation, projects and
organisations
● Meetups/conferences organizer
● Trainer
● Certified this and that
5. Why this presentation? What to expect?
● Not pretending to be an expert just sharing what worked and what didn’t
● Share what worked for us and hopefully save some time for some of you
● A lot of technical details and references
● Slides will be available online - you don’t have to remember/photo everything
7. Why Vault?
● Centrally Manage Secrets to Reduce
Secrets Sprawl
● Shift from static secrets to short-time
dynamically generated ones
● Protect Sensitive Data Across Clouds
and Private Datacenters
8. Where do we start?
Collect requirements and clarify context
9. Questions to ask - deployment and operations
● Where to deploy? VM? Container? Baremetal?
● Patch or scratch?
● How to access? VPN? Public? Service mesh?
● How to auto-unseal?
● How to get in initial secrets? (Ex. TLS)
● What storage is available?
● Where to stream logs?
● Where to stream telemetry?
● How to extract audit files?
● HA? DR?
● One per env or one for all?
10. Look for best practices and templates
● Why not to make use of https://github.com/hashicorp/terraform-aws-vault, right?
● With some small tweaks
● And little more tweaks
● And some more tweaks
(TLS certs handling, consul-less, dynamic configuration depending on environment,
rolling upgrades via multiple auto scaling configurations, audit files sync, etc)
Another option - https://github.com/hashicorp/vault-helm
Some inspiration
https://learn.hashicorp.com/vault/operations/ops-reference-architecture
11. Vault production (min) readiness checklist
● TLS termination
● Vault HA storage - ACL and encryption
● Local storage encryption
● Auto-unseal using KMS
● Stripped down image, infra as code,
encryption, minimal exec rights
● No ssh or other kind of remote access, NACL
for outgoing traffic
● IDS
● Backups and DR
● Logs and telemetry export from the node
● Audit on, sync audit files to remote storage,
integrity check for audit files
● Sync audit files to archive
● Audit files parsing and anomaly detection
● Availability/performance monitoring and
alerting
More here https://learn.hashicorp.com/vault/operations/production-hardening
12. Context - before Vault
● Applications running in containers
● Orchestrated by Kubernetes
● Running in AWS
● Secrets in configmaps/secrets
● Apps require database connection and connection to other cloud services (i.e.
database creds and cloud access creds), other static secrets
● Developers pulling secrets from k8s secrets
13. Deployment
● EC2, classic ELB, auto-scaling group per AZ (for rolling updates)
● Immutable infra
● Initial secrets baked in as encrypted archive that could be un-encrypted only with key
accessible to Vault instances
● Behind VPN
● Auto-unseal with KMS
● HA storage in DynamoDB with point in time restore
● Logs in CloudWatch
● Telemetry in Prometheus
14. It would be a good idea
to
split deployment terraform spec and
configuration terraform spec
16. To start configuring Vault via Terraform we need...
● Vault URL configured as VAULT_ADDR env variable
● Vault token (root token will do for the start but revoke it afterwards together with the
rest of the root tokens)
● A good idea what are you after…
More here https://www.youtube.com/watch?v=fOybhcbuxJ0 and here
https://www.terraform.io/docs/providers/vault/index.html
17. One slide Vault intro
LDAP
k8s
App
Role
AWS
...
Auth methods
Vault
token
AWS
Data
base
Secret Engines
Rabbit
MQ
PKI
Database login credentials
AWS access keys
RabbitMQ logic credentials
Certificates
Lease
Audit device
More here https://www.youtube.com/watch?v=VYfl-DpZ5wM
KV
Transit Encrypted data
Secret value
Vault
policies
20. You probably need more than one...
● Humans - operators and developers
● Machines - CI/CD, bots, etc
● Things - Apps, Infra etc
A good idea is to use MFA for humans, limit from where auth methods could be invoked
21. Auth -> Role -> Token with policy
Ex.
LDAP -> LDAP Group -> Token with policy
22. LDAP
● Leverages existing IAM setup
● Delegates credentials validation
● Used my humans
● Would be a good idea to simplify login procedure for your users
More here https://www.vaultproject.io/docs/auth/ldap.html
26. Policy
data "vault_policy_document" "example" {
rule {
path = "secret/*"
capabilities = ["create", "read", "update", "delete", "list"]
description = "allow all on secrets"
}
}
resource "vault_policy" "example" {
name = "example_policy"
policy = "${data.vault_policy_document.example.hcl}"
}
https://www.terraform.io/docs/providers/vault/d/policy_document.html
27. Policy
● You will need a policy to manage policy...
● Deny by default
● Do not have to match LDAP group name but easier for users if it
does
● Member of multiple groups gets multiple policies
More here https://learn.hashicorp.com/vault/getting-started/policies
28. AppRole if you really have to...
● If you don’t have a better way
● Mostly used for CI
● Initial secret issue
● No good way to audit access
More here https://www.vaultproject.io/docs/auth/approle.html
54. Secrets rotation
DB_SECRET_ENGINE_MOUNTS=$(vault secrets list -format=json | jq -r '. | to_entries[] | select(.value.type |
startswith("database")) | .key')
for DB_SECRET_ENGINE_MOUNT in ${DB_SECRET_ENGINE_MOUNTS}; do
DB_CONNECTION_NAMES=$(vault list -format=json ${DB_SECRET_ENGINE_MOUNT}config | jq --raw-output .[])
for DB_CONNECTION_NAME in ${DB_CONNECTION_NAMES}; do
vault write -force ${DB_SECRET_ENGINE_MOUNT}rotate-root/${DB_CONNECTION_NAME}
done
done
55. Secrets rotation
AWS_USERS=$(aws iam list-users --query "Users[?starts_with(UserName, 'vault-aws-')].UserName" --output text)
for AWS_USER in ${AWS_USERS}; do
KEYS_ID=$(aws iam list-access-keys --user-name ${AWS_USER} --query "AccessKeyMetadata[*].AccessKeyId" --output
text)
for KEY_ID in ${KEYS_ID}; do
aws iam delete-access-key --access-key-id ${KEY_ID} --user-name ${AWS_USER}
done
done
terraform apply
Note! Keys are still in Terraform state - encrypt state storage and state itself!
57. KV state issue
● Terraform provider for Vault in some cases(?) does not re-read KV and newly added
values are not readable/found
● terraform state rm data-source
data "vault_generic_secret" "rundeck_auth" {
path = "secret/rundeck_auth"
}
provider "rundeck" {
url = "http://rundeck.example.com/"
auth_token = "${data.vault_generic_secret.rundeck_auth.data["auth_token"]}"
}
58. Vault and Terraform not always play together
resource "vault_database_secret_backend_connection" "postgres" {
count = "${var.enable_postgresql}"
backend = "${vault_mount.db.path}"
name = "${var.postgresql_db_name}"
allowed_roles = ["${var.postgresql_role_name}", "${local.read_only_role_name}", "${local.admin_role_name}"]
data = {
username = "${var.postgresql_db_username}"
password = "${var.postgresql_db_username_password}"
}
postgresql {
connection_url =
"postgres://{{username}}:{{password}}@${var.postgresql_db_endpoint}:${var.postgresql_db_port}/${var.postgresql_db_name}"
max_open_connections = "${var.postgresql_max_open_connections}"
}
lifecycle {
ignore_changes = ["data.password"]
}
}
59. Vault and Terraform not always play together
resource "aws_iam_access_key" "key" {
user = "${aws_iam_user.user.name}"
}
resource "aws_iam_user_policy" "policy" {
name = "Allow-Vault-to-create-temp-users"
user = "${aws_iam_user.user.name}"
policy = "${data.aws_iam_policy_document.document.json}"
}
resource "vault_aws_secret_backend" "aws" {
description = "AWS secret engine for operators to get temporary keys"
path = "$humans-aws"
region = "${data.aws_region.r.name}"
access_key = "${aws_iam_access_key.key.id}"
secret_key = "${aws_iam_access_key.key.secret}"
default_lease_ttl_seconds = "28800"
max_lease_ttl_seconds = "86400"
}