Detecting secrets in code committed to gitlab (in real time)

Detecting secrets in code committed to Gitlab
(in real time)
Chandrapal Badshah

About Me
● Chandrapal Badshah
● Security Engineer
● Stoic and spends time with philosophy
● Pentest, Automation, Read books
● Manage @HackwithGithub on Twitter

Context
● Product based company, fail fast learn fast
● Hires a lot of devs*
● Use Gitlab community edition for code storage and CI/CD
● We do audit the code for secrets in regular intervals, but that’s late

Problem Statement
Need to detect and remove sensitive API keys (secrets) from code
This would reduce the impact when:
● Devs makes an internal repo public
● Devs pushes commits to their personal Github repos by mistake
● Unauthorized members accesses to code (insider threat)

This would help us in situations like
Source : https://www.bleepingcomputer.com/news/security/microsofts-github-account-hacked-private-repositories-stolen/

Git ﬂow
→ git commit → git push →

Git hooks
● Git hooks are scripts that git executes before or after events such as:
commit, push, and receive
● Git hooks are a built-in feature - no need to download anything.
● There are many types of git hooks. Check out https://githooks.com/
● We are interested in commit and receive based hooks:
○ pre-commit
○ post-commit
○ pre-receive
○ post-receive

Git hooks in the ﬂow
Source: https://blog.gitguardian.com/git-hooks-automated-secrets-detection/

Comparison of Git hooks
Pre commit and Post commit hooks - runs the scripts on dev machines.
Advantages:
● Stops even before the secrets are committed
Disadvantages:
● Adding new regex & managing the script on dev machines is hard
● False positives are bad user experience
● Privacy issues ? Nothing stops them from removing the git hooks

Pre receive hook - it can’t do much checks as the code is yet to reach the server.
There is Pre push hook which executes even before the Pre receive hook is
executed on the server side. But Pre push hook is still on the client side.

Post receive hook - runs on the server side.
Advantages:
● Can be conﬁgured for no delay when user does a git push. Devs don’t really
see the diﬀerence.
● Easy to manage the scripts
● False positives are manageable
Disadvantages:
● The secrets are already on the server

Final Decision
Go with the use of post receive hooks.
If secret detected:
● automatically raise a conﬁdential Gitlab issue in the repo
● get feedback - check if it’s a false positive
● if it’s a secret, ask the devs to rotate the secret
Post receive hooks should be conﬁgured per repository

Gitlab feature to help post receive hooks
● Gitlab has System hooks
● Gitlab system hooks does a HTTP POST request for many events like push,
group create, repo create, etc
● More details at
https://docs.gitlab.com/ee/system_hooks/system_hooks.html

Existing secret detection tools
There are lots of open source tools:
● truﬄeHog
● gitleaks
● git-secrets by AWS Labs
● detect-secrets by Yelp
● talisman by ThoughtWorks
● and more...

TruﬄeHog
● Python based tool
● Customizable regex
● Easy install and CLI commands
● Good documentation
● https://github.com/dxa4481/truﬄeHog

Gitleaks
● Written in Golang
● Customizable regex
● Supports whitelisting of secrets
● Lots of options in CLI commands, lacks documentation
● Allows scan of single commit but downloads the entire repo
● https://github.com/zricethezav/gitleaks

Comparison of truffleHog and gitleaks
truffleHog
1. Efficient for smaller commits
2. Less memory intense
3. After configuring with Gitlab system hooks,
the total time taken to complete scanning
was less.
gitleaks
1. Same time as trufflehog for smaller commits.
Comparatively fast for huge commits.
2. Very greedy for CPU memory
3. After configuring with Gitlab system hooks,
the total time taken to complete scanning
was less but at the cost of CPU memory.

Changes made
● Took all the necessary code from truﬄeHog and stripped the rest. We
internally call it “tattletale-rt”.
● The scan logic looks like the below:
○ Get the code changes in the commit (only the added content not the removed)
○ Get all the regexes we need to scan
○ For each line in the code change, check if the regex matches
○ If matches, report it
● Have a separate service called “Issue Manager” which manages issues.

Thanks to
Fahri Shihab
@fahrishb
Sanjog Panda
@sanjogpanda

What we learnt
● Not all API keys are sensitive. Google API keys are everywhere and are
intended to be public - Google Maps API key, Firebase key, etc
● Deployments are different for each projects - No “one solution” that fits all
● This detection is regex based. API keys / secrets will not be detected if:
○ API key doesn't match the regex
○ If the secrets are in a different language. пароль (parol’) is “password” in Russian.
● Entropy based detection is noisy but can detect some secrets.
● Learn on what’s the secure way to store secrets for each tech stack.

What are we working on now ?
Follow on Twitter to get more updates on:
● Mobile App Security Pipeline (Android & iOS)
● SAST

Detecting secrets in code committed to gitlab (in real time)

Recommandé

Recommandé

Contenu connexe

Tendances

Tendances (20)

Similaire à Detecting secrets in code committed to gitlab (in real time)

Similaire à Detecting secrets in code committed to gitlab (in real time) (20)

Plus de Chandrapal Badshah

Plus de Chandrapal Badshah (11)

Dernier

Dernier (20)

Detecting secrets in code committed to gitlab (in real time)