Ce diaporama a bien été signalé.
Nous utilisons votre profil LinkedIn et vos données d’activité pour vous proposer des publicités personnalisées et pertinentes. Vous pouvez changer vos préférences de publicités à tout moment.
Making
Spinnaker Go
@
Stitch Fix
Diana Tkachenko,
Data Platform Engineer
Spinnaker Is
Not Yet in
Production
Let me tell you an
awesome story of how to
install and set up
spinnaker to make it
work...
I. Our Infrastructure
II. Setting Up Spinnaker
III. Authentication on Spinnaker
PART I
Our Infrastructure
Pre-Spinnaker
100% of Infrastructure
on AWS
3 Peered VPCs
Isolate environments into different VPCs:
● TEST
○ testing deployments before
...
Deployment
Pipeline
Immutable Server Pattern
● Package Code into RPMs
● Bake AMI from RPM
● Deploy
○ Set up Launch Config ...
Process Overview
create ELB
create Route53
create spec
bake AMI
launch ASG
build RPM
Repeatable
Deployment Process
Definit...
Step 1:
Build RPM from Spec
Wrote up simple tools to create the RPM:
● Create spec file from template
● Customize spec fil...
Step 2: Bake AMI
● Used aminator (also from Netflix) to create
AMIs
● Jenkins job for baking
How does AMI get baked?
1. Cr...
Step 3: Deploy
ELB
ASG
Route53
EC2 EC2 EC2
Launch Config
AMIRPM
is baked into
both used to create
internet traffic
immutab...
Why Spinnaker?
80 Data Scientists
10 Platform Engineers
Our data scientists are
responsible for:
● Building ETLs
● Deployi...
PART II
Setting Up
Spinnaker
In Our Infrastructure
Key
Differences
from the
Netflix Setup
1. Amazon Linux instead of Ubuntu
a. Adding RPM support to Gradle
b. System V inste...
Diff #1
You drew the short straw with
Amazon Linux (Red Hat) instead
of Ubuntu
Adding RPM Support to
Gradle
Create the buildRpm block:
● add our rpm repo in /etc/yum.repos.d
on bake machine
● add depen...
Upstart on
Amazon Linux
Different startup systems:
● We use System V (ancient)
○ service nginx start
○ startup scripts in ...
Diff #2
You’re hip and use Nginx
instead of Apache
Namespace Gate and
Rosco in Nginx
● include /etc/nginx/sites-enabled in main nginx conf
● on deploy, symlink
/etc/nginx/si...
Diff #3
You happily use AWS
Elasticache for Redis, but find
out Spinnaker angers it
AWS Elasticache is
Special
AWS Redis won’t let you issue CONFIG
commands!
● Redis version has to be >= 2.8.0
● On AWS elas...
Diff #4
You’d like a quick Cassandra
hack since you are
Cassandra-less
Quick EBS Backed
Cassandra Node
Don’t want an entire cluster - want fast setup, so
create single-node Cassandra:
● EBS bac...
Overview: Spinnaker on AWS
ELB
spinnaker.<internal-domain>.com
HTTP 80 ⇒ HTTP 80
ASG
EC2
cloud
driver
7002
front
50
8080
o...
PART III
Auth on Spinnaker
Keep Calm
SSL + Auth
on Spinnaker
● Where to Terminate SSL?
● Glory and the Beast of Self Signed
Certs
● Google OAuth2.0 Redirects M...
SSL: Dilemma #1
Where to terminate SSL:
a. ELB
b. Nginx
c. Server
Nginx to Terminate
SSL for Deck, Rosco
● Configure nginx with cert and key and turn ssl on
● Nginx now cannot start on boo...
For Gate, Pass Through SSL
Directly to Server
We want ELB to just pass traffic through to gate
without decrypting:
● Bypas...
SSL: Dilemma #2
Self signed certs? Meet your
new best friends, the Java
TrustStores
Tomcat Needs CA to Be in
Trust Store
Because we are using self-signed certs, it’s
important to have our self created CA in...
OAuth: Dilemma #3
Google OAuth2.0 redirects
trample all over your Nginx
rewrites
Remove Namespacing
for Gate & Bypass Nginx
● Set redirect_uri to our gate
address:
https://spinnaker.<internal-
domain>.co...
Client Auth: Dilemma #4
Tomcat doesn’t seem to care
about your client cert
Make Tomcat Request Client
Cert for Client Auth
We need to enable scripts to post tasks to spinnaker with
client authentic...
PART IV
Take Aways
What we learned
Spinnaker is complex!
There are barriers to overcome
if working with different
infrastructure.
I learned a lot about SSL, OAuth
2.0 and Client Authentication.
Like a lot.
Thanks for Listening!
We are very much looking forward to having
Spinnaker in production.
Find me on spinnaker slack
@dtka...
Prochain SlideShare
Chargement dans…5
×

Making Spinnaker Go @ Stitch Fix

A talk I gave at the recent Advanced AWS Meeup - this is a detailed guide to how I installed and set up Spinnaker to work with our infrastructure at Stitch Fix. I go over the various problems I ran into and how I solved them. I hope this can be useful for others setting up, or interested in setting up Spinnaker for their purposes.

**Big thanks to Armory for recording the talks! Video for this talk can be found here: https://youtu.be/ywzPblFpIE0 (I'm the second speaker)**

  • Soyez le premier à commenter

Making Spinnaker Go @ Stitch Fix

  1. 1. Making Spinnaker Go @ Stitch Fix Diana Tkachenko, Data Platform Engineer
  2. 2. Spinnaker Is Not Yet in Production Let me tell you an awesome story of how to install and set up spinnaker to make it work for you!
  3. 3. I. Our Infrastructure II. Setting Up Spinnaker III. Authentication on Spinnaker
  4. 4. PART I Our Infrastructure Pre-Spinnaker
  5. 5. 100% of Infrastructure on AWS 3 Peered VPCs Isolate environments into different VPCs: ● TEST ○ testing deployments before pushing to prod ● PROD ○ all production deployments ● INFRA ○ tools that both prod and test need to use prod test infra jenkins artifactory spinnaker flotilla
  6. 6. Deployment Pipeline Immutable Server Pattern ● Package Code into RPMs ● Bake AMI from RPM ● Deploy ○ Set up Launch Config with AMI ○ Create ASG ○ Set up ELBs, Route53
  7. 7. Process Overview create ELB create Route53 create spec bake AMI launch ASG build RPM Repeatable Deployment Process Definition of Application make changes to code To create an application, this would be the one time setup app “scaffolding” on aws; route53 points to ELB rpm built from this recipe Iterative process for deploying new versions attach to ELB
  8. 8. Step 1: Build RPM from Spec Wrote up simple tools to create the RPM: ● Create spec file from template ● Customize spec file ● Jenkins job to build RPM The process appears complex: ● The spec file seems scary for user ● But it makes deployment easy down the line! Name: sf-helloworld Version: 0.0.1 Release: 1 Summary: YOUR SUMMARY HERE! Group: Development/Libraries License: stitchfix-internal BuildArch: noarch AutoReqProv: no BuildRequires: Requires: sf-base, sf-aa, sf-nginx %install mkdir -p $RPM_BUILD_ROOT{/stitchfix,/etc/init.d} cp -R %{_sourcedir} $RPM_BUILD_ROOT/stitchfix/%{base_name} cp %{_topdir}/SCRIPTS/sf-%{base_name} $RPM_BUILD_ROOT/etc/init.d/sf-%{base_name} %files /stitchfix/%{base_name} /etc/init.d/sf-%{base_name} %post ln -s /etc/nginx/sites-available/sf-app.conf /etc/nginx/sites-enabled/sf-app.conf /usr/bin/pip-2.7 install -e /stitchfix/%{base_name} chkconfig --add %{name} chkconfig --levels 345 %{name} on sf-helloworld.spec
  9. 9. Step 2: Bake AMI ● Used aminator (also from Netflix) to create AMIs ● Jenkins job for baking How does AMI get baked? 1. Create volume from base AMI id 2. Attach and mount volume 3. Chroot into volume 4. Install RPM on volume 5. Create snapshot from volume 6. Register AMI from snapshot EC2 Instance (Baking Machine) Artifactory (RPM repo) RPM Volume get RPM from repo installRPM
  10. 10. Step 3: Deploy ELB ASG Route53 EC2 EC2 EC2 Launch Config AMIRPM is baked into both used to create internet traffic immutableserver routes traffic
  11. 11. Why Spinnaker? 80 Data Scientists 10 Platform Engineers Our data scientists are responsible for: ● Building ETLs ● Deploying Dashboards and Services We value self service!
  12. 12. PART II Setting Up Spinnaker In Our Infrastructure
  13. 13. Key Differences from the Netflix Setup 1. Amazon Linux instead of Ubuntu a. Adding RPM support to Gradle b. System V instead of Upstart 2. Nginx instead of Apache 3. Secured Redis on AWS 4. No Cassandra in Existing Architecture And how to handle them
  14. 14. Diff #1 You drew the short straw with Amazon Linux (Red Hat) instead of Ubuntu
  15. 15. Adding RPM Support to Gradle Create the buildRpm block: ● add our rpm repo in /etc/yum.repos.d on bake machine ● add dependency rpms inside the block ● make sure to build all the other spinnaker rpms and push to your rpm repo ./gradlew buildRpm // Ubuntu buildDeb { requires('redis-server', '3.0.5', GREATER | EQUAL) requires('spinnaker-clouddriver') requires('spinnaker-deck') requires('spinnaker-echo') requires('spinnaker-front50') requires('spinnaker-gate') requires('spinnaker-igor') requires('spinnaker-orca') requires('spinnaker-rosco') requires('spinnaker-rush') requires('apache2') } // Centos buildRpm { requires('sf-nginx') requires('sf-base') requires('spinnaker-clouddriver') requires('spinnaker-deck') requires('spinnaker-echo') requires('spinnaker-front50') requires('spinnaker-gate') requires('spinnaker-igor') requires('spinnaker-orca') requires('spinnaker-rosco') requires('spinnaker-rush') os = LINUX # ⇐ YOU NEED THIS MAGIC LINE! } [spinnaker] build.gradle
  16. 16. Upstart on Amazon Linux Different startup systems: ● We use System V (ancient) ○ service nginx start ○ startup scripts in /etc/init.d ○ chkconfig for starting on bootup ● Spinnaker uses upstart ○ initctl start spinnaker ○ conf files in /etc/init Another Issue: ● 0.6.5 version of upstart on Amazon Linux which is way older than 1.4 on Ubuntu description "rosco" start on filesystem or runlevel [2345] # not supported in old version # so for amazon linux we remove these lines: setuid spinnaker setgid spinnaker expect fork stop on stopping spinnaker env HOME=/home/spinnaker exec /opt/rosco/bin/rosco 2>&1 > /var/log/spinnaker/rosco/rosco.log & [rosco] /etc/init/rosco.conf
  17. 17. Diff #2 You’re hip and use Nginx instead of Apache
  18. 18. Namespace Gate and Rosco in Nginx ● include /etc/nginx/sites-enabled in main nginx conf ● on deploy, symlink /etc/nginx/sites-available/spinnaker.conf => /etc/nginx/sites-enabled/spinnaker.conf [spinnaker] /etc/nginx/sites-available/spinnaker.conf # all services on the same machine server { listen 80; location / { root /opt/deck/html; } # namespacing gate location ~* ^/gate/ { rewrite ^/gate/(.*) /$1 break; proxy_pass http://localhost:8084; } # namespacing rosco location ~* ^/rosco/ { rewrite ^/rosco/(.*) /$1 break; proxy_pass http://localhost:8087; } } ELB HTTP 80 ⇒ HTTP 80 nginx 80 / => /opt/deck/html /gate/health => localhost:8084/health /rosco/health => localhost:8087/health EC2 spinnaker.<internal-domain>.com
  19. 19. Diff #3 You happily use AWS Elasticache for Redis, but find out Spinnaker angers it
  20. 20. AWS Elasticache is Special AWS Redis won’t let you issue CONFIG commands! ● Redis version has to be >= 2.8.0 ● On AWS elasticache console, add notify-keyspace-events=Egx to a new parameter group ○ this enables redis keyspace events for generic commands and expired events ● In gate.yml, add redis.configuration.secure=true server: port: ${services.gate.port:8084} address: ${services.gate.host:localhost} ... redis: connection: ${services.redis.connection} # add the following two lines if using aws redis configuration: secure: true [spinnaker] /config/gate.yml AWS Redis 2.8.0 spinnaker parameter group notify-keyspace-events=Egx
  21. 21. Diff #4 You’d like a quick Cassandra hack since you are Cassandra-less
  22. 22. Quick EBS Backed Cassandra Node Don’t want an entire cluster - want fast setup, so create single-node Cassandra: ● EBS backed store for cassandra data ● Startup script remaps route53 entry on each deployment ○ Point straight to EC2, not ELB On redeploy or termination: ● EBS detaches, so data is not lost ● cassandra.<internal-domain>.com mapped to new EC2 Cassandra cassandra.<internal-domain>.com EBS /cassandra-storage # change all store dirs to EBS data_file_directories: - /cassandra-storage/data commitlog_directory: /cassandra-storage/commitlog saved_caches_directory: /cassandra-storage/saved_caches # point all to private route53 entry seed_provider: parameters: - seeds: cassandra.<internal-domain>.com listen_address: cassandra.<internal-domain>.com rpc_address: cassandra.<internal-domain>.com /etc/cassandra/conf/cassandra.yaml
  23. 23. Overview: Spinnaker on AWS ELB spinnaker.<internal-domain>.com HTTP 80 ⇒ HTTP 80 ASG EC2 cloud driver 7002 front 50 8080 orca 8083 rosco 8087 gate 8084 rush 8085 igor 8088 echo 8089 nginx 80 deck 80 route53 cname for load balancer load balancer listeners deck, rosco, gate through nginx gate calls everything else cassandra redis
  24. 24. PART III Auth on Spinnaker Keep Calm
  25. 25. SSL + Auth on Spinnaker ● Where to Terminate SSL? ● Glory and the Beast of Self Signed Certs ● Google OAuth2.0 Redirects Mess up Nginx Rewrites ● Tomcat Ignores Client Certs for Client Auth Get ready to read a lot of stack traces
  26. 26. SSL: Dilemma #1 Where to terminate SSL: a. ELB b. Nginx c. Server
  27. 27. Nginx to Terminate SSL for Deck, Rosco ● Configure nginx with cert and key and turn ssl on ● Nginx now cannot start on bootup - needs password? ○ Add password to a file, add to nginx ● Now our healthcheck is messed up ○ Add 5000 port for easy ELB healthcheck ● Optional 80 => 443 redirect ● Notice how gate rewrite is gone… ○ has to do with oauth redirects server { listen 5000; location / { add_header Content-Type text/plain; return 200 'POOOOOOOOP'; } } # optional redirect here server { listen 80; return 301 https://$host$request_uri; } server { listen 443 ssl; ssl_password_file /etc/keys/spinnaker.pass; ssl_certificate /opt/spinnaker/ssl/server.crt; ssl_certificate_key /opt/spinnaker/ssl/server.key; location / { root /opt/deck/html; } location ~* ^/rosco/ { rewrite ^/rosco/(.*) /$1 break; proxy_pass http://localhost:8087; } } [spinnaker] /etc/nginx/sites-available/spinnaker.conf
  28. 28. For Gate, Pass Through SSL Directly to Server We want ELB to just pass traffic through to gate without decrypting: ● Bypass nginx for gate: ports 8084 ⇒ 8084 for gate SSL Gate is responsible for all types of authentication: ● Have client certificate? ○ Authenticate client certificate - this is why gate needs to terminate SSL ● No client certificate? ○ Send to google oauth ELB HTTP 80 ⇒ HTTP 80 TCP 443 ⇒ TCP 443 TCP 8084 ⇒ TCP 8084 EC2 spinnaker.<internal-domain>.com gate 8084 nginx 443 80 ⇒ 443
  29. 29. SSL: Dilemma #2 Self signed certs? Meet your new best friends, the Java TrustStores
  30. 30. Tomcat Needs CA to Be in Trust Store Because we are using self-signed certs, it’s important to have our self created CA in the truststore: ● Add spinnaker cert to java keystore using keytool utility ● Add keystore/truststore file location to gate-local.yml config server: ssl: enabled: true keyStore: /opt/spinnaker/ssl/keystore.jks keyStorePassword: poop keyAlias: server trustStore: /opt/spinnaker/ssl/keystore.jks trustStorePassword: poop /opt/spinnaker/conf/gate-local.yml But at some point I still had problems, so here’s a quick hack - add your CA to default java CA file: $JAVA_HOME/jre/lib/security/cacerts
  31. 31. OAuth: Dilemma #3 Google OAuth2.0 redirects trample all over your Nginx rewrites
  32. 32. Remove Namespacing for Gate & Bypass Nginx ● Set redirect_uri to our gate address: https://spinnaker.<internal- domain>.com:8084/login ● Gate can no longer be namespaced because on redirect, /gate in the path gets lost as only $host recorded Spinnaker (gate) Google Auth Server Web Browser (deck javascript) https://spinnaker.<internal-domain>.com:8084/login User authorization request User authorizes application Auth code grant Access token request Access token grant
  33. 33. Client Auth: Dilemma #4 Tomcat doesn’t seem to care about your client cert
  34. 34. Make Tomcat Request Client Cert for Client Auth We need to enable scripts to post tasks to spinnaker with client authentication: ● Create certs for client ● Configure gate tomcat to validate client cert Spinnaker Gate spinnaker.<internal-domain>.com:8084 Beakhead (Spinnaker Client) x509: enabled: true subjectPrincipalRegex: CN=(.*?) server: ssl: clientAuth: want enabled: true keyStore: /opt/spinnaker/ssl/keystore.jks keyStorePassword: poop keyAlias: server trustStore: /opt/spinnaker/ssl/keystore.jks trustStorePassword: poop /opt/spinnaker/conf/gate-local.yml POST /tasks Include client cert in request ● Layer based authentication on gate ● Tomcat validates cert: has to recognize cert authority from truststore ● Returns response if authenticated
  35. 35. PART IV Take Aways What we learned
  36. 36. Spinnaker is complex! There are barriers to overcome if working with different infrastructure.
  37. 37. I learned a lot about SSL, OAuth 2.0 and Client Authentication. Like a lot.
  38. 38. Thanks for Listening! We are very much looking forward to having Spinnaker in production. Find me on spinnaker slack @dtkachenko All pictures used in this presentation credit to Allie Brosh hyperboleandahalf.blogspot.com

×