SlideShare une entreprise Scribd logo
1  sur  42
Télécharger pour lire hors ligne
© Copyright 2017 Pivotal Software, Inc. All rights Reserved. Version 1.0
Hubert Zhang (hzhang@pivotal.io)
Jack WU (jwu@pivotal.io)
PL/Container
Introduction
Customize and Secure the Runtime and
Dependencies of Procedural Languages
Cover w/ Image
Agenda
■ The Problem
■ What is PL/Container
■ How to use PL/Container
■ PL/Container Internals
■ Future Work
■ Q+A
The Problem
PGConf 2018: PL/Container Introduction
We generate More and More Data
We generate More and More Data
1.2
ZB 2.8ZB 8.5ZB
40ZB2010
2012
2015
20201ZB=1,000,000,000,000,000,000,000 Byte
We want to analyze data for knowledge
We want to analyse data IN Database
But…
PL/Python and PL/R are UNTRUSTED
Languages
Only Superuser can Create UDF in Untrusted
Languages
System(“rm -rf /data”)
The Problem: Triangle Dependency
Data Scientist
DBA
UDF
Review & Create
Run UDF
Greenplum
Package
UDFPackage
1. Greenplum
2. Operation System
3. Python / R
4. TensorFlow
Resolve The Problem: untrusted -> untrusted
Data Scientist
DBA
UDF
Review & Create
Run UDF
Create UDF
How to Make untrusted to untrusted?
PL/Container
PL/Container
What is PL/Container
PGConf 2018: PL/Container Introduction
What is PL/Container?
PL/Container is a customizable, secure
runtime for Greenplum Database Procedural
Languages.
● Greenplum Database Extension
● Stateless
● Based on Docker Container
● Customizable
● Secure
● Isolated
PL/Container
PL/Container
6.Ping
7.Ping
8.Call
9.SPI
10.Result
11.Result
PL/Container Architecture
GPDB Segment host
1. Query plan
2. Parse container
name from UDF body
3. Read configuration
4. Bring up container 5. Start
Repeat
p.8–p.11
12. Result Repeat p.12
How UDF run in PL/Container?
PL/Container is a customizable, secure runtime
for Greenplum Database Procedural Languages.
a. PL/Container Extension starts a docker
container (only in 1st call)
b. Transfer UDF and data to docker container
c. Run the UDF in docker container
d. Contact the docker container to get the results
PL/Container
PL/Container
How to use PL/Container
PGConf 2018: PL/Container Introduction
Install PL/Container on Greenplum
Install from Source Code
● source $GPHOME/greenplum_path.sh
● make install
Install from GPPKG
● gppkg –i plcontianer-1.1.0-rhel7-x86_64.gppkg
● no additional dependencies
Platform
Centos 6.6+ or 7.x
Database
Greenplum 5.2+
Docker
Docker 17.05+ on Centos7
Docker 1.7+ on Centos6
Prerequisites
Build Custom Docker Image (optional)
Minimum Requirement:
● Python or R environment
● Add location of libpython.so and libR.so to LD_LIBRARY_PATH
Customize Your image:
● Install specific packages
FROM continuumio/anaconda
ENV LD_LIBRARY_PATH "/opt/conda/lib:$LD_LIBRARY_PATH”
FROM continuumio/anaconda3
RUN conda install -c conda-forge -y tensorflow
ENV LD_LIBRARY_PATH "/opt/conda/lib:/usr/local/lib:$LD_LIBRARY_PATH"
runtime
<id>
<image>
<command>
<shared directory>
<setting
memory_mb>
<setting cpu_share>
runtime add
runtime delete
runtime backup
runtime restore
runtime edit
runtime show
image add
image delete
image list
Image RuntimeXML
Configure PL/Container
container cgroup node
memory.memsw.limit_in_bytes
cpu.shares
Run PL/Container
Running a simple plpython UDF to calculate log10
postgres=# CREATE LANGUAGE plpythonu; plcontainer;
postgres=# CREATE OR REPLACE FUNCTION pylog10(input double precision) RETURNS double
precision AS $$
import math
return math.log10(input)
$$ LANGUAGE plpythonu;
postgres=# SELECT pylog10(100);
pylog10
--------------
2
(1 row)
Run PL/Container
Running a simple PL/Container UDF to calculate log10
postgres=# CREATE EXTENSION plcontainer;
postgres=# CREATE OR REPLACE FUNCTION pylog10(Input double precision) RETURNS double
precision AS $$
# container: plc_python_shared
import math
return math.log10(input)
$$ LANGUAGE plcontainer;
postgres=# SELECT pylog10(100);
pylog10
--------------
2
(1 row)
Demos
How to use PLcontainer
PL/Container Internal
PGConf 2018: PL/Container Introduction
Cover w/ Image
PL/Container Internals
■ Message Protocol
■ SPI Support
■ Pluggable Backend
■ Resource Management
■ Error Handling
■ Performance
Message Protocol
PL/Container use messages to
communicate between QEs and
containers.
● plcMsgPing
● plcMsgCallreq
● plcMsgResult
● plcMsgError
● plcMsgLog
● plcMsgSQL
● plcMsgSubtransaction
● plcMsgRaw
Query
Executor
Container
PING PONG SQLResultSQLError/LogCallReq
Result
SPI support
Server Programming Interface enable UDF to run SQL queries.
Query ExecutorContainer
plpy.execute(query)
plan=plpy.prepare
plpy.execute(plan)
SPI_execute()
create&cache plan
SPI_execute_plan
Where
Query
Generated
Where
Query
Executed
Problem: SPI is called inside container but executed at QE side.
Pluggable Backend
Docker as a sandbox Container as a serviceSeparate Process
GPDB Query Executor
C CODE
Python Executor
Python Code
Resource Management
Container level
● Memory: memory limit, minimum is 4M
● CpuShares: relative weight of CPU
Extension level
● Integrated with GPDB resource group extension
framework.
OOM
Container level
● Memory: memory limit, minimum is 4M
● CpuShares: relative weight of CPU
Extension level
● Integrated with GPDB resource group extension
framework.
create resource group plgroup
(concurrency=0,
cpu_rate_limit=10,
memory_limit=30,
memory_auditor=‘cgroup’)
Resource Management
OOM
Resource Management
Container level
● Memory: memory limit, minimum is 4M
● CpuShares: relative weight of CPU
Extension level
● Integrated with GPDB resource group extension
framework.
memory
gpdb
default_grou
p OID
plgroup oid
aefs9err8e flkr345ere4
cgroup
38G
1G 2G
create resource group plgroup
(concurrency=0,
cpu_rate_limit=10,
memory_limit=30,
memory_auditor=‘cgroup’)
Error Handling
Container failure should not affect GPDB core
● Containers fail to create
● Containers fail to start
● Containers crash when running
● Cached containers crash
Container cleanup
● Query Cancel
● QE error
● Cached QE quit when idle for a long time (By Cleanup process)
Performance
Optimization
● Cached container (lifecycle same as QE)
● Unix domain socket
● Type conversion
● Resource management (CPU share)
Best practices
● Array instead of multiple rows.
● Complex UDF instead of simple one
QE Container
start container
tuple 1
tuple 2
query 1
query 2
tuple 1
tuple n
tuple n
.
.
.
.
.
Performance
Test Environment
● Hardware: 6 virtual machines, each with 19G memory and 5 processors. (Intel(R)
Xeon(R) CPU E5-2697 v2 @ 2.70GHz)
● Software: Centos7, GPDB 5.2 with 30 segments.
Workloads
● Long-running function
● Large input array function
● Large output array function
Performance
Long-running function
CREATE OR REPLACE FUNCTION pysleep(i
int) RETURNS void AS $$
# contaziner: plc_python_shared
import time
time.sleep(i)
$$ LANGUAGE plcontainer;
SELECT count(pysleep(1)) FROM tbl;
no performance downgrade
Performance
Large input array function
CREATE OR REPLACE FUNCTION
pylargeint8in(a int8[]) RETURNS float8
AS $$
#container : plc_python_shared
return sum(a)/float(len(a))
$$ LANGUAGE plcontainer;
SELECT count(pylargeint8in(ARRAY(SELECT
column1 FROM tbl1))) FROM tbl2;
2 times performance downgrade for Python
30% performance improvement for R
Performance
Large output array function
CREATE OR REPLACE FUNCTION
pylargeoutfloat8(num int) RETURNS
float8[] AS $$
# container: plc_python_shared
return [x/3.0 for x in range(num)]
$$ LANGUAGE plcontainer;
SELECT count(pylargeoutfloat8(n)) FROM
tbl;
4 times performance improvement
Future Work
PGConf 2018: PL/Container Introduction
PL/Container Future Work (subject to change)
Support More Languages Support More TechnologyContainer Orchestrater /
Cloud
https://github.com/greenplum-db/plcontainer
https://gpdb.docs.pivotal.io/570/ref_guide/extensions/pl_container.html
Transforming How The World Builds Software
© Copyright 2017 Pivotal Software, Inc. All rights Reserved.

Contenu connexe

Tendances

Fedora Virtualization Day: Linux Containers & CRIU
Fedora Virtualization Day: Linux Containers & CRIUFedora Virtualization Day: Linux Containers & CRIU
Fedora Virtualization Day: Linux Containers & CRIU
Andrey Vagin
 

Tendances (20)

GUC Tutorial Package (9.0)
GUC Tutorial Package (9.0)GUC Tutorial Package (9.0)
GUC Tutorial Package (9.0)
 
jcmd #javacasual
jcmd #javacasualjcmd #javacasual
jcmd #javacasual
 
XDP in Practice: DDoS Mitigation @Cloudflare
XDP in Practice: DDoS Mitigation @CloudflareXDP in Practice: DDoS Mitigation @Cloudflare
XDP in Practice: DDoS Mitigation @Cloudflare
 
Practical Glusto Example
Practical Glusto ExamplePractical Glusto Example
Practical Glusto Example
 
Prosit google-cloud
Prosit google-cloudProsit google-cloud
Prosit google-cloud
 
On the benchmark of Chainer
On the benchmark of ChainerOn the benchmark of Chainer
On the benchmark of Chainer
 
Hands On Gluster with Jeff Darcy
Hands On Gluster with Jeff DarcyHands On Gluster with Jeff Darcy
Hands On Gluster with Jeff Darcy
 
Deploying containers and managing them on multiple Docker hosts, Docker Meetu...
Deploying containers and managing them on multiple Docker hosts, Docker Meetu...Deploying containers and managing them on multiple Docker hosts, Docker Meetu...
Deploying containers and managing them on multiple Docker hosts, Docker Meetu...
 
Debugging node in prod
Debugging node in prodDebugging node in prod
Debugging node in prod
 
Kernel Recipes 2016 - Speeding up development by setting up a kernel build farm
Kernel Recipes 2016 - Speeding up development by setting up a kernel build farmKernel Recipes 2016 - Speeding up development by setting up a kernel build farm
Kernel Recipes 2016 - Speeding up development by setting up a kernel build farm
 
Synchronised Multidevice Media Playback with Gstreamer
Synchronised Multidevice Media Playback with GstreamerSynchronised Multidevice Media Playback with Gstreamer
Synchronised Multidevice Media Playback with Gstreamer
 
Arbiter volumes in gluster
Arbiter volumes in glusterArbiter volumes in gluster
Arbiter volumes in gluster
 
Backing up Wikipedia Databases
Backing up Wikipedia DatabasesBacking up Wikipedia Databases
Backing up Wikipedia Databases
 
Clustering tensor flow con kubernetes y raspberry pi
Clustering tensor flow con kubernetes y raspberry piClustering tensor flow con kubernetes y raspberry pi
Clustering tensor flow con kubernetes y raspberry pi
 
Introduction of Java GC Tuning and Java Java Mission Control
Introduction of Java GC Tuning and Java Java Mission ControlIntroduction of Java GC Tuning and Java Java Mission Control
Introduction of Java GC Tuning and Java Java Mission Control
 
How to Burn Multi-GPUs using CUDA stress test memo
How to Burn Multi-GPUs using CUDA stress test memoHow to Burn Multi-GPUs using CUDA stress test memo
How to Burn Multi-GPUs using CUDA stress test memo
 
Using Flame Graphs
Using Flame GraphsUsing Flame Graphs
Using Flame Graphs
 
CRIU: Time and Space Travel for Linux Containers
CRIU: Time and Space Travel for Linux ContainersCRIU: Time and Space Travel for Linux Containers
CRIU: Time and Space Travel for Linux Containers
 
Introduction to RCU
Introduction to RCUIntroduction to RCU
Introduction to RCU
 
Fedora Virtualization Day: Linux Containers & CRIU
Fedora Virtualization Day: Linux Containers & CRIUFedora Virtualization Day: Linux Containers & CRIU
Fedora Virtualization Day: Linux Containers & CRIU
 

Similaire à Customize and Secure the Runtime and Dependencies of Your Procedural Languages Using PL/Container - Greenplum Summit 2018

Lightweight Virtualization with Linux Containers and Docker I YaC 2013
Lightweight Virtualization with Linux Containers and Docker I YaC 2013Lightweight Virtualization with Linux Containers and Docker I YaC 2013
Lightweight Virtualization with Linux Containers and Docker I YaC 2013
Docker, Inc.
 
ELC-E Linux Awareness
ELC-E Linux AwarenessELC-E Linux Awareness
ELC-E Linux Awareness
Peter Griffin
 

Similaire à Customize and Secure the Runtime and Dependencies of Your Procedural Languages Using PL/Container - Greenplum Summit 2018 (20)

Lightweight Virtualization with Linux Containers and Docker | YaC 2013
Lightweight Virtualization with Linux Containers and Docker | YaC 2013Lightweight Virtualization with Linux Containers and Docker | YaC 2013
Lightweight Virtualization with Linux Containers and Docker | YaC 2013
 
Lightweight Virtualization with Linux Containers and Docker I YaC 2013
Lightweight Virtualization with Linux Containers and Docker I YaC 2013Lightweight Virtualization with Linux Containers and Docker I YaC 2013
Lightweight Virtualization with Linux Containers and Docker I YaC 2013
 
[KubeCon EU 2020] containerd Deep Dive
[KubeCon EU 2020] containerd Deep Dive[KubeCon EU 2020] containerd Deep Dive
[KubeCon EU 2020] containerd Deep Dive
 
PGConf.ASIA 2019 Bali - PostgreSQL on K8S at Zalando - Alexander Kukushkin
PGConf.ASIA 2019 Bali - PostgreSQL on K8S at Zalando - Alexander KukushkinPGConf.ASIA 2019 Bali - PostgreSQL on K8S at Zalando - Alexander Kukushkin
PGConf.ASIA 2019 Bali - PostgreSQL on K8S at Zalando - Alexander Kukushkin
 
"Lightweight Virtualization with Linux Containers and Docker". Jerome Petazzo...
"Lightweight Virtualization with Linux Containers and Docker". Jerome Petazzo..."Lightweight Virtualization with Linux Containers and Docker". Jerome Petazzo...
"Lightweight Virtualization with Linux Containers and Docker". Jerome Petazzo...
 
Experiences building a distributed shared log on RADOS - Noah Watkins
Experiences building a distributed shared log on RADOS - Noah WatkinsExperiences building a distributed shared log on RADOS - Noah Watkins
Experiences building a distributed shared log on RADOS - Noah Watkins
 
Kubernetes @ Squarespace (SRE Portland Meetup October 2017)
Kubernetes @ Squarespace (SRE Portland Meetup October 2017)Kubernetes @ Squarespace (SRE Portland Meetup October 2017)
Kubernetes @ Squarespace (SRE Portland Meetup October 2017)
 
Introduction to Docker at SF Peninsula Software Development Meetup @Guidewire
Introduction to Docker at SF Peninsula Software Development Meetup @GuidewireIntroduction to Docker at SF Peninsula Software Development Meetup @Guidewire
Introduction to Docker at SF Peninsula Software Development Meetup @Guidewire
 
The Accidental DBA
The Accidental DBAThe Accidental DBA
The Accidental DBA
 
Post mortem talk - Node Interactive EU
Post mortem talk - Node Interactive EUPost mortem talk - Node Interactive EU
Post mortem talk - Node Interactive EU
 
Containers > VMs
Containers > VMsContainers > VMs
Containers > VMs
 
Kubernetes 101
Kubernetes 101Kubernetes 101
Kubernetes 101
 
Docker and-containers-for-development-and-deployment-scale12x
Docker and-containers-for-development-and-deployment-scale12xDocker and-containers-for-development-and-deployment-scale12x
Docker and-containers-for-development-and-deployment-scale12x
 
ELC-E Linux Awareness
ELC-E Linux AwarenessELC-E Linux Awareness
ELC-E Linux Awareness
 
SCM Puppet: from an intro to the scaling
SCM Puppet: from an intro to the scalingSCM Puppet: from an intro to the scaling
SCM Puppet: from an intro to the scaling
 
Troubleshooting .net core on linux
Troubleshooting .net core on linuxTroubleshooting .net core on linux
Troubleshooting .net core on linux
 
Deep Postgres Extensions in Rust | PGCon 2019 | Jeff Davis
Deep Postgres Extensions in Rust | PGCon 2019 | Jeff DavisDeep Postgres Extensions in Rust | PGCon 2019 | Jeff Davis
Deep Postgres Extensions in Rust | PGCon 2019 | Jeff Davis
 
Ippevent : openshift Introduction
Ippevent : openshift IntroductionIppevent : openshift Introduction
Ippevent : openshift Introduction
 
Docker 0.11 at MaxCDN meetup in Los Angeles
Docker 0.11 at MaxCDN meetup in Los AngelesDocker 0.11 at MaxCDN meetup in Los Angeles
Docker 0.11 at MaxCDN meetup in Los Angeles
 
Java gpu computing
Java gpu computingJava gpu computing
Java gpu computing
 

Plus de VMware Tanzu

Plus de VMware Tanzu (20)

What AI Means For Your Product Strategy And What To Do About It
What AI Means For Your Product Strategy And What To Do About ItWhat AI Means For Your Product Strategy And What To Do About It
What AI Means For Your Product Strategy And What To Do About It
 
Make the Right Thing the Obvious Thing at Cardinal Health 2023
Make the Right Thing the Obvious Thing at Cardinal Health 2023Make the Right Thing the Obvious Thing at Cardinal Health 2023
Make the Right Thing the Obvious Thing at Cardinal Health 2023
 
Enhancing DevEx and Simplifying Operations at Scale
Enhancing DevEx and Simplifying Operations at ScaleEnhancing DevEx and Simplifying Operations at Scale
Enhancing DevEx and Simplifying Operations at Scale
 
Spring Update | July 2023
Spring Update | July 2023Spring Update | July 2023
Spring Update | July 2023
 
Platforms, Platform Engineering, & Platform as a Product
Platforms, Platform Engineering, & Platform as a ProductPlatforms, Platform Engineering, & Platform as a Product
Platforms, Platform Engineering, & Platform as a Product
 
Building Cloud Ready Apps
Building Cloud Ready AppsBuilding Cloud Ready Apps
Building Cloud Ready Apps
 
Spring Boot 3 And Beyond
Spring Boot 3 And BeyondSpring Boot 3 And Beyond
Spring Boot 3 And Beyond
 
Spring Cloud Gateway - SpringOne Tour 2023 Charles Schwab.pdf
Spring Cloud Gateway - SpringOne Tour 2023 Charles Schwab.pdfSpring Cloud Gateway - SpringOne Tour 2023 Charles Schwab.pdf
Spring Cloud Gateway - SpringOne Tour 2023 Charles Schwab.pdf
 
Simplify and Scale Enterprise Apps in the Cloud | Boston 2023
Simplify and Scale Enterprise Apps in the Cloud | Boston 2023Simplify and Scale Enterprise Apps in the Cloud | Boston 2023
Simplify and Scale Enterprise Apps in the Cloud | Boston 2023
 
Simplify and Scale Enterprise Apps in the Cloud | Seattle 2023
Simplify and Scale Enterprise Apps in the Cloud | Seattle 2023Simplify and Scale Enterprise Apps in the Cloud | Seattle 2023
Simplify and Scale Enterprise Apps in the Cloud | Seattle 2023
 
tanzu_developer_connect.pptx
tanzu_developer_connect.pptxtanzu_developer_connect.pptx
tanzu_developer_connect.pptx
 
Tanzu Virtual Developer Connect Workshop - French
Tanzu Virtual Developer Connect Workshop - FrenchTanzu Virtual Developer Connect Workshop - French
Tanzu Virtual Developer Connect Workshop - French
 
Tanzu Developer Connect Workshop - English
Tanzu Developer Connect Workshop - EnglishTanzu Developer Connect Workshop - English
Tanzu Developer Connect Workshop - English
 
Virtual Developer Connect Workshop - English
Virtual Developer Connect Workshop - EnglishVirtual Developer Connect Workshop - English
Virtual Developer Connect Workshop - English
 
Tanzu Developer Connect - French
Tanzu Developer Connect - FrenchTanzu Developer Connect - French
Tanzu Developer Connect - French
 
Simplify and Scale Enterprise Apps in the Cloud | Dallas 2023
Simplify and Scale Enterprise Apps in the Cloud | Dallas 2023Simplify and Scale Enterprise Apps in the Cloud | Dallas 2023
Simplify and Scale Enterprise Apps in the Cloud | Dallas 2023
 
SpringOne Tour: Deliver 15-Factor Applications on Kubernetes with Spring Boot
SpringOne Tour: Deliver 15-Factor Applications on Kubernetes with Spring BootSpringOne Tour: Deliver 15-Factor Applications on Kubernetes with Spring Boot
SpringOne Tour: Deliver 15-Factor Applications on Kubernetes with Spring Boot
 
SpringOne Tour: The Influential Software Engineer
SpringOne Tour: The Influential Software EngineerSpringOne Tour: The Influential Software Engineer
SpringOne Tour: The Influential Software Engineer
 
SpringOne Tour: Domain-Driven Design: Theory vs Practice
SpringOne Tour: Domain-Driven Design: Theory vs PracticeSpringOne Tour: Domain-Driven Design: Theory vs Practice
SpringOne Tour: Domain-Driven Design: Theory vs Practice
 
SpringOne Tour: Spring Recipes: A Collection of Common-Sense Solutions
SpringOne Tour: Spring Recipes: A Collection of Common-Sense SolutionsSpringOne Tour: Spring Recipes: A Collection of Common-Sense Solutions
SpringOne Tour: Spring Recipes: A Collection of Common-Sense Solutions
 

Dernier

introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdfintroduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
VishalKumarJha10
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service provider
mohitmore19
 
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesAI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
VictorSzoltysek
 

Dernier (20)

How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
 
ManageIQ - Sprint 236 Review - Slide Deck
ManageIQ - Sprint 236 Review - Slide DeckManageIQ - Sprint 236 Review - Slide Deck
ManageIQ - Sprint 236 Review - Slide Deck
 
The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...
The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...
The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
 
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
 
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
 
Pharm-D Biostatistics and Research methodology
Pharm-D Biostatistics and Research methodologyPharm-D Biostatistics and Research methodology
Pharm-D Biostatistics and Research methodology
 
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial Goals
 
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdfPayment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
 
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdfintroduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
 
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
 
Introducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) SolutionIntroducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) Solution
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service provider
 
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesAI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
 
Define the academic and professional writing..pdf
Define the academic and professional writing..pdfDefine the academic and professional writing..pdf
Define the academic and professional writing..pdf
 
Microsoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdfMicrosoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdf
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
 
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
 

Customize and Secure the Runtime and Dependencies of Your Procedural Languages Using PL/Container - Greenplum Summit 2018

  • 1. © Copyright 2017 Pivotal Software, Inc. All rights Reserved. Version 1.0 Hubert Zhang (hzhang@pivotal.io) Jack WU (jwu@pivotal.io) PL/Container Introduction Customize and Secure the Runtime and Dependencies of Procedural Languages
  • 2. Cover w/ Image Agenda ■ The Problem ■ What is PL/Container ■ How to use PL/Container ■ PL/Container Internals ■ Future Work ■ Q+A
  • 3. The Problem PGConf 2018: PL/Container Introduction
  • 4. We generate More and More Data
  • 5. We generate More and More Data 1.2 ZB 2.8ZB 8.5ZB 40ZB2010 2012 2015 20201ZB=1,000,000,000,000,000,000,000 Byte
  • 6. We want to analyze data for knowledge
  • 7. We want to analyse data IN Database
  • 9. PL/Python and PL/R are UNTRUSTED Languages
  • 10. Only Superuser can Create UDF in Untrusted Languages System(“rm -rf /data”)
  • 11. The Problem: Triangle Dependency Data Scientist DBA UDF Review & Create Run UDF Greenplum Package UDFPackage 1. Greenplum 2. Operation System 3. Python / R 4. TensorFlow
  • 12. Resolve The Problem: untrusted -> untrusted Data Scientist DBA UDF Review & Create Run UDF Create UDF
  • 13. How to Make untrusted to untrusted? PL/Container PL/Container
  • 14. What is PL/Container PGConf 2018: PL/Container Introduction
  • 15. What is PL/Container? PL/Container is a customizable, secure runtime for Greenplum Database Procedural Languages. ● Greenplum Database Extension ● Stateless ● Based on Docker Container ● Customizable ● Secure ● Isolated PL/Container PL/Container
  • 16. 6.Ping 7.Ping 8.Call 9.SPI 10.Result 11.Result PL/Container Architecture GPDB Segment host 1. Query plan 2. Parse container name from UDF body 3. Read configuration 4. Bring up container 5. Start Repeat p.8–p.11 12. Result Repeat p.12
  • 17. How UDF run in PL/Container? PL/Container is a customizable, secure runtime for Greenplum Database Procedural Languages. a. PL/Container Extension starts a docker container (only in 1st call) b. Transfer UDF and data to docker container c. Run the UDF in docker container d. Contact the docker container to get the results PL/Container PL/Container
  • 18. How to use PL/Container PGConf 2018: PL/Container Introduction
  • 19. Install PL/Container on Greenplum Install from Source Code ● source $GPHOME/greenplum_path.sh ● make install Install from GPPKG ● gppkg –i plcontianer-1.1.0-rhel7-x86_64.gppkg ● no additional dependencies Platform Centos 6.6+ or 7.x Database Greenplum 5.2+ Docker Docker 17.05+ on Centos7 Docker 1.7+ on Centos6 Prerequisites
  • 20. Build Custom Docker Image (optional) Minimum Requirement: ● Python or R environment ● Add location of libpython.so and libR.so to LD_LIBRARY_PATH Customize Your image: ● Install specific packages FROM continuumio/anaconda ENV LD_LIBRARY_PATH "/opt/conda/lib:$LD_LIBRARY_PATH” FROM continuumio/anaconda3 RUN conda install -c conda-forge -y tensorflow ENV LD_LIBRARY_PATH "/opt/conda/lib:/usr/local/lib:$LD_LIBRARY_PATH"
  • 21. runtime <id> <image> <command> <shared directory> <setting memory_mb> <setting cpu_share> runtime add runtime delete runtime backup runtime restore runtime edit runtime show image add image delete image list Image RuntimeXML Configure PL/Container container cgroup node memory.memsw.limit_in_bytes cpu.shares
  • 22. Run PL/Container Running a simple plpython UDF to calculate log10 postgres=# CREATE LANGUAGE plpythonu; plcontainer; postgres=# CREATE OR REPLACE FUNCTION pylog10(input double precision) RETURNS double precision AS $$ import math return math.log10(input) $$ LANGUAGE plpythonu; postgres=# SELECT pylog10(100); pylog10 -------------- 2 (1 row)
  • 23. Run PL/Container Running a simple PL/Container UDF to calculate log10 postgres=# CREATE EXTENSION plcontainer; postgres=# CREATE OR REPLACE FUNCTION pylog10(Input double precision) RETURNS double precision AS $$ # container: plc_python_shared import math return math.log10(input) $$ LANGUAGE plcontainer; postgres=# SELECT pylog10(100); pylog10 -------------- 2 (1 row)
  • 24. Demos How to use PLcontainer
  • 25. PL/Container Internal PGConf 2018: PL/Container Introduction
  • 26. Cover w/ Image PL/Container Internals ■ Message Protocol ■ SPI Support ■ Pluggable Backend ■ Resource Management ■ Error Handling ■ Performance
  • 27. Message Protocol PL/Container use messages to communicate between QEs and containers. ● plcMsgPing ● plcMsgCallreq ● plcMsgResult ● plcMsgError ● plcMsgLog ● plcMsgSQL ● plcMsgSubtransaction ● plcMsgRaw Query Executor Container PING PONG SQLResultSQLError/LogCallReq Result
  • 28. SPI support Server Programming Interface enable UDF to run SQL queries. Query ExecutorContainer plpy.execute(query) plan=plpy.prepare plpy.execute(plan) SPI_execute() create&cache plan SPI_execute_plan Where Query Generated Where Query Executed Problem: SPI is called inside container but executed at QE side.
  • 29. Pluggable Backend Docker as a sandbox Container as a serviceSeparate Process GPDB Query Executor C CODE Python Executor Python Code
  • 30. Resource Management Container level ● Memory: memory limit, minimum is 4M ● CpuShares: relative weight of CPU Extension level ● Integrated with GPDB resource group extension framework. OOM
  • 31. Container level ● Memory: memory limit, minimum is 4M ● CpuShares: relative weight of CPU Extension level ● Integrated with GPDB resource group extension framework. create resource group plgroup (concurrency=0, cpu_rate_limit=10, memory_limit=30, memory_auditor=‘cgroup’) Resource Management OOM
  • 32. Resource Management Container level ● Memory: memory limit, minimum is 4M ● CpuShares: relative weight of CPU Extension level ● Integrated with GPDB resource group extension framework. memory gpdb default_grou p OID plgroup oid aefs9err8e flkr345ere4 cgroup 38G 1G 2G create resource group plgroup (concurrency=0, cpu_rate_limit=10, memory_limit=30, memory_auditor=‘cgroup’)
  • 33. Error Handling Container failure should not affect GPDB core ● Containers fail to create ● Containers fail to start ● Containers crash when running ● Cached containers crash Container cleanup ● Query Cancel ● QE error ● Cached QE quit when idle for a long time (By Cleanup process)
  • 34. Performance Optimization ● Cached container (lifecycle same as QE) ● Unix domain socket ● Type conversion ● Resource management (CPU share) Best practices ● Array instead of multiple rows. ● Complex UDF instead of simple one QE Container start container tuple 1 tuple 2 query 1 query 2 tuple 1 tuple n tuple n . . . . .
  • 35. Performance Test Environment ● Hardware: 6 virtual machines, each with 19G memory and 5 processors. (Intel(R) Xeon(R) CPU E5-2697 v2 @ 2.70GHz) ● Software: Centos7, GPDB 5.2 with 30 segments. Workloads ● Long-running function ● Large input array function ● Large output array function
  • 36. Performance Long-running function CREATE OR REPLACE FUNCTION pysleep(i int) RETURNS void AS $$ # contaziner: plc_python_shared import time time.sleep(i) $$ LANGUAGE plcontainer; SELECT count(pysleep(1)) FROM tbl; no performance downgrade
  • 37. Performance Large input array function CREATE OR REPLACE FUNCTION pylargeint8in(a int8[]) RETURNS float8 AS $$ #container : plc_python_shared return sum(a)/float(len(a)) $$ LANGUAGE plcontainer; SELECT count(pylargeint8in(ARRAY(SELECT column1 FROM tbl1))) FROM tbl2; 2 times performance downgrade for Python 30% performance improvement for R
  • 38. Performance Large output array function CREATE OR REPLACE FUNCTION pylargeoutfloat8(num int) RETURNS float8[] AS $$ # container: plc_python_shared return [x/3.0 for x in range(num)] $$ LANGUAGE plcontainer; SELECT count(pylargeoutfloat8(n)) FROM tbl; 4 times performance improvement
  • 39. Future Work PGConf 2018: PL/Container Introduction
  • 40. PL/Container Future Work (subject to change) Support More Languages Support More TechnologyContainer Orchestrater / Cloud
  • 42. Transforming How The World Builds Software © Copyright 2017 Pivotal Software, Inc. All rights Reserved.