SlideShare une entreprise Scribd logo
1  sur  4
Google Certified Professional - Data Engineer
Job Role Description
A Google Certified Professional - Data Engineer enables data-driven decision making by collecting,
transforming, and visualizing data. The data engineer should be able to design, build, maintain, and
troubleshoot data processing systems with a particular emphasis on the security, reliability,
fault-tolerance, scalability, fidelity, and efficiency of such systems. The data engineer should also be able
to analyze data to gain insight into business outcomes, build statistical models to support
decision-making, and create machine learning models to automate and simplify key business processes.
Certification Exam Guide
Section 1: Designing data processing systems
1.1 Designing flexible data representations. Considerations include:
● future advances in data technology
● changes to business requirements
● awareness of current state and how to migrate the design to a future state
● data modeling
● tradeoffs
● distributed systems
● schema design
1.2 Designing data pipelines. Considerations include:
● future advances in data technology
● changes to business requirements
● awareness of current state and how to migrate the design to a future state
● data modeling
● tradeoffs
● system availability
● distributed systems
● schema design
● common sources of error (eg. removing selection bias)
1.3 Designing data processing infrastructure. Considerations include:
● future advances in data technology
● changes to business requirements
● awareness of current state, how to migrate the design to the future state
● data modeling
● tradeoffs
● system availability
● distributed systems
● schema design
● capacity planning
● different types of architectures: message brokers, message queues, middleware,
service-oriented
Section 2: Building and maintaining data structures and databases
2.1 Building and maintaining flexible data representations
2.2 Building and maintaining pipelines. Considerations include:
● data cleansing
● batch and streaming
● transformation
● acquire and import data
● testing and quality control
● connecting to new data sources
2.3 Building and maintaining processing infrastructure. Considerations include:
● provisioning resources
● monitoring pipelines
● adjusting pipelines
● testing and quality control
Section 3: Analyzing data and enabling machine learning
3.1 Analyzing data. Considerations include:
● data profiling
● data correlation
● patterns and insights
● anomaly detection
● statistical models
● machine learning
● assessing the statistical relevance of conclusions
3.2 Transforming data to enable machine learning and pattern discovery. Considerations
include:
● repeatability
● generalization
● distributed computing
● improved model accuracy
3.3 Identifying or building data visualization and reporting tools. Considerations include:
● automation
● decision support
● data summarization
● enabling patterns and insights
Section 4: Modeling business processes for analysis and optimization
4.1 Mapping business requirements to data representations. Considerations include:
● working with business users
● gathering business requirements
4.2 Optimizing data representations, data infrastructure performance and cost.
Considerations include:
● resizing and scaling resources
● data cleansing, distributed systems
● high performance algorithms
● common sources of error (eg. removing selection bias)
Section 5: Ensuring reliability
5.1 Performing quality control. Considerations include:
● verification
● building and running test suites
● pipeline monitoring
5.2 Assessing, troubleshooting, and improving data representations and data processing
infrastructure.
5.3 Recovering data. Considerations include:
● planning (e.g. fault-tolerance)
● executing (e.g., rerunning failed jobs, performing retrospective re-analysis)
● stress testing data recovery plans and processes
Section 6: Visualizing data and advocating policy
6.1 Building (or selecting) data visualization and reporting tools. Considerations include:
● automation
● decision support
● data summarization, (e.g, translation up the chain, fidelity, trackability, integrity)
6.2 Advocating policies and publishing data and reports.
Section 7: ​ ​Designing for security and compliance
7.1 Designing secure data infrastructure and processes. Considerations include:
● Identify and Access Management (IAM)
● data security
● penetration testing
● Separation of Duties (SoD)
● security control
7.2 Designing for legal compliance. Considerations include:
● Health Insurance Portability and Accountability Act (HIPAA), Children’s Online
Privacy Protection Act (COPPA), etc.
● audits

Contenu connexe

Tendances

12 ipt 0106/7 Project Implementation & Testing
12 ipt 0106/7   Project Implementation & Testing12 ipt 0106/7   Project Implementation & Testing
12 ipt 0106/7 Project Implementation & Testingctedds
 
12 ipt 0201 information systems
12 ipt 0201   information systems12 ipt 0201   information systems
12 ipt 0201 information systemsctedds
 
Mis system analysis and system design
Mis   system analysis and system designMis   system analysis and system design
Mis system analysis and system designRahul Hedau
 
Project Management for Information System Development
Project Management for Information System DevelopmentProject Management for Information System Development
Project Management for Information System DevelopmentNabilaNuzhat
 
System Analysis And Design 2011
System Analysis And Design  2011System Analysis And Design  2011
System Analysis And Design 2011tgushi12
 
12 ipt 0104 making decisions
12 ipt 0104   making decisions12 ipt 0104   making decisions
12 ipt 0104 making decisionsctedds
 
Information Management unit 3 Database management systems
Information Management unit 3 Database management systemsInformation Management unit 3 Database management systems
Information Management unit 3 Database management systemsGanesha Pandian
 
Bba205 – management information system
Bba205 – management information systemBba205 – management information system
Bba205 – management information systemsmumbahelp
 
System and Design-MIS-Seminar,Presentation
 System and Design-MIS-Seminar,Presentation System and Design-MIS-Seminar,Presentation
System and Design-MIS-Seminar,PresentationPraveen Gummadidala
 
Decision Making Framework in e-Business Cloud Environment Using Software Metr...
Decision Making Framework in e-Business Cloud Environment Using Software Metr...Decision Making Framework in e-Business Cloud Environment Using Software Metr...
Decision Making Framework in e-Business Cloud Environment Using Software Metr...ijitjournal
 
HSC IPT 1.1) Project mangement
HSC IPT 1.1) Project mangementHSC IPT 1.1) Project mangement
HSC IPT 1.1) Project mangementctedds
 
167543812 a-study-on-smart-card-doc
167543812 a-study-on-smart-card-doc167543812 a-study-on-smart-card-doc
167543812 a-study-on-smart-card-dochomeworkping8
 
System Analysis & Design
System Analysis & DesignSystem Analysis & Design
System Analysis & DesignMustafa Ali
 
Introduction to System analysis part1
Introduction to System analysis part1Introduction to System analysis part1
Introduction to System analysis part1DrMohammed Qassim
 
1.2) Information systems in context
1.2) Information systems in context1.2) Information systems in context
1.2) Information systems in contextctedds
 
Ipt Syllabus Changes Project Management
Ipt Syllabus Changes   Project ManagementIpt Syllabus Changes   Project Management
Ipt Syllabus Changes Project ManagementLiam Dunphy
 

Tendances (20)

12 ipt 0106/7 Project Implementation & Testing
12 ipt 0106/7   Project Implementation & Testing12 ipt 0106/7   Project Implementation & Testing
12 ipt 0106/7 Project Implementation & Testing
 
12 ipt 0201 information systems
12 ipt 0201   information systems12 ipt 0201   information systems
12 ipt 0201 information systems
 
Mis system analysis and system design
Mis   system analysis and system designMis   system analysis and system design
Mis system analysis and system design
 
Project Management for Information System Development
Project Management for Information System DevelopmentProject Management for Information System Development
Project Management for Information System Development
 
System Analysis And Design 2011
System Analysis And Design  2011System Analysis And Design  2011
System Analysis And Design 2011
 
12 ipt 0104 making decisions
12 ipt 0104   making decisions12 ipt 0104   making decisions
12 ipt 0104 making decisions
 
Information Management unit 3 Database management systems
Information Management unit 3 Database management systemsInformation Management unit 3 Database management systems
Information Management unit 3 Database management systems
 
Bba205 – management information system
Bba205 – management information systemBba205 – management information system
Bba205 – management information system
 
System and Design-MIS-Seminar,Presentation
 System and Design-MIS-Seminar,Presentation System and Design-MIS-Seminar,Presentation
System and Design-MIS-Seminar,Presentation
 
Ch06
Ch06Ch06
Ch06
 
Decision Making Framework in e-Business Cloud Environment Using Software Metr...
Decision Making Framework in e-Business Cloud Environment Using Software Metr...Decision Making Framework in e-Business Cloud Environment Using Software Metr...
Decision Making Framework in e-Business Cloud Environment Using Software Metr...
 
HSC IPT 1.1) Project mangement
HSC IPT 1.1) Project mangementHSC IPT 1.1) Project mangement
HSC IPT 1.1) Project mangement
 
167543812 a-study-on-smart-card-doc
167543812 a-study-on-smart-card-doc167543812 a-study-on-smart-card-doc
167543812 a-study-on-smart-card-doc
 
Gr 6 sdlc models
Gr 6   sdlc modelsGr 6   sdlc models
Gr 6 sdlc models
 
System Analysis & Design
System Analysis & DesignSystem Analysis & Design
System Analysis & Design
 
Introduction to System analysis part1
Introduction to System analysis part1Introduction to System analysis part1
Introduction to System analysis part1
 
1.2) Information systems in context
1.2) Information systems in context1.2) Information systems in context
1.2) Information systems in context
 
Ipt Syllabus Changes Project Management
Ipt Syllabus Changes   Project ManagementIpt Syllabus Changes   Project Management
Ipt Syllabus Changes Project Management
 
Analysis vs reporting
Analysis vs reportingAnalysis vs reporting
Analysis vs reporting
 
Data warehouse physical design
Data warehouse physical designData warehouse physical design
Data warehouse physical design
 

Similaire à Google certified-professional-data-engineer

Architectural aspects and design hypothesis of the data ingestion pipeline
Architectural aspects and design hypothesis of the data ingestion pipeline Architectural aspects and design hypothesis of the data ingestion pipeline
Architectural aspects and design hypothesis of the data ingestion pipeline Gathr One
 
Understand your data dependencies – Key enabler to efficient modernisation
 Understand your data dependencies – Key enabler to efficient modernisation  Understand your data dependencies – Key enabler to efficient modernisation
Understand your data dependencies – Key enabler to efficient modernisation Profinit
 
CIS 2303 LO1: Introduction to System Analysis and Design
CIS 2303 LO1: Introduction to System Analysis and DesignCIS 2303 LO1: Introduction to System Analysis and Design
CIS 2303 LO1: Introduction to System Analysis and DesignAhmad Ammari
 
Analyzing and Visualizing Data with Power BI (SF)_Student.pptx
Analyzing and Visualizing Data with Power BI (SF)_Student.pptxAnalyzing and Visualizing Data with Power BI (SF)_Student.pptx
Analyzing and Visualizing Data with Power BI (SF)_Student.pptxAlexChua42
 
MOPs & ML Pipelines on GCP - Session 6, RGDC
MOPs & ML Pipelines on GCP - Session 6, RGDCMOPs & ML Pipelines on GCP - Session 6, RGDC
MOPs & ML Pipelines on GCP - Session 6, RGDCgdgsurrey
 
Big data-analytics-for-smart-manufacturing-systems-report
Big data-analytics-for-smart-manufacturing-systems-reportBig data-analytics-for-smart-manufacturing-systems-report
Big data-analytics-for-smart-manufacturing-systems-reportAravindharamanan S
 
PTTKHTTT_part 1.pdf
PTTKHTTT_part 1.pdfPTTKHTTT_part 1.pdf
PTTKHTTT_part 1.pdfTmTri
 
Requirements management planning & Requirements change management
Requirements management planning & Requirements change managementRequirements management planning & Requirements change management
Requirements management planning & Requirements change managementRa'Fat Al-Msie'deen
 
rough-work.pptx
rough-work.pptxrough-work.pptx
rough-work.pptxsharpan
 
DDMA / T-Mobile: Datakwaliteit
DDMA / T-Mobile: DatakwaliteitDDMA / T-Mobile: Datakwaliteit
DDMA / T-Mobile: DatakwaliteitDDMA
 
BABOK v3 讀書會 CH5 20150528
BABOK v3 讀書會 CH5 20150528BABOK v3 讀書會 CH5 20150528
BABOK v3 讀書會 CH5 20150528moris lee
 
White Paper-2-Mapping Manager-Bringing Agility To Business Intelligence
White Paper-2-Mapping Manager-Bringing Agility To Business IntelligenceWhite Paper-2-Mapping Manager-Bringing Agility To Business Intelligence
White Paper-2-Mapping Manager-Bringing Agility To Business IntelligenceAnalytixDataServices
 
MLOps and Data Quality: Deploying Reliable ML Models in Production
MLOps and Data Quality: Deploying Reliable ML Models in ProductionMLOps and Data Quality: Deploying Reliable ML Models in Production
MLOps and Data Quality: Deploying Reliable ML Models in ProductionProvectus
 
Software Development Life Cycle (SDLC).pptx
Software Development Life Cycle (SDLC).pptxSoftware Development Life Cycle (SDLC).pptx
Software Development Life Cycle (SDLC).pptxsandhyakiran10
 
Pysyvästi laadukasta masterdataa SmartMDM:n avulla
Pysyvästi laadukasta masterdataa SmartMDM:n avullaPysyvästi laadukasta masterdataa SmartMDM:n avulla
Pysyvästi laadukasta masterdataa SmartMDM:n avullaBilot
 
Logicentrix Dashboards And Scorecards
Logicentrix Dashboards And ScorecardsLogicentrix Dashboards And Scorecards
Logicentrix Dashboards And Scorecardssanolan
 

Similaire à Google certified-professional-data-engineer (20)

Architectural aspects and design hypothesis of the data ingestion pipeline
Architectural aspects and design hypothesis of the data ingestion pipeline Architectural aspects and design hypothesis of the data ingestion pipeline
Architectural aspects and design hypothesis of the data ingestion pipeline
 
Understand your data dependencies – Key enabler to efficient modernisation
 Understand your data dependencies – Key enabler to efficient modernisation  Understand your data dependencies – Key enabler to efficient modernisation
Understand your data dependencies – Key enabler to efficient modernisation
 
CIS 2303 LO1: Introduction to System Analysis and Design
CIS 2303 LO1: Introduction to System Analysis and DesignCIS 2303 LO1: Introduction to System Analysis and Design
CIS 2303 LO1: Introduction to System Analysis and Design
 
Analyzing and Visualizing Data with Power BI (SF)_Student.pptx
Analyzing and Visualizing Data with Power BI (SF)_Student.pptxAnalyzing and Visualizing Data with Power BI (SF)_Student.pptx
Analyzing and Visualizing Data with Power BI (SF)_Student.pptx
 
MOPs & ML Pipelines on GCP - Session 6, RGDC
MOPs & ML Pipelines on GCP - Session 6, RGDCMOPs & ML Pipelines on GCP - Session 6, RGDC
MOPs & ML Pipelines on GCP - Session 6, RGDC
 
Big data-analytics-for-smart-manufacturing-systems-report
Big data-analytics-for-smart-manufacturing-systems-reportBig data-analytics-for-smart-manufacturing-systems-report
Big data-analytics-for-smart-manufacturing-systems-report
 
PTTKHTTT_part 1.pdf
PTTKHTTT_part 1.pdfPTTKHTTT_part 1.pdf
PTTKHTTT_part 1.pdf
 
Requirements management planning & Requirements change management
Requirements management planning & Requirements change managementRequirements management planning & Requirements change management
Requirements management planning & Requirements change management
 
rough-work.pptx
rough-work.pptxrough-work.pptx
rough-work.pptx
 
1 introduction of OOAD
1 introduction of OOAD1 introduction of OOAD
1 introduction of OOAD
 
C2_W1---.pdf
C2_W1---.pdfC2_W1---.pdf
C2_W1---.pdf
 
Building information systems
Building information systemsBuilding information systems
Building information systems
 
DDMA / T-Mobile: Datakwaliteit
DDMA / T-Mobile: DatakwaliteitDDMA / T-Mobile: Datakwaliteit
DDMA / T-Mobile: Datakwaliteit
 
BABOK v3 讀書會 CH5 20150528
BABOK v3 讀書會 CH5 20150528BABOK v3 讀書會 CH5 20150528
BABOK v3 讀書會 CH5 20150528
 
White Paper-2-Mapping Manager-Bringing Agility To Business Intelligence
White Paper-2-Mapping Manager-Bringing Agility To Business IntelligenceWhite Paper-2-Mapping Manager-Bringing Agility To Business Intelligence
White Paper-2-Mapping Manager-Bringing Agility To Business Intelligence
 
MLOps and Data Quality: Deploying Reliable ML Models in Production
MLOps and Data Quality: Deploying Reliable ML Models in ProductionMLOps and Data Quality: Deploying Reliable ML Models in Production
MLOps and Data Quality: Deploying Reliable ML Models in Production
 
Software Development Life Cycle (SDLC).pptx
Software Development Life Cycle (SDLC).pptxSoftware Development Life Cycle (SDLC).pptx
Software Development Life Cycle (SDLC).pptx
 
Presentation2
Presentation2Presentation2
Presentation2
 
Pysyvästi laadukasta masterdataa SmartMDM:n avulla
Pysyvästi laadukasta masterdataa SmartMDM:n avullaPysyvästi laadukasta masterdataa SmartMDM:n avulla
Pysyvästi laadukasta masterdataa SmartMDM:n avulla
 
Logicentrix Dashboards And Scorecards
Logicentrix Dashboards And ScorecardsLogicentrix Dashboards And Scorecards
Logicentrix Dashboards And Scorecards
 

Plus de aBIZinaBOX Inc - CPA's - Financial Advisory, Taxation, Predictive Analytics & Technology

Plus de aBIZinaBOX Inc - CPA's - Financial Advisory, Taxation, Predictive Analytics & Technology (8)

Who Gets The Cash From a Pound of California Cannabis
Who Gets The Cash From a Pound of California CannabisWho Gets The Cash From a Pound of California Cannabis
Who Gets The Cash From a Pound of California Cannabis
 
Irc sec. 280 e memo j
Irc sec. 280 e memo   jIrc sec. 280 e memo   j
Irc sec. 280 e memo j
 
MARCUSA California Cannabis Emergency Regulations - Accounting, Tax and Recor...
MARCUSA California Cannabis Emergency Regulations - Accounting, Tax and Recor...MARCUSA California Cannabis Emergency Regulations - Accounting, Tax and Recor...
MARCUSA California Cannabis Emergency Regulations - Accounting, Tax and Recor...
 
Can Roo Rooting Create Succession Rights?
Can Roo Rooting Create Succession Rights?Can Roo Rooting Create Succession Rights?
Can Roo Rooting Create Succession Rights?
 
aBIZinaBOX's View of the "US Market Leaders" in the Xero Ecosystem
aBIZinaBOX's View of the "US Market Leaders" in the Xero EcosystemaBIZinaBOX's View of the "US Market Leaders" in the Xero Ecosystem
aBIZinaBOX's View of the "US Market Leaders" in the Xero Ecosystem
 
Due dilligence on a cpa firm or other accounting services provdier
Due dilligence on a cpa firm or other accounting services provdierDue dilligence on a cpa firm or other accounting services provdier
Due dilligence on a cpa firm or other accounting services provdier
 
EAs and Circular 230
EAs and Circular 230EAs and Circular 230
EAs and Circular 230
 
“America’s Tax Experts” - A Cruel Hoax
“America’s Tax Experts” - A Cruel Hoax“America’s Tax Experts” - A Cruel Hoax
“America’s Tax Experts” - A Cruel Hoax
 

Dernier

Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesBoston Institute of Analytics
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...apidays
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 

Dernier (20)

Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 

Google certified-professional-data-engineer

  • 1. Google Certified Professional - Data Engineer Job Role Description A Google Certified Professional - Data Engineer enables data-driven decision making by collecting, transforming, and visualizing data. The data engineer should be able to design, build, maintain, and troubleshoot data processing systems with a particular emphasis on the security, reliability, fault-tolerance, scalability, fidelity, and efficiency of such systems. The data engineer should also be able to analyze data to gain insight into business outcomes, build statistical models to support decision-making, and create machine learning models to automate and simplify key business processes. Certification Exam Guide Section 1: Designing data processing systems 1.1 Designing flexible data representations. Considerations include: ● future advances in data technology ● changes to business requirements ● awareness of current state and how to migrate the design to a future state ● data modeling ● tradeoffs ● distributed systems ● schema design 1.2 Designing data pipelines. Considerations include: ● future advances in data technology ● changes to business requirements ● awareness of current state and how to migrate the design to a future state ● data modeling ● tradeoffs ● system availability ● distributed systems ● schema design ● common sources of error (eg. removing selection bias) 1.3 Designing data processing infrastructure. Considerations include: ● future advances in data technology ● changes to business requirements ● awareness of current state, how to migrate the design to the future state ● data modeling ● tradeoffs ● system availability ● distributed systems ● schema design ● capacity planning
  • 2. ● different types of architectures: message brokers, message queues, middleware, service-oriented Section 2: Building and maintaining data structures and databases 2.1 Building and maintaining flexible data representations 2.2 Building and maintaining pipelines. Considerations include: ● data cleansing ● batch and streaming ● transformation ● acquire and import data ● testing and quality control ● connecting to new data sources 2.3 Building and maintaining processing infrastructure. Considerations include: ● provisioning resources ● monitoring pipelines ● adjusting pipelines ● testing and quality control Section 3: Analyzing data and enabling machine learning 3.1 Analyzing data. Considerations include: ● data profiling ● data correlation ● patterns and insights ● anomaly detection ● statistical models ● machine learning ● assessing the statistical relevance of conclusions 3.2 Transforming data to enable machine learning and pattern discovery. Considerations include: ● repeatability ● generalization ● distributed computing ● improved model accuracy 3.3 Identifying or building data visualization and reporting tools. Considerations include: ● automation ● decision support ● data summarization ● enabling patterns and insights
  • 3. Section 4: Modeling business processes for analysis and optimization 4.1 Mapping business requirements to data representations. Considerations include: ● working with business users ● gathering business requirements 4.2 Optimizing data representations, data infrastructure performance and cost. Considerations include: ● resizing and scaling resources ● data cleansing, distributed systems ● high performance algorithms ● common sources of error (eg. removing selection bias) Section 5: Ensuring reliability 5.1 Performing quality control. Considerations include: ● verification ● building and running test suites ● pipeline monitoring 5.2 Assessing, troubleshooting, and improving data representations and data processing infrastructure. 5.3 Recovering data. Considerations include: ● planning (e.g. fault-tolerance) ● executing (e.g., rerunning failed jobs, performing retrospective re-analysis) ● stress testing data recovery plans and processes Section 6: Visualizing data and advocating policy 6.1 Building (or selecting) data visualization and reporting tools. Considerations include: ● automation ● decision support ● data summarization, (e.g, translation up the chain, fidelity, trackability, integrity) 6.2 Advocating policies and publishing data and reports. Section 7: ​ ​Designing for security and compliance 7.1 Designing secure data infrastructure and processes. Considerations include: ● Identify and Access Management (IAM) ● data security ● penetration testing ● Separation of Duties (SoD) ● security control 7.2 Designing for legal compliance. Considerations include:
  • 4. ● Health Insurance Portability and Accountability Act (HIPAA), Children’s Online Privacy Protection Act (COPPA), etc. ● audits