SlideShare une entreprise Scribd logo
1  sur  33
Télécharger pour lire hors ligne
MiTAC MiCloud - Google Cloud Platform Partner @ APAC2014Q2 BigQuery Workshop
Google BigQuery
Big data with SQL like query feature, but fast...
Google BigQueryGoogle BigQuery
http://goo.gl/XZmqgN
RESTful
GCE
LB
前言:
● 我們要實作喔~ 有興趣的
朋友,請打開您的電腦...
● 開好GCP專案?
● Enable Billing了?
● 裝好google_cloud_sdk?
● 這裡的無線AP:
○ 帳號:
○ 密碼:
Data
Access
Big Data
Access
Frontend Services
Backend Services
BigQuery它是...
● TB level data analysis
● Fast mining response
● SQL like query language
● Multi-dataset interactive
support
● Cheap and pay by use
● Offline job support
Getting Start
BigQuery Web UI
https://bigquery.cloud.google.com/
BigQuery structure
● Project
● Dataset
● Table
● Job
Handson - Import
Sample Data...
The easily way - Import Wizard
JCMB_2014.csv Schema
date_time:String,atmospheric_pressure:float,
rainfall:float,wind_speed:float,wind_direction:
float,surface_temperature:float,
relative_humidity:float,solar_flux:float,battery:
float
Load Data to BigQuery in CMD
CSV / JSON Cloud Storage BigQuery
Load CSV to BigQuery
gsutil cp [source] gs://[bucket-name]
# gsutil cp ~/Desktop/log.csv gs://your-bucket/
Copying file:///Users/simonsu/Desktop/log.csv [Content-Type=text/csv]...
Uploading: 4.59 MB/36.76 MB
bq load [project]:[dataset].[table] gs://[bucket]/[csv path] [schema]
# bq load project.dataset gs://your-bucket/log.csv IP:STRING,DNS:STRING,TS:STRING,URL:STRING
Waiting on bqjob_rf4f3f1d9e2366a6_00000142c1bdd36f_1 ... (24s) Current status: DONE
Load JSON to BigQuery
bq load --source_format NEWLINE_DELIMITED_JSON 
[project]:[dataset].[table] [json file] [schema file]
# bq load --source_format NEWLINE_DELIMITED_JSON testbq.jsonTest ./sample.json ./schema.json
Waiting on bqjob_r7182196a0278f1c6_00000145f940517b_1 ... (39s) Current status: DONE
# bq load --source_format NEWLINE_DELIMITED_JSON testbq.jsonTest gs://your-bucket/sample.json ./schema.
json
Waiting on bqjob_r7182196a0278f1c6_00000145f940517b_1 ... (39s) Current status: DONE
Handson - Query
Web way - Query Console
Install google_cloud_sdk (https://developers.google.com/cloud/sdk/)
Shell way - bq commad
Shell way - bq commad
bq query <sql_query>
# bq query 'select charge_unit,charge_desc,one_charge from testbq.test'
BigQuery - Query Language
Query syntax
● SELECT
● WITHIN
● FROM
● FLATTEN
● JOIN
● WHERE
● GROUP BY
● HAVING
● ORDER BY
● LIMIT
Query support
Supported functions and operators
● Aggregate functions
● Arithmetic operators
● Bitwise operators
● Casting functions
● Comparison functions
● Date and time functions
● IP functions
● JSON functions
● Logical operators
● Mathematical functions
● Regular expression functions
● String functions
● Table wildcard functions
● URL functions
● Window functions
● Other functions
select charge_unit,charge_desc,one_charge from testbq.test
Select
+-----------------+----------------+--------------------+
| charge_unit | charge_desc | one_charge |
+-----------------+----------------+--------------------+
| M | 按月計費 |0 |
| D | 按日計費 |0 |
| HH | 小時計費 |0 |
| T | 分計費 |0 |
| SS | 按次計費 |1 |
+-----------------+----------------+--------------------+
SELECT a.order_id,a.sales,b.begin_use_date
FROM testbq.order_master a LEFT JOIN testbq.order_detail b
ON a.order_id = b.order_id
Join
+-----------------+----------------+-----------------------------+
| a_order_id | a_sales | b_begin_use_date |
+-----------------+----------------+-----------------------------+
| OM2003 | D589 | 2011-11-01 17:43:00 UTC |
| OM2004 | D589 | 2011-11-01 09:43:00 UTC |
| OM2005 | D589 | 2011-11-01 17:55:00 UTC |
| OM2006 | D589 | 2011-11-01 17:54:00 UTC |
| OM2007 | D589 | 2011-11-03 16:31:00 UTC |
+-----------------+----------------+-----------------------------+
SELECT
fullName,
age,
gender,
citiesLived.place
FROM (FLATTEN([dataset.tableId], children))
WHERE
(citiesLived.yearsLived > 1995) AND
(children.age > 3)
GROUP BY fullName, age, gender, citiesLived.place
Flatten
+------------+-----+--------+--------------------+
| fullName | age | gender | citiesLived_place |
+------------+-----+--------+--------------------+
| John Doe | 22 | Male | Stockholm |
| Mike Jones | 35 | Male | Los Angeles |
| Mike Jones | 35 | Male | Washington DC |
| Mike Jones | 35 | Male | Portland |
| Mike Jones | 35 | Male | Austin |
+------------+-----+--------+---------------------+
SELECT
word,
COUNT(word) AS count
FROM
publicdata:samples.shakespeare
WHERE
(REGEXP_MATCH(word,r'ww'ww'))
GROUP BY word
ORDER BY count DESC
LIMIT 3;
Regular Expression
+-----------------+----------------+
| word | count |
+-----------------+----------------+
| ne'er | 42 |
| we'll | 35 |
| We'll | 33 |
+-----------------+----------------+
SELECT
TOP (FORMAT_UTC_USEC(timestamp * 1000000), 5)
AS top_revision_time,
COUNT (*) AS revision_count
FROM
[publicdata:samples.wikipedia];
+----------------------------+----------------+
| top_revision_time | revision_count |
+----------------------------+----------------+
| 2002-02-25 15:51:15.000000 | 20971 |
| 2002-02-25 15:43:11.000000 | 15955 |
| 2010-01-14 15:52:34.000000 | 3 |
| 2009-12-31 19:29:19.000000 | 3 |
| 2009-12-28 18:55:12.000000 | 3 |
+----------------------------+----------------+
Time Function
SELECT
DOMAIN(repository_homepage) AS user_domain,
COUNT(*) AS activity_count
FROM
[publicdata:samples.github_timeline]
GROUP BY
user_domain
HAVING
user_domain IS NOT NULL AND user_domain != ''
ORDER BY
activity_count DESC
LIMIT 5;
IP Function
+-----------------+----------------+
| user_domain | activity_count |
+-----------------+----------------+
| github.com | 281879 |
| google.com | 34769 |
| khanacademy.org | 17316 |
| sourceforge.net | 15103 |
| mozilla.org | 14091 |
+-----------------+----------------+
Handson - Programming
● Prepare a Google Cloud Platform project
● Create a Service Account
● Generate key from Service Account p12 key
Prepare
Google Service Account
web server appliction
service account
v.s.
Prepare Authentications
p12 key → pem key轉換
$ openssl pkcs12 -in privatekey.p12 -out privatekey.pem -nocerts
$ openssl rsa -in privatekey.pem -out key.pem
Node.js - bigquery模組
var bq = require('bigquery')
, prjId = 'your-bigquery-project-id';
bq.init({
client_secret: '/path/to/client_secret.json',
key_pem: '/path/to/key.pem'
});
bq.job.listds(prjId, function(e,r,d){
if(e) console.log(e);
console.log(JSON.stringify(d));
}); 操作時,透過bq呼叫job之下的
function做操作
bigquery模組可參考:https://github.com/peihsinsu/bigquery
/* Ref: https://developers.google.com/apps-script/advanced/bigquery */
var request = { query: 'SELECT TOP(word, 30) AS word, COUNT(*) AS word_count ' +
'FROM publicdata:samples.shakespeare WHERE LENGTH(word) > 10;' };
var queryResults = BigQuery.Jobs.query(request, projectId);
var jobId = queryResults.jobReference.jobId;
queryResults = BigQuery.Jobs.getQueryResults(projectId, jobId);
var rows = queryResults.rows;
while (queryResults.pageToken) {
queryResults = BigQuery.Jobs.getQueryResults(projectId, jobId, {
pageToken: queryResults.pageToken
});
rows = rows.concat(queryResults.rows);
}
Google Drive way - Apps Script
● Features: https://cloud.google.com/products/bigquery#features
● Case Studies: https://cloud.google.com/products/bigquery#case-
studies
● Pricing: https://cloud.google.com/products/bigquery#pricing
● Documentation: https://cloud.google.
com/products/bigquery#documentation
● Query Reference: https://developers.google.com/bigquery/query-
reference
References
http://goo.gl/LD4RN4

Contenu connexe

Tendances

MongoDB Tick Data Presentation
MongoDB Tick Data PresentationMongoDB Tick Data Presentation
MongoDB Tick Data Presentation
MongoDB
 
MongoDB for Time Series Data Part 1: Setting the Stage for Sensor Management
MongoDB for Time Series Data Part 1: Setting the Stage for Sensor ManagementMongoDB for Time Series Data Part 1: Setting the Stage for Sensor Management
MongoDB for Time Series Data Part 1: Setting the Stage for Sensor Management
MongoDB
 
Building Pinterest Real-Time Ads Platform Using Kafka Streams
Building Pinterest Real-Time Ads Platform Using Kafka Streams Building Pinterest Real-Time Ads Platform Using Kafka Streams
Building Pinterest Real-Time Ads Platform Using Kafka Streams
confluent
 

Tendances (20)

Google Developer Group - Cloud Singapore BigQuery Webinar
Google Developer Group - Cloud Singapore BigQuery WebinarGoogle Developer Group - Cloud Singapore BigQuery Webinar
Google Developer Group - Cloud Singapore BigQuery Webinar
 
Getting to Insights Faster with the MongoDB Connector for BI
Getting to Insights Faster with the MongoDB Connector for BIGetting to Insights Faster with the MongoDB Connector for BI
Getting to Insights Faster with the MongoDB Connector for BI
 
MongoDB Tick Data Presentation
MongoDB Tick Data PresentationMongoDB Tick Data Presentation
MongoDB Tick Data Presentation
 
Python and MongoDB as a Market Data Platform by James Blackburn
Python and MongoDB as a Market Data Platform by James BlackburnPython and MongoDB as a Market Data Platform by James Blackburn
Python and MongoDB as a Market Data Platform by James Blackburn
 
MongoDB for Analytics
MongoDB for AnalyticsMongoDB for Analytics
MongoDB for Analytics
 
Analytic Data Report with MongoDB
Analytic Data Report with MongoDBAnalytic Data Report with MongoDB
Analytic Data Report with MongoDB
 
Akamai Edge: Tracking the Performance of the Web with HTTP Archive
Akamai Edge: Tracking the Performance of the Web with HTTP ArchiveAkamai Edge: Tracking the Performance of the Web with HTTP Archive
Akamai Edge: Tracking the Performance of the Web with HTTP Archive
 
Tracking the Performance of the Web with HTTP Archive
Tracking the Performance of the Web with HTTP ArchiveTracking the Performance of the Web with HTTP Archive
Tracking the Performance of the Web with HTTP Archive
 
MongoDB for Time Series Data Part 1: Setting the Stage for Sensor Management
MongoDB for Time Series Data Part 1: Setting the Stage for Sensor ManagementMongoDB for Time Series Data Part 1: Setting the Stage for Sensor Management
MongoDB for Time Series Data Part 1: Setting the Stage for Sensor Management
 
Dev Jumpstart: Build Your First App with MongoDB
Dev Jumpstart: Build Your First App with MongoDBDev Jumpstart: Build Your First App with MongoDB
Dev Jumpstart: Build Your First App with MongoDB
 
Google BigQuery Best Practices
Google BigQuery Best PracticesGoogle BigQuery Best Practices
Google BigQuery Best Practices
 
Analytics with MongoDB Aggregation Framework and Hadoop Connector
Analytics with MongoDB Aggregation Framework and Hadoop ConnectorAnalytics with MongoDB Aggregation Framework and Hadoop Connector
Analytics with MongoDB Aggregation Framework and Hadoop Connector
 
AIC x PyLadies TW Python Data Vis - 3: Dashboard
AIC x PyLadies TW Python Data Vis - 3: DashboardAIC x PyLadies TW Python Data Vis - 3: Dashboard
AIC x PyLadies TW Python Data Vis - 3: Dashboard
 
MongoDB.local Paris Keynote
MongoDB.local Paris KeynoteMongoDB.local Paris Keynote
MongoDB.local Paris Keynote
 
Faites évoluer votre accès aux données avec MongoDB Stitch
Faites évoluer votre accès aux données avec MongoDB StitchFaites évoluer votre accès aux données avec MongoDB Stitch
Faites évoluer votre accès aux données avec MongoDB Stitch
 
[Public] 7 arquetipos de la tecnología moderna [españa]
[Public] 7 arquetipos de la tecnología moderna [españa][Public] 7 arquetipos de la tecnología moderna [españa]
[Public] 7 arquetipos de la tecnología moderna [españa]
 
Big Data Expo 2015 - Gigaspaces Making Sense of it all
Big Data Expo 2015 - Gigaspaces Making Sense of it allBig Data Expo 2015 - Gigaspaces Making Sense of it all
Big Data Expo 2015 - Gigaspaces Making Sense of it all
 
Kubernetes as data platform
Kubernetes as data platformKubernetes as data platform
Kubernetes as data platform
 
#SlimScalding - Less Memory is More Capacity
#SlimScalding - Less Memory is More Capacity#SlimScalding - Less Memory is More Capacity
#SlimScalding - Less Memory is More Capacity
 
Building Pinterest Real-Time Ads Platform Using Kafka Streams
Building Pinterest Real-Time Ads Platform Using Kafka Streams Building Pinterest Real-Time Ads Platform Using Kafka Streams
Building Pinterest Real-Time Ads Platform Using Kafka Streams
 

En vedette

En vedette (18)

API analytics with Redis and Google Bigquery. NoSQL matters edition
API analytics with Redis and Google Bigquery. NoSQL matters editionAPI analytics with Redis and Google Bigquery. NoSQL matters edition
API analytics with Redis and Google Bigquery. NoSQL matters edition
 
Column Stores and Google BigQuery
Column Stores and Google BigQueryColumn Stores and Google BigQuery
Column Stores and Google BigQuery
 
Get more from Analytics with Google BigQuery - Javier Ramirez - Datawaki- BBVACI
Get more from Analytics with Google BigQuery - Javier Ramirez - Datawaki- BBVACIGet more from Analytics with Google BigQuery - Javier Ramirez - Datawaki- BBVACI
Get more from Analytics with Google BigQuery - Javier Ramirez - Datawaki- BBVACI
 
You might be paying too much for BigQuery
You might be paying too much for BigQueryYou might be paying too much for BigQuery
You might be paying too much for BigQuery
 
Crunching Data with Google BigQuery. JORDAN TIGANI at Big Data Spain 2012
Crunching Data with Google BigQuery. JORDAN TIGANI at Big Data Spain 2012Crunching Data with Google BigQuery. JORDAN TIGANI at Big Data Spain 2012
Crunching Data with Google BigQuery. JORDAN TIGANI at Big Data Spain 2012
 
How BigQuery broke my heart
How BigQuery broke my heartHow BigQuery broke my heart
How BigQuery broke my heart
 
Big Data Analytics with Google BigQuery. By Javier Ramirez. All your base Co...
Big Data Analytics with Google BigQuery.  By Javier Ramirez. All your base Co...Big Data Analytics with Google BigQuery.  By Javier Ramirez. All your base Co...
Big Data Analytics with Google BigQuery. By Javier Ramirez. All your base Co...
 
Exploring Open Date with BigQuery: Jenny Tong
Exploring Open Date with BigQuery: Jenny TongExploring Open Date with BigQuery: Jenny Tong
Exploring Open Date with BigQuery: Jenny Tong
 
Google Cloud Dataflow Two Worlds Become a Much Better One
Google Cloud Dataflow Two Worlds Become a Much Better OneGoogle Cloud Dataflow Two Worlds Become a Much Better One
Google Cloud Dataflow Two Worlds Become a Much Better One
 
GDD Brazil 2010 - Google Storage, Bigquery and Prediction APIs
GDD Brazil 2010 - Google Storage, Bigquery and Prediction APIsGDD Brazil 2010 - Google Storage, Bigquery and Prediction APIs
GDD Brazil 2010 - Google Storage, Bigquery and Prediction APIs
 
Complex realtime event analytics using BigQuery @Crunch Warmup
Complex realtime event analytics using BigQuery @Crunch WarmupComplex realtime event analytics using BigQuery @Crunch Warmup
Complex realtime event analytics using BigQuery @Crunch Warmup
 
Get more from Analytics 360 with BigQuery and the Google Cloud Platform
Get more from Analytics 360 with BigQuery and the Google Cloud PlatformGet more from Analytics 360 with BigQuery and the Google Cloud Platform
Get more from Analytics 360 with BigQuery and the Google Cloud Platform
 
Exploring BigData with Google BigQuery
Exploring BigData with Google BigQueryExploring BigData with Google BigQuery
Exploring BigData with Google BigQuery
 
Google BigQuery
Google BigQueryGoogle BigQuery
Google BigQuery
 
Scaling Galaxy on Google Cloud Platform
Scaling Galaxy on Google Cloud PlatformScaling Galaxy on Google Cloud Platform
Scaling Galaxy on Google Cloud Platform
 
An indepth look at Google BigQuery Architecture by Felipe Hoffa of Google
An indepth look at Google BigQuery Architecture by Felipe Hoffa of GoogleAn indepth look at Google BigQuery Architecture by Felipe Hoffa of Google
An indepth look at Google BigQuery Architecture by Felipe Hoffa of Google
 
Understanding cloud with Google Cloud Platform
Understanding cloud with Google Cloud PlatformUnderstanding cloud with Google Cloud Platform
Understanding cloud with Google Cloud Platform
 
A Tour of Google Cloud Platform
A Tour of Google Cloud PlatformA Tour of Google Cloud Platform
A Tour of Google Cloud Platform
 

Similaire à Workshop 20140522 BigQuery Implementation

MySQL/MariaDB query optimizer tuning tutorial from Percona Live 2013
MySQL/MariaDB query optimizer tuning tutorial from Percona Live 2013MySQL/MariaDB query optimizer tuning tutorial from Percona Live 2013
MySQL/MariaDB query optimizer tuning tutorial from Percona Live 2013
Sergey Petrunya
 
Common schema my sql uc 2012
Common schema   my sql uc 2012Common schema   my sql uc 2012
Common schema my sql uc 2012
Roland Bouman
 
Common schema my sql uc 2012
Common schema   my sql uc 2012Common schema   my sql uc 2012
Common schema my sql uc 2012
Roland Bouman
 

Similaire à Workshop 20140522 BigQuery Implementation (20)

MySQL/MariaDB query optimizer tuning tutorial from Percona Live 2013
MySQL/MariaDB query optimizer tuning tutorial from Percona Live 2013MySQL/MariaDB query optimizer tuning tutorial from Percona Live 2013
MySQL/MariaDB query optimizer tuning tutorial from Percona Live 2013
 
Around the world with extensions | PostgreSQL Conference Europe 2018 | Craig ...
Around the world with extensions | PostgreSQL Conference Europe 2018 | Craig ...Around the world with extensions | PostgreSQL Conference Europe 2018 | Craig ...
Around the world with extensions | PostgreSQL Conference Europe 2018 | Craig ...
 
Advanced Query Optimizer Tuning and Analysis
Advanced Query Optimizer Tuning and AnalysisAdvanced Query Optimizer Tuning and Analysis
Advanced Query Optimizer Tuning and Analysis
 
Window functions in MySQL 8.0
Window functions in MySQL 8.0Window functions in MySQL 8.0
Window functions in MySQL 8.0
 
Common schema my sql uc 2012
Common schema   my sql uc 2012Common schema   my sql uc 2012
Common schema my sql uc 2012
 
Common schema my sql uc 2012
Common schema   my sql uc 2012Common schema   my sql uc 2012
Common schema my sql uc 2012
 
Oracle Query Tuning Tips - Get it Right the First Time
Oracle Query Tuning Tips - Get it Right the First TimeOracle Query Tuning Tips - Get it Right the First Time
Oracle Query Tuning Tips - Get it Right the First Time
 
Need for Speed: MySQL Indexing
Need for Speed: MySQL IndexingNeed for Speed: MySQL Indexing
Need for Speed: MySQL Indexing
 
Adaptive Query Optimization
Adaptive Query OptimizationAdaptive Query Optimization
Adaptive Query Optimization
 
What's New in MariaDB Server 10.2 and MariaDB MaxScale 2.1
What's New in MariaDB Server 10.2 and MariaDB MaxScale 2.1What's New in MariaDB Server 10.2 and MariaDB MaxScale 2.1
What's New in MariaDB Server 10.2 and MariaDB MaxScale 2.1
 
What's New in MariaDB Server 10.2 and MariaDB MaxScale 2.1
What's New in MariaDB Server 10.2 and MariaDB MaxScale 2.1What's New in MariaDB Server 10.2 and MariaDB MaxScale 2.1
What's New in MariaDB Server 10.2 and MariaDB MaxScale 2.1
 
MySQL 5.7 Tutorial Dutch PHP Conference 2015
MySQL 5.7 Tutorial Dutch PHP Conference 2015MySQL 5.7 Tutorial Dutch PHP Conference 2015
MySQL 5.7 Tutorial Dutch PHP Conference 2015
 
MySQL 5.7. Tutorial - Dutch PHP Conference 2015
MySQL 5.7. Tutorial - Dutch PHP Conference 2015MySQL 5.7. Tutorial - Dutch PHP Conference 2015
MySQL 5.7. Tutorial - Dutch PHP Conference 2015
 
Sprint 56
Sprint 56Sprint 56
Sprint 56
 
Modern query optimisation features in MySQL 8.
Modern query optimisation features in MySQL 8.Modern query optimisation features in MySQL 8.
Modern query optimisation features in MySQL 8.
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
 
Altitude San Francisco 2018: Logging at the Edge
Altitude San Francisco 2018: Logging at the Edge Altitude San Francisco 2018: Logging at the Edge
Altitude San Francisco 2018: Logging at the Edge
 
The Future of Sharding
The Future of ShardingThe Future of Sharding
The Future of Sharding
 
OSMC 2021 | pg_stat_monitor: A cool extension for better database (PostgreSQL...
OSMC 2021 | pg_stat_monitor: A cool extension for better database (PostgreSQL...OSMC 2021 | pg_stat_monitor: A cool extension for better database (PostgreSQL...
OSMC 2021 | pg_stat_monitor: A cool extension for better database (PostgreSQL...
 
MariaDB 10.5 new features for troubleshooting (mariadb server fest 2020)
MariaDB 10.5 new features for troubleshooting (mariadb server fest 2020)MariaDB 10.5 new features for troubleshooting (mariadb server fest 2020)
MariaDB 10.5 new features for troubleshooting (mariadb server fest 2020)
 

Plus de Simon Su

Plus de Simon Su (20)

Kubernetes Basic Operation
Kubernetes Basic OperationKubernetes Basic Operation
Kubernetes Basic Operation
 
Google IoT Core 初體驗
Google IoT Core 初體驗Google IoT Core 初體驗
Google IoT Core 初體驗
 
JSDC 2017 - 使用google cloud 從雲到端,動手刻個IoT
JSDC 2017 - 使用google cloud 從雲到端,動手刻個IoTJSDC 2017 - 使用google cloud 從雲到端,動手刻個IoT
JSDC 2017 - 使用google cloud 從雲到端,動手刻個IoT
 
GCPUG.TW meetup #28 - GKE上運作您的k8s服務
GCPUG.TW meetup #28 - GKE上運作您的k8s服務GCPUG.TW meetup #28 - GKE上運作您的k8s服務
GCPUG.TW meetup #28 - GKE上運作您的k8s服務
 
Google Cloud Platform Special Training
Google Cloud Platform Special TrainingGoogle Cloud Platform Special Training
Google Cloud Platform Special Training
 
GCE Windows Serial Console Usage Guide
GCE Windows Serial Console Usage GuideGCE Windows Serial Console Usage Guide
GCE Windows Serial Console Usage Guide
 
GCPNext17' Extend 開始GCP了嗎?
GCPNext17' Extend   開始GCP了嗎?GCPNext17' Extend   開始GCP了嗎?
GCPNext17' Extend 開始GCP了嗎?
 
Try Cloud Spanner
Try Cloud SpannerTry Cloud Spanner
Try Cloud Spanner
 
Google Cloud Monitoring
Google Cloud MonitoringGoogle Cloud Monitoring
Google Cloud Monitoring
 
Google Cloud Computing compares GCE, GAE and GKE
Google Cloud Computing compares GCE, GAE and GKEGoogle Cloud Computing compares GCE, GAE and GKE
Google Cloud Computing compares GCE, GAE and GKE
 
JCConf 2016 - Google Dataflow 小試
JCConf 2016 - Google Dataflow 小試JCConf 2016 - Google Dataflow 小試
JCConf 2016 - Google Dataflow 小試
 
JCConf 2016 - Dataflow Workshop Labs
JCConf 2016 - Dataflow Workshop LabsJCConf 2016 - Dataflow Workshop Labs
JCConf 2016 - Dataflow Workshop Labs
 
JCConf2016 - Dataflow Workshop Setup
JCConf2016 - Dataflow Workshop SetupJCConf2016 - Dataflow Workshop Setup
JCConf2016 - Dataflow Workshop Setup
 
GCPUG meetup 201610 - Dataflow Introduction
GCPUG meetup 201610 - Dataflow IntroductionGCPUG meetup 201610 - Dataflow Introduction
GCPUG meetup 201610 - Dataflow Introduction
 
Brocade - Stingray Application Firewall
Brocade - Stingray Application FirewallBrocade - Stingray Application Firewall
Brocade - Stingray Application Firewall
 
使用 Raspberry pi + fluentd + gcp cloud logging, big query 做iot 資料搜集與分析
使用 Raspberry pi + fluentd + gcp cloud logging, big query 做iot 資料搜集與分析使用 Raspberry pi + fluentd + gcp cloud logging, big query 做iot 資料搜集與分析
使用 Raspberry pi + fluentd + gcp cloud logging, big query 做iot 資料搜集與分析
 
Docker in Action
Docker in ActionDocker in Action
Docker in Action
 
Google I/O 2016 Recap - Google Cloud Platform News Update
Google I/O 2016 Recap - Google Cloud Platform News UpdateGoogle I/O 2016 Recap - Google Cloud Platform News Update
Google I/O 2016 Recap - Google Cloud Platform News Update
 
IThome DevOps Summit - IoT、docker與DevOps
IThome DevOps Summit - IoT、docker與DevOpsIThome DevOps Summit - IoT、docker與DevOps
IThome DevOps Summit - IoT、docker與DevOps
 
Google Cloud Platform Introduction - 2016Q3
Google Cloud Platform Introduction - 2016Q3Google Cloud Platform Introduction - 2016Q3
Google Cloud Platform Introduction - 2016Q3
 

Dernier

Dernier (20)

From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 

Workshop 20140522 BigQuery Implementation

  • 1. MiTAC MiCloud - Google Cloud Platform Partner @ APAC2014Q2 BigQuery Workshop Google BigQuery Big data with SQL like query feature, but fast... Google BigQueryGoogle BigQuery http://goo.gl/XZmqgN
  • 2. RESTful GCE LB 前言: ● 我們要實作喔~ 有興趣的 朋友,請打開您的電腦... ● 開好GCP專案? ● Enable Billing了? ● 裝好google_cloud_sdk? ● 這裡的無線AP: ○ 帳號: ○ 密碼: Data Access Big Data Access Frontend Services Backend Services
  • 3. BigQuery它是... ● TB level data analysis ● Fast mining response ● SQL like query language ● Multi-dataset interactive support ● Cheap and pay by use ● Offline job support
  • 6. BigQuery structure ● Project ● Dataset ● Table ● Job
  • 9. The easily way - Import Wizard
  • 11. Load Data to BigQuery in CMD CSV / JSON Cloud Storage BigQuery
  • 12. Load CSV to BigQuery gsutil cp [source] gs://[bucket-name] # gsutil cp ~/Desktop/log.csv gs://your-bucket/ Copying file:///Users/simonsu/Desktop/log.csv [Content-Type=text/csv]... Uploading: 4.59 MB/36.76 MB bq load [project]:[dataset].[table] gs://[bucket]/[csv path] [schema] # bq load project.dataset gs://your-bucket/log.csv IP:STRING,DNS:STRING,TS:STRING,URL:STRING Waiting on bqjob_rf4f3f1d9e2366a6_00000142c1bdd36f_1 ... (24s) Current status: DONE
  • 13. Load JSON to BigQuery bq load --source_format NEWLINE_DELIMITED_JSON [project]:[dataset].[table] [json file] [schema file] # bq load --source_format NEWLINE_DELIMITED_JSON testbq.jsonTest ./sample.json ./schema.json Waiting on bqjob_r7182196a0278f1c6_00000145f940517b_1 ... (39s) Current status: DONE # bq load --source_format NEWLINE_DELIMITED_JSON testbq.jsonTest gs://your-bucket/sample.json ./schema. json Waiting on bqjob_r7182196a0278f1c6_00000145f940517b_1 ... (39s) Current status: DONE
  • 15. Web way - Query Console
  • 17. Shell way - bq commad bq query <sql_query> # bq query 'select charge_unit,charge_desc,one_charge from testbq.test'
  • 18. BigQuery - Query Language
  • 19. Query syntax ● SELECT ● WITHIN ● FROM ● FLATTEN ● JOIN ● WHERE ● GROUP BY ● HAVING ● ORDER BY ● LIMIT Query support Supported functions and operators ● Aggregate functions ● Arithmetic operators ● Bitwise operators ● Casting functions ● Comparison functions ● Date and time functions ● IP functions ● JSON functions ● Logical operators ● Mathematical functions ● Regular expression functions ● String functions ● Table wildcard functions ● URL functions ● Window functions ● Other functions
  • 20. select charge_unit,charge_desc,one_charge from testbq.test Select +-----------------+----------------+--------------------+ | charge_unit | charge_desc | one_charge | +-----------------+----------------+--------------------+ | M | 按月計費 |0 | | D | 按日計費 |0 | | HH | 小時計費 |0 | | T | 分計費 |0 | | SS | 按次計費 |1 | +-----------------+----------------+--------------------+
  • 21. SELECT a.order_id,a.sales,b.begin_use_date FROM testbq.order_master a LEFT JOIN testbq.order_detail b ON a.order_id = b.order_id Join +-----------------+----------------+-----------------------------+ | a_order_id | a_sales | b_begin_use_date | +-----------------+----------------+-----------------------------+ | OM2003 | D589 | 2011-11-01 17:43:00 UTC | | OM2004 | D589 | 2011-11-01 09:43:00 UTC | | OM2005 | D589 | 2011-11-01 17:55:00 UTC | | OM2006 | D589 | 2011-11-01 17:54:00 UTC | | OM2007 | D589 | 2011-11-03 16:31:00 UTC | +-----------------+----------------+-----------------------------+
  • 22. SELECT fullName, age, gender, citiesLived.place FROM (FLATTEN([dataset.tableId], children)) WHERE (citiesLived.yearsLived > 1995) AND (children.age > 3) GROUP BY fullName, age, gender, citiesLived.place Flatten +------------+-----+--------+--------------------+ | fullName | age | gender | citiesLived_place | +------------+-----+--------+--------------------+ | John Doe | 22 | Male | Stockholm | | Mike Jones | 35 | Male | Los Angeles | | Mike Jones | 35 | Male | Washington DC | | Mike Jones | 35 | Male | Portland | | Mike Jones | 35 | Male | Austin | +------------+-----+--------+---------------------+
  • 23. SELECT word, COUNT(word) AS count FROM publicdata:samples.shakespeare WHERE (REGEXP_MATCH(word,r'ww'ww')) GROUP BY word ORDER BY count DESC LIMIT 3; Regular Expression +-----------------+----------------+ | word | count | +-----------------+----------------+ | ne'er | 42 | | we'll | 35 | | We'll | 33 | +-----------------+----------------+
  • 24. SELECT TOP (FORMAT_UTC_USEC(timestamp * 1000000), 5) AS top_revision_time, COUNT (*) AS revision_count FROM [publicdata:samples.wikipedia]; +----------------------------+----------------+ | top_revision_time | revision_count | +----------------------------+----------------+ | 2002-02-25 15:51:15.000000 | 20971 | | 2002-02-25 15:43:11.000000 | 15955 | | 2010-01-14 15:52:34.000000 | 3 | | 2009-12-31 19:29:19.000000 | 3 | | 2009-12-28 18:55:12.000000 | 3 | +----------------------------+----------------+ Time Function
  • 25. SELECT DOMAIN(repository_homepage) AS user_domain, COUNT(*) AS activity_count FROM [publicdata:samples.github_timeline] GROUP BY user_domain HAVING user_domain IS NOT NULL AND user_domain != '' ORDER BY activity_count DESC LIMIT 5; IP Function +-----------------+----------------+ | user_domain | activity_count | +-----------------+----------------+ | github.com | 281879 | | google.com | 34769 | | khanacademy.org | 17316 | | sourceforge.net | 15103 | | mozilla.org | 14091 | +-----------------+----------------+
  • 27. ● Prepare a Google Cloud Platform project ● Create a Service Account ● Generate key from Service Account p12 key Prepare
  • 28. Google Service Account web server appliction service account v.s.
  • 29. Prepare Authentications p12 key → pem key轉換 $ openssl pkcs12 -in privatekey.p12 -out privatekey.pem -nocerts $ openssl rsa -in privatekey.pem -out key.pem
  • 30. Node.js - bigquery模組 var bq = require('bigquery') , prjId = 'your-bigquery-project-id'; bq.init({ client_secret: '/path/to/client_secret.json', key_pem: '/path/to/key.pem' }); bq.job.listds(prjId, function(e,r,d){ if(e) console.log(e); console.log(JSON.stringify(d)); }); 操作時,透過bq呼叫job之下的 function做操作 bigquery模組可參考:https://github.com/peihsinsu/bigquery
  • 31. /* Ref: https://developers.google.com/apps-script/advanced/bigquery */ var request = { query: 'SELECT TOP(word, 30) AS word, COUNT(*) AS word_count ' + 'FROM publicdata:samples.shakespeare WHERE LENGTH(word) > 10;' }; var queryResults = BigQuery.Jobs.query(request, projectId); var jobId = queryResults.jobReference.jobId; queryResults = BigQuery.Jobs.getQueryResults(projectId, jobId); var rows = queryResults.rows; while (queryResults.pageToken) { queryResults = BigQuery.Jobs.getQueryResults(projectId, jobId, { pageToken: queryResults.pageToken }); rows = rows.concat(queryResults.rows); } Google Drive way - Apps Script
  • 32. ● Features: https://cloud.google.com/products/bigquery#features ● Case Studies: https://cloud.google.com/products/bigquery#case- studies ● Pricing: https://cloud.google.com/products/bigquery#pricing ● Documentation: https://cloud.google. com/products/bigquery#documentation ● Query Reference: https://developers.google.com/bigquery/query- reference References