SlideShare une entreprise Scribd logo
1  sur  87
SAS 9.3
Solve next-generation
problems with SAS®
Give a man a fish and you feed him for a day; teach
a man to fish and you feed him for a lifetime.
Presentation Outline
Introduction to the SAS Environment
1. SAS Introduction
2. SAS Programs
3. SAS Data Sets and Data Libraries
4. Creating SAS Data Sets
What is SAS?
• SAS is a comprehensive statistical software system which
integrates utilities for storing, modifying, analyzing, and
graphing data.
• SAS runs on both Windows and UNIX platforms
• SAS is used in a wide range of industries such as
healthcare, education, financial services, life sciences,…
• Check out the webpage to learn more
• http://www.sas.com/
Who is SAS?
2012 Worldwide Results
Breakdown by Industry Sector
2012 SAS Annual Report
More than 1,500 banks with SAS
Evolution of SAS
SAS Banking Analytics Architecture
SAS User Interface
Log Window
Explorer
Window
Editor Window
Output Window (not shown)
Results
Window
(not shown)
Run button – click on this button to run
SAS code
Click here for SAS help
New Window button
Save button
Tool bar similar
to Windows applications
Editor Window
The Editor Window contains inputted data
sets and SAS programs
Explorer Window
Explorer
Window
Libraries Folder - Contains data sets created in SAS
Libraries Folder
Contents of the Libraries
Folder
The Work Folder contains
data sets created in SAS
Contents of the Work Folder
These are the data sets that
have been created in SAS
through inputting data and
by creating data sets in SAS
programs
Log Window
The Log Window contains a record
of all commands submitted to
SAS and shows errors in the
commands.
Output Window
The Output Window contains output
based on SAS programs submitted in the
Editor Window.
Results Window
The Results Window shows a
listing of SAS programs
that have been submitted
in the order that they were
submitted.
Click on any procedure to
view all output parts of the
procedure and click on any
individual part to view the
actual output.
SAS Help
SAS Programs
• File extension - .sas
• Editor window has four uses:
– Access and edit existing SAS
programs
– Write new SAS programs
– Submitting SAS programs for
execution
– Saving SAS programs
• SAS program
– Sequence of steps that the user
submits for execution
• Submitting SAS programs
– Entire program
– Selection of the program
• 2 Basic steps in SAS programs:
– Data Steps
• Typically used to create SAS
datasets and manipulate
data,
• Begins with DATA statement
– Proc Steps
• Typically used to process
SAS data sets
• Begins with PROC statement
• The end of the data or proc steps
are indicated by:
– RUN statement – most steps
– QUIT statement – some steps
– Beginning of another step (DATA
or PROC statement)
• SAS Data Libraries
– Contain SAS data sets
– Identified by assigning a library
reference name – libref
– Temporary
• Work library
• SAS data files are deleted
when session ends
• Library reference name not
necessary
– Permanent
• SAS data sets are saved
after session ends
• SASUSER library
• You can create and access
your own libraries
SAS Data Sets and Data Libraries
Presentation Outline
1. Data Set Information
2. Data Set Manipulation
3. Combining Data Sets
A. Concatenating/Appending
B. Merging
Working With SAS Data Sets
• Proc Contents
– Output contains a table of contents of the specified data set
– Data Set Information
• Data set name
• Number of observations
• Number of Variables
– Variable Information
• Type (numeric or character)
• Length
– Syntax:
PROC CONTENTS DATA=input_data_set;
RUN;
Data Set Information
• Create a new SAS data set using an existing SAS data set as input
– Specify name of the new SAS data set after the DATA statement
– Use SET statement to identify SAS data set being read
– Syntax:
DATA output_data_set;
SET input_data_set;
<additional SAS statements>;
RUN;
– By default the SET statement reads all observations and variables from the
input data set into the output data set.
Data Set Manipulation
• Assignment Statements
– Evaluate an expression
– Assign resulting value to a variable
– General Form: variable = expression;
– Example: miles_per_hour = distance/time;
• SAS Functions
– Perform arithmetic functions, compute simple statistics, manipulate
dates, etc.
– General Form: variable=function_name(argument1, argument2,…);
– Example: Time_worked = sum(Day1,Day2, Day3, Day4, Day5);
Data Set Manipulation
• Conditional Processing
– Uses IF-THEN-ELSE logic
– General Form: IF <expression1> THEN <statement>;
ELSE IF <expression2> THEN <statement>;
ELSE <statement>;
– <expression> is a true/false statement, such as:
• Day1=Day2, Day1 > Day2, Day1 < Day2
• Day1+Day2=10
• Sum(day1,day2)=10
• Day1=5 and Day2=5
Data Set Manipulation
• Conditional Processing
Symbolic Mnemonic Example
= EQ IF region=‘Spain’;
~= or ^= NE IF region ne ‘Spain’;
> GT IF rainfall > 20;
< LT IF rainfall lt 20;
>= GE IF rainfall ge 20;
<= LE IF rainfall <= 20;
& AND IF rainfall ge 20 & temp < 90;
| or ! OR IF rainfall ge 20 OR temp < 90;
IS NOT
MISSING
IF region IS NOT MISSING;
BETWEEN
AND
IF region BETWEEN ‘Plain’ AND ‘Spain’;
CONTAINS IF region CONTAINS ‘ain’;
IN IF region IN (‘Rain’, ‘Spain’, ‘Plain’);
Data Set Manipulation
• PROC SORT sorts data according to specified variables
• General Form:
PROC SORT DATA=input_data_set <options>;
BY Variable1 Variable2;
RUN;
• Sorts data according to Variable1 and then Variable2;
• By default, SAS sorts data in ascending order
– Number low to high
– A to Z
• Use DESCENDING statement for numbers high to low and letters Z to A
– BY City DESCENDING Population;
– SAS sorts data first by city A to Z and then Population high to low
Data Set Manipulation
• Merging Data Sets
– One-to-One Match Merge
• A single record in a data set corresponds to a single record in all other
data sets
• Example: Patient and Billing Information
– One-to-Many Match Merge
• Matching one observation from one data set to multiple observations in
other data sets
• Example: County and State Information
– Note: Data must be sorted before merging can be done
(PROC SORT)
Combining Data Sets
• Concatenating (or Appending)
• Stacks each data set upon the other
• If one data set does not have a variable that the other datasets do, the
variable in the new data set is set to missing for the observations from
that data set.
• General Form:
DATA output_data_set;
SET data1 data2;
run;
• PROC APPEND may also be used
Combining Data Sets
Presentation Outline
1. Print Procedure
2. Plot Procedure
3. Univariate Procedure
4. Means Procedure
5. Freq Procedure
Summary Procedures
• PROC PRINT is used to print data to the output window
• By default, prints all observations and variables in the SAS data set
• General Form: PROC PRINT DATA=input_data_set <options>
<optional SAS statements>;
RUN;
• Some Options
– input_data_set (obs=n) - Specifies the number of observations
to be printed in the output
– NOOBS - Suppresses printing observation
number
– LABEL - Prints the labels instead of variable
names
Print Procedure
• Used to create basic scatter plots of the data
• Use PROC GPLOT or PROC SGPLOT for more sophisticated plots
• General Form:
PROC PLOT DATA=input_data_set;
PLOT vertical_variable * horizontal_variable/<options>;
RUN;
• By default, SAS uses letters to mark points on plots
– A for a single observation, B for two observations at the same point, etc.
• To specify a different character to represent a point
– PLOT vertical_variable * horizontal variable = ‘*’;
• To specify a third variable to use to mark points
– PLOT vertical_variable * horizontal_variable = third_variable;
• To plot more than one variable on the vertical axis
– PLOT vertical_variable1 * horizontal_variable=‘2’
vertical_variable2 * horizontal_variable=‘1’/OVERLAY;
Plot Procedure
• PROC UNIVARIATE is used to examine the distribution of data
• Produces summary statistics for a single variable
– Includes mean, median, mode, standard
deviation, skewness, kurtosis, quantiles, etc.
• General Form:
PROC UNIVARIATE DATA=input_data_set<options>;
VAR variable1 variable2 variable3;
RUN ;
• If the variable statement is not used, summary statistics will be produced for all
numeric variables in the input data set.
• Options include:
– PLOT – produces Stem-and-leaf plot, Box plot, and Normal probability plot;
– NORMAL – produces tests of Normality
Univariate Procedure
• Similar to the Univariate procedure
• General Form:
PROC MEANS DATA=input_data_set options;
<Optional SAS statements>;
RUN;
• With no options or optional SAS statements, the Means procedure will print out
the number of non-missing values, mean, standard deviation, minimum, and
maximum for all numeric variables in the input data set
• Optional SAS Statements
– VAR Variable1 Variable2;
• Specifies which numeric variables statistics will be produced for
– BY Variable1 Variable2;
• Calculates statistics for each combination of the BY variables
– Output out=output_data_set;
• Creates data set with the default statistics
Means Procedure
• Options
– Statistics Available
– Note: The default alpha level for confidence limits is 95%. Use ALPHA=
option to specify different alpha level.
CLM Two-Sided Confidence Limits RANGE Range
CSS Corrected Sum of Squares SKEWNESS Skewness
CV Coefficient of Variation STDDEV Standard Deviation
KURTOSIS Kurtosis STDERR Standard Error of Mean
LCLM Lower Confidence Limit SUM Sum
MAX Maximum Value SUMWGT Sum of Weight Variables
MEAN Mean UCLM Upper Confidence Limit
MIN Minimum Value USS Uncorrected Sum of Squares
N Number Non-missing Values VAR Variance
NMISS Number Missing Values PROBT Probability for Student’s t
MEDIAN (or P50) Median T Student’s t
Q1 (P25) 25% Quantile Q3 (P75) 75% Quantile
P1 1% Quantile P5 5% Quantile
P10 10% Quantile P90 90% Quantile
P95 95% Quantile P99 99% Quantile
Means Procedure
• PROC FREQ is used to generate frequency tables
• Most common usage is create table showing the distribution of categorical
variables
• General Form:
PROC FREQ DATA=input_data_set;
TABLE variable1*variable2*variable3/<options>;
RUN;
• Options
– LIST – prints cross tabulations in list format rather than grid
– MISSING – specifies that missing values should be included in the
tabulations
– OUT=output_data_set – creates a data set containing frequencies, list
format
– NOPRINT – suppress printing in the output window
• Use BY statement to get percentages within each category of a variable
Freq Procedure
Presentation Outline
• Proc SQL is the SAS implementation of SQL
• Proc SQL is a powerful SAS procedure that combines the functionality
of the SAS data step with the SQL language
• Proc SQL can sort, subset, merge and summarize data – all at once
• Proc SQL can combine standard SQL functions with virtually all SAS
functions
• Proc SQL can work remotely with RDBMS such as Oracle
Introduction - What is PROC SQL
PROC SQL – What can do?
– To perform a query – Using SELECT statement.
– To save queried result into SAS dataset – Using CREATE TABLE
statement
– To save the query itself – Using CREATE VIEW statement
– To sort dataset
– To merge more than one datasets in a number of ways
– To import dataset from Oracle Clinical to SAS
– To enter new records into a SAS dataset
– To modify/ edit the SAS dataset
PROC SQL - Why
• The Advantage of using SQL
– Combined functionality
– Faster for smaller tables
– SQL code is more portable for non-SAS applications
– Not require presorting
– Not require common variable names to join on. (need same
type , length)
• It is used to perform a query. It does not create any dataset.
• The simplest SQL code, need 3 statements
• By default, it will print the resultant query, use NOPRINT option to
suppress this feature
• Begin with PROC SQL, end with QUIT; not RUN;
• Need at least one SELECT… FROM statement
Performing Query – SELECT
Statement
PROC SQL;
SELECT *
FROM VITALS;
QUIT;
Performing Query – SELECT
Statement
To select all the variables
use ‘*’ after SELECT
statement
PROC SQL;
SELECT Patient, pulse
FROM VITALS;
QUIT;
Performing Query – SELECT
Statement
To select only particular variable(s) write down the variable names after SELECT
statement. Variable names should be separated by commas.
PROC SQL;
SELECT DISTINCT Patient
FROM VITALS;
QUIT;
Performing Query – SELECT
Statement
To select only distinct observations and to delete duplicate observations.
PROC SQL ;
SELECT *
FROM Vitals
ORDER BY date;
QUIT;
Ordering/Sorting Query Results
• SELECT * means we select all variables from dataset VITALS
• Put ORDER BY after FROM.
Sorting by Date
PROC SQL;
SELECT *
FROM vitals
WHERE Name CONTAINS 'J';
QUIT;
Subsetting:
- Character searching in WHERE
• Always put WHERE after FROM
• CONTAINS in WHERE statement only for character variables
Print observations with name
containing ‘J’.
PROC SQL;
SELECT *
FROM vitals
WHERE Name LIKE ‘%o%';
QUIT;
Subsetting
- Character searching in WHERE
• LIKE in WHERE statement only for character variables
Print observations with name
containing ‘o’ in between.
• In SELECT, the results of a query are converted to an output object (printing).
• Query results can also be stored as data.
• The CREATE TABLE statement creates a table with the results of a query.
• The CREATE VIEW statement stores the query itself as a view. Either way, the
data identified in the query can beused in later SQL statements or in other SAS
steps.
Creating New Data
PROC SQL;
CREATE TABLE bp
AS SELECT
patient, date, pulse
FROM Vitals
WHERE temp>98.5;
QUIT;
Creating New Data - Create Table
CREATE TABLE … AS…
Statement Creates a New
table from an existing table.
These statements will
copy all the variables to
the new dataset
PROC SQL;
CREATE TABLE bp
AS SELECT *
FROM Vitals
WHERE temp>98.5;
QUIT;
Creating New Data - Create Table
We can also assign different variable name, Label, Length, and format name
PROC SQL;
CREATE TABLE bp
AS SELECT
patient AS Patient LABEL='Subject number' LENGTH =5,
date AS Date LABEL='Date of Expt' FORMAT=WORDDATE8.,
pulse,
temp
FROM Vitals
WHERE temp>98.5;
QUIT;
PROC SQL;
CREATE VIEW bp
AS SELECT patient, date, pulse, temp
FROM Vitals;
WHERE temp>98.5
QUIT;
Creating New Data - Create View
• First step-creating a view,no output is produced.
• When a table is created, the query is executed and the resulting data is stored
in a file. When a view is created, the query itself is stored in the file. The data is
not accessed at all in the process of creating a view.
• The order of each statement is important
• CASE …END AS should in between SELECT and FROM
• Use WHEN … THEN ELSE… to redefine variables
• New variable GENDER is created from PATIENT.
Case Logic
- reassigning/recategorize
PROC SQL;
CREATE TABLE BP AS
SELECT Patient, Pulse,
CASE Patient
WHEN 101 THEN 'Male'
WHEN 102 THEN 'Female'
WHEN 103 THEN 'Female'
ELSE 'Male'
END AS Gender
FROM Vitals;
QUIT;
New Variable
Source variable
Combining Datasets: Joins
Full Join InnerJoin
Left Join Right Join
If a or b; If a and b;
If a; If b;
Dataset: Dosing
Combining Datasets: Joins
Dataset: Vitals
Combining Datasets: Joins
• No prior sorting required – one advantage over DATA MERGE
• Use comma (,) to separate two datasets in FROM
• Without WHERE, all possible combinations of rows from each tables is
produced, all columns are included
Join Tables (Merge datasets)
- Inner Join: Using WHERE
PROC SQL;
CREATE TABLE new AS
SELECT dosing.patient,
dosing.date,
dosing.med,
vitals.pulse,
vitals.temp
FROM dosing, vitals
WHERE dosing.patient=vitals.patient
AND dosing.date=vitals.date;
QUIT;
Join Tables (Merge datasets)
- Inner Join
Resultant dataset will contain all & only those observations which comes from
DOSING dataset.
Join Tables (Merge datasets)
- Left Joins using ON
PROC SQL;
CREATE TABLE new1 AS
SELECT dosing.patient,
dosing.date,
dosing.med,
vitals.pulse,
vitals.temp
FROM dosing LEFT JOIN vitals
ON dosing.patient=vitals.patient
AND dosing.date=vitals.date;
QUIT;
Join Tables (Merge datasets)
- Left Joins using ON
Resultant dataset will contain all & only those observations which comes from
VITALS dataset.
Join Tables (Merge datasets)
- Right Joins using ON
PROC SQL;
CREATE TABLE new1 AS
SELECT dosing.patient,
dosing.date,
dosing.med,
vitals.pulse,
vitals.temp
FROM dosing RIGHT JOIN vitals
ON dosing.patient=vitals.patient
AND dosing.date=vitals.date;
QUIT;
Join Tables (Merge datasets)
- Right Joins using ON
Resultant dataset will contain all observation if they come from at least one of the
datasets.
Join Tables (Merge datasets)
- Full Joins using ON
PROC SQL;
CREATE TABLE new1 AS
SELECT dosing.patient,
dosing.date,
dosing.med,
vitals.pulse,
vitals.temp
FROM dosing FULL JOIN vitals
ON dosing.patient=vitals.patient
AND dosing.date=vitals.date;
QUIT;
Join Tables (Merge datasets)
- Full Joins using ON
SQL Functions
♦ PROC SQL supports almost all the functions available to the SAS DATA
step that can be used in a proc sql select statement
♦ Common Functions:
◘ COUNT
◘ DISTINCT
◘ MAX
◘ MIN
◘ SUM
◘ AVG
◘ VAR
◘ STD
◘ STDERR
◘ NMISS
◘ RANGE
◘ SUBSTR
◘ LENGTH
◘ UPPER
◘ LOWER
◘ CONCAT
◘ ROUND
◘ MOD
PROC SQL functions
PROC SQL;
SELECT avg(Age) AS mean,
std(Age) AS sd,
min(Age) AS min,
max(Age) AS max,
count(Age) AS count,
N (Age) AS Count
FROM sashelp.class;
quit;
PROC SQL functions
PROC SQL;
SELECT sex,
avg(Age) AS mean,
std(Age) AS sd,
min(Age) AS min,
max(Age) AS max,
count(Age) AS count,
N (Age) AS Count
FROM sashelp.class;
GROUP BY Sex
quit;
/*Deleting rows*/
PROC SQL;
DELETE
FROM class
WHERE age le 13;
QUIT;
Editing Data – Deleting rows and
Dropping columns
/*Droping variables*/
PROC SQL;
CREATE TABLE New (DROP=age) AS
SELECT *
FROM Class;
QUIT;
• Deleting columns can be done in SELECT or in DROP on created table
Importing data from OC to SAS
Importing data from OC to SAS
Presentation Outline
Learning SAS
Learning SAS
Learning SAS
Learning SAS
Learning SAS
Learning SAS
Learning SAS
Learning SAS
Learning SAS
Presentation Outline
SAS Global Certification Program
SAS Global Certification Program
Presentation Outline
Questions and comments
¡MUCHAS GRACIAS!
Luis Barragán Scavino
Jorge Rodríguez Mamani
Calle Alcanfores 1255
Miraflores, Lima 18, Perú
+51 99 417 6340
luis.barragan@bigdata.pe
jorge.rodriguez@bigdata.pe

Contenu connexe

Tendances

Utility Procedures in SAS
Utility Procedures in SASUtility Procedures in SAS
Utility Procedures in SASguest2160992
 
Prog1 chap1 and chap 2
Prog1 chap1 and chap 2Prog1 chap1 and chap 2
Prog1 chap1 and chap 2rowensCap
 
Sql and PL/SQL Best Practices I
Sql and PL/SQL Best Practices ISql and PL/SQL Best Practices I
Sql and PL/SQL Best Practices ICarlos Oliveira
 
Trivadis TechEvent 2017 SQL Server 2016 Temporal Tables by Willfried Färber
Trivadis TechEvent 2017 SQL Server 2016 Temporal Tables by Willfried FärberTrivadis TechEvent 2017 SQL Server 2016 Temporal Tables by Willfried Färber
Trivadis TechEvent 2017 SQL Server 2016 Temporal Tables by Willfried FärberTrivadis
 
Introduction to ABAP
Introduction to ABAPIntroduction to ABAP
Introduction to ABAPsapdocs. info
 

Tendances (8)

Utility Procedures in SAS
Utility Procedures in SASUtility Procedures in SAS
Utility Procedures in SAS
 
Prog1 chap1 and chap 2
Prog1 chap1 and chap 2Prog1 chap1 and chap 2
Prog1 chap1 and chap 2
 
SAS basics Step by step learning
SAS basics Step by step learningSAS basics Step by step learning
SAS basics Step by step learning
 
Sql and PL/SQL Best Practices I
Sql and PL/SQL Best Practices ISql and PL/SQL Best Practices I
Sql and PL/SQL Best Practices I
 
Trivadis TechEvent 2017 SQL Server 2016 Temporal Tables by Willfried Färber
Trivadis TechEvent 2017 SQL Server 2016 Temporal Tables by Willfried FärberTrivadis TechEvent 2017 SQL Server 2016 Temporal Tables by Willfried Färber
Trivadis TechEvent 2017 SQL Server 2016 Temporal Tables by Willfried Färber
 
SAS Macros
SAS MacrosSAS Macros
SAS Macros
 
Introduction to ABAP
Introduction to ABAPIntroduction to ABAP
Introduction to ABAP
 
SAS ODS HTML
SAS ODS HTMLSAS ODS HTML
SAS ODS HTML
 

En vedette

Data Mining aplicado al sector seguros
Data Mining aplicado al sector segurosData Mining aplicado al sector seguros
Data Mining aplicado al sector segurosJorge Rodríguez M.
 
BBVA Arquitectura - Demo DevOps
BBVA Arquitectura - Demo DevOpsBBVA Arquitectura - Demo DevOps
BBVA Arquitectura - Demo DevOpsErnesto Anaya
 
Marketing de Servicios - Relacion con los clientes
Marketing de Servicios - Relacion con los clientesMarketing de Servicios - Relacion con los clientes
Marketing de Servicios - Relacion con los clientesRafael Medina
 

En vedette (6)

Data Mining aplicado al sector seguros
Data Mining aplicado al sector segurosData Mining aplicado al sector seguros
Data Mining aplicado al sector seguros
 
BBVA Arquitectura - Demo DevOps
BBVA Arquitectura - Demo DevOpsBBVA Arquitectura - Demo DevOps
BBVA Arquitectura - Demo DevOps
 
Marketing de Servicios - Relacion con los clientes
Marketing de Servicios - Relacion con los clientesMarketing de Servicios - Relacion con los clientes
Marketing de Servicios - Relacion con los clientes
 
Segmentación de Clientes
Segmentación de ClientesSegmentación de Clientes
Segmentación de Clientes
 
CRM como Estrategia
CRM como EstrategiaCRM como Estrategia
CRM como Estrategia
 
Caso de Estudio - Banca
Caso de Estudio - BancaCaso de Estudio - Banca
Caso de Estudio - Banca
 

Similaire à Introducción al Software Analítico SAS

Introduction to sas
Introduction to sasIntroduction to sas
Introduction to sasAjay Ohri
 
Introduction to-sas-1211594349119006-8
Introduction to-sas-1211594349119006-8Introduction to-sas-1211594349119006-8
Introduction to-sas-1211594349119006-8thotakoti
 
Introduction To Sas
Introduction To SasIntroduction To Sas
Introduction To Sashalasti
 
Sample Questions The following sample questions are not in.docx
Sample Questions The following sample questions are not in.docxSample Questions The following sample questions are not in.docx
Sample Questions The following sample questions are not in.docxtodd331
 
Sas-training-in-mumbai
Sas-training-in-mumbaiSas-training-in-mumbai
Sas-training-in-mumbaiUnmesh Baile
 
I need help with Applied Statistics and the SAS Programming Language.pdf
I need help with Applied Statistics and the SAS Programming Language.pdfI need help with Applied Statistics and the SAS Programming Language.pdf
I need help with Applied Statistics and the SAS Programming Language.pdfMadansilks
 
Sas Talk To R Users Group
Sas Talk To R Users GroupSas Talk To R Users Group
Sas Talk To R Users Groupgeorgette1200
 
Basics Of SAS Programming Language
Basics Of SAS Programming LanguageBasics Of SAS Programming Language
Basics Of SAS Programming Languageguest2160992
 
Sas short course_presentation_11-4-09
Sas short course_presentation_11-4-09Sas short course_presentation_11-4-09
Sas short course_presentation_11-4-09Prashant Ph
 
Sas short course_presentation_11-4-09
Sas short course_presentation_11-4-09Sas short course_presentation_11-4-09
Sas short course_presentation_11-4-09Prashant Ph
 
Sap abap
Sap abapSap abap
Sap abapnrj10
 

Similaire à Introducción al Software Analítico SAS (20)

Introduction to sas
Introduction to sasIntroduction to sas
Introduction to sas
 
INTRODUCTION TO SAS
INTRODUCTION TO SASINTRODUCTION TO SAS
INTRODUCTION TO SAS
 
Introduction to-sas-1211594349119006-8
Introduction to-sas-1211594349119006-8Introduction to-sas-1211594349119006-8
Introduction to-sas-1211594349119006-8
 
Introduction To Sas
Introduction To SasIntroduction To Sas
Introduction To Sas
 
SAS - Training
SAS - Training SAS - Training
SAS - Training
 
Sample Questions The following sample questions are not in.docx
Sample Questions The following sample questions are not in.docxSample Questions The following sample questions are not in.docx
Sample Questions The following sample questions are not in.docx
 
Sas-training-in-mumbai
Sas-training-in-mumbaiSas-training-in-mumbai
Sas-training-in-mumbai
 
I need help with Applied Statistics and the SAS Programming Language.pdf
I need help with Applied Statistics and the SAS Programming Language.pdfI need help with Applied Statistics and the SAS Programming Language.pdf
I need help with Applied Statistics and the SAS Programming Language.pdf
 
Sas Talk To R Users Group
Sas Talk To R Users GroupSas Talk To R Users Group
Sas Talk To R Users Group
 
SAS - overview of SAS
SAS - overview of SASSAS - overview of SAS
SAS - overview of SAS
 
Basics Of SAS Programming Language
Basics Of SAS Programming LanguageBasics Of SAS Programming Language
Basics Of SAS Programming Language
 
SAS Programming Notes
SAS Programming NotesSAS Programming Notes
SAS Programming Notes
 
Sas
SasSas
Sas
 
Sas
SasSas
Sas
 
Spss
SpssSpss
Spss
 
SAS Internal Training
SAS Internal TrainingSAS Internal Training
SAS Internal Training
 
Sas short course_presentation_11-4-09
Sas short course_presentation_11-4-09Sas short course_presentation_11-4-09
Sas short course_presentation_11-4-09
 
Sas short course_presentation_11-4-09
Sas short course_presentation_11-4-09Sas short course_presentation_11-4-09
Sas short course_presentation_11-4-09
 
5116427.ppt
5116427.ppt5116427.ppt
5116427.ppt
 
Sap abap
Sap abapSap abap
Sap abap
 

Dernier

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 

Dernier (20)

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 

Introducción al Software Analítico SAS

  • 2. Give a man a fish and you feed him for a day; teach a man to fish and you feed him for a lifetime.
  • 4. Introduction to the SAS Environment 1. SAS Introduction 2. SAS Programs 3. SAS Data Sets and Data Libraries 4. Creating SAS Data Sets
  • 5. What is SAS? • SAS is a comprehensive statistical software system which integrates utilities for storing, modifying, analyzing, and graphing data. • SAS runs on both Windows and UNIX platforms • SAS is used in a wide range of industries such as healthcare, education, financial services, life sciences,… • Check out the webpage to learn more • http://www.sas.com/
  • 7. 2012 Worldwide Results Breakdown by Industry Sector 2012 SAS Annual Report
  • 8. More than 1,500 banks with SAS
  • 10. SAS Banking Analytics Architecture
  • 11. SAS User Interface Log Window Explorer Window Editor Window Output Window (not shown) Results Window (not shown) Run button – click on this button to run SAS code Click here for SAS help New Window button Save button Tool bar similar to Windows applications
  • 12. Editor Window The Editor Window contains inputted data sets and SAS programs
  • 13. Explorer Window Explorer Window Libraries Folder - Contains data sets created in SAS
  • 14. Libraries Folder Contents of the Libraries Folder The Work Folder contains data sets created in SAS Contents of the Work Folder These are the data sets that have been created in SAS through inputting data and by creating data sets in SAS programs
  • 15. Log Window The Log Window contains a record of all commands submitted to SAS and shows errors in the commands.
  • 16. Output Window The Output Window contains output based on SAS programs submitted in the Editor Window.
  • 17. Results Window The Results Window shows a listing of SAS programs that have been submitted in the order that they were submitted. Click on any procedure to view all output parts of the procedure and click on any individual part to view the actual output.
  • 19. SAS Programs • File extension - .sas • Editor window has four uses: – Access and edit existing SAS programs – Write new SAS programs – Submitting SAS programs for execution – Saving SAS programs • SAS program – Sequence of steps that the user submits for execution • Submitting SAS programs – Entire program – Selection of the program • 2 Basic steps in SAS programs: – Data Steps • Typically used to create SAS datasets and manipulate data, • Begins with DATA statement – Proc Steps • Typically used to process SAS data sets • Begins with PROC statement • The end of the data or proc steps are indicated by: – RUN statement – most steps – QUIT statement – some steps – Beginning of another step (DATA or PROC statement)
  • 20. • SAS Data Libraries – Contain SAS data sets – Identified by assigning a library reference name – libref – Temporary • Work library • SAS data files are deleted when session ends • Library reference name not necessary – Permanent • SAS data sets are saved after session ends • SASUSER library • You can create and access your own libraries SAS Data Sets and Data Libraries
  • 22. 1. Data Set Information 2. Data Set Manipulation 3. Combining Data Sets A. Concatenating/Appending B. Merging Working With SAS Data Sets
  • 23. • Proc Contents – Output contains a table of contents of the specified data set – Data Set Information • Data set name • Number of observations • Number of Variables – Variable Information • Type (numeric or character) • Length – Syntax: PROC CONTENTS DATA=input_data_set; RUN; Data Set Information
  • 24. • Create a new SAS data set using an existing SAS data set as input – Specify name of the new SAS data set after the DATA statement – Use SET statement to identify SAS data set being read – Syntax: DATA output_data_set; SET input_data_set; <additional SAS statements>; RUN; – By default the SET statement reads all observations and variables from the input data set into the output data set. Data Set Manipulation
  • 25. • Assignment Statements – Evaluate an expression – Assign resulting value to a variable – General Form: variable = expression; – Example: miles_per_hour = distance/time; • SAS Functions – Perform arithmetic functions, compute simple statistics, manipulate dates, etc. – General Form: variable=function_name(argument1, argument2,…); – Example: Time_worked = sum(Day1,Day2, Day3, Day4, Day5); Data Set Manipulation
  • 26. • Conditional Processing – Uses IF-THEN-ELSE logic – General Form: IF <expression1> THEN <statement>; ELSE IF <expression2> THEN <statement>; ELSE <statement>; – <expression> is a true/false statement, such as: • Day1=Day2, Day1 > Day2, Day1 < Day2 • Day1+Day2=10 • Sum(day1,day2)=10 • Day1=5 and Day2=5 Data Set Manipulation
  • 27. • Conditional Processing Symbolic Mnemonic Example = EQ IF region=‘Spain’; ~= or ^= NE IF region ne ‘Spain’; > GT IF rainfall > 20; < LT IF rainfall lt 20; >= GE IF rainfall ge 20; <= LE IF rainfall <= 20; & AND IF rainfall ge 20 & temp < 90; | or ! OR IF rainfall ge 20 OR temp < 90; IS NOT MISSING IF region IS NOT MISSING; BETWEEN AND IF region BETWEEN ‘Plain’ AND ‘Spain’; CONTAINS IF region CONTAINS ‘ain’; IN IF region IN (‘Rain’, ‘Spain’, ‘Plain’); Data Set Manipulation
  • 28. • PROC SORT sorts data according to specified variables • General Form: PROC SORT DATA=input_data_set <options>; BY Variable1 Variable2; RUN; • Sorts data according to Variable1 and then Variable2; • By default, SAS sorts data in ascending order – Number low to high – A to Z • Use DESCENDING statement for numbers high to low and letters Z to A – BY City DESCENDING Population; – SAS sorts data first by city A to Z and then Population high to low Data Set Manipulation
  • 29. • Merging Data Sets – One-to-One Match Merge • A single record in a data set corresponds to a single record in all other data sets • Example: Patient and Billing Information – One-to-Many Match Merge • Matching one observation from one data set to multiple observations in other data sets • Example: County and State Information – Note: Data must be sorted before merging can be done (PROC SORT) Combining Data Sets
  • 30. • Concatenating (or Appending) • Stacks each data set upon the other • If one data set does not have a variable that the other datasets do, the variable in the new data set is set to missing for the observations from that data set. • General Form: DATA output_data_set; SET data1 data2; run; • PROC APPEND may also be used Combining Data Sets
  • 32. 1. Print Procedure 2. Plot Procedure 3. Univariate Procedure 4. Means Procedure 5. Freq Procedure Summary Procedures
  • 33. • PROC PRINT is used to print data to the output window • By default, prints all observations and variables in the SAS data set • General Form: PROC PRINT DATA=input_data_set <options> <optional SAS statements>; RUN; • Some Options – input_data_set (obs=n) - Specifies the number of observations to be printed in the output – NOOBS - Suppresses printing observation number – LABEL - Prints the labels instead of variable names Print Procedure
  • 34. • Used to create basic scatter plots of the data • Use PROC GPLOT or PROC SGPLOT for more sophisticated plots • General Form: PROC PLOT DATA=input_data_set; PLOT vertical_variable * horizontal_variable/<options>; RUN; • By default, SAS uses letters to mark points on plots – A for a single observation, B for two observations at the same point, etc. • To specify a different character to represent a point – PLOT vertical_variable * horizontal variable = ‘*’; • To specify a third variable to use to mark points – PLOT vertical_variable * horizontal_variable = third_variable; • To plot more than one variable on the vertical axis – PLOT vertical_variable1 * horizontal_variable=‘2’ vertical_variable2 * horizontal_variable=‘1’/OVERLAY; Plot Procedure
  • 35. • PROC UNIVARIATE is used to examine the distribution of data • Produces summary statistics for a single variable – Includes mean, median, mode, standard deviation, skewness, kurtosis, quantiles, etc. • General Form: PROC UNIVARIATE DATA=input_data_set<options>; VAR variable1 variable2 variable3; RUN ; • If the variable statement is not used, summary statistics will be produced for all numeric variables in the input data set. • Options include: – PLOT – produces Stem-and-leaf plot, Box plot, and Normal probability plot; – NORMAL – produces tests of Normality Univariate Procedure
  • 36. • Similar to the Univariate procedure • General Form: PROC MEANS DATA=input_data_set options; <Optional SAS statements>; RUN; • With no options or optional SAS statements, the Means procedure will print out the number of non-missing values, mean, standard deviation, minimum, and maximum for all numeric variables in the input data set • Optional SAS Statements – VAR Variable1 Variable2; • Specifies which numeric variables statistics will be produced for – BY Variable1 Variable2; • Calculates statistics for each combination of the BY variables – Output out=output_data_set; • Creates data set with the default statistics Means Procedure
  • 37. • Options – Statistics Available – Note: The default alpha level for confidence limits is 95%. Use ALPHA= option to specify different alpha level. CLM Two-Sided Confidence Limits RANGE Range CSS Corrected Sum of Squares SKEWNESS Skewness CV Coefficient of Variation STDDEV Standard Deviation KURTOSIS Kurtosis STDERR Standard Error of Mean LCLM Lower Confidence Limit SUM Sum MAX Maximum Value SUMWGT Sum of Weight Variables MEAN Mean UCLM Upper Confidence Limit MIN Minimum Value USS Uncorrected Sum of Squares N Number Non-missing Values VAR Variance NMISS Number Missing Values PROBT Probability for Student’s t MEDIAN (or P50) Median T Student’s t Q1 (P25) 25% Quantile Q3 (P75) 75% Quantile P1 1% Quantile P5 5% Quantile P10 10% Quantile P90 90% Quantile P95 95% Quantile P99 99% Quantile Means Procedure
  • 38. • PROC FREQ is used to generate frequency tables • Most common usage is create table showing the distribution of categorical variables • General Form: PROC FREQ DATA=input_data_set; TABLE variable1*variable2*variable3/<options>; RUN; • Options – LIST – prints cross tabulations in list format rather than grid – MISSING – specifies that missing values should be included in the tabulations – OUT=output_data_set – creates a data set containing frequencies, list format – NOPRINT – suppress printing in the output window • Use BY statement to get percentages within each category of a variable Freq Procedure
  • 40. • Proc SQL is the SAS implementation of SQL • Proc SQL is a powerful SAS procedure that combines the functionality of the SAS data step with the SQL language • Proc SQL can sort, subset, merge and summarize data – all at once • Proc SQL can combine standard SQL functions with virtually all SAS functions • Proc SQL can work remotely with RDBMS such as Oracle Introduction - What is PROC SQL
  • 41. PROC SQL – What can do? – To perform a query – Using SELECT statement. – To save queried result into SAS dataset – Using CREATE TABLE statement – To save the query itself – Using CREATE VIEW statement – To sort dataset – To merge more than one datasets in a number of ways – To import dataset from Oracle Clinical to SAS – To enter new records into a SAS dataset – To modify/ edit the SAS dataset
  • 42. PROC SQL - Why • The Advantage of using SQL – Combined functionality – Faster for smaller tables – SQL code is more portable for non-SAS applications – Not require presorting – Not require common variable names to join on. (need same type , length)
  • 43. • It is used to perform a query. It does not create any dataset. • The simplest SQL code, need 3 statements • By default, it will print the resultant query, use NOPRINT option to suppress this feature • Begin with PROC SQL, end with QUIT; not RUN; • Need at least one SELECT… FROM statement Performing Query – SELECT Statement
  • 44. PROC SQL; SELECT * FROM VITALS; QUIT; Performing Query – SELECT Statement To select all the variables use ‘*’ after SELECT statement
  • 45. PROC SQL; SELECT Patient, pulse FROM VITALS; QUIT; Performing Query – SELECT Statement To select only particular variable(s) write down the variable names after SELECT statement. Variable names should be separated by commas.
  • 46. PROC SQL; SELECT DISTINCT Patient FROM VITALS; QUIT; Performing Query – SELECT Statement To select only distinct observations and to delete duplicate observations.
  • 47. PROC SQL ; SELECT * FROM Vitals ORDER BY date; QUIT; Ordering/Sorting Query Results • SELECT * means we select all variables from dataset VITALS • Put ORDER BY after FROM. Sorting by Date
  • 48. PROC SQL; SELECT * FROM vitals WHERE Name CONTAINS 'J'; QUIT; Subsetting: - Character searching in WHERE • Always put WHERE after FROM • CONTAINS in WHERE statement only for character variables Print observations with name containing ‘J’.
  • 49. PROC SQL; SELECT * FROM vitals WHERE Name LIKE ‘%o%'; QUIT; Subsetting - Character searching in WHERE • LIKE in WHERE statement only for character variables Print observations with name containing ‘o’ in between.
  • 50. • In SELECT, the results of a query are converted to an output object (printing). • Query results can also be stored as data. • The CREATE TABLE statement creates a table with the results of a query. • The CREATE VIEW statement stores the query itself as a view. Either way, the data identified in the query can beused in later SQL statements or in other SAS steps. Creating New Data
  • 51. PROC SQL; CREATE TABLE bp AS SELECT patient, date, pulse FROM Vitals WHERE temp>98.5; QUIT; Creating New Data - Create Table CREATE TABLE … AS… Statement Creates a New table from an existing table. These statements will copy all the variables to the new dataset PROC SQL; CREATE TABLE bp AS SELECT * FROM Vitals WHERE temp>98.5; QUIT;
  • 52. Creating New Data - Create Table We can also assign different variable name, Label, Length, and format name PROC SQL; CREATE TABLE bp AS SELECT patient AS Patient LABEL='Subject number' LENGTH =5, date AS Date LABEL='Date of Expt' FORMAT=WORDDATE8., pulse, temp FROM Vitals WHERE temp>98.5; QUIT;
  • 53. PROC SQL; CREATE VIEW bp AS SELECT patient, date, pulse, temp FROM Vitals; WHERE temp>98.5 QUIT; Creating New Data - Create View • First step-creating a view,no output is produced. • When a table is created, the query is executed and the resulting data is stored in a file. When a view is created, the query itself is stored in the file. The data is not accessed at all in the process of creating a view.
  • 54. • The order of each statement is important • CASE …END AS should in between SELECT and FROM • Use WHEN … THEN ELSE… to redefine variables • New variable GENDER is created from PATIENT. Case Logic - reassigning/recategorize PROC SQL; CREATE TABLE BP AS SELECT Patient, Pulse, CASE Patient WHEN 101 THEN 'Male' WHEN 102 THEN 'Female' WHEN 103 THEN 'Female' ELSE 'Male' END AS Gender FROM Vitals; QUIT; New Variable Source variable
  • 55. Combining Datasets: Joins Full Join InnerJoin Left Join Right Join If a or b; If a and b; If a; If b;
  • 58. • No prior sorting required – one advantage over DATA MERGE • Use comma (,) to separate two datasets in FROM • Without WHERE, all possible combinations of rows from each tables is produced, all columns are included Join Tables (Merge datasets) - Inner Join: Using WHERE PROC SQL; CREATE TABLE new AS SELECT dosing.patient, dosing.date, dosing.med, vitals.pulse, vitals.temp FROM dosing, vitals WHERE dosing.patient=vitals.patient AND dosing.date=vitals.date; QUIT;
  • 59. Join Tables (Merge datasets) - Inner Join
  • 60. Resultant dataset will contain all & only those observations which comes from DOSING dataset. Join Tables (Merge datasets) - Left Joins using ON PROC SQL; CREATE TABLE new1 AS SELECT dosing.patient, dosing.date, dosing.med, vitals.pulse, vitals.temp FROM dosing LEFT JOIN vitals ON dosing.patient=vitals.patient AND dosing.date=vitals.date; QUIT;
  • 61. Join Tables (Merge datasets) - Left Joins using ON
  • 62. Resultant dataset will contain all & only those observations which comes from VITALS dataset. Join Tables (Merge datasets) - Right Joins using ON PROC SQL; CREATE TABLE new1 AS SELECT dosing.patient, dosing.date, dosing.med, vitals.pulse, vitals.temp FROM dosing RIGHT JOIN vitals ON dosing.patient=vitals.patient AND dosing.date=vitals.date; QUIT;
  • 63. Join Tables (Merge datasets) - Right Joins using ON
  • 64. Resultant dataset will contain all observation if they come from at least one of the datasets. Join Tables (Merge datasets) - Full Joins using ON PROC SQL; CREATE TABLE new1 AS SELECT dosing.patient, dosing.date, dosing.med, vitals.pulse, vitals.temp FROM dosing FULL JOIN vitals ON dosing.patient=vitals.patient AND dosing.date=vitals.date; QUIT;
  • 65. Join Tables (Merge datasets) - Full Joins using ON
  • 66. SQL Functions ♦ PROC SQL supports almost all the functions available to the SAS DATA step that can be used in a proc sql select statement ♦ Common Functions: ◘ COUNT ◘ DISTINCT ◘ MAX ◘ MIN ◘ SUM ◘ AVG ◘ VAR ◘ STD ◘ STDERR ◘ NMISS ◘ RANGE ◘ SUBSTR ◘ LENGTH ◘ UPPER ◘ LOWER ◘ CONCAT ◘ ROUND ◘ MOD
  • 67. PROC SQL functions PROC SQL; SELECT avg(Age) AS mean, std(Age) AS sd, min(Age) AS min, max(Age) AS max, count(Age) AS count, N (Age) AS Count FROM sashelp.class; quit;
  • 68. PROC SQL functions PROC SQL; SELECT sex, avg(Age) AS mean, std(Age) AS sd, min(Age) AS min, max(Age) AS max, count(Age) AS count, N (Age) AS Count FROM sashelp.class; GROUP BY Sex quit;
  • 69. /*Deleting rows*/ PROC SQL; DELETE FROM class WHERE age le 13; QUIT; Editing Data – Deleting rows and Dropping columns /*Droping variables*/ PROC SQL; CREATE TABLE New (DROP=age) AS SELECT * FROM Class; QUIT; • Deleting columns can be done in SELECT or in DROP on created table
  • 70. Importing data from OC to SAS
  • 71. Importing data from OC to SAS
  • 87. ¡MUCHAS GRACIAS! Luis Barragán Scavino Jorge Rodríguez Mamani Calle Alcanfores 1255 Miraflores, Lima 18, Perú +51 99 417 6340 luis.barragan@bigdata.pe jorge.rodriguez@bigdata.pe