2. Why Use SAS for IR?
• Data Manipulation
• Strong reliable statistical package
(regression, psychometrics, create
your own formula)
• Email reports (directly from SAS)
• Create reports in many different
file types (doc, xls, xml, pdf…)
• Macros (variable that can be used
throughout a SAS program)
• Create all THECB and IPEDS
• Create your own error check
reports
• Create tables for tools like Tableau
• Write table directly to other data
warehouses
• Record of exactly what you did
• Batch report (no need to even
open SAS)
• Create Maps
• …
3. Workshop Goals
• Make you a SAS Expert in Three Hours
• Learn the basic SAS programming for:
• Importing (In Programming)
• The DATA Step ( the most important procedure in SAS.)
• Proc Sort
• Exporting Data
• Proc Print
• Proc Freq (the easily reporting procedure)
• Proc Summary
• Proc Tabulate (a little more difficult)
• If we have time:
• Touch of Macros to make life easy
• Creating Text files ready to send to the THECB
• Email reports (if we have time)
• Proc Report ( show it –It’s great but harder to master)
5. Libname Statement
• libname Enrollme "w4aafsssSSITCBM SAS DatabasesCBM 001
Text FilesFinal”;
• In English-
• I want to create a directory of files in SAS named Enrollme that is
stored current here w4aafsssSSITCBM SAS DatabasesCBM 001
Text FilesFinal “;”
• Library can only be 8 characters long.
12. Proc Import
• Colby, we have no SAS datasets, YET.
Proc Import file=“w4aafsssSSITPlanningTAIRSAS DB Training 1.xlsx”
DBMS=Excel out=Students replace;
Run;
I want to import a file located at w4aafsssSSITPlanningTAIRSAS DB
Training 1.xlsx that is this type of file. I want to named it “Students” and if I
rerun this code I want SAS to replace “Students”.
Please run this command;
http://support.sas.com/documentation/cdl/en/acpcref/63184/HTML/default
/viewer.htm#a003102096.htm
All DBMS Options
Tips: In SAS programs, don’t use mapped drives.
Use network addresses especially if you plan to
share code.
13. The log
Note: SAS Data sets names in SAS code have two
parts, the library name and the database name.
If you do not give a libname (library name), SAS
assumes the database goes to a temporary
library called “work”.
Caution: SAS deletes all data sets in “work” once
you close SAS.
15. The Data Step- Creating a Dataset from
Existing data
• Data (Required)
• PUT (Optional)
• Set or Merge (Required most times)
• By (optional, but required if you use merge statements)
• Where or IF Statements (optional)
• Keep and Drop Statements (optional)
• Rename Statement (optional)
• Run; (Required)
17. Data Statement
• Data work.StudentChanged;
• I want to create a dataset stored in the “work” library named
“StudentChanged”.
18. Set Statement
• Data work.StudentChanged;
• Set work.students;
• I want to create a dataset stored in the “work” library and name
“StudentChanged”.
• I want you to get the data to create this dataset from the dataset
stored in the library named “work” and named “students”.
19. Create a field with the same value for all
records.
• Data work.StudentChanged;
• Set work.students;
• Studentmarker=1;
• I want to create a dataset stored in the “work” library and named
“StudentChanged”.
• I want you to get the data to create this dataset from the dataset
stored in the library named “work” and named “students”.
• I want to add a field named Studentmarker where every record as the
value of 1.
Note: SAS variables exist in two forms: Numeric or string
(character). “ “ or ‘ ‘ must be used when created or using
character variables.
20. Create a field with the same value for all
records.
• Data work.StudentChanged;
• Set work.students;
• Studentmarker=1;
• Run;
• I want to create a dataset stored in the “work” library and name
“StudentChanged”.
• I want you to get the data to create this dataset from the dataset stored in the
library named “work” and named “students”.
• I want to add a field named Studentmarker where every record as the value of 1.
• I want you to do this.
Tips: Notice the semicolons at the end in of each
statement. This is how SAS know the statement
has ended. YOU WILL GET AN ERROR WITHOUT
THEM.
21. Create a Dataset with only some students
(records) based a selection criteria.
• I want a dataset of Male students with a GPA equal or greater than
3.5.
22. Data Statement
• Data StudentHiGPAMale;
• I want to create a dataset stored in the “work” library named
“StudentHiGPAMale”.
23. Set Statement
• Data work. StudentHiGPAMale;
• Set work.students;
• I want to create a dataset stored in the “work” library and named
“StudentHiGPAMale”.
• I want you to get the data to create this dataset from the dataset
stored in the library named “work” and named “students”.
24. Where Statement
• Data work. StudentHiGPAMale;
• Set work.students;
• Where Gender=“M” and GPA>=3.5;
• Run;
• I want to create a dataset stored in the “work” library and named
“StudentHiGPAMale”.
• I want you to get the data to create this dataset from the dataset stored in
the library named “work” and named “students”.
• I don’t want all of “work.students”. I only want students who are male and
have GPA’s greater than or equal to 3.5
• I want you to do this.
25. Where Statement 2
• Data work. StudentHiGPAMale;
• Set work.students;
• Where Gender=“M” or GPA>=3.5;
• Run;
• I want to create a dataset stored in the “work” library and named
“StudentHiGPAMale”.
• I want you to get the data to create this dataset from the dataset stored in
the library named “work” and named “students”.
• I don’t want all of “work.students”. I only want students who are male or
have GPA’s greater than or equal to 3.5
• I want you to do this.
SAS Operators (And/Or)-
And -you only want data with those two (or
more) conditions
OR- You want data that has either of those
conditions
26. Where Statement 3
• Data work. StudentHiGPAMale;
• Set work.students;
• Where (Gender=“M” or GPA>=3.5) and School=“SOD”;
• Run;
• I want to create a dataset stored in the “work” library and named
“StudentHiGPAMale”.
• I want you to get the data to create this dataset from the dataset stored in
the library named “work” and named “students”.
• I don’t want all of “work.students”. I only want students who are male or
have GPA’s greater than or equal to 3.5, but only if they are in School SOD.
• I want you to do this.
27. Where Statement 3
• Data work. StudentLisaJoe;
• Set work.students;
• Where First_name in (“lisa” “Joe”);
• Run;
• I want to create a dataset stored in the “work” library and named
“StudentLisaJoe”.
• I want you to get the data to create this dataset from the dataset stored in
the library named “work” and named “students”.
• I don’t want all of “work.students”. I only want students whose names are
“lisa” or “Joe”.
• I want you to do this.
28. IF Statement 1
• Data work. StudentLisaJoe;
• Set work.students;
• if First_name in (“lisa” “Joe”) then output (or delete);
• Run;
• I want to create a dataset stored in the “work” library and named
“StudentLisaJoe”.
• I want you to get the data to create this dataset from the dataset stored in
the library named “work” and named “students”.
• I don’t want all of “work.students”. I only want students whose names are
“lisa” or “Joe”.
• I want you to do this.
29. Where and IF Statement warning
• Data work. StudentLisaJoe;
• Set work.students;
• if First_name in (“Lisa” “Joe”);
• Run;
• I want to create a dataset stored in the “work” library and name
“StudentLisaJoe”.
• I want you to get the data to create this dataset from the dataset stored in
the library named “work” and named “students”.
• I don’t want all of “work.students”. I only want students whose names are
“Lisa” or “Joe”.
• I want you to do this.
Warning Where and IF are case sensitive
30. Use functions to make records similiar
• data students1;
• set students;
• First_name1=propcase(First_name);
• run;
• Data work.StudentLisaJoe;
• Set work.students1;
• if First_name1 in ("Lisa" "Joe");
• Run;
• List of SAS Functions
• http://support.sas.com/documentation/cdl/en/lrdict/64316/HTML/default/viewer.htm#a000245
860.htm
32. Keep Statement
• Data Studentonlynames;
• Set students;
• Keep First_name Last_name;
• Run;
• I want to create a dataset stored in the “work” library and named
“Studentonlynames”.
• I want you to get the data to create this dataset from the dataset stored in the
library named “work” and named “students”.
• I want only the variables named First_name and Last_name to be dataset
“Studentonlynames”.
• I want you to do this.
Remember if no libname is given SAS
assumed you mean the “work”
Library
Keep will only “keep” the variables in
the name in the statement.
33. Drop Statement
• Data Studentwonames;
• Set students;
• drop First_name Last_name;
• Run;
• I want to create a dataset stored in the “work” library and named
“Studentwonames”.
• I want you to get the data to create this dataset from the dataset stored in the
library named “work” and named “students”.
• I want all the variables in dataset “Studentwonames” except First_name
Last_name.
• I want you to do this.
Remember if no libname is given SAS
assumed you mean the “work”
Library
Drop statement only removes the
variables named after it.
35. Rename variables in a Dataset
• Data StudentsCopyStudent_Loan;
• Set students;
• Rename Student_loan=Student_Loan2
School=School2;
• Run;
• I want to create a dataset stored in the “work” library and named
“StudentsCopyStudent_Loan”.
• I want you to get the data to create this dataset from the dataset stored in the
library named “work” and named “students”.
• I want the variable Student_loan to now be named “Student_Loan2 and School to
be named school2.
• I want you to do this.
36. Copy variables in a Dataset
• Data StudentsCopyStudent_Loan;
• Set students;
• Student_loan2=Student_Loan;
• Run;
• I want to create a dataset stored in the “work” library and name
“StudentsCopyStudent_Loan”.
• I want you to get the data to create this dataset from the dataset stored in
the library named “work” and named “students”.
• I want the variable Student_loan2 to be create by copying “Student_Loan”.
• I want you to do this.
37. Creating new variables from information
already in your dataset
• Start with numeric values
38. Simple equations with Numeric variables
• Data StudentsStuloansbyhalf;
• Set students;
• Student_loanbyhalf=Student_loan*0.5;
• Run;
• I want to create a dataset stored in the “work” library and named
“StudentsCopyStudent_Loan”.
• I want you to get the data to create this dataset from the dataset stored in
the library named “work” and named “students”.
• I want a new variable created called Student_loanbyhalf which will be
equal to Student_loan * 0.5 .
• I want you to do this.
39. Simple equations with Numeric variables
• Data StudentsStuloanstimesyear;
• Set students;
• Student_loantimesyear=Student_loan*CalYear;
• Run;
• I want to create a dataset stored in the “work” library and name “StudentsStuloanstimesyear”.
• I want you to get the data to create this dataset from the dataset stored in the library named
“work” and named “students”.
• I want a new variable created called Student_loantimesyear which will be equal to Student_loan *
CalYear .
• I want you to do this.
• All SAS Operators
• http://support.sas.com/documentation/cdl/en/lrcon/62955/HTML/default/viewer.htm#a000780
367.htm
Warning: Null (.) equals negative finite not zero
40. Create new numeric variables with Functions
Round function
• Data StudentGPARound;
• Set students;
• GPAround=round(GPA,.01);
• Run;
• I want to create a dataset stored in the “work” library and name
“StudentGPARound”.
• I want you to get the data to create this dataset from the dataset
stored in the library named “work” and named “students”.
• I want to create a new variable named “GPAround” by rounding
“GPA” to the hundredth place
42. Substr, compress, upcase Functions
• Data StudentIntials;
• Set students;
• Initials=compress(upcase(substr (first_name, 1,1)) || “.”||
upcase(substr(last_name, 1,1)) );
• Run;
• I want to create a dataset stored in the “work” library and name
“StudentIntials”.
• I want you to get the data to create this dataset from the dataset stored in
the library named “work” and named “students”.
• I want to create a variable named “Initials” where I take the first letter in
upper case of the first name then add a period then take the first letter in
upper case of the last name and then remove all spaces.
43. Concept Check
• Data Fred;
• Set red.color;
• Run;
• Data Students2;
• Set students;
• Where semester=“Fall”
• Run;
• Data time.student2;
• Set student;
• Keep gpa semester last_name;
• Run;
• Data time.student2;
• Set student;
• Year2=year*100;
• Run;
44. Concept Check
• Create a dataset named “grow”
using a SAS database named
“flowers”.
• Create a dataset named “sun”
using the dataset named “star”.
Copy the variable named “mass”
into the a variable named
“mass2”.
• Create a dataset name highSAT
using a SAS database named
“satscores”. Only include records
if the variable SAT is greater than
1400.
45. Logic statements to create new variable
• data studentmarkloan;
• put highloans $30.;
• set students;
• If student_loan>20000 then highloans="High Student Loans";
• else highloans="Not High Student Loans";
• run;
• I want to create a dataset stored in the “work” library and name “studentmarkloan;”.
• I want you to get the data to create this dataset from the dataset stored in the library named
“work” and named “students”.
• If student_loan is greater than 20000 then I want highloans to equal “High Student loans” if not I
want it to equal Not High Student Loans”
46. Combining to dataset (SET)
• Data StudentDouble;
• Set studentmarkloan Students;
• Run;
• I want to create a dataset stored in the “work” library and named
“StudentDouble”.
• I want you to get the data to create this dataset by combining putting
datasets studentmarkloan and Students
47. Proc sort
• Proc Sort data= StudentDouble;
• By Stud_id;
• Run;
• I want to sort in ascending order all dataset records using the variable
“stud_id” .
• Proc Sort data= StudentDouble;
• By descending Stud_id ;
• Run;
• I want to sort in descending order all dataset records using the variable
“stud_id” .
48. Creating multiple datasets in one dataset
• Data first second;
• Set StudentDouble;
• By stud_id;
• If first.stud_id then output first; *Use any kind of condition;
• Else output second;
• Run;
• I want to create two datasets stored in the “work” library and named
“first” and “second”.
• I want to rely on the fact that “studentdouble” is sorted by “stud_id”.
• When you find the first unique stud_id store it in the dataset “first”. If it is
not the first unique stud_id then store it in the dataset “second”.
49. Combining to dataset (Merge one to one)
• Data CoursegradeMerge;
• Merge students (in=a) Coursegrade (in=b);
• By stud_id;
• If a then output;
• Run;
50. Combining to dataset (Merge one to one)
• Data CoursegradeMerge;
• Merge students (in=a) Coursegrade (in=b);
• By stud_id;
• If b then output;
• Run;
51. Combining to dataset (Merging)
• Data CoursegradeMerge;
• Merge students (in=a) Coursegrade (in=b);
• By stud_id;
• If a then output;
• Run;
52. Different Ways to Merge
• Data CoursegradeMerge2;
• Merge students (in=a) Coursegrade (in=b);
• By stud_id;
• If b then output;
• Run;
• Data CoursegradeMerge3;
• Merge students (in=a) Coursegrade (in=b);
• By stud_id;
• If a and b then output;
• Run;
• Data CoursegradeMerge4;
• Merge students (in=a) Coursegrade (in=b);
• By stud_id;
• If a and not b then output;
• Run;
• Data CoursegradeMerge5;
• Merge students (in=a) Coursegrade (in=b);
• By stud_id;
• If B and not a then output;
• Run;
53. Proc Print
• Proc Print data=students;
• Var _all_;
• Run;
• I want to create output to the SAS viewer using data from a dataset
named “students”.
• In the output, I want you to include all variables and records in the
dataset.
54. More Proc print
• Proc Print data=students;
• Var First_name Last_name GPA;
• Run;
• I want to create output to the SAS viewer
using data from a dataset named “students”.
• In the output, I want you to include
First_name, Last_name, and GPA variables in
that order.
• Proc Print data=students noobs label;
• where gpa>3.8;
• label first_name="First Name";
• Var First_name Last_name GPA;
• Run;
• I want to create output to the SAS viewer
using data from a dataset named “students”.
• Please delete the observation number and let
me relabel the field names.
• I only want records with greater than a 3.8
GPA to be in this output.
• Please change the relabel variable first_name
to First Name.
• In the output, I want you to include
First_name, Last_name, and GPA variables in
that order.
55. Proc Freq
• Proc freq data=students;
• Table Last_name;
• Run;
• I want to create output in the form
of a frequency table to the SAS
viewer using data from a dataset
named “students”.
• Please make a frequency table of
the variable “last_name”
• Proc freq data=students;
• Table Last_name*semester
/NOPERCENT NOCOL;
• Run;
• I want to create output in the form
of a frequency table to the SAS
viewer using data from a dataset
named “students”.
• Please make a crosstab table of the
variable “last_name” and
“semester. In the output, please
remove the overall percentages
and column percentages.
http://support.sas.com/documentation/cdl/en/procstat/63104/HTML/default/viewer.htm#procstat_freq_sect0
10.htm
56. Proc Tabulate
• Proc tabulate data=students;
• Class semester;
• Var gpa;
• Table semester, gpa*mean=“Average”;
• Run;
• I want to create output in the form of a
frequency table to the SAS viewer using
data from a dataset named “students”.
• I want to include the categorical variable
name “Semester” and the numeric
variable “GPA”.
• Please build a table with Semester’s on
the row and a mean gpa for each
semester.
• Proc tabulate data=students;
• Class semester ethnic;
• Table semester, ethnic*(n=“Count");
• Run;
• I want to create output in the form of a
frequency table to the SAS viewer using
data from a dataset named “students”.
• I want to include the categorical variables
name “Semester” and “ethnic”.
• Please build a table with Semester’s on
the row and ethnicity in the column’s and
give me the frequency for each cell.
Please add the label count for the
frequencies.
57. More Proc Tabulate
• Proc tabulate data=students;
• label ethnic="Ethnicity";
• Class semester ethnic;
• Table semester, ethnic*(n=" ");
• Run;
• I want to create output in the form of a
frequency table to the SAS viewer using data
from a dataset named “students”.
• Please relabel “ethnic” as “Ethnicity” in the
table.
• I want to include the categorical variables
name “Semester” and “ethnic”.
• Please build a table with Semester’s on the
row and ethnicity in the column’s and give me
the frequency for each cell. Please remove
the label for n on the columns.
• Proc tabulate data=students;
• where gpa>3.8;
• label ethnic="Ethnicity";
• Class semester ethnic;
• Table semester all="Total", ethnic*(n=" ") all="Total";
• Run;
• I want to create output in the form of a frequency table
to the SAS viewer using data from a dataset named
“students”.
• Please only include records with a GPA greater than a 3.8.
• Please relabel “ethnic” as “Ethnicity” in the table.
• I want to include the categorical variables name
“Semester” and “ethnic”.
• Please build a table with Semester’s on the row. Ethnicity
in the column’s and give me the frequency for each cell.
Include row and column totals Please remove the label
for n on the columns
58. More Proc Tabulate
• Proc tabulate data=students;
• label ethnic="Ethnicity";
• Class semester ethnic;
• Table semester*ethnic all="Total", gender*(n=" ") all="Total";
• Run;
• I want to create output in the form of a frequency table to the SAS viewer using
data from a dataset named “students”.
• Please relabel “ethnic” as “Ethnicity” in the table.
• I want to include the categorical variables name “Semester” and “ethnic”.
• Please build a table with Semesters on the row crossed by ethnic. Gender in the
columns and give me the frequency for each cell. Include row and column totals
Please remove the label for n on the columns
59. Titles and Footnotes
Titles
• Title1 "Counts of all Students
Last Names";
• Title2 "AY 2014-2015";
• Proc freq data=students;
• Table Last_name;
• Run;
Footnotes
• Title1 "Counts of all Students Last
Names";
• Title2 "AY 2014-2015"
• Proc freq data=students notitle;
• Table Last_name;
• Run;
• Footnote1 "Source: Certified
Enrollment Records";
• Footnote2 "Office of Institutional
Research";
61. Proc Summary
• Proc Summary data=Students nway;
• class Gender school;
• var GPA;
• output out=SchoolgenGPA2
mean=gpa;
• Run;
• Proc Summary data=Students nway;
• class Gender school;
• output out=SchoolgenGPA3 ;
• run;
• Proc Summary data=Students nway;
• Where ethnic="White";
• class Gender school;
• var gpa;
• output out=SchoolgenGPA4 max=gpa ;
• run;
62. Proc Format
• Proc format;
• value $gendern "M"="Male"
• "F"="Female";
• Value GPAcat low-3.5="Less
than or equal to 3.5"
• 3.51-high="Greater than 3.5";
• run;
• Title1 "Student list with Gender
and GPA Category";
• Title2 "AY 2014-2015";
• Proc Print data=students;
• Var First_name Last_name
gender GPA;
• format gender $gendern. GPA
GPAcat.;
• Run;
63. Output Delivery System (PDF)
• options orientation=landscape;
• OdS pdf file="w4aafsssSSITPlanningTAIRreport1.pdf";
• Proc tabulate data=students;
• where gpa>3.8;
• label ethnic="Ethnicity";
• Class semester ethnic;
• Table semester, ethnic*(n=" ");
• Run;
• ods pdf close;
68. Creating a Text File
• data _NULL_;
• put n z5.;
• if 0 then set students nobs=n;
• call symputx('numofrecords',n);
• stop;
• run;
• DATA _null_;
• SET students nobs=j;
• FILE "w4aafsssSSITPlanningTAIRtest1.txt" ;
• if _n_=1 then put "HY2K000040CBM001012016C0150Sharon Carpenter datarequest@uthscsa.edu";
• put
• @1 Gender
• @2 School
• @6 First_name
• @14 Last_name ;
• If _n_=j then put "EOF100&numofrecords";
• RUN;