spss Help

6/18/13 Help
upendra_laptop:58451/help/advanced/print.jsp?topic=/../nav/0&confirmed=true 1/38
Help
Contents
1. Core System
1.1. Overview
1.1.1. What's new in version 21?
1.1.5. What's new in version 17.0?
1.1.8. What's new in version 14.0.1
1.1.9.1. Version 14.0 compatibility with previous releases
1.1.11. What's new in version 11.0
1.1.18. Windows
1.1.18.1. Designated window versus active window
1.1.18.1.1. Changing the designated window
1.1.19. Status Bar
1.1.20. Dialog boxes
1.1.21. Variable names and variable labels in dialog box lists
1.1.22. Resizing dialog boxes
1.1.23. Dialog box controls
1.1.24. Selecting variables
1.1.25. Data type, measurement level, and variable list icons
1.1.26. Getting information about variables in dialog boxes
1.1.27. Command line options
1.1.28. Basic steps in data analysis
1.1.29. Statistics Coach
1.2. Getting Help
1.2.1. Getting Help on Output Terms
1.3. Data files
1.3.1. Opening data files
1.3.1.1. To open data files
1.3.1.2. Data file types
1.3.1.3. Opening file options
1.3.1.4. Reading Excel Files
1.3.1.5. Reading Excel 95 or Later Files
1.3.1.5.1. How to Read Excel 95 or Later Files
1.3.1.6. Reading older Excel files and other spreadsheets
1.3.1.7. Reading dBASE files
1.3.1.8. Reading Stata files
1.3.1.9. Reading Database Files
1.3.1.9.1. To Read Database Files
1.3.1.9.2. Selecting a Data Source
1.3.1.9.3. Selecting Data Fields
1.3.1.9.4. Creating a Relationship between Tables

6/18/13 Help
n n Nextn n n "); }
Simulation. Predictive models, such as linear regression, require a set of known inputs to predict
an outcome or target value. In many real world applications, however, values of inputs are
uncertain. Simulation allows you to account for uncertainty in the inputs to predictive models and
evaluate the likelihood of various outcomes in the presence of that uncertainty. See the topic
Simulation for more information.
One-click descriptive statistics. Select variables in the Data Editor and get summary descriptive
statistics (for example, mean, median, frequency counts). Appropriate statistics are automatically
determined based on measurement level. See the topic Obtaining Descriptive Statistics for Selected
Variables for more information.
Read Cognos Business Intelligence data. If you have access to an IBM® Cognos® Business
Intelligence server, you can read data packages and list reports into IBM® SPSS® Statistics. See
the topic Reading Cognos data for more information.
Merge data files without pre-sorting. Merge data files by values of key variables without pre-
sorting the files based on key values. You can also merge data files based on string keys of
different defined lengths in each file and merge a case data file with multiple table-lookup files with
different keys in each table-lookup file. See the topic STAR JOIN for more information.
Compare datasets. Compare the data values and metadata attributes (dictionary information) of
two datasets. See the topic Comparing datasets for more information.
Password protect and encrypt data and output files. See the topic Encrypting data files and
output documents for more information.
Pivot table editing enhancements. After creating pivot tables, you can now:
• Toggle the display of names, values, and labels. See the topic Controlling display of variable and
value labels for more information.
• Sort table rows. See the topic Sorting rows for more information.
• Insert rows and columns. See the topic Inserting rows and columns for more information.
• Change the output language. See the topic Changing the output language for more information.
Export output in Excel 2007 and higher format. See the topic Export output for more
information.
Preserve table styles when exporting output to HTML. All pivot table style information (for
example, font styles, background colors) and column widths can now be preserved. See the topic
HTML options for more information.
Unicode default. SPSS Statistics now runs in Unicode mode by default instead of code page mode.
See the topic Unicode mode for more information.
© Copyright IBM Corporation 1989, 2012.
n n Nextn n n "); }
Maps. The Graphboard Template Chooser now includes templates for creating different types of
map visualizations, such as choropleth maps (color maps), maps with mini-charts, and overlay
maps. IBM® SPSS® Statistics ships with several map files, but you can use the Map Conversion

6/18/13 Help
Utility to covert your existing map shapefiles for use with the Graphboard Template Chooser. See
the topic Using the Map Conversion Utility for more information.
Faster rendering of pivot tables. Pivot tables now render much faster than in previous versions,
while retaining full support for pivoting and editing. If you used fast rendering of lightweight tables
in version 19, you will find comparable results for pivot tables in version 20 and higher, without the
limitations of lightweight tables. Users who require compatibility with versions prior to 20 can
choose to generate legacy tables (referred to as full-featured tables in version 19). See the topic
Pivot table options for more information.
Background, disconnected execution for production jobs. Production jobs can be run in a
separate background session on a remote server. You can submit the jobs from your local
computer, disconnect from the remote server, reconnect later and retrieve your results. You don't
need to keep SPSS Statistics running on your local computer. You don't even need to keep your
local computer turned on. Progress of remote jobs can be monitored and results retrieved from the
new Background Job Status tab of the production facility dialog.See the topic Production jobs for
more information.
Ordinal Targets for Generalized linear mixed models. The Generalized linear mixed models
procedure now uses the information in the ordering of categories of targets with the ordinal
measurement level. Ordinal targets are modeled with an ordinal multinomial distribution, and the
target is linearly related to the factors and covariates via one of a number of cumulative link
functions. This feature is available in the Advanced Statistics add-on option.
n n Nextn n n "); }
Linear models. Linear models predict a continuous target based on linear relationships between
the target and one or more predictors. Linear models are relatively simple and give an easily
interpreted mathematical formula for scoring. The properties of these models are well understood
and can typically be built very quickly compared to other model types (such as neural networks or
decision trees) on the same dataset. This feature is available in the Statistics Base add-on module.
See the topic Linear models for more information.
Generalized linear mixed models. Generalized linear mixed models extend the linear model so
that: the target is linearly related to the factors and covariates via a specified link function; the
target can have a non-normal distribution; and the observations can be correlated. Generalized
linear mixed models cover a wide variety of models, from simple linear regression to complex
multilevel models for non-normal longitudinal data. This feature is available in the Advanced
Statistics add-on module. See the topic Generalized linear mixed models for more information.
Lightweight tables. Lightweight tables can be rendered much faster than full-featured pivot
tables. Although they lack the editing features of pivot tables, they can easily be converted to
pivot tables with all editing features enabled. See the topic Pivot table options for more
information.
Scoring wizard. The new scoring wizard makes it easy to apply predictive models to score your
data, and scoring no longer requires IBM® SPSS® Statistics Server. See the topic Scoring data
with predictive models for more information.
Improved default measurement level. For data read from external sources and new variables
created in a session, the method for determining default measurement level has been improved to
evaluate more conditions than just the number of unique values. Since measurement level affects
the results of many procedures, correct measurement level assignment is often important. See the
topic Data Options for more information.

6/18/13 Help
"Smart" output. The procedures in the Direct Marketing add-on module now provide "smart"
output: simple, non-technical explanations that help you evaluate your results.
Syntax editor enhancements. You can now split the editor pane into two panes arranged with
one above the other. You can indent or outdent blocks of syntax or automatically indent selections
with a format similar to pasted syntax. A new toolbar button allows you to uncomment text that
was previously commented out, and a new option setting allows you to paste syntax at the position
of the cursor. You can now also navigate to the next or previous syntactical error (such as an
unmatched quote), making it easier to locate these errors before running the syntax. See the topic
Using the Syntax Editor for more information.
Database drivers for salesforce.com. Database drivers for salesforce.com allow an analyst to
access data in salesforce.com just like you access data in a SQL database. Analysts can now
connect to salesforce.com, extract data that is relevant and perform analysis.
Compiled transformations. When you use compiled transformations, transformation commands
(such as COMPUTEand RECODE) are compiled to machine code at run time to improve the
performance of these transformations for datasets with a large number of cases. This feature
requires SPSS Statistics Server.
Statistics portal. Statistics portal is a Web-based interface for IBM® SPSS® Collaboration and
Deployment Services users that allows them to analyze their data with the power of the SPSS
Statistics engine. They run analyses from custom user interfaces authored in SPSS Statistics (with
the Custom Dialog Builder) and stored in their IBM SPSS Collaboration and Deployment Services
Repository. Enhancements relevant to authors of custom user interfaces for Statistics portal
include: honoring a filter, specified for the active dataset, between successive analyses; hiding
small counts in tables generated by CROSSTABS, OLAP CUBES, and CTABLES; and displaying a set of
row and column dimensions as table layers in the CROSSTABScrosstabulation table.
n n Nextn n n "); }
Automated data preparation. Automated Data Preparation (ADP) handles the task of preparing
data for analysis, analyzing your data and identifying fixes, screening out fields (variables) that are
problematic or not likely to be useful, deriving new attributes when appropriate, and improving
performance through intelligent screening techniques. You can use the algorithm in fully automatic
fashion, allowing it to choose and apply fixes, or you can use it in interactive fashion, previewing
the changes before they are made and accept or reject them as desired. Automated Data
Preparation is available in the Data Preparation add-on option. See the topic Automated Data
Preparation for more information.
Bootstrapping. Bootstrapping is a robust method for determining the properties of population
estimators (like the mean, median, percentiles, and correlation and regression coefficients) when
parametric assumptions do not hold, or when inferences based on parametric assumptions are
difficult to compute. Bootstrapping is available in the new Bootstrapping add-on option. See the
topic Introduction to Bootstrapping for more information.
New nonparametric tests. Nonparametric tests make minimal assumptions about the underlying
distribution of the data. The new nonparametric tests provide a new user interface and Model
Viewer output, and include all of the tests available in the legacy nonparametric tests, including:
one-sample Wilcoxon signed-rank test, one-sample confidence intervals for the binomial distribution,
the related-samples marginal homogeneity test, and the Hodges-Lehman confidence interval for the
median of the difference in paired-samples and the difference in medians of two independent
samples. Pairwise and stepwise step-down multiple comparisons are also available for all k
independent samples and k related samples tests. The Jonckheere-Terpstra test is available without

6/18/13 Help
requiring the Exact Tests add-on option. The new nonparametric tests are available in the
Statistics Base add-on option. See the topic Nonparametric Tests for more information.
Programmability enhancements. The R Integration Plug-in now supports R debugging features.
Additionally, you can create pivot tables from R with multiple row and column dimensions and you
can nest multiple pivot tables under a common outline heading. R extension commands can be
implemented directly from R source code files, bypassing the need to distribute them as R packages.
Also, you can bundle together all components of a custom R or Python procedure, allowing end
users to easily install the procedure without manually copying files. Complete documentation for the
Python and R Integration Plug-ins is now integrated with the Help system.
Direct marketing tools. The new Direct Marketing add-on option provides a set of tools designed
to improve the results of direct marketing campaigns by identifying demographic, purchasing, and
other characteristics that define various groups of consumers and targeting specific groups to
maximize positive response rates. See the topic Direct Marketing for more information.
Custom Tables enhancements. The Custom Tables add-on option now offers computed
categories and significance results integrated into the same table as the values being tested. For
more information on computed categories, see Computed Categories. For more information on
significance tests in custom tables, see Custom Tables: Test Statistics Tab.
Improved SAS data file support. You can now write data files in SAS 9 format. See the topic
Saving data: Data file types for more information.
Improved Custom Dialog Builder. The Custom Dialog Builder now has a list box control that
supports single or multiple selection. Also, list items for combo box and list box controls can now be
dynamically populated with values associated with the variables in a specified target list. In
addition, radio buttons can now contain a set of nested controls. See the topic Creating and
Managing Custom Dialogs for more information.
Improved display of large pivot tables. New display options are now available that make it easier
to view and navigate large pivot tables (tables with hundreds or thousands of rows). See the topic
Set rows to display for more information.
Improved Twostep Cluster output. The Twostep Cluster procedure now provides interactive
model viewer output. Twostep Cluster is available in the Statitics Base option. See the topic The
Cluster Viewer for more information.
Additional rule-checking on quality control charts. Rule-checking is now performed on several
additional control charts. When rule-checking is requested for an X-bar chart, it will also be
performed on the accompanying R (range) or s (standard deviation) chart. Similarly, when rule-
checking is requested for an Individuals (Runs) chart, it will also be performed on the accompanying
Moving Range chart. Quality control charts are available in the Statistics Base option.
n n Nextn n n "); }
New syntax editor. The syntax editor has been completely redesigned with features such as auto-
completion, color coding, bookmarks, and breakpoints. Auto-completion provides you with a list of
valid command names, subcommands, and keywords; so you’ll spend less time referring to syntax
charts. Color coding allows you to quickly spot unrecognized terms as well as some common
syntactical errors. Bookmarks allow you to quickly navigate large command syntax files. Breakpoints
allow you to stop execution at specified points so you can inspect data or output before
proceeding. See the topic Using the Syntax Editor for more information.
Custom Dialog Builder. The Custom Dialog Builder allows you to create and manage custom

6/18/13 Help
dialogs for generating command syntax. You can create custom dialogs to generate syntax from
multiple commands, including custom extension commands implemented in Python or R. See the
topic Creating and Managing Custom Dialogs for more information.
Multiple language support. In addition to the ability to change the output language available in
previous releases, you can now change the user interface language. See the topic General options
for more information.
Codebook. The Codebook procedure reports the dictionary information -- such as variable names,
variable labels, value labels, missing values -- and summary statistics for all or specified variables
and multiple response sets in the active dataset. For nominal and ordinal variables and multiple
response sets, summary statistics include counts and percents. For scale variables, summary
statistics include mean, standard deviation, and quartiles. See the topic Codebook for more
information.
Nearest Neighbor analysis. Nearest Neighbor analysis is a method for classifying cases based on
their similarity to other cases. In machine learning, it was developed as a way to recognize patterns
of data without requiring an exact match to any stored patterns, or cases. Similar cases are near
each other and dissimilar cases are distant from each other. Thus, the distance between two cases
is a measure of their dissimilarity. See the topic Nearest Neighbor Analysis for more information.
Multiple Imputation. The Multiple Imputation procedure performs multiple imputation of missing
data values. Given a dataset containing missing values, it outputs one or more datasets in which
missing values are replaced with plausible estimates. You can then obtain pooled results when
running other procedures. The procedure also summarizes missing values in the working dataset.
This feature is available in the Missing Values add-on option. See the topic Impute Missing Data
Values (Multiple Imputation) for more information.
RFM analysis. RFM (recency, frequency, monetary) analysis is a technique used to identify existing
customers who are most likely to respond to a new offer. This technique is commonly used in direct
marketing. This feature is available in the EZ RFM add-on option. See the topic RFM Analysis for
more information.
Categorical Regression enhancements. Categorical Regression has been enhanced to include
regularization and resampling methods to assess and improve prediction accuracy. Together, these
new methods make it possible to create state-of-the-art models, even for high-volume data (where
there are more variables than observations, such as in genomics). This feature is available in the
Categories add-on option. See the topic Categorical Regression (CATREG) for more information.
Graphboard. Graphboard visualizations are graphs, charts, and plots created from a visualization
template. IBM® SPSS® Statistics ships with built-in visualization templates. You can also use a
separate product, IBM® SPSS® Visualization Designer, to create your own visualization templates.
The new visualization templates are effectively custom visualization types. See the topic Creating
and Editing Graphboard Visualizations for more information.
Exporting output. More output export format options and more control over exported content,
including:
• Wrap or shrink wide table in Word documents. See the topic Word/RTF options for more
information.
• Create new worksheets or append data to existing worksheets in an Excel workbook. See the
topic Excel options for more information.
• Save output export specifications in the form of command syntax with the OUTPUT EXPORT
command. All the features for exporting output in the Export Output dialog are now also available
in command syntax; so you can save and re-run your export specifications and include them in
automated production jobs. See the topic OUTPUT EXPORT for more information.
• The Output Management System (OMS) now supports these additional output formats: Word,
Excel, and PDF. See the topic Output Management System for more information.

6/18/13 Help
Shift Values. Shift Values creates new variables that contain the values of existing variables from
preceding (lag) or subsequent (lead) cases. See the topic Shift Values for more information.
Aggregate enhancements. You can now use the features of the Aggregate procedure without
specifying a break variable. See the topic Aggregate Data for more information.
Median function. A median function is now available for computing the median value across
selected variables for each case. See the topic Statistical functions for more information.
n n Nextn n n "); }
User interface enhancements. Enhancements to the point-and-click interface include:
• All dialog boxes are now resizable. The ability to make a dialog box wider makes variable lists
wider so that you can see more of the variable names and/or descriptive labels. The ability to
make a dialog box longer makes variable lists longer so that you can see more variables without
scrolling.
• Drag-and-drop variable selection is now supported in all dialog boxes.
• Variable list display order and display characteristics can be changed on the fly in all dialog
boxes. Change the sort order (alphabetic, file order, measurement level) and/or switch between
display of variable names or variable labels whenever you want. See the topic Variable names
and variable labels in dialog box lists for more information.
Data and output management. Data and output management enhancements include:
• Read and write Excel 2007 files.
• Choose between working with multiple datasets or one dataset at a time. See the topic General
options for more information.
• Search and replace information in Viewer documents, including hidden items and layers in
multidimensional pivot tables. See the topic Finding and replacing information in the Viewer for
more information.
• Assign missing values and value labels to any string variable, regardless of the defined string
width (previously limited to strings with a defined width of 8 or less bytes).
• New character-based string functions. See the topic String functions for more information.
• Output Management System (OMS) support for Viewer file format (.spv) and VML-format charts
and image maps with pop-up chart information for HTML documents. See the topic Output
Management System for more information.
• Customize Variable View in the Data Editor. Change the display order of the attribute columns,
and control which attribute columns are displayed. See the topic Customizing Variable View for
more information.
• Sort variables in the active dataset alphabetically or by attribute (dictionary) values. See the
topic Sort variables for more information.
• Spell check variable labels and value labels in Variable View. See the topic Spell checking for
more information.
• Change basic variable type (string, numeric), change the defined width of string variables, and
automatically set the width of string variables to the longest observed value for each variable.

6/18/13 Help
See the topic ALTER TYPE for more information.
• Read and write Unicode data and syntax files. See the topic General options for more information.
• Control the default directory location to look for and save files. See the topic File locations
options for more information.
Performance. For computers with multiple processors or processors with multiple cores,
multithreading for faster performance is now available for some procedures. See the topic THREADS
Subcommand (SET command) for more information.
Statistical enhancements. Statistical enhancements include:
• Partial Least Squares (PLS). A predictive technique that is an alternative to ordinary least
squares (OLS) regression, canonical correlation, or structural equation modeling, and it is
particularly useful when predictor variables are highly correlated or when the number of
predictors exceeds the number of cases. See the topic Partial Least Squares Regression for more
information.
• Multilayer perceptron (MLP). The MLP procedure fits a particular kind of neural network called a
multilayer perceptron. The multilayer perceptron uses a feed-forward architecture and can have
multiple hidden layers. The multilayer perceptron is very flexible in the types of models it can fit.
It is one of the most commonly used neural network architectures. This procedure is available in
the new Neural Networks option. See the topic Multilayer Perceptron for more information.
• Radial basis function (RBF). A Radial basis function (RBF) network is a feed-forward, supervised
learning network with only one hidden layer, called the radial basis function layer. Like the
multilayer perceptron (MLP) network, the RBF network can do both prediction and classification.
It can be much faster than MLP, however it is not as flexible in the types of models it can fit.
This procedure is available in the new Neural Networks option. See the topic Radial Basis
Function for more information.
• Generalized Linear Models supports numerous new features, including ordinal multinomial and
Tweedie distributions, maximum likelihood estimation of the negative binomial ancillary parameter,
and likelihood-ratio statistics. This procedure is available in the Advanced Statistics option. See
the topic Generalized Linear Models Response for more information.
• Cox Regression now provides the ability to export model information to an XML (PMML) file. This
procedure is available in the Advanced Statistics option. See the topic Cox Regression Save New
Variables for more information.
• Complex Samples Cox Regression. Apply Cox proportional hazards regression to analysis of
survival times—that is, the length of time before the occurrence of an event for samples drawn
by complex sampling methods. This procedure supports continuous and categorical predictors,
which can be time-dependent. This procedure provides an easy way of considering differences in
subgroups as well as analyzing effects of a set of predictors. The procedure estimates variances
by taking into account the sample design used to select the sample, including equal probability
and probability proportional to size (PPS) methods and with replacement (WR) and without
replacement (WOR) sampling procedures. This procedure is available in the Complex Samples
option.
Programmability extension. Programmability extension enhancements include:
• R-Plugin. Combine the power of IBM® SPSS® Statistics with the ability to write your own
statistical routines with R. This plug-in is available only as a download from
http://www.ibm.com/developerworks/spssdevcentral.
• Nested Begin Program-End Programcommand structures. See the topic BEGIN PROGRAM-END
PROGRAM for more information.
• Ability to create and manage multiple datasets.

6/18/13 Help
Command syntax. For a complete list of command syntax additions and changes, see Release
History.
Features no longer supported
• There is no longer a separate chart editor for "interactive" charts. Charts created from the
legacy "interactive" chart dialog boxes and from IGRAPHcommand syntax are created in the
same format as all other charts and edited in the same chart editor.
• Some features provided in the legacy "interactive" chart dialog boxes and IGRAPHcommand
syntax are no longer available. See the "Release History" section of IGRAPH for details.
• The Draft Viewer is no longer available.
• You cannot open Viewer files created in previous versions of SPSS Statistics (.spo files) in SPSS
Statistics 16.0. For Windows operating systems, the installation CD includes a Legacy Viewer
that you can install to view and edit Viewer files created in previous releases.
• The Maps option is no longer available.
• Dialog box interfaces for the legacy procedures in the Trends and Tables options are no longer
available. For Trends, this includes the following commands: AREG, ARIMA, and EXSMOOTH. For
Tables, this includes the TABLEScommand. If you have a license for either of these options that
includes the legacy procedures, command syntax for these commands is still supported.
n n Nextn n n "); }
Data management
Custom variable attributes. In addition to the standard variable attributes (for example, value
labels, missing values, measurement level), you can create your own custom variable attributes.
You can display and edit these attributes directly in Variable View of the Data Editor. Like standard
variable attributes, these custom attributes are saved with IBM® SPSS® Statistics data files. See
the topic Custom Variable Attributes for more information. (A syntax-only implementation of custom
variable attributes was introduced in version 14 with the VARIABLE ATTRIBUTE command.)
Variable sets. You can now use variable sets to control which variables are displayed in the Data
Editor as well as in dialog box variable lists. (In previous releases, variable sets affected only dialog
box variable lists.) Variable sets make it easier to work with data files that contain a large number
of variables. See the topic Defining variable sets for more information.
Export to Database wizard. Create new database tables, replace values for selected fields, and
add new fields to an existing table, all without having to write a single line of SQL code yourself.
See the topic Exporting to a Database for more information.
Export to Data Collection. Export to IBM® SPSS® Data Collection creates a data file in SPSS
Statistics format and a Data Collection metadata file that you can use to read the data into Data
Collection applications. See the topic Exporting to IBM SPSS Data Collection for more information.
(Note: This is only available on Microsoft Windows operating systems.)
Save data in CSV format. Save data in CSV (comma-separated values) format. CSV is a common
data format recognized by many applications. See the topic Saving data files in external formats for
more information. For information on command syntax for saving data in CSV format, see SAVE
TRANSLATE.

6/18/13 Help
Reporting
Export results in PDF format. Export output in PDF format, including Viewer outline headings as
bookmarks in the PDF file. See the topic Export output for more information.
Control chart enhancements. You can now define rules for control charts to help you quickly
identify points that are out of control.
More chart types in Chart Builder. The Chart Builder has been expanded to include histograms,
boxplots, scatterplot matrices, overlay scatterplots, population pyramids, error bar charts, high-
low-close charts, difference area charts, range bar charts, dot plots, charts of separate variables,
and paneled charts. You can also create charts that were not previously available, such as charts
with dual, independent y axes.
Chart Editor enhancements. The Chart Editor now offers more control over your charts. Major
features include an updated Variables tab for changing chart types easily, automatic control of
white space, additional distribution curves for histograms, a tool for quickly rescaling axes, and the
ability to use custom equations to create reference lines. See the topic What's New and Different
Programmatic control of output documents. You can now create, open, activate, save, and
close Viewer and documents with command syntax using OUTPUT NEW, OUTPUT NAME, OUTPUT
ACTIVATE, OUTPUT OPEN, OUTPUT SAVE, and OUTPUT CLOSE.
Statistical Enhancements
Ordinal Regression. This procedure, previously available as part of the Advanced Statistics add-
on option, is now available in the Core system. See the topic Ordinal Regression for more
information.
PMML model files with transformations. You can now include transformations in PMML model files
and merge information from model files using the TMS BEGIN-TMS END and TMS MERGE commands.
Generalized Linear Models. The Generalized Linear Models procedure expands the general linear
model so that the dependent variable is linearly related to the factors and covariates via a specified
link function. Moreover, the model allows for the dependent variable to have a non-normal
distribution. This procedure is available in the Advanced Statistics option. See the topic Generalized
Linear Models Response for more information.
Generalized Estimating Equations. The Generalized Estimating Equations procedure extends the
generalized linear model to allow for analysis of repeated measurements. This procedure is available
in the Advanced Statistics option. See the topic Generalized Estimating Equations for more
information.
Complex Samples Ordinal Regression. The Complex Samples Ordinal Regression procedure
performs regression analysis on a binary or ordinal dependent variable for samples drawn by complex
sampling methods. Optionally, you can request analyses for a subpopulation. This procedure is
available in the Complex Samples option. See the topic Complex Samples Ordinal Regression for more
information.
Optimal Binning. The Optimal Binning procedure discretizes one or more scale variables by
distributing the values of each variable into bins. Bin formation is optimal with respect to a
categorical guide variable that "supervises" the binning process. Bins can then be used instead of
the original data values for further analysis. This procedure is available in the Data Preparation
option. See the topic Optimal Binning for more information.
Programmability Extension
The Programmability Extension now allows you to write to the active dataset and create custom
pivot tables and custom procedures. For more information go to

6/18/13 Help
http://www.ibm.com/developerworks/spssdevcentral
n n Nextn n n "); }
• Dialog box interface for new PREFSCALprocedure, introduced in version 14.0. This procedure is
available in the Categories option.See the topic Multidimensional Unfolding (PREFSCAL) for more
information.
• New add-on option: Adaptor for Predictive Enterprise Services enables you to publish and share
objects through a central repository. Objects can be shared among users, shared with other
applications, and tracked using versioning, thus eliminating the need for ad-hoc file systems to
manage enterprise-wide assets.
• Access to data for external programming languages. The Programmability Extension has been
enhanced to provide the ability to send data to an external programming language.
n n Nextn n n "); }
Data management
• Have multiple data sources open at the same time, making it easier to compare data files, copy
data and attributes from one file to another file, and merge multiple data sources without saving
each data source as a sorted data file first. For more information, see Basic Handling of Multiple
Data Sources and DATASET NAME.
• Read and write Stata-format data files. You can read Stata version 4–8 data files and write
Stata version 5–8 data files. For more information, see Reading Stata files, Saving data files in
Stata format, GET STATA, SAVE TRANSLATE.
• Read data from IBM® SPSS® Data Collection data sources. See the topic Reading IBM SPSS
Data Collection Data for more information. (Note: This is available only on Microsoft Windows
operating systems.)
• Read data from OLE DB data sources. See the topic Selecting a Data Source for more
information. (Note: This is available only on Microsoft Windows operating systems.)
• Define descriptive value labels up to 120 bytes (previous limit was 60 bytes).
• Create data values from value labels or use them in transformation logic with the VALUELABEL
function. See the topic VALUELABEL function for more information.
• Find and replace string values with the REPLACEfunction. See the topic String functions for more
information.
• Define custom variable attributes and data file attributes with the VARIABLE ATTRIBUTE and
DATAFILE ATTRIBUTE commands.
• Write data to database tables and other formats by using field/column names that are not
constrained by IBM® SPSS® Statistics variable-naming rules. SAVE TRANSLATEhas been
enhanced to allow you to use quoted values for field/column names that contain spaces,

6/18/13 Help
commas, or other characters that are not allowed in ˜variable names. See the topic RENAME
Subcommand (SAVE TRANSLATE command) for more information.
• Use the new SQLsubcommand of the SAVE TRANSLATEcommand to append new columns to
database tables, modify database table column attributes, join tables, and perform other actions
that are permitted with valid SQL statements. See the topic SQL Subcommand (SAVE
TRANSLATE command) for more information.
Charts
• Use the new Chart Builder interface (Graphs menu) to build charts from predefined gallery charts
or from the individual parts (for example, coordinate systems and bars) that make up a chart.
See the topic Building Charts for more information.
• Create custom chart types by using powerful GGRAPHand GPLcommand syntax. See the topic
GGRAPH for more information.
Statistical enhancements
• New Expert Modeler in the Forecasting option automatically identifies and estimates the best-
fitting model for one or more time series, thus eliminating the need to identify an appropriate
model through trial and error. For more information, see Time Series Modeler and TSMODEL.
• New Data Validation option provides a quick visual snapshot of your data and provides the ability
to apply validation rules that identify invalid data values. You can create rules that flag out-of-
range values, missing values, or blank values. You can also save variables that record individual
rule violations and the total number of rule violations per case. A limited set of predefined rules is
provided that you can copy or modify. For more information, see Introduction to Data Preparation
and VALIDATEDATA.
New Anomaly Detection procedure in the Data Validation option finds unusual observations that
could adversely affect predictive models. Some of these outlying observations represent truly
unique cases and are thus unsuitable for prediction, while other observations are caused by
data-entry errors in which the values are technically “correct” and thus cannot be caught by the
Validate Data procedure. For more information, see Identify Unusual Cases and DETECTANOMALY.
• New Multidimensional Unfolding procedure (PREFSCAL) in the Categories option attempts to find
the structure in a set of proximity measures between row and column objects. This process is
accomplished by assigning observations to specific locations in a conceptual low-dimensional
space such that the distances between points in the space match the given (dis)similarities as
closely as possible. The result is a least-squares representation of the objects in that low-
dimensional space, which, in many cases, helps you further understand your data. This procedure
is currently available with PREFSCALcommand syntax. See the topic PREFSCAL for more
information.
• New Predictor Selection procedure (SELECTPRED) in SPSS Statistics Server sifts through a very
large number of categorical and continuous predictor variables. The procedure selects a smaller
subset for use in predictive modeling procedures that cannot accept so many predictors. This
procedure is currently available with SELECTPREDcommand syntax. See the topic SELECTPRED
• New Naïve Bayes procedure (NAIVEBAYES) in SPSS Statistics Server produces a simple and stable
model for predictor selection and classification. This procedure is currently available with
NAIVEBAYEScommand syntax. See the topic NAIVEBAYES for more information.
• Improved significance testing capabilities in the Custom Tables option allows you to now perform
significance tests on subtotals and multiple response sets. See the topic Custom Tables: Test
Statistics Tab for more information.
• More flexibility is available in defining multiple response sets for multiple dichotomies. See the

6/18/13 Help
topic Defining Multiple Response Sets for more information.
Output
• Pivot table output is now provided for Rank Cases (RANK), Replace Missing Values (RMV), and
Create Time Series (CREATE) in the Core system; all procedures in the Conjoint option; Model
Selection Loglinear Analysis (HILOGLINEAR) in the Advanced Statistics option; and Probit Analysis
(PROBIT), Weight Estimation (WLS), and 2-Stage Least Squares (2SLS) in the Regression option.
Performance enhancements
• Table structures that previously took a long time to create or that might run out of memory with
the Custom Tables option (CTABLES) can now be created quickly and efficiently.
n n Nextn n n "); }
1.1.9.1. Version 14.0 compatibility with previous releases
Logistic Regression
In previous versions, the order of recoded string values was dependent on the order of values in
the data file. For example, when recoding the dependent variable, the first string value encountered
was recoded to 0, and the second string value encountered was recoded to 1. The procedure now
recodes string variables so that the order of recoded values is the alphanumeric order of the string
values. Thus, the procedure may recode string variables differently than in previous versions.
Logistic Regression is available in the Regression option.
Macro facility
Improvements to the macro facility may cause errors in jobs that previously ran without errors.
Specifically, for syntax that is processed with interactive rules, if a macro call occurs at the end of
a command, and there is no command terminator (either a period or a blank line), the next command
after the macro expansion will be interpreted as a continuation line instead of a new command, as
in:
DEFINE !macro1()
var1 var2 var3
!ENDDEFINE.
FREQUENCIES VARIABLES = !macro1
DESCRIPTIVES VARIABLES = !macro1.
In interactive mode, the DESCRIPTIVEScommand will be interpreted as a continuation of the
FREQUENCIEScommand, and neither command will run.
n n Nextn n n "); }
New data definition tools. Two new features make defining data faster and easier:
• The Copy Data Properties wizard provides the ability to use an external data file as a template
for defining file and variable properties in the active dataset. You can also use variables in the
active dataset as templates for other variables in the active dataset. Copy Data Properties is

6/18/13 Help
available on the Data menu in the Data Editor window. See the topic Copying Data Properties for
more information.
• Define Variable Properties (also available on the Data menu in the Data Editor window) scans
your data and lists all unique data values for any selected variables, identifies unlabeled values,
and provides an auto-label feature. This is particularly useful for categorical variables that use
numeric codes to represent categories--for example, 0 = Male, 1 = Female. See the topic
Defining Variable Properties for more information.
Expanded support for SAS format data files. You can now save data files in SAS version 6, SAS
version 7, and SAS Transport file format. See the topic Saving data: Data file types for more
information.
Expanded output export capabilities. You can now export entire Viewer documents or selected
output objects in Word/RTF format and Excel format (charts are not included in Excel format). See
the topic Export output for more information.
Multiple output languages. You can now produce pivot table output in different languages and
switch languages during the same session. See the topic General options for more information.
TwoStep Cluster Analysis. This new clustering procedure offers the following features not
available in the other IBM® SPSS® Statistics clustering procedures:
• Automatic selection of the best number of clusters, in addition to measures for choosing
between cluster models.
• Ability to create cluster models simultaneously based on categorical and continuous variables.
• Ability to save the cluster model to an external XML file, then read that file and update the
cluster model using newer data.
• Ability to analyze large data files with a single clustering procedure.
See the topic TwoStep Cluster Analysis for more information.
New Custom Tables option. If you have used the Tables option in the past, you will discover that
almost everything is new in this release, including:
• A simple, drag-and-drop table builder interface that allows you to preview your table as you
select variables and options.
• A single, unified table builder interface instead of multiple menu choices and dialog boxes for
different types of tables.
• Subtotals for subsets of categories of a categorical variable.
• Custom control over category display order and ability to selectively show or hide categories.
Note: Custom Tables is not included in the Core system. It is only available if you have purchased
the Custom Tables add-on option.
n n Nextn n n "); }
Ratio Statistics. A new procedure provides a comprehensive list of summary statistics for
describing the ratio between two scale variables, including coefficient of dispersion, coefficient of
variation, price-related differential, and average absolute deviation.

6/18/13 Help
Linear Mixed Models. A new procedure enables you to construct predictive models when you have
a nested data structure. You can formulate a wide variety of models, including Fixed Effects ANOVA
Model, Randomized Complete Blocks Design, Split-Plot Design, Purely Random Effects Model, Random
Coefficient Model, Multilevel Analysis, Unconditional Linear Growth Model, Linear Growth Model with
a person-level covariate, Repeated Measures Analysis, and Repeated Measures Analysis with time-
dependent covariate. In addition, you can work with repeated measure designs, such as incomplete
repeated measurements in which the number of observations varies across subjects. Available in
the Advanced Statistics option.
Performance enhancements. General Linear Models, Proximities, Hierarchical Cluster Analysis
(Core system), and Multinomial Logistic Regression (Regression) now run much faster than in
previous releases.
Data management. Restructure data to create a single case (record) from multiple cases or
create multiple cases from a single case.
Database Wizard. Now allows you to automatically recode categorical string values to numeric
variables (using the original values as value labels), auto-join tables using primary/foreign key
relationships, and obtain random samples from large data sources.
Text Wizard. Now allows you to read CSV-format text data that contains text qualifiers (such as,
"1,000", "2,000", ...).
OLAP Cubes. Now allows you to calculate arithmetic and percentage differences between
categories of a grouping variable or between separate variables.
One-Way ANOVA. Now includes the Brown-Forsythe and Welch tests.
Scientific notation for small numbers. You can choose not to see it in your output (from the Edit
menu, choose Options, and then click the General tab).
Aggregate. Median has been added to the list of available Aggregate functions.
Arithmetic functions. Available functions in arithmetic expressions are expanded to include density
functions for continuous and discrete distributions.
Multinomial Logistic Regression. New functionality added to this procedure allows you to save
estimated response probabilities, predicted response categories, probability of predicted response
categories, and probability of actual response categories. Available in the Regression option.
Categorical Regression. This procedure has been redesigned to make it more powerful and easier
to use. Available in the Categories option.
Categorical Principal Components Analysis. Improvements in this procedure now make results
easier to understand. Available in the Categories option.
n n Nextn n n "); }
Read native SAS files with GET SAS. The GET SAScommand has been enhanced to read native
SAS data files in addition to SAS transport files.
Cache data automatically. The Cache facility now automatically creates a data cache after a
certain number of changes to the active data file. By default, the number of changes is 20. You can
change the number with the SETcommand, CACHEsubcommand.
See the Command Syntax Reference for more information on SETand GET SAS.

6/18/13 Help
n n Nextn n n "); }
Improved data access. Analyze large data files without requiring large amounts of temporary disk
storage space. File size limitations are virtually eliminated because duplicate copies of the data file
(automatically created and stored in temporary disk space in previous releases) are no longer
required.
Distributed analysis. Dramatically improve the speed of your analysis by using a remote server
computer to do perform data- and compute-intensive work for you. Using distributed analysis mode
with the server version, you can perform complex analyses on large data files without tying up your
desktop computer.
Multiple sessions. You can now run multiple sessions simultaneously on the same desktop
computer, making it possible to analyze more than one data file at the same time.
Direct Excel access. Read Excel 5 or later files directly into IBM® SPSS® Statistics simply by
selecting the Excel file in the Open File dialog box. You no longer need to use special Excel ODBC
drivers to read Excel files. And now you can read columns that contain mixed data types without
any loss of data. Columns with mixed data types are automatically read as string variables and all
values are read as valid string values.
New Data Editor. The Data Editor has been redesigned with a new Variables tab that makes it
much easier to view and define variable attributes such as data types and descriptive variable and
value labels.
Multiple test variables with ROC curves. The ROC Curve procedure has been enhanced to
compare multiple test variables.
Improved quality for interactive graphics used in other applications and improved printing
performance. Interactive graphs can now be copied as Windows metafiles, which are better suited
to resizing and printing in other applications without jagged lines and edges. Interactive graphs can
be printed as metafiles for faster results at the same high quality.
Polytomous Logit Universal Models (PLUM). Enables you to apply regression techniques to
ordinal outcomes (such as low, medium and high). Available in the Advanced Statistics option.
Thematic mapping. Enables you to graphically summarize data by geographic regions, using bar,
pie, range of value, graduated symbol, and dot density charts displayed on high-quality maps.
Available in the new Map option.
New optimal scaling procedure. A new nonlinear principal components analysis procedure
(CATPCA) is available in the Categories option.
Improved output for Logistic Regression and Cox Regression. Logistic Regression (Regression
option) and Cox Regression (Advanced Statistics option) now produce high-quality, flexible, pivot
table output.
n n Nextn n n "); }
A (slightly) new look. The Analyze menu replaces the Statistics menu, and:
• Procedures formerly available on the Summarize submenu have been reorganized into two new

6/18/13 Help
submenus: Reports and Descriptive Statistics.
• Layered Reports is now OLAP Cubes.
• GLM-General Factorial is now GLM Univariate.
Interactive graphics. More charts and more features, including:
• More chart types, including area charts, stacked bar charts, and charts of multiple variables.
• More chart features, including reference lines, secondary axes, spikes in scatterplots, control of
category order and display of categories with missing data, more flexibility in key display, and
more control over panel display.
• Charts from pivot tables. Just select the part of the table you want to display as a chart, right-
click anywhere in the selected area, and select Create Graph.
Draft Viewer. Better-looking output in the Draft Viewer with:
• Improved table borders using box characters that produce clean, solid lines for row, column, and
cell borders.
• Better page breaks for multipage tables and more control over how multipage tables are
displayed.
Statistical enhancements. Statistical enhancements include:
• Reliability analysis, multidimensional scaling (ALSCAL), and the Matrix language are now available
in the Core system.
• New ROC Curve procedure for evaluating the performance of classification schemes where there
is one variable with two categories by which subjects are classified.
• Crosstabs procedure enhanced to include Cochran-Mantel-Haenszel statistic.
• New Nominal Regression procedure for analyzing the relationship between categorical variables
with two or more categories and multiple independent variables (available in the Regression
option).
File management. The new Text Wizard makes it easier than ever to read text data files in a
variety of formats.
Titles. Support for syntax commands TITLEand SUBTITLE.
n n Nextn n n "); }
Dynamic, interactive charts and graphs. New graphing features make it easier to explore your
data visually.
• Drag and drop new variables, and update charts on the fly.
• Split a single chart into multiple panels for side-by-side comparisons.
• Paste "live" charts into applications that support ActiveX objects.
Statistical enhancements. Perform more in-depth analysis with additional statistics, including:
• New ANOVA procedure with custom models and post-hoc tests.

6/18/13 Help
• Robust Levene test to compare variance between groups in the Explore procedure.
• Harmonic and geometric means in the Means procedure.
• Interclass correlation in the Reliability procedure (Professional Statistics option).
• One-minus-survival functions in Survival procedures (Advanced Statistics option).
• Improved correspondence analysis and multiple regression for categorical data (Categories
option).
Data management. Work faster and easier with data management improvements, including:
• Display descriptive variable labels instead of variable names in dialog boxes. Variable labels can
be up to 256 characters long.
• Create and execute database queries faster. Save queries created by the Database Wizard and
create prompted queries that allow you to use the same query to retrieve different subsets of
data (such as sales data for different quarters).
• Include a date and time stamp in journal files for archiving.
Output enhancements. Create more pivot tables and have more control over output, including:
• See the bottom line faster with the new Layered Reports procedure that provides summary
reports with each subgroup in a separate layer. Drill down to custom views of your results and
reveal key information.
• Pick the output format you want: interactive pivot tables or simple text output. (For text output,
choose Options from the Edit menu, click the General tab, and click Draft for output type.)
• Control default column width in pivot tables and save column width settings in the Data Editor.
• Use bookmarks to save different views of pivot tables.
• Display correlation coefficients and significance levels next to each other in the Bivariate
Correlations procedure.
• Organize results by variable or by table type in the Frequencies procedure.
Online Help. Online Help has been expanded to include more just-in-time training features,
including:
• Easily understand your results with the Results Coach. Annotated sample output that helps you
understand how to interpret your results is available for all pivot tables in the Core system.
• Get the information you need faster and easier with "Ask Me" help -- a natural language interface
that helps you find the answers you need without needing to know any complicated jargon.
n n Nextn n n "); }
Scripting and automation. With new scripting features and OLE automation, you can automate
many tasks, including customizing pivot table output. You can use the sample scripts, customize
them to meet your needs, or create your own scripts.
• Use Options on the Edit menu and select the Scripts tab on the Options dialog box to select
autoscripts (scripts that run automatically each time you create a specified table type) and

6/18/13 Help
select specific autoscript functions.
• Use Create/Modify Autoscripts on the Utilities menu to create new autoscript functions for the
currently selected output object type in the Viewer.
• Use Run Scripts on the Edit menu to run a personal script on the currently selected output
object in the Viewer (a variety of sample personal scripts are supplied with IBM® SPSS®
Statistics).
• Use Open or New on the File menu to modify any personal script or create a new personal script.
HTML and ASCII format for exporting output. You can export output in HTML (HTML 3.0) and
ASCII text format. For HTML format, pivot tables can be exported as HTML tables, and charts can
be exported in JPEG format and automatically embedded by reference in your HTML document. Use
Export on the File menu of the Viewer to export output.
Expanded features for reading databases. The new Database Wizard enables you to specify
multiple joins, including both inner and outer joins. Use Database Capture on the File menu to read
databases into SPSS Statistics.
Customizable toolbars. You can modify toolbars and create your own toolbars to include the
features you use often, including personal scripts and any items available on the menus. Use
Toolbars on the View menu to customize toolbars.
Statistics Coach. For users who are not familiar with SPSS Statistics or with the available
statistical procedures, the Statistics Coach can help you get started with many of the basic
statistical techniques in the Core system.
More statistical procedures in the Core system. Factor analysis, discriminant analysis, cluster
analysis, and proximity and distance measures are now included in the Core system (Analyze menu)
and feature new, flexible, pivot table output.
Variance Components Analysis. A new procedure in the Advanced Statistics option, Variance
Components Analysis extends the analytic capabilities of the General Linear Model procedures.
Statistical enhancements. Many statistical procedures now have additional features:
• Crosstabs. McNemar test and clustered bar charts.
• Frequencies. Pie charts.
• Factor Analysis. Promax rotation method.
• Discriminant Analysis. Leave-one-out classification (similar to jackknifing).
• Logistic Regression (Professional Statistics). Pseudo R Squared measures and Hosmer-
Lemeshow goodness-of-fit statistics.
• General Linear Model (Advanced Statistics). Expanded set of analysis options and techniques.
New tables features. With the Custom Tables option, you can save multiple response set
information, and pivoting features have been enhanced to provide greater flexibility for pivoting
tables.
More printing control. Printing features have been expanded to include alignment control of
individual output items, user-specified page and column breaks in large tables, and widow and
orphan control for tables that break across pages.
• Use Align Left, Center, or Align Right on the Format menu in the Viewer to change the alignment
for the selected output item.
• Use Break on the Format menu in an activated pivot table to specify a page or column break at

6/18/13 Help
the selected row or column. Use Keep Together to prevent page or column breaks between rows
or columns you want to remain together.
• Use Table Properties on the Format menu in an activated pivot table to change widow/orphan
settings.
More pivot table control. Features for modifying pivot tables have been expanded to include the
ability to: reorder categories by dragging and dropping selected rows or columns, rotate row and
column labels, and create groups of related rows and columns with group labels.
• Use Rotate on the Format menu in an activated pivot table to rotate inner column or outer row
labels.
• Use Group on the Edit menu in an activated pivot table to create groups of related rows or
columns with group labels.
Other new features. Other new features include:
• Variable labels up to 256 characters.
• Ability to read SYSTAT 6.0 for Windows files directly.
n n Nextn n n "); }
Version 7.0 takes full advantage of the improved features of Windows 95 to bring you:
New output display. Most of the statistical procedures in the Core system have been improved to
provide presentation-quality results.
Pivot tables. Pivot table output allows you to view your results in many different ways. You can
switch row and column variables, selectively show and hide categories, and manipulate
multidimensional tables.
Easy access context menus. A simple right-mouse button click anywhere in a pivot table opens a
context menu that provides access to common editing tasks right at your fingertips.
Improved online Help. Right-mouse button context menus also provide access to online Help on
individual controls on dialog boxes and selected items in output results.
Unconstrained multiple document interface.IBM® SPSS® Statistics windows are no longer
constrained to fit inside an overall application window with a single menu bar and toolbar. Each
window has its own menus and toolbars and can be placed anywhere on the screen.
New Summarize procedure. Create presentation-quality listings of the cases in your data file,
combined with summary statistics for subgroups of cases defined by one or more grouping variables.
You can also suppress display of cases to obtain totals and subtotals for all combinations of the
selected grouping variables.
New GLM procedure. The general linear model (GLM) is a flexible statistical model incorporating
analyses involving normally distributed dependent variables and combinations of categorical and
continuous predictor variables.
n n Nextn n n "); }

6/18/13 Help
1.1.18. Windows
There are a number of different types of windows in IBM® SPSS® Statistics:
Data Editor. The Data Editor displays the contents of the data file. You can create new data files
or modify existing data files with the Data Editor. If you have more than one data file open, there is
a separate Data Editor window for each data file.
Viewer. All statistical results, tables, and charts are displayed in the Viewer. You can edit the
output and save it for later use. A Viewer window opens automatically the first time you run a
procedure that generates output.
Pivot Table Editor. Output that is displayed in pivot tables can be modified in many ways with the
Pivot Table Editor. You can edit text, swap data in rows and columns, add color, create
multidimensional tables, and selectively hide and show results.
Chart Editor. You can modify high-resolution charts and plots in chart windows. You can change
the colors, select different type fonts or sizes, switch the horizontal and vertical axes, rotate 3-D
scatterplots, and even change the chart type.
Text Output Editor. Text output that is not displayed in pivot tables can be modified with the Text
Output Editor. You can edit the output and change font characteristics (type, style, color, size).
Syntax Editor. You can paste your dialog box choices into a syntax window, where your selections
appear in the form of command syntax. You can then edit the command syntax to use special
features that are not available through dialog boxes. You can save these commands in a file for use
in subsequent sessions.
Data Editor and Viewer

6/18/13 Help
n n Nextn n n "); }
1.1.18.1. Designated window versus active window
If you have more than one open Viewer window, output is routed to the designated Viewer
window. If you have more than one open Syntax Editor window, command syntax is pasted into the
designated Syntax Editor window. The designated windows are indicated by a plus sign in the icon
in the title bar. You can change the designated windows at any time.
The designated window should not be confused with the active window, which is the currently
selected window. If you have overlapping windows, the active window appears in the foreground. If
you open a window, that window automatically becomes the active window and the designated
window.
n n Nextn n n "); }
1.1.18.1.1. Changing the designated window

6/18/13 Help
Make the window that you want to designate the active window (click anywhere in the window).
Click the Designate Window button on the toolbar (the plus sign icon).
or
From the menus choose:
Utilities > Designate Window
Note: For Data Editor windows, the active Data Editor window determines the dataset that is used
in subsequent calculations or analyses. There is no "designated" Data Editor window. See the topic
Basic Handling of Multiple Data Sources for more information.
n n Nextn n n "); }
1.1.19. Status Bar
The status bar at the bottom of each IBM® SPSS® Statistics window provides the following
information:
Command status. For each procedure or command that you run, a case counter indicates the
number of cases processed so far. For statistical procedures that require iterative processing, the
number of iterations is displayed.
Filter status. If you have selected a random sample or a subset of cases for analysis, the message
Filter on indicates that some type of case filtering is currently in effect and not all cases in the data
file are included in the analysis.
Weight status. The message Weight on indicates that a weight variable is being used to weight
cases for analysis.
Split File status. The message Split File on indicates that the data file has been split into separate
groups for analysis, based on the values of one or more grouping variables.
n n Nextn n n "); }
1.1.20. Dialog boxes
Most menu selections open dialog boxes. You use dialog boxes to select variables and options for
analysis.
Dialog boxes for statistical procedures and charts typically have two basic components:
Source variable list. A list of variables in the active dataset. Only variable types that are allowed
by the selected procedure are displayed in the source list. Use of short string and long string
variables is restricted in many procedures.
Target variable list(s). One or more lists indicating the variables that you have chosen for the
analysis, such as dependent and independent variable lists.
n n Nextn n n "); }

6/18/13 Help
1.1.21. Variable names and variable labels in dialog box lists
You can display either variable names or variable labels in dialog box lists, and you can control the
sort order of variables in source variable lists. To control the default display attributes of variables
in source lists, choose Options on the Edit menu. See the topic General options for more
information.
You can also change the variable list display attributes within dialogs. The method for changing the
display attributes depends on the dialog:
• If the dialog provides sorting and display controls above the source variable list, use those
controls to change the display attributes.
• If the dialog does not contain sorting controls above the source variable list, right-click on any
variable in the source list and select the display attributes from the context menu.
You can display either variable names or variable labels (names are displayed for any variables
without defined labels), and you can sort the source list by file order, alphabetical order, or
measurement level. (In dialogs with sorting controls above the source variable list, the default
selection of None sorts the list in file order.)
n n Nextn n n "); }
1.1.22. Resizing dialog boxes
You can resize dialog boxes just like windows, by clicking and dragging the outside borders or
corners. For example, if you make the dialog box wider, the variable lists will also be wider.
Resized dialog box
n n Nextn n n "); }
1.1.23. Dialog box controls

6/18/13 Help
There are five standard controls in most dialog boxes:
OK or Run. Runs the procedure. After you select your variables and choose any additional
specifications, click OK to run the procedure and close the dialog box. Some dialogs have a Run
button instead of the OK button.
Paste. Generates command syntax from the dialog box selections and pastes the syntax into a
syntax window. You can then customize the commands with additional features that are not
available from dialog boxes.
Reset. Deselects any variables in the selected variable list(s) and resets all specifications in the
dialog box and any subdialog boxes to the default state.
Cancel. Cancels any changes that were made in the dialog box settings since the last time it was
opened and closes the dialog box. Within a session, dialog box settings are persistent. A dialog box
retains your last set of specifications until you override them.
Help. Provides context-sensitive Help. This control takes you to a Help window that contains
information about the current dialog box.
n n Nextn n n "); }
1.1.24. Selecting variables
To select a single variable, simply select it in the source variable list and drag and drop it into the
target variable list. You can also use arrow button to move variables from the source list to the
target lists. If there is only one target variable list, you can double-click individual variables to move
them from the source list to the target list.
You can also select multiple variables:
• To select multiple variables that are grouped together in the variable list, click the first variable
and then Shift-click the last variable in the group.
• To select multiple variables that are not grouped together in the variable list, click the first
variable, then Ctrl-click the next variable, and so on (Macintosh: Command-click).
n n Nextn n n "); }
1.1.25. Data type, measurement level, and variable list icons
The icons that are displayed next to variables in dialog box lists provide information about the
variable type and measurement level.
Numeric String Date Time
Scale (Continuous) n/a
Ordinal

6/18/13 Help
Nominal
• For more information on measurement level, see Variable measurement level.
• For more information on numeric, string, date, and time data types, see Variable type.
n n Nextn n n "); }
1.1.26. Getting information about variables in dialog boxes
Many dialogs provide the ability to find out more about the variables displayed in the variable lists.
Right-click a variable in the source or target variable list.
Choose Variable Information.
n n Nextn n n "); }
1.1.27. Command line options
You can start IBM® SPSS® Statistics from the command line and include various switches to log on
to an analytic server or to start production mode, among other options. The command is named
stats and can be run from one of the following locations.
Platform Executable Location
Windows <installation directory>
Mac <installation directory>/<version>.app/Contents/MacOS
Linux <installation directory>/bin
Available Switches and Options
stats [-server <inet:hostname:port>] [-user <name>] [-password <password>]
[-switchserver]
[-singleseat]
[-nologo]
[-production [silent|prompt] [-background]]
[-symbol <values>]
[<filename>] ...
[-help|-h]
-server <inet:hostname:port> or -server <ssl:hostname:port>. The name or IP address and
port number of the server. Windows only.
-user <name>. A valid user name. If a domain name is required, precede the user name with the
domain name and a backslash (). Windows only.
-password <password>. The user's password.
-switchserver. Display the "Server Login" dialog box. This switch has precedence over the previous

6/18/13 Help
-server, -user, and -passwordswitches. Windows only.
-singleseat. Start application in a single seat mode.
-nologo. Start the application without displaying the splash screen.
-production [prompt|silent]. Start the application in production mode. The promptand silent
keywords specify whether to display the dialog box that prompts for runtime values if they are
specified in the job. The prompt keyword is the default and shows the dialog box. The silent
keyword suppresses the dialog box. If you use the silentkeyword, you can define the runtime
symbols with the -symbolswitch. Otherwise, the default value is used. The -switchserverand -
singleseatswitches are ignored when using the -productionswitch.
-symbol <values>. List of symbol-value pairs used in the production job. Each symbol name starts
with @. Values that contain spaces should be enclosed in quotes. Rules for including quotes or
apostrophes in string literals may vary across operating systems, but enclosing a string that
includes single quotes or apostrophes in double quotes usually works (for example, “'a quoted
value'”). The symbols must be defined in the production job using the Runtime Values tab. See the
topic Runtime values for more information.
-background. Run the production job in the background on a remote server. Your local computer
does not have to remain on and does not have to remain connected to the remote server. You can
disconnect and retrieve the results later. You must also include the -productionswitch and
specify the server using the -serverswitch.
<filename> .... List of filenames, which can include all application supported file types. Enclose a
file name with double quotes if it contains spaces.
-help|-h. Display the command help.
If the -server, -user, -password, -switchserver, and -singleseatswitches are omitted, SPSS
Statistics runs in the default mode.
Examples
Note: The following examples assume that you changed directories to the executable location. The
details may vary by operating system and may require path specifications.
Starting in distributed mode using a specific server:
stats -server mystatssvr:3016 -user myuser -password mypassword
Starting in distributed mode using a specific server and a domain name:
stats -server mystatssvr:3016 -user "mydomainmyuser" -password mypassword
Starting in single seat mode:
stats -singleseat
Starting in production mode, letting SPSS Statistics prompt for runtime values:
stats C:job1.spj -production
Starting in production mode with defined symbol-value pairs:
stats C:job1.spj -production silent -macro @sex male @state "North Dakota"

6/18/13 Help
Starting in default mode while opening a data file and a syntax file:
stats C:cars.sav C:analysis.sps
n n Nextn n n "); }
1.1.28. Basic steps in data analysis
Analyzing data with IBM® SPSS® Statistics is easy. All you have to do is:
Get your data into SPSS Statistics. You can open a previously saved SPSS Statistics data file,
you can read a spreadsheet, database, or text data file, or you can enter your data directly in the
Data Editor.
Select a procedure. Select a procedure from the menus to calculate statistics or to create a
chart.
Select the variables for the analysis. The variables in the data file are displayed in a dialog box
for the procedure.
Run the procedure and look at the results. Results are displayed in the Viewer.
n Nextn n n "); }
1.1.29. Statistics Coach
If you are unfamiliar with IBM® SPSS® Statistics or with the available statistical procedures, the
Statistics Coach can help you get started by prompting you with simple questions, nontechnical
language, and visual examples that help you select the basic statistical and charting features that
are best suited for your data.
To use the Statistics Coach, from the menus in any SPSS Statistics window choose:
Help > Statistics Coach
The Statistics Coach covers only a selected subset of procedures. It is designed to provide general
assistance for many of the basic, commonly used statistical techniques.
n n Nextn n n "); }
1.2. Getting Help
Help is provided in many different forms:
Help menu. The Help menu in most windows provides access to the main Help system, plus
tutorials and technical reference material.
• Topics. Provides access to the Contents, Index, and Search tabs, which you can use to find
specific Help topics.
• Tutorial. Illustrated, step-by-step instructions on how to use many of the basic features. You

6/18/13 Help
don't have to view the whole tutorial from start to finish. You can choose the topics you want to
view, skip around and view topics in any order, and use the index or table of contents to find
specific topics. You can also click here to start the tutorial.
• Case Studies. Hands-on examples of how to create various types of statistical analyses and
how to interpret the results. The sample data files used in the examples are also provided so that
you can work through the examples to see exactly how the results were produced. You can
choose the specific procedure(s) that you want to learn about from the table of contents or
search for relevant topics in the index. You can also click here to open the Case Studies.
• Statistics Coach. A wizard-like approach to guide you through the process of finding the
procedure that you want to use. After you make a series of selections, the Statistics Coach
opens the dialog box for the statistical, reporting, or charting procedure that meets your
selected criteria. You can also click here to open the Statistics Coach.
• Command Syntax Reference. Detailed command syntax reference information is available in
two forms: integrated into the overall Help system and as a separate document in PDF form in
the Command Syntax Reference, available from the Help menu.
• Statistical Algorithms. The algorithms used for most statistical procedures are available in two
forms: integrated into the overall Help system and as a separate document in PDF form available
on the manuals CD. For links to specific algorithms in the Help system, choose Algorithms from
the Help menu.
Context-sensitive Help. In many places in the user interface, you can get context-sensitive Help.
• Dialog box Help buttons. Most dialog boxes have a Help button that takes you directly to a
Help topic for that dialog box. The Help topic provides general information and links to related
topics.
• Pivot table context menu Help. Right-click on terms in an activated pivot table in the Viewer
and choose What's This? from the context menu to display definitions of the terms.
• Command syntax. In a command syntax window, position the cursor anywhere within a syntax
block for a command and press F1 on the keyboard. A complete command syntax chart for that
command will be displayed. Complete command syntax documentation is available from the links
in the list of related topics and from the Help Contents tab.
Other Resources
Technical Support Web site. Answers to many common problems can be found at
http://www.ibm.com/support. (The Technical Support Web site requires a login ID and password.
Information on how to obtain an ID and password is provided at the URL listed above.)
If you're a student using a student, academic or grad pack version of any IBM SPSS software
product, please see our special online Solutions for Education pages for students. If you're a
student using a university-supplied copy of the IBM SPSS software, please contact the IBM SPSS
product coordinator at your university.
SPSS Community. The SPSS community has resources for all levels of users and application
developers. Download utilities, graphics examples, new statistical modules, and articles. Visit the
SPSS community at http://www.ibm.com/developerworks/spssdevcentral.
n Nextn n n "); }
1.2.1. Getting Help on Output Terms

6/18/13 Help
To see a definition for a term in pivot table output in the Viewer:
Double-click the pivot table to activate it.
Right-click on the term that you want explained.
Choose What's This? from the context menu.
A definition of the term is displayed in a pop-up window.
Show me
n n Nextn n n "); }
1.3. Data files
Data files come in a wide variety of formats, and this software is designed to handle many of them,
including:
• Spreadsheets created with Excel and Lotus
• Database tables from many database sources, including Oracle, SQLServer, Access, dBASE, and
others
• Tab-delimited and other types of simple text files
• Data files in IBM® SPSS® Statistics format created on other operating systems
• SYSTAT data files
• SAS data files
• Stata data files
• IBM® Cognos® Business Intelligence data packages and list reports
n n Nextn n n "); }
1.3.1. Opening data files
In addition to files saved in IBM® SPSS® Statistics format, you can open Excel, SAS, Stata, tab-
delimited, and other files without converting the files to an intermediate format or entering data
definition information.
• Opening a data file makes it the active dataset. If you already have one or more open data files,
they remain open and available for subsequent use in the session. Clicking anywhere in the Data
Editor window for an open data file will make it the active dataset. See the topic Working with
Multiple Data Sources for more information.
• In distributed analysis mode using a remote server to process commands and run procedures, the
available data files, folders, and drives are dependent on what is available on or from the remote
server. The current server name is indicated at the top of the dialog box. You will not have
access to data files on your local computer unless you specify the drive as a shared device and
the folders containing your data files as shared folders. See the topic Distributed Analysis Mode

6/18/13 Help
n n Nextn n n "); }
1.3.1.1. To open data files
File > Open > Data...
In the Open Data dialog box, select the file that you want to open.
Click Open.
Optionally, you can:
• Automatically set the width of each string variable to the longest observed value for that
variable using Minimize string widths based on observed values. This is particularly useful when
reading code page data files in Unicode mode. See the topic General options for more
information.
• Read variable names from the first row of spreadsheet files.
• Specify a range of cells to read from spreadsheet files.
• Specify a worksheet within an Excel file to read (Excel 95 or later).
For information on reading data from databases, see Reading Database Files. For information on
reading data from text data files, see Text Wizard. For information on reading IBM® Cognos® data,
see Reading Cognos data.
n n Nextn n n "); }
1.3.1.2. Data file types
SPSS Statistics. Opens data files saved in IBM® SPSS® Statistics format and also the DOS
product SPSS/PC+.
SPSS Statistics Compressed. Opens data files saved in SPSS Statistics compressed format.
SPSS/PC+. Opens SPSS/PC+ data files. This is available only on Windows operating systems.
SYSTAT. Opens SYSTAT data files.
SPSS Statistics Portable. Opens data files saved in portable format. Saving a file in portable
format takes considerably longer than saving the file in SPSS Statistics format.
Excel. Opens Excel files.
Lotus 1-2-3. Opens data files saved in 1-2-3 format for release 3.0, 2.0, or 1A of Lotus.
SYLK. Opens data files saved in SYLK (symbolic link) format, a format used by some spreadsheet
applications.
dBASE. Opens dBASE-format files for either dBASE IV, dBASE III or III PLUS, or dBASE II. Each
case is a record. Variable and value labels and missing-value specifications are lost when you save
a file in this format.
SAS. SAS versions 6–9 and SAS transport files. Using command syntax, you can also read value

6/18/13 Help
labels from a SAS format catalog file. See the topic GET SAS for more information.
Stata. Stata versions 4–8.
For information on reading data from databases, see Reading Database Files. For information on
reading data from text data files, see Text Wizard. For information on reading IBM® Cognos® data,
see Reading Cognos data.
n n Nextn n n "); }
1.3.1.3. Opening file options
Read variable names. For spreadsheets, you can read variable names from the first row of the file
or the first row of the defined range. The values are converted as necessary to create valid
variable names, including converting spaces to underscores. For information on variable naming
rules, see Variable Names.
Worksheet. Excel 95 or later files can contain multiple worksheets. By default, the Data Editor
reads the first worksheet. To read a different worksheet, select the worksheet from the drop-down
list.
Range. For spreadsheet data files, you can also read a range of cells. Use the same method for
specifying cell ranges as you would with the spreadsheet application.
n n Nextn n n "); }
1.3.1.4. Reading Excel Files
Read variable names. You can read variable names from the first row of the file or the first row of
the defined range. Values that don't conform to variable naming rules are converted to valid
variable names, and the original names are used as variable labels. For information on variable
naming rules, see Variable Names.
Worksheet. Excel files can contain multiple worksheets. By default, the Data Editor reads the first
worksheet. To read a different worksheet, select the worksheet from the drop-down list.
Range. You can also read a range of cells. Use the same method for specifying cell ranges as you
would in Excel.
n n Nextn n n "); }
1.3.1.5. Reading Excel 95 or Later Files
The following rules apply to reading Excel 95 or later files:
Data type and width. Each column is a variable. The data type and width for each variable are
determined by the data type and width in the Excel file. If the column contains more than one data
type (for example, date and numeric), the data type is set to string, and all values are read as valid
string values.
Blank cells. For numeric variables, blank cells are converted to the system-missing value, indicated

6/18/13 Help
by a period. For string variables, a blank is a valid string value, and blank cells are treated as valid
string values.
Variable names. If you read the first row of the Excel file (or the first row of the specified range)
as variable names, values that don't conform to variable naming rules are converted to valid
variable names, and the original names are used as variable labels. For information on variable
naming rules, see Variable Names. If you do not read variable names from the Excel file, default
variable names are assigned.
n n Nextn n n "); }
1.3.1.5.1. How to Read Excel 95 or Later Files
File > Open > Data…
In the Open File dialog box, select the type of file and the file you want to open.
Click Open.
If the first row of the spreadsheet contains column headings or labels, click Read variable names in
the Opening File Options dialog box.
If the data you want to read do not start in the first row of the spreadsheet, enter the cell range in
the Opening File Options dialog box.
If the data you want to read are not on the first sheet of the file, select the sheet you want to
read.
If all of your data are read as string data, you probably tried to read the first row as data when it
really contains headings.
Show me
n n Nextn n n "); }
1.3.1.6. Reading older Excel files and other spreadsheets
The following rules apply to reading Excel files prior to Excel 95 and other spreadsheet data:
Data type and width. The data type and width for each variable are determined by the column
width and data type of the first data cell in the column. Values of other types are converted to the
system-missing value. If the first data cell in the column is blank, the global default data type for
the spreadsheet (usually numeric) is used.
Blank cells. For numeric variables, blank cells are converted to the system-missing value, indicated
by a period. For string variables, a blank is a valid string value, and blank cells are treated as valid
string values.
Variable names. If you do not read variable names from the spreadsheet, the column letters (A, B,
C, ...) are used for variable names for Excel and Lotus files. For SYLK files and Excel files saved in
R1C1 display format, the software uses the column number preceded by the letter C for variable
names (C1, C2, C3, ...).

6/18/13 Help
n n Nextn n n "); }
1.3.1.7. Reading dBASE files
Database files are logically very similar to IBM® SPSS® Statistics data files. The following general
rules apply to dBASE files:
• Field names are converted to valid variable names. For information on variable naming rules, see
Variable Names.
• Colons used in dBASE field names are translated to underscores.
• Records marked for deletion but not actually purged are included. The software creates a new
string variable, D_R, which contains an asterisk for cases marked for deletion.
n n Nextn n n "); }
1.3.1.8. Reading Stata files
The following general rules apply to Stata data files:
• Variable names. Stata variable names are converted to IBM® SPSS® Statistics variable names
in case-sensitive form. Stata variable names that are identical except for case are converted to
valid variable names by appending an underscore and a sequential letter (_A, _B, _C, ..., _Z,
_AA, _AB, ..., and so forth).
• Variable labels. Stata variable labels are converted to SPSS Statistics variable labels.
• Value labels. Stata value labels are converted to SPSS Statistics value labels, except for Stata
value labels assigned to "extended" missing values.
• Missing values. Stata "extended" missing values are converted to system-missing values.
• Date conversion. Stata date format values are converted to SPSS Statistics DATEformat (d-m-
y) values. Stata "time-series" date format values (weeks, months, quarters, and so on) are
converted to simple numeric (F) format, preserving the original, internal integer value, which is
the number of weeks, months, quarters, and so on, since the start of 1960.
n n Nextn n n "); }
1.3.1.9. Reading Database Files
You can read data from any database format for which you have a database driver. In local analysis
mode, the necessary drivers must be installed on your local computer. In distributed analysis mode
(available with IBM® SPSS® Statistics Server), the drivers must be installed on the remote
server.See the topic Distributed Analysis Mode for more information.
Note: If you are running the Windows 64-bit version of SPSS Statistics, you cannot read Excel,
Access, or dBASE database sources, even though they may appear on the list of available database
sources. The 32-bit ODBC drivers for these products are not compatible.

6/18/13 Help
n n Nextn n n "); }
1.3.1.9.1. To Read Database Files
File > Open Database > New Query...
(Or choose Edit Query to work on a saved query)
Select the data source.
If necessary (depending on the data source), select the database file and/or enter a login name,
password, and other information.
Select the table(s) and fields. For OLE DB data sources (available only on Windows operating
systems), you can only select one table.
Specify any relationships between your tables.
Optionally:
• Specify any selection criteria for your data.
• Add a prompt for user input to create a parameter query.
• Save your constructed query before running it.
You can read data from any database format for which you have a database driver. In local analysis
mode, the necessary drivers must be installed on your local computer. In distributed analysis mode
(available with the server version), the drivers must be installed on the remote server.
To add data sources in distributed analysis mode, see your system administrator.
Show me
n n Nextn n n "); }
1.3.1.9.2. Selecting a Data Source
Use the first screen of the Database Wizard to select the type of data source to read.
ODBC Data Sources
If you do not have any ODBC data sources configured, or if you want to add a new data source,
click Add ODBC Data Source.
• On Linux operating systems, this button is not available. ODBC data sources are specified in
odbc.ini, and the ODBCINI environment variables must be set to the location of that file. For
more information, see the documentation for your database drivers.
• In distributed analysis mode (available with IBM® SPSS® Statistics Server), this button is not
available. To add data sources in distributed analysis mode, see your system administrator.
An ODBC data source consists of two essential pieces of information: the driver that will be used
to access the data and the location of the database you want to access. To specify data

spss Help

Recommandé

Recommandé

Contenu connexe

Tendances

Tendances (20)

Similaire à spss Help

Similaire à spss Help (20)

Plus de Saroj Suwal

Plus de Saroj Suwal (20)

Dernier

Dernier (20)

spss Help