4. 6/17/13 Visual Binning
publib.boulder.ibm.com/infocenter/spssstat/v20r0m0/advanced/print.jsp?topic=/com.ibm.spss.statistics.help/idh_bander_gating.htm&pageBreak=true&breakConfi… 4/9
2. Binning Variables
The Visual Binning main dialog box provides the following information for the scanned variables:
Scanned Variable List. Displays the variables you selected in the initial dialog box. You can sort
the list by measurement level (scale or ordinal) or by variable label or name by clicking on the
column headings.
Cases Scanned. Indicates the number of cases scanned. All scanned cases without user-
missing or system-missing values for the selected variable are used to generate the distribution
of values used in calculations in Visual Binning, including the histogram displayed in the main
dialog box and cutpoints based on percentiles or standard deviation units.
Missing Values. Indicates the number of scanned cases with user-missing or system-missing
values. Missing values are not included in any of the binned categories. See the topic User-
Missing Values in Visual Binning for more information.
Current Variable. The name and variable label (if any) for the currently selected variable that
will be used as the basis for the new, binned variable.
Binned Variable. Name and optional variable label for the new, binned variable.
• Name. You must enter a name for the new variable. Variable names must be unique and must
follow variable naming rules. See the topic Variable names for more information.
• Label. You can enter a descriptive variable label up to 255 characters long. The default
variable label is the variable label (if any) or variable name of the source variable with (Binned)
appended to the end of the label.
Minimum and Maximum. Minimum and maximum values for the currently selected variable,
based on the scanned cases and not including values defined as user-missing.
Nonmissing Values. The histogram displays the distribution of nonmissing values for the
currently selected variable, based on the scanned cases.
• After you define bins for the new variable, vertical lines on the histogram are displayed to
indicate the cutpoints that define bins.
• You can click and drag the cutpoint lines to different locations on the histogram, changing the
bin ranges.
• You can remove bins by dragging cutpoint lines off the histogram.
Note: The histogram (displaying nonmissing values), the minimum, and the maximum are based on
the scanned values. If you do not include all cases in the scan, the true distribution may not be
accurately reflected, particularly if the data file has been sorted by the selected variable. If you
scan zero cases, no information about the distribution of values is available.
Grid. Displays the values that define the upper endpoints of each bin and optional value labels
for each bin.
• Value. The values that define the upper endpoints of each bin. You can enter values or use
Make Cutpoints to automatically create bins based on selected criteria. By default, a cutpoint
with a value of HIGH is automatically included. This bin will contain any nonmissing values
above the other cutpoints. The bin defined by the lowest cutpoint will include all nonmissing
values lower than or equal to that value (or simply lower than that value, depending on how
you define upper endpoints).
• Label. Optional, descriptive labels for the values of the new, binned variable. Since the values
6. 6/17/13 Visual Binning
publib.boulder.ibm.com/infocenter/spssstat/v20r0m0/advanced/print.jsp?topic=/com.ibm.spss.statistics.help/idh_bander_gating.htm&pageBreak=true&breakConfi… 6/9
3. Automatically Generating Binned Categories
The Make Cutpoints dialog box allows you to auto-generate binned categories based on selected
criteria.
To Use the Make Cutpoints Dialog Box
From the menus in the Data Editor window choose:
Transform > Visual Binning...
Select the numeric scale and/or ordinal variables for which you want to create new categorical
(binned) variables.
Click Continue.
Select (click) a variable in the Scanned Variable List.
Click Make Cutpoints.
Select the criteria for generating cutpoints that will define the binned categories.
Click Apply.
Note: The Make Cutpoints dialog box is not available if you scanned zero cases.
Equal Width Intervals. Generates binned categories of equal width (for example, 1–10, 11–20,
and 21–30) based on any two of the following three criteria:
• First Cutpoint Location. The value that defines the upper end of the lowest binned category
(for example, a value of 10 indicates a range that includes all values up to 10).
• Number of Cutpoints. The number of binned categories is the number of cutpoints plus one.
For example, 9 cutpoints generate 10 binned categories.
• Width. The width of each interval. For example, a value of 10 would bin age in years into 10-
year intervals.
Equal Percentiles Based on Scanned Cases. Generates binned categories with an equal
number of cases in each bin (using the aempirical algorithm for percentiles), based on either
of the following criteria:
• Number of Cutpoints. The number of binned categories is the number of cutpoints plus one.
For example, three cutpoints generate four percentile bins (quartiles), each containing 25% of
the cases.
• Width (%). Width of each interval, expressed as a percentage of the total number of cases.
For example, a value of 33.3 would produce three binned categories (two cutpoints), each
containing 33.3% of the cases.
If the source variable contains a relatively small number of distinct values or a large number of
cases with the same value, you may get fewer bins than requested. If there are multiple
identical values at a cutpoint, they will all go into the same interval; so the actual percentages
may not always be exactly equal.
Cutpoints at Mean and Selected Standard Deviations Based on Scanned Cases. Generates
binned categories based on the values of the mean and standard deviation of the distribution of
the variable.