Introduction to Data Structure

For More Visit: Https://www.ThesisScientist.com
Unit 2
Introduction to Data Structure
Introduction
A computer is a machine that manipulates information. The study of Computer Science includes the study
of how information is organized in a Computer, how it can be manipulated and how it can be utilized. Thus,
it is exceedingly important to understand the concept of information organization and manipulation.
Computer Science can be defined as the study of the data, its representation and transformation by a digital
computer.
Data Structure
Data may be organized in many different ways; the logical or mathematical model of a particular
organization of data is called "Data Structure". The choice of a particular data model depends on two
considerations:
 It must be rich enough in structure to reflect the actual relationships of the data in the real world.
 The structure should be simple enough that one can effectively process the data when necessary.
Data Structure Operations
The particular data structure that one chooses for a given situation depends largely on the nature of specific
operations to be performed.
The following are the four major operations associated with any data structure:
i. Traversing : Accessing each record exactly once so that certain items in the record may be
processed.
ii. Searching : Finding the location of the record with a given key value, or finding the locations
of all records which satisfy one or more conditions.
iii. Inserting : Adding a new record to the structure.
iv. Deleting : Removing a record from the structure.
Primitive and Composite Data Types
Primitive Data Types are Basic data types of any language. In most computers these are native to the
machine's hardware.
Some Primitive data types are:
 Integer

 Character
 Real Number
 Logical Number
 Pointers
Integer
A quantity representing objects that are discrete in nature can be represented by an integer.
Ex. 2, 0, -56, 89998 etc. are integers
Ex. 34.89, -3.98, abc, uii*^lkd889 are not integers
Character
Information is not always interpreted numerically. Items such as names, job, addresses etc. must also be
represented in some fashion within a computer.
Character is a literal expression of some element selected from an alphabet. A wide variety of character sets
(or alphabets) are handled by the most popular computers. Two of the largest and most widely used
character sets are represented by EBDIC and ASCII.
Ex. „A‟, „b‟, „/‟, „*‟, ‟5‟ etc. are characters
Ex. 23, 45.5, “AJHJHA”, 7&^&* are not characters
Real Number
The usual method used by computers to represent real numbers is floating-point notation. There are many
varieties of floating point notation and each has individual characteristics. The key concept is that a real
number is represented by a number, called a mantissa, times a base raised to an integer power called an
exponent. The base is usually fixed, and the mantissa and exponent vary to represent different real
numbers.
Example : If the base is fixed at 10, the number 485.43 can be represented as 48543×10-2
.
Logical Number
A logical data item is a primitive data type that can assume the values of either TRUE or FALSE.
Pointers
A pointer is a reference to a data structure. As pointer is a single fixed-size data item, it provides a
homogeneous method of referencing any data structure, regardless of the structure's type or complexity. In
some instance the pointers permit faster insertion and deletion of elements to and from a data structure.
Int x; Here x is an integer type while p is a pointer to integer type.
Int *p;

Abstract Data Type
A tool that permits a designer to consider a component at an abstract level, without worrying about the
details of its implementation is called ABSTRACTION. Treating the data as objects with some predefined
operations that can be performed on those objects is called Data Abstraction.
Data Abstraction when supported as a data types in a language is called Abstract Data Type (ADT). ADT
is a useful tool for specifying the logical properties of a data type.
The term "Abstract Data Type" refers to the basic mathematical concept that defines the data type.
Ex. Integer is an abstraction of a numerical quantity which is a whole number and can be positive or
negative.
Ex. Point is an abstraction of a two dimensional figure having no length and breadth.
Atomic Type
It is a value (constant or variable), which is treated as single entity only and cannot be subdivided.
Integers, reals, characters type of data cannot be broken further into any simpler data types, therefore, these
are atomic type.
Whereas, name of a person is not an atomic type data, because it can be broken into first name and last
name. Also, first and last names can be further broken into character type.
Structured Type
It is a set of values, which has two ingredients:
i) It is made up of COMPONENT elements.
ii) There is a structure, i.e. a set of rules for putting the components together.
Ex. Address is a structured type data : it has components – house No., Street No., City etc.
Ex. Date of birth is a structured data : it has day, moth and year data items.
Refinement Stages
Stages from any mathematical concept to its implementation as application are as follows:
Example 1

Any Sequence
Queue
Linear Circular
Array With Counter Array With Flag
Airport Simulation
Introduction to Algorithm Design
Algorithm
An algorithm is a well-defined list of steps for solving a particular problem. In other words an algorithm is
a finite step by step list of well-defined instructions for solving a particular problem. The steps of the
algorithm are executed one after the other, beginning with step 1.
Algorithms and their equivalent computer programs are more easily understood if they mainly use self-
contained modules and three types of logic, or flow of control called
1. Sequence Logic or Sequential Logic
2. Selection or Conditional Logic
3. Iteration or Repetitive Logic
In "Sequence Logic" unless instructions are given to the contrary, the modules are executed in the obvious
sequence. The sequence may be presented explicitly, by means of numbered steps, or implicitly, by the
order in which the modules are written.
"Selection Logic" employs a number of conditions, which lead to a selection of one out of several
alternative modules.
"Iteration Logic" refers to either of the two types of structures involving loops. Each type begins with a
Repeat statement and is followed by a module, called the body of the loop.
Algorithm Characteristics
Algorithm must satisfy the following criteria:
Input : These are zero or more quantities which are externally supplied.
Output : At least one quantity is produced.

Definiteness : Each instruction must be clear and unambiguous.
Finiteness : If we trace out the instructions of an algorithm, then for all cases the algorithm must terminate
after a finite number of steps. A program does not satisfy this condition necessarily.
Example : Operating System.
Effectiveness : Every instruction must be sufficiently basic and also feasible.
Creating of Programs (Algorithm)
Program or Algorithm design is broken up into five phases :
 Requirements : Understanding the information given and what result is asked for.
 Design : Description of order of processing through a proper notation.
 Analysis : Refinement and Coding, Verification.
Algorithm Design Approaches
An algorithm consists of components, which have components of their own; indeed, and algorithm is a
hierarchy of components, the highest level components corresponding to the total algorithm.
To design such hierarchies there are two different possible approaches:
i. Top Down
ii. Bottom up
A Top Down design approach starts by identifying the major components of the system, decomposing them
into their lower level components, iterating until the desired level of details are achieved.
A Bottom up design approach starts with designing the most basic or primitive components that use these
lower level components.
For the bottom up approach to be successful, we must have a good notion of the top where the design
should be heading.
For detailed example see Unit-1.
Program Analysis
One goal of 'Data Structures' is to develop skills for making evaluating judgements about programs. There
are many criteria upon which we can judge a program, for instance.
 Does it do what we want it to do?
 Does it work correctly according to the original specifications of the task.
 Is there proper documentation, which describes how to use it and how it works.
 Are subroutines created in such a way that they perform logical subfunctions.
 Is the code readable.

The above criteria are vitally important when it comes to writing software, most especially for large
systems.
Time and Space Complexity of Algorithms
The analysis of algorithms is a major task in Computer Science. In order to compare algorithms, we must
have some criteria to measure the efficiency of our algorithms.
The total time needed by any algorithm in execution and memory space required by the algorithm are the
two main measures for the efficiency of any algorithm.
To determine the execution time the following information is required:
 The Machine we are executing on.
 Its Machine language instruction set.
 The time required by each Machine instruction.
 The translation a compiler will make from the source to the Machine language.
But choosing a real machine and existing compiler even if hypothetical algorithms (with imaginary
executions times) is designed there would be the problem of compiler, which could vary from machine to
machine. All these considerations lead us to limit our goals.
Another approach is called the Frequency Count approach. In this approach the time is measured by
counting the number of key operations. Key operations of any algorithm are operations or steps which are
not to be excluded from the algorithm. We count only key operations because time for the other operations
is much less than or at the most proportional to the time for the key operations.
The space is measured by counting the maximum of memory needed by the algorithm.
Consider the following examples :
Example 1
Consider a program segment :
for(i=1; i<n; i++)
for(j=1; j<n; j++)
x++;
Frequency Count = n.n = n2
Order of complexity = 2
Array

Linear Data Structures
A "data structure" which displays the relationship of adjacency between elements is said to be "linear".
This type of data structure is used to represent one of the simplest and most commonly found data object
i.e. ordered or linear list.
Examples are Months of the Year
[January, February, March.............., December]
or the values in a card deck.
[2, 3, 4, 5, 6, 7, 8, 9, 10, Jack, Queen, King, Ace]
or the group of friends
[Deepak, Narendra, Abhishek, Ashish, Saurabh].
If we consider anordered list more abstractly, we say that it is either empty or it can be written as
[x1, x2, x3......xn].
where xi's are atoms from some set A.
Operations on Linear Data Structures
There are a variety of operations that can be performed on these data structures. These operations include :
i) Find the length of the list, n;
ii) Read the list from left to right or (right to left).
iii) Retrieve the ith element, 1in.
iv) Store a new value into the ith position, 1in.
v) Insert a new element at position i, 1in causing elements numbered i, i+1......, n to become
numbered i+1, i+2....n+1;
vi) Delete the element at position i, 1in causing elements numbered i+1,...n to become numbered i,
i+1,....n-1.
Arrays
The simplest data structure that makes use of computed addresses to locate its elements is the One
Dimensional Array. Normally a number of (contiguous) memory locations are sequentially allocated to the
Array. Assuming that each element requires one word of memory, an n element Array will occupy n
consecutive words in memory. Array size is always fixed; and hence requires a fixed number of memory
locations.
Categorization
The arrays may be categorized in the following categories:
 One Dimensional Array
 Two Dimensional Arrays
 Three Dimensional Arrays

 Multi Dimensional Arrays
Each type of Array has different kind of memory representation.
Memory Representation of One-Dimensional Array
A one dimensional array is a list of finite number n of Homogeneous data elements (i.e. data elements of
the same type) such that:
i) The elements of the Array are referenced respectively by an index set consisting of n consecutive
numbers.
ii) The elements of the Array are stored respectively in successive memory locations.
The number n of elements is called the length or size of the Array.
If not explicitly stated, we will assume the index set consists of the integers 1, 2, ...n. In general, the length
or the number of data elements of the Array can be obtained from the index set by the formula
Length = UB-LB+1________________________ (2.1)
where UB is the largest index, called the upper bound and LB is the smallest index, called the lower bound,
of the Array. Note that length = UB when LB=1.
The elements of an Array A may be denoted by the subscript notation A[1], A[2], A[3],... A[n].
In general a linear Array A with a subscript lower bound of "one" can be represented pictorially as in Fig.
3.1 given below.
.
.
.
A[I]
LB
Figure 3.1
If s words are allocated for each element or node (i.e. size of data types of elements of Array is s), then total
memory space allocated to array is given by :
Memory Space Request (UB-LB+1)*s___________________ (2.2)
If we want to calculate the memory space required for first i-1 elements of an Array then slight
modification in formula (2.2) will be needed. i.e.
Space Required = (i-1-LB+1)*s
or Space Required = (i-LB)*s_________________________(2.3)
If the Address of A[LB] is  then the Address of ith element of Array will be given by:

Address of A[i] = +(i-LB)*s________________________(2.4)
Now, we can write a Program for addressing the ith element of a single dimensional Array. We can take , i,
LB and s as input from user and output the address of A[i].
Example 1
Consider a One-Dimensional Array of Fig. 3.2 given below. It has 5 elements. Each element takes 4 bytes
to store.
Suppose Address of a[0]=1000, and we want to calculate the address of a4 then from the formula (2.4) we
have,
 = 1000,
i = 4,
LB = 1,
s = 4
Address of a4 = 1000+(4-1)*4
Address of a4 = 1012.
a5
a4
a3
a2
a1
Figure 3.2
Memory Representation of Two Dimensional Arrays
Even though Multidimensional Arrays are provided as a standard data object in most of the high level
languages, it is interesting to see how they are represented in memory. Memory may be regarded as one
dimensional with words numbered from 1 to m. So we are concerned with representing n dimensional
Array in a one dimensional memory.
A two dimensional 'm x n' Array A is a collection of m.n data elements such that each element is specified
by a pair of integers (such as j, k), called subscripts, with property that
1jm and 1kn.
The element of A with first subscript j and second subscript k will be denoted by AJ,K or A[J, K]. Two
dimensional arrays are called matrices in mathematics and tables in Business Applications.

Example 2
Columns
1 A [1, 1] A [1, 2] A [1, 3]
2 A [2, 1] A [2, 2] A [2, 3]
3 A [3, 1] A [3, 2] A [3, 3]
Two Dimensional 3x3 Array A
Programming language stores the array in either of the two way :
i) Row Major Order
ii) Column Major Order
In Row Major Order elements of 1st
Row are stored first in linear order and then comes elements of next
Row and so on.
In Column Major Order elements of 1st
column are stored first linearly and then comes elements of next
column. When Above Matrix is stored in memory using Row Major Order form then the representation will
be as shown below:
(1, 1)
(1, 2)
(1, 3)
(2, 1)
(2, 2)
(2, 3)
(3, 1)
(3, 2)
(3, 3)
Row 1
Row 2
Row 3
Figure 3.3
Representation with Column Major form will be:

(1, 1)
(2, 1)
(3, 1)
(1, 2)
(2, 2)
(3, 2)
(1, 3)
(2, 3)
(3, 3)
Column 1
Column 2
Column 3
Figure 3.4
Number of elements in Any 2 Dimensional Array can be given by :
No. of elements = (UB1-LB1+1) * (UB2-LB2+1)_______________ (2.5)
where, UB1 is upper bound of 1st
dimension
LB1 is lower bound of 1st
dimension UB2 and LB2
are upper and lower bounds of 2nd
dimensions.
UB2
LB2
Row
LB1
UB1
ith
A[i, LB2
) …… A (I, j)
Figure 3.5
If we want to calculate the number of elements till Ist Row then.
No. of elements = (UB2-LB2+1) * (1-1+1)
Or No. of elements = UB2-LB2+1 _______________________(2.6)

Introduction to Data Structure

Recommandé

Recommandé

Contenu connexe

Tendances

Tendances (20)

Similaire à Introduction to Data Structure

Similaire à Introduction to Data Structure (20)

Plus de Prof Ansari

Plus de Prof Ansari (20)

Dernier

Dernier (20)

Introduction to Data Structure