2. SUBJECT NAME: BTCOC303 DATA STRUCTURES
Prof. Jayanand Kamble
2
WeeklyTeaching hrs Evaluation Scheme
Credit
L T P CA MSE ESE
3 4 8 20 20 60 4
3. CONTENTS: UNIT 1 [6 HRS]
• Introduction: Data, Data types, Data structure,
• Abstract Data Type (ADT), representation of Information,
• characteristics of algorithm, program,
• analyzing programs.
Prof. Jayanand Kamble
3
4. INTRODUCTION: DATA
• Computer data is information processed or stored by a computer.
• This information may be in the form of text documents, images, audio clips, software
programs, or other types of data.
• Computer data may be processed by the computer's CPU and is stored in files and
folders on the computer's hard disk.
Prof. Jayanand Kamble
4
5. • Data can be defined as a representation of facts, concepts, or instructions in a formalized
manner, which should be suitable for communication, interpretation, or processing by
human or electronic machine.
• Data is represented with the help of characters such as alphabets (A-Z, a-z), digits (0-9)
or special characters (+,-,/,*,<,>,= etc.)
Prof. Jayanand Kamble
5
6. • Information is organized or classified data, which has some meaningful values for the
receiver.
• Information is the processed data on which decisions and actions are based.
Prof. Jayanand Kamble
6
7. DATA TYPES
Prof. Jayanand Kamble
7
A data type is a
collection of objects
and a set of operations
that act on those
objects.
8. DATA STRUCTURE
• A data structure is a way of storing data in a computer so that it can be used efficiently
and it will allow the most efficient algorithm to be used.
• A data structure should be seen as a logical concept that must address two fundamental
concerns.
• First, how the data will be stored, and
• Second, what operations will be performed on it.
Prof. Jayanand Kamble
8
9. TYPES OF DATA STRUCTURES
• There are two types of data structures:
• Primitive data structure
• Non-primitive data structure
Prof. Jayanand Kamble
9
10. PRIMITIVE DATA STRUCTURE
• The primitive data structures are primitive data types.
• The int, char, float, double, and pointer are the primitive data structures that can hold a
single value.
Prof. Jayanand Kamble
10
11. NON-PRIMITIVE DATA STRUCTURE
• The non-primitive data structure is divided into two types:
• Linear data structure
• Non-linear data structure
Prof. Jayanand Kamble
11
12. LINEAR & NON-LINEAR DATA STRUCTURE
• The arrangement of data in a sequential manner is known as a linear data structure.
• The data structures used for this purpose are Arrays, Linked list, Stacks, and Queues.
• In these data structures, one element is connected to only one another element in a
linear form.
• When one element is connected to the 'n' number of elements known as a non-linear
data structure.
• The best example is trees and graphs.
• In this case, the elements are arranged in a random manner.
Prof. Jayanand Kamble
12
13. DIFFERENCE BETWEEN LINEAR AND NON LINEAR
DATA STRUCTURE
Linear Data Structure Non-Linear Data Structure
Every item is related to its previous and next time. Every item is attached with many other items.
Data is arranged in linear sequence. Data is not arranged in sequence.
Data items can be traversed in a single run. Data cannot be traversed in a single run.
Eg.Array, Stacks, linked list, queue. Eg. tree, graph.
Implementation is easy. Implementation is difficult.
Prof. Jayanand Kamble
13
14. DATA STRUCTURES CAN ALSO BE CLASSIFIED AS:
• Static data structure:
• It is a type of data structure where the size is allocated at the compile time.
• Therefore, the maximum size is fixed.
• Dynamic data structure:
• It is a type of data structure where the size is allocated at the run time.
• Therefore, the maximum size is flexible.
Prof. Jayanand Kamble
14
15. • Few areas in which data structures are used.
• Compiler Design,
• Operating System,
• Database Management System,
• Statistical analysis package,
• Numerical Analysis,
• Graphics,
• Artificial Intelligence,
• and Simulation.
Prof. Jayanand Kamble
15
16. • Data structures are used with the following areas: RDBMS, Network data model and
hierarchical data model.
• RDBMS implements Array data structure.
• Network data model uses Graph.
• Hierarchal data model usesTrees.
Prof. Jayanand Kamble
16
17. ABSTRACT DATA TYPE (ADT)
• The DataType is basically a type of data that can be used in different computer program.
• It signifies the type like integer, float etc, the space like integer will take 4-bytes, character
will take 1-byte of space etc.
• The abstract datatype is special kind of datatype, whose behavior is defined by a set of
values and set of operations.
• An abstract data type(ADT) is a data type that is organized in such a way that the
specification of the objects and the operations on the objects is separated from the
representation of the objects and the implementation of the operations.
Prof. Jayanand Kamble
17
18. • The keyword “Abstract” is used as we can use these datatypes, we can perform different
operations. But how those operations are working that is totally hidden from the user.
• The ADT is made of with primitive datatypes, but operation logics are hidden.
• Some examples of ADT are Stack, Queue, List etc.
Prof. Jayanand Kamble
18
19. ADT NAT_NO
Prof. Jayanand Kamble
19
structure Natural_Number is
objects: an ordered subrange of the integers starting at zero and ending
at the maximum integer (INT_MAX) on the computer
functions:
for all x, y Nat_Number; TRUE, FALSE Boolean
and where +, -, <, and == are the usual integer operations.
Nat_No Zero ( ) ::= 0
Boolean Is_Zero(x) ::= if (x) return FALSE
else return TRUE
Nat_No Add(x, y) ::= if ((x+y) <= INT_MAX) return x+y
else return INT_MAX
Boolean Equal(x,y) ::= if (x== y) return TRUE
else return FALSE
Nat_No Successor(x) ::= if (x == INT_MAX) return x
else return x+1
Nat_No Subtract(x,y) ::= if (x<y) return 0
else return x-y
end Natural_Number
20. SOME COMMON ADTS,WHICH HAVE PROVED
USEFUL IN A GREATVARIETY OF APPLICATIONS,ARE
• Container
• Deque
• List
• Map
• Multimap
• Multiset
Prof. Jayanand Kamble
20
• Priority queue
• Queue
• Set
• Stack
• String
• Tree
21. • Each of these ADTs may be defined in many ways and variants, not necessarily equivalent.
• For example, a stack ADT may or may not have a count operation that tells how many
items have been pushed and not yet popped.
• This choice makes a difference not only for its clients but also for the implementation.
Prof. Jayanand Kamble
21
22. REPRESENTATION OF INFORMATION
• Information:
• Any type of knowledge that can be exchanged. In an exchange, it is represented by data.
• An example is a string of bits (the data) accompanied by a description of how to interpret a
string of bits as numbers representing temperature observations measured in degrees Celsius
• Data:
• A reinterpretable representation of information in a formalized manner suitable for
communication, interpretation, or processing
Prof. Jayanand Kamble
22
23. • Data Representation refers to the form in which data is stored, processed, and
transmitted.
• Devices such as smartphones, iPods, and computers store data in digital formats that can
be handled by electronic circuitry.
• Digitization is the process of converting information, such as text, numbers, photo, or
music, into digital data that can be manipulated by electronic devices.
Prof. Jayanand Kamble
23
24. • The 0s and 1s used to represent digital data are referred to as binary digits — from this
term we get the word bit that stands for binary digit.
• A bit is a 0 or 1 used in the digital representation of data.
• A digital file, usually referred to simply as a file, is a named collection of data that exits on
a storage medium, such as a hard disk, CD, DVD, or flash drive
Prof. Jayanand Kamble
24
26. REPRESENTING NUMBERS
• Numeric data consists of numbers that can be used in arithmetic operations.
• Digital devices represent numeric data using the binary number system, also called base 2.
• The binary number system only has two digits: 0 and 1.
• No numeral like 2 exists in the system, so the number “two” is represented in binary as 10
(pronounced “one zero”).
Prof. Jayanand Kamble
26
28. REPRESENTINGTEXT
• Character data is composed of letters, symbols, and numerals that are not used in
calculations.
• Examples of character data include your name, address,and hair color.
• Character data is commonly referred to as “text.”
• Digital devices employ several types of codes to represent character data, including ASCII,
Unicode, and their variants.
• ASCII (American Standard Code for Information Interchange, pronounced “ASKee”) requires
seven bits for each character.
• The ASCII code for an uppercase A is 1000001.
Prof. Jayanand Kamble
28
29. • Extended ASCII is a superset of ASCII that uses eight bits for each character.
• For example, Extended ASCII represents the uppercase letter A as 01000001.
• Using eight bits instead of seven bits allows Extended ASCII to provide codes for 256
characters.
• Unicode (pronounced “YOU ni code”) uses sixteen bits and provides codes or 65,000
characters.
• This is a bonus for representing the alphabets of multiple languages.
• UTF-8 is a variable-length coding scheme that uses seven bits for common ASCII characters
but uses sixteen-bit Unicode as necessary
Prof. Jayanand Kamble
29
30. • ASCII codes are used for numerals, such as Social Security numbers and phone numbers.
• Plain, unformatted text is sometimes called ASCII text and is stored in a so-called text file
with a name ending in .txt.
• On Apple devices these files are labeled “PlainText.” InWindows, these files are labeled
“Text Document”.
• ASCII text files contain no formatting.To create documents with styles and formats,
formatting codes have to be embedded in the text.
Prof. Jayanand Kamble
30
31. • MicrosoftWord produces formatted text and creates documents in DOCX format.
• Apple Pages produces documents in PAGES format.
• Adobe Acrobat produces documents in PDF format.
• HTML markup language used forWeb pages produces documents in HTML format.
Prof. Jayanand Kamble
31
32. BITES AND BYTES
• All the data stored and transmitted by digital devices is encoded as bits.
• Terminology related to bits and bytes is extensively used to describe storage capacity and
network access speed.
• The word bit, an abbreviation for binary digit, can be further abbreviated as a lowercase b.
• A group of eight bits is called a byte and is usually abbreviated as an uppercase B.
• When reading about digital devices, you’ll frequently encounter references such as 90
kilobits per second, 1.44 megabytes, 2.8 gigahertz, and 2 terabytes.
• Kilo, mega, giga, tera, and similar terms are used to quantify digital data.
Prof. Jayanand Kamble
32
33. • Use bits for data rates, such as Internet connection speeds, and movie download speeds.
• Use bytes for file sizes and storage capacities.
• 104 KB: Kilobyte (KB or Kbyte) is often used when referring to the size of small
computer files.
• 56 Kbps: Kilobit (Kb or Kbit) can be used for slow data rates, such as a 56 Kbps (kilobits
per second) dial-up connection.
• 50 Mbps: Megabit (Mb or Mbit) is used for faster data rates, such as a 50 Mbps (megabits
per second) Internet connection.
Prof. Jayanand Kamble
33
34. CHARACTERISTICS OF ALGORITHM,
• Algorithms:
• A finite sequence of instructions, each of which has a clear meaning and can be performed
with a finite amount of effort in a finite length of time.
• An algorithm is a finite set of instructions that accomplishes a particular task
Prof. Jayanand Kamble
34
35. STRUCTURE AND PROPERTIES OF ALGORITHM:
• An algorithm has the following structure
1. Input Step
2.Assignment Step
3. Decision Step
4. Repetitive Step
5. Output Step
Prof. Jayanand Kamble
35
36. PROPERTIES OF ALGORITHM
• An algorithm is endowed with the following properties:
1. Finiteness:An algorithm must terminate after a finite number of steps.
2. Definiteness:The steps of the algorithm must be precisely defined or unambiguously specified.
3. Generality:An algorithm must be generic enough to solve all problems of a particular class.
4. Effectiveness: the operations of the algorithm must be basic enough to be put down on pencil
and paper. They should not be too complex to warrant writing another algorithm for the
operation.
5. Input-Output: The algorithm must have certain initial and precise inputs, and outputs that may
be generated both at its intermediate and final steps.
Prof. Jayanand Kamble
36
37. PRACTICAL ALGORITHM DESIGN ISSUES:
• To save time (Time Complexity):A program that runs faster is a better program.
• To save space (Space Complexity): A program that saves space over a competing program
is consider as desirable.
Prof. Jayanand Kamble
37
38. EFFICIENCY OF ALGORITHMS
• The performances of algorithms can be measured on the scales of time and space.
• The performance of a program is the amount of computer memory and time needed to
run a program.
• We use two approaches to determine the performance of a program.
• One is analytical and the other is experimental.
• In performance analysis we use analytical methods, while in performance measurement
we conduct experiments.
Prof. Jayanand Kamble
38
39. • Time Complexity:
• The time complexity of an algorithm or a program is a function of the running time of the
algorithm or a program.
• In other words, it is the amount of computer time it needs to run to completion.
• Space Complexity:
• The space complexity of an algorithm or program is a function of the space needed by the
algorithm or program to run to completion.
Prof. Jayanand Kamble
39
40. • The time complexity of an algorithm can be computed either by an empirical or
theoretical approach.
• The empirical or posteriori testing approach calls for implementing the complete
algorithms and executing them on a computer for various instances of the problem.
• The time taken by the execution of the programs for various instances of the problem
are noted and compared.
• The algorithm whose implementation yields the least time is considered as the best
among the candidate algorithmic solutions.
Prof. Jayanand Kamble
40
41. ANALYZING ALGORITHMS
• Suppose M is an algorithm, and suppose n is the size of the input data.
• Clearly the complexity f(n) of M increases as n increases.
• It is usually the rate of increase of f(n) with some standard functions.
• The most common computing times are
• O(1), O(𝑙𝑜𝑔 𝑛), O(n), O(𝑛 𝑙𝑜𝑔 𝑛), O(𝑛 ), O(𝑛 ), O(2 ).
Prof. Jayanand Kamble
41
44. • The total frequency counts of the program segments A, B and C given by 1, (3n+1) and
(3𝑛 +3n+1) respectively are expressed as O(1), O(n) and O(𝑛 ).
• These are referred to as the time complexities of the program segments since they are
indicative of the running times of the program segments.
• In a similar manner space complexities of a program can also be expressed in terms of
mathematical notations, which is nothing but the amount of memory they require for
their execution.
Prof. Jayanand Kamble
44
45. REASONS FOR ANALYZING ALGORITHMS
• To predict the resources that the algorithm requires
• Computational Time(CPU consumption).
• Memory Space(RAM consumption).
• Communication bandwidth consumption.
• To predict the running time of an algorithm
• Total number of primitive operations executed.
Prof. Jayanand Kamble
45
46. PROGRAM: HOW TO CREATE PROGRAMS?
• Requirements:
• Make sure you understand the information you are given (the input) and what results you are
to produce (the output).
• Try to write down a rigorous description of the input and output which covers all cases.
• Design:
• You may have several data objects (such as a maze, a polynomial, or a list of names).
• For each object there will be some basic operations to perform on it (such as print the maze,
add two polynomials, or find a name in the list.)
• Assume that these operations already exist in the form of procedures and write an algorithm
which solves the problem according to the requirements.
• Use a notation which is natural to the way you wish to describe the order of processing.
Prof. Jayanand Kamble
46
47. • Analysis:
• If you can think of another algorithm, then write it down.
• Next try to compare the two algorithms you have in hand.
• It may already be possible to tell if one will be more desirable than the other.
• If you cannot distinguish between two, choose one to work on for now.
• Refinement and coding:
• You must now choose representations for your data objects and write algorithms for each of
the operations on these objects
Prof. Jayanand Kamble
47
48. • Verification:
• Verification consists of three distinct aspects: program proving, testing and debugging.
• Each of these is an art.
• Before executing your program you should attempt to prove it is correct.
• Testing is the art of creating sample data upon which to run your program.
• If the program fails to run correctly then debugging is needed to determine what went wrong
and how to correct it.
Prof. Jayanand Kamble
48
49. EXAMPLE OF CREATING PROGRAM
• Suppose we devise a program for sorting a set of n>=1 integers
• "from those integers which remain unsorted, find the smallest and place it next in the
sorted list."
Prof. Jayanand Kamble
49
for i:=1 to n do
begin
examine a[i] to a[n] and suppose the smallest integer is at a[j];
interchange a[i] and a[j];
end;
50. • There now remain two clearly defined subtasks:
• (i) to find the minimum integer and
• (ii) to interchange it with a[i].
• This latter problem can be solved by the code
• t:=a[i]; a[i]:=a[j]; a[j]:=t;
Prof. Jayanand Kamble
50
51. • The first subtask can be solved by assuming the minimum is a[i], checking a[i] with a[i+1],
a[i+2],...and whenever a smaller element is found, regarding it as the new minimum.
• Eventually a[n] is compared to the current minimum and we are done.
• Putting all these observations together we get the procedure sort.
Prof. Jayanand Kamble
51
52. ANALYZING PROGRAMS
• There are many criteria upon which we can judge a program, for instance:
• Does it do what we want it to do?
• Does it work correctly according to the original specifications of the task?
• Is there documentation which describes how to use it and how it works?
• Are procedures created in such a way that they perform logical sub-functions?
• Is the code readable?
Prof. Jayanand Kamble
52
53. • There are other criteria for judging programs which have a more direct relationship to
performance.
• These have to do with computing time and storage requirements of the algorithms.
• Performance evaluation can be loosely divided into 2 major phases:
• (a) a priori estimates and
• (b) a postpriori testing.
Prof. Jayanand Kamble
53
54. CONSIDER THE EXAMPLES
Prof. Jayanand Kamble
54
x:=x+1;
We assume that the statement
x:=x+1 is not contained within any
loop either explicit or implicit. Then
its frequency count is one.
for i:=1 to n do
x:=x+1;
Now, the same statement will be
executed n times.
for i:=1 to n do
for i:=1 to n do
x:=x+1;
It will be executed (n*n) times now
55. END OF UNIT 01
• References:
• “Fundamentals of Data Structure” book by Ellis Horowitz and Sartaj Sahni.
Prof. Jayanand Kamble
55