UiPath Community: AI for UiPath Automation Developers
Data Structures 2004
1. Lecture Notes, Data Structures Sanjay Goel, JIIT, 2004
Data Structures (2004-2005)
Lecture 1 (2 hr.) (22.07.04) (Class strength : Approx. 230-240 students)
1. Question : What is Engineering?
2. Question : Is Engineering an Applied Science? Is it enough to say so?
3. Question : What is older, Engineering or Science?
4. Feedback forms on lecture format.
5. Design and execute an algorithm for collection of these forms.
6. Macro Analysis : Nobody raised the hand when asked if anybody has rated the textbook
explanation as the most important property of the lecture. Announced that I will not follow or
explain any text book in the class. Instead, I will do Engineering and create something new with
the participation of approx. 300 students.
7. Announced programming assignments.
8. Course Objectives and description.
9. Question : Give examples of clear and precise expressions.
10. Tools of expressions: Natural Language, Examples, Diagrams, Graphs, Mathematical expressions,
formal expressions, word definitions, Counter examples. Engineers have to create new tools for
enhancing the clarity and precision of expression. We will create some for our se in this course and
perhaps also for future.
11. In class exercise : Every body write one example, some sharing and critique by Instructor.
Critique the example of your buddy
12. Design an algorithm : Everybody think of a number. Design and execute an algorithm for adding
these numbers. Identify the sequential and parallel phases.
13.In class exercise : Everybody write this algorithm clearly and precisely
Critique your buddy’s algorithm for its clarity and precision.
Continue at home. Use examples, counter examples, diagrams and/or
mathematical formal expressions for bringing clarity and precision.
14. Question : What does creativity require?
15. Question : What kind of material/media has been used by man for expressing creativity?
16. Question : What is our material for creativity? Computer memory.
17. In class exercise : Think of any existing or imaginary thing. Think how can you represent it in
computer memory. Write it down. Show it your buddy. Critique your buddy’s work.
18. A structured representation of data for computer memory is data structure.
19. What is Computer Science? Some Definitions.
Lecture 2 (1 hr.) (27.07.04) (Class strength : Approx. 200 students)
1. Write down, what new thing, if any, did you learn in the last class?
2. Write down, what new Question came to your mind, if any, for which you have not yet found the
answer?
3. Share your thoughts with your buddy.
4. Think of some real existing and imaginary thing (object of interest, OI), How to represent it in
computer memory? Can you write a clear and precise representation?
5. Question : How to represent __________ in the computer memory? Identify your OI to fill in the
blank.
6. Write five different OIs for this template.
7. Pick up any one of these and start representing it for computer memory.
8. The key to find the answer to a question is to ask another more pin pointed question to yourself.
9. Question : What question should you ask to answer these questions on “How to represent ….”?
10. Question : What is the difference in some thing and its representation?
11. Representation represents only some aspect of the real for some specific purpose.
12. Question : What is the purpose of representation for each chosen OI (real or imaginary) by you?
13. Question : What aspect of chosen OI do you need to represent in order to meet the chosen
purpose?
14. Refer to Lect_1_01 and Lect_1_02 of 2002.
Lecture 3 (2 hr.) (29.07.04) (Class strength : Approx. 200 students)
1. Firm up your group of six.
2. Lecture Notes, Data Structures Sanjay Goel, JIIT, 2004
2. Group exercise : Share your OIs. Pick up an OI for further work.
3. Develop the purpose for its representation. Class sharing.
4. Make it clear and precise with examples, counter examples, graphs, drawings, formulas,
scenarios and so on.
5. Re-examine, Is engineering an applied science. That’s a scientist’s perspective. Engineers
perspective is that it uses science as a tool. E.g. Carpentry is not applied hammering. Instead
Carpentry uses hammers.
6. Find out those scientific concepts that were discovered as a result of solving some engineering
problem.
7. Keep the purely numerical computational problems out for this course. Work for non-
numeric problems or mixed problems.
8. Generic checklist of computational processes (like the Rasa’s in a drama) and try to have more
than one of these in your story. Enrich your story by checking if your design has got such
element in it or not.
9. Write individual design stories, clearly and precisely.
10. Refer Lect_2_01 of 2002.
Lecture 4 (1 hr.) (03.08.04) (Class strength : Approx. 200 students)
1. WAP to shift up the elements of an array after deleting one element from in between. Make
the last element as blank.
2. WAP to shift up the elements of a structure after deleting one element from in between.
3. Draw a lesson from the above two examples. A well structured data can help in one or more of
following areas :
i. Run time memory saving,
ii. Run time time saving
iii. Cleaner and simpler processing logic.
4. WAP to circularly shift up the elements of an array after deleting one element from in
between.
5. WAP to shift up the elements of an array, where each element is a structure, after deleting
one array element from in between. Make fields of the last array element as blank.
Lecture 5 (2 hr.) (05.08.04) (Class strength : Approx. 120 students)
1. Design your memory representation for an OI (single variable polynomial). Purpose of
representation is to evaluate and display a single variable polynomial. Convert your
representation into a data structure. Use this data structure to WAP for display and evaluate
any user inputted single variable polynomial.
2. Evaluate different options of these representations (data structures) in terms memory
requirement, run time and also simplicity of logic (of algorithm) design.
3. A well structured data can help in one or more of following areas :
i. Run time memory saving,
ii. Run time time saving
iii. Cleaner and simpler processing logic.
4. Use this data structure for WAP for adding, subtracting, multiplying, differentiating and
integrating single variable polynomials.
Lecture 6 (1 hr.) (10.08.04) (Class strength : Approx. 220 students)
1. WAP for deleting an element from a linked list of integers.
2. Writing good programs :
i. Give useful names to all variables
ii. Prefix variables names with type sybols e.g. all integers variables with i_, all
character variables with c_ and so on.
iii. Initialize all variables
iv. Write readable programs rather than tricky programs
v. Consider using one Free memory corresponding to every malloc.
3. Array elements are directly addressable i.e. it takes same amount of time to access any
element of an array. Whereas access time varies in a linked list.
3. Lecture Notes, Data Structures Sanjay Goel, JIIT, 2004
4. Compare the run time costs of element deletion from an array with element deletion from a
linked list.
5. You can use an additional variable i.e. counter of elements to keep the count of elements in
the array and linked list. You can also consider initialising array with values out of the
permitted range e.g. if the values are expected to be within –20 to 20, you can use –100 or
100 to indicate empty elements. Zero in this case is a legal value within the range and does not
indicate emptiness.
6. Design a data storage scheme for storing polynomial functions and series of numbers. Design
an algorithm to test if the given series is a Taylor series approximation (at x) for a given
polynomial function and x0.
Taylor series approx. of a given function is as :
7. While equating real numbers do not use ==, instead, check if abs(n1-n2) is <= e1.
where e1 is very small real number and its value depends on application.
8. Design a dynamic data structure for storing a randomly ordered collection of single
variable polynomial functions and another static data structure for storing randomly-
ordered collection of real number-sequences. The records in both the collections also
should have additional provision for storing indices of all the matching entries (if any)
in another collection. One entry in any collection may match with none, one or many
entries in another. A number-sequence is declared as matching with a polynomial, if
all the numbers in the sequence match with corresponding terms of the Taylor series
expansion of a function for given x and x0 within the limits of a user-defined
‘permitted-mismatch’. Design an algorithm for updating matching indices in both the
collections for a given user-defined input of ‘permitted-mismatch’, x and x0.
Lecture 7 (2 hr.) (12.08.04) (Class strength : Approx. 200 students)
Post class Assignment:
1. Modify the last assignment of previous lecture to take input from text file and also output the
result into a file.
2. WAP to shift up the elements of an array stored in a file after deleting one element from in
between.
3. WAP to circularly shift up the elements of an array stored in a file after deleting one element
from in between.
Lecture 8 (1 hr.) (17.08.04) (Class strength : Approx. 230 students)
1. Write an algorithm for reversing the order of elements in an array.
2. What does the following algorithm do? Illustrate your answer with example.
void WhatdoIdo (NodePtr p)
{
if (p)
{
WhatdoIdo (p -> next);
cout << p -> data;
};
}
2.1 What if the order of statements in the inner block is exchanged?
2.2 Transform this algorithm into an equivalent non-recursive algorithm.
What are your options ? Evaluate the options.
3. Option 1: Do not use any extra memory for doing the task
Option 2: Use an Array to copy the elements and then print the elements of array in reverse
order as array elements can be easily printed in any order.
4. Lecture Notes, Data Structures Sanjay Goel, JIIT, 2004
Option 3: Use a linked list to copy the elements in reverse order , insert the new node in front
of the existing list (rather than at the back).
4. WAP based on each option.
5. As software designers, we use additional memory for designing more efficient algorithms.
This additional memory can be used in various forms. Each form has some merits and
demerits.
6. Unless specifically permitted or required, changing the input (form or the content) is not
acceptable in most cases.
Lecture 9 (2 hr.) (19.08.04) (Class strength : Approx. 180 students)
1. Pick one of the following or any other similar popular family game and propose a design for a
software version
Snakes and Ladders
Ludo
Chess
Any Cards based Game
Make-a-word
Rubic Cube
You can design you software either on the model of ‘play with computer’ or ‘play through
computer’
2. A design team has conceived the following initial specifications of a search engine for a
large company’s internal Digital Library
Only specially authorised users can upload new documents or new versions of old document.
All employees can look at the documents. Information systems department will create, update
and maintain a list of keywords for faster search facility. The search engine users can also
search by entering any word through the keyword. Searched documents are to be listed as
follows:
Case A, Faster search on a listed keyword : As per the frequency of occurrence of the word
i.e. the documents having higher “density” of the chosen keyword will be listed before the
documents having lower density, where, density[k, d] =(Occurrence count of the word k in
d)/(word count in d)
Case B, Search by entering a word though the keyboard : As per the frequency of usage of
a document, where usage is defined as number of times a document is opened by users
through the search engine.
Draw a design diagram (Concept map) for this problem. Refer to ADS (DSII 2002-03) lecture
#6 slide 6 to 27 and Lecture #9 slide no 12-13.
2. Progressive development of software. Programming is only a small part of the process. All
Engineering design require and create drawings. It is possible to create software through its
architectural drawings. It becomes much simpler a process that is easy to follow and monitor.
Stage 1. Write a design story.
Stage 2. Prepare a checklist of nouns and verbs in the design story.
Stage 3. Draw 1st level Concept map of software. Draw a graph of Nouns and verbs.
Stage 4. Draw 2nd level Concept map of software. Put verbs inside oval shape and nouns
in rectangular shaped boxes.
Stage 5. Draw 3rd level Concept map of software. Put examples of data in the noun
boxes.
Stage 6. Draw 4th level Concept map of software. Identify data tanks (rectangular boxes
with multiple record in the same format but with different values) and put them
inside double lined rectangular boxes. Suitably label data tanks.
Iterate through above cycle and critique in a group. Check for inconsistencies and
incompleteness at each iterative stage and update your concept map.
Coding will follow in later stages.
4. While flowchart gives a process centric view of the software, Concept map gives a data
centric view.
5. Lecture Notes, Data Structures Sanjay Goel, JIIT, 2004
5. The data tanks will be realised through data structures and the verbs will be realised through
algorithms.
6. Draw the concept map for your chosen family game as per first exercise of the day.
7. Draw the concept map for your design stories of your earlier assignments.
Lecture 10 (1 hr.) (24.08.04) (Class strength : Approx. 180 students)
1. Elements of a design story:
- Plot / Theme
- Characters with defined roles : (Anything that has a Stimulus-Response behaviour is
a character. Identify the stimulus-response pairs for your characters)
- Events
- Sequence
- Time based events/objects
- Interactive events/objects
- Characters take Actions
- Actions have consequences
- Consequences are actions by other characters.
2. Design
- Regard engineering design practice as a process of "story telling".
- Design is story telling
Stories Design Concept Prototype ……… Product
3. Some Tips on story writing
- Experience lots of stories
- Start with the whole and move to the parts: Present the big picture within a whole- system
global context and connect to local initiatives.
4. Some Guidelines and hints on construction of Concept maps
- This concept map will provide a birds-eye view of a collection of interacting and
collaborating data tanks and data items.
- This Concept map will be a diagram of inter-connected data tanks via processing units
with marked labeled boxes and arrows.
- Give indication of what, when and how does some data move or change in any data tank.
- Use double line boxes for data tanks containing several homogeneous data items and
single line boxes for single data item/packets, if any. Give examples of the data inside
each data tank.
- Put a small circle on the top right corner of boxes, if it represents dynamic data i.e. the
data can change as a result of valid operations. Put another circle on the top left corner, if
the data population size can change during processing. This dynamic data is not to be
confused with dynamic data structure as this higher level of dynamism can be
implemented with dynamic or static data structures at lower layer.
- Use oval shape boxes for processing units.
- Your concept map should be hierarchical i.e. it should gradually show more details in
different diagrams rather than showing all the details in one diagram. Initially focus on
most critical aspects.
5. Write a detailed story (half a page to full page) for an automation problem. Create
hierarchical Concept Map for it. Document all the versions.
Lecture 11 (2 hr.) (26.08.04) (Class strength : Approx. 100 students)
1. Some more Guidelines and hints on construction of Concept maps:
- Mention the strategic positions and attributes of typical data item for every data tank.
Draw two dotted horizontal lines and write the strategic positions in the central portion
and attributes in the lower.
- If your data tank is compound i.e. you need some additional ancillary and smaller data
tanks (e.g. indices and so on) to support efficient searching of appropriate data items in
the principal data tank, include the names of these ancillary data tanks in the principal
data tank itself. by dividing your principal data tank in two units by a single horizontal
6. Lecture Notes, Data Structures Sanjay Goel, JIIT, 2004
line and listing these names in the lower portion. The Principal data tank in the upper unit
will have three portions separated by two dotted horizontal lines as mentioned in ‘8’
- Later expand this compound data tank into principal data tank interconnected with
ancillary data tanks in a different diagram.
- Name your data tanks as plural nouns (e.g. trains, passengers, books, users and so on) that
contains many homogeneous items only.
- Name your ancillary data tanks using the name of principal data tank, followed by an
underscore and then by another plural noun e.g the ancillary data tanks for data tank
‘trains’ could be named as ‘trains_train-nos’, ‘trains_destinations’ and so on.
- Name your processing units as verbs only.
- Put the name of the data that flows in/out of data tanks.
- Write the properties and functional behaviour for each data tank by giving clear
description of the content and also permitted legal operations on each data tank and data
item.
- Do not mix or confuse this concept mapping technique with any other diagramming
technique like DFD or ER diagram and so on. There may be some similarities with some
but it is different.
2. Some guidelines for identifying generic data tanks:
- Look at each data tank as a collection of homogeneous (similarly structured) data items.
- These individual data items could be atomic or compound.
- This collection may or may not require some inter data item organisational constraints.
- Perform a mathematical reduction by Replacing problem specific nouns in each problem
with generic variables like x, y, z and so on.
- Examine these data tanks and look for structural similarities like all linear equation are
structurally similar to each other irrespective of number of terms, all polynomials are also
structurally similar, all first order differential equations are also structurally similar to
each other.
- Represent compound data items by a single generic macro variable in capital letters like
X, Y and so on. Repeat the process till there is scope of more variable grouping into
single macro variable i.e. reduce the data tank as a collection of some homogeneous
single macro variables.
- See if individual data items in such abstracted data tank are required to be arranged in
some specific order or not. If yes, define the order. All the data tanks that require same
definition of ordering can possibly be termed as similar.
- If the data tank is an ordered collection of X, see how the relative position of a specific
data item is defined with respect to other similar data items.
- Some Possible arrangements.
-
a. X ; (Linear)
b.
. X ; (Non linear : Descending Tree)
.
.
c. .
X . (Non linear: Ascending Tree)
.
d.
. . (Non linear: Graph)
. X .
. .
7. Lecture Notes, Data Structures Sanjay Goel, JIIT, 2004
- Relative position could be in terms of order of insertion or relative value of some data
item .
- Examine Relative Positional eligibility for Retrieval : All/ only some strategic relative
positions.
- Examine Relative Positional eligibility for Manipulation : All/None/ only some
strategic relative positions.
- Examine Relative Positional eligibility for Insertion : Any empty slot/ only some
strategic relative positions.
- Examine Relative Positional eligibility for Deletion : All/None/ only some strategic
positions.
- If retrieval, manipulation, insertion and deletion are dependant on some well defined
strategic relative position within the data tank; observe, identify and define these
positions.
- Some examples of strategic relative positions can be as follows :
Based on order of insertion: earliest, latest, after the latest insertion, before the
earliest insertion, 3rd earliest, 4th latest, next relative to the current position as per
insertion order, previous relative to the current position as per insertion order and so on;
Based on the value : Minimum, Maximum, 3rd minimum,in between a given range of
values, in between an appropriate range of values.
- Find out the similarities based on these observations.
- Create the detailed concept map of your design story
- Do the group exercise as per the project I details.
- Create the concept maps for the data tank representation of book, dictionary,
Thesaurus, Eicher road map, Periodic table, Atlas, Railway TimeTable
Lecture 12 (1 hr.) (31.08.04) (Class strength : Approx. 120 students)
1. What are the core competencies for engineering professionals ? Excerpts from a research project
(SPINE) in USA and Europe : Problem solving is the most important skill for engineers.
2. What is a problem ? What is problem solving ?
Option A: Applying well defined and known algorithms to well structured problems e.g given F
and A, find M; find integration of x2. (We don’t talk of such problems while talking of
problem solving)
Option B: Finding/ designing/choosing and then applying a new algorithm (may be adaptation of
an existing one) to a well structured (new type) or not so well structured problems e.g.
program development
Option C: Applying your heuristics (thumb rules) to solve not so well structured problems e.g.
playing tic-tac-toe; Rubic Cube, Chess playing, Management, Design
In this course, we are mainly learning second class (option B) of problem solving. We are also
learning how to define a problem as a more structured one starting from the initial not so structured
problem definition. We also intend to develop some heuristics to efficiently solve problems of this
class.
3. Program = Algorithm + Data Structure + User Interface; The design starts from design of UI, then
Data Structures and finally algorithm. This process is open to iteration.
4. Software is developed through progressive “zooming” of initial problem statement. Solution lies
in problem statement. The problem is “zoomed” into sub-problems and this “zooming” process
continues till each final sub-problem can be directly translated into a set of finite and precise
instructions (can be represented though flow chart, pseudo-code and finally translated into a
machine readable programming language).
5. Insertion sort : evaluation of several options
6. Examine how creation and disciplined management of some data can help in more efficient
algorithms.
7. Refer to Lect_2_02 and Lect_2_03 of 2002.
8. Lecture Notes, Data Structures Sanjay Goel, JIIT, 2004
Lecture 13 (2 hr.) (02.09.04) (Class strength : Approx. 120 students)
1. Software is developed through progressive “zooming” of initial problem statement. Case
study of Insertion sort. Refer to Lect_2_02 and Lect_2_03 of 2002.
2. Defining Algorithmic problem in terms of Input characterisation and output specification.
Refer to Lect_2_06 of 2002.
3. Issues involved in analysis and design of algorithms
4. Are algorithm scalable ?
5. Algorithm correctness.
6. Time complexity and space complexity
7. Comparing the run time to execute Linear search and Binary search algorithms.
Lecture 14 (1 hr.) (14.09.04) (Class strength : Approx. 110 students)
1. Case Study on Concept Map Design for Homoeopathy reference by Kanisha (B6).
- Boger’s Materia Medica has several parts that are interlinked.
2. Comparing the run time to execute Linear search and Binary search algorithms.
3. Mathematical Analysis of step count for estimation of run time for iterative algorithms using
the example of Insertion sort. Algorithm. Refer to Lect_2_05 of 2002.
4. Assignment : Using this method, estimate the step count for more iterative algorithms known
to you e.g linear search through unsorted array, linear search through unsorted array,
binary search, matrix multiplication, bobble sort, selection sort and all the iterative
algorithms studied by you in Numerical Analysis course like Gauss Seidel SOR, Power
method for finding Eigen values, Jacobi Method for finding Eigen values.
5. Assignment : Write programs for insertion sort and any other two algorithms (at least one for
Numerical Analysis set) mentioned in above assignment with a modification of introducing
the logic for step counting within each program itself. Run the programs with this added logic
for different data size. Generate data for step count Vs data size for each algorithm. Draw
Graphs showing this function using Excel. Members of same group should write programs
for step count of different algorithms.
Lecture 15 (2 hr.) (16.09.04) (Class strength : Approx. 100 students)
1. Run time of program = C x S,
where S= Step count of all statement and
C depends on computer speed
And S = K x S1
where S1 = Step count of only key statement and
K >=1 depends on algorithm
2. Worst Case, Best Case and Average Case Time Complexity of iterative and recursive
algorithms using the example of Insertion sort, factorial and Fibonacci number Algorithm.
Refer to Lect_2_04 and 2_06a of 2002.
3. Assignment : Analyse the time complexity of Ackerman function.
Lecture 16 (1 hr.) (21.09.04) (Class strength : Approx. 120 students)
1. Review of Best Case and Worst case Time Complexity Analysis
2. Average Case Time Complexity Analysis example of Linear Search through unordered array
and ordered array ; Binary Search
3. Assignment : Best case, Worst Case and Average case Time Complexity Analysis for
following algorithms:
i. Linear Search through unordered array
ii. Linear Search through ordered array
iii. Binary Search
iv. Insertion Search
v. Bubble Sort
vi. Selection Sort
9. Lecture Notes, Data Structures Sanjay Goel, JIIT, 2004
Lecture 17 (2 hr.) (22.09.04) (Class strength : Approx. 100 students)
1. Selection and Bubble Sort Algorithms. Refer ADS (DS II 2002-03) Lecture #6 Slide 32-37.
2. Average Time Complexity Analysis of Selection and Bubble Sort.
3. Project demonstrations on visualization of wave file by P. Singh & Group. This software is
recommended be used by all to do simulation for digital communication in ADC course.
4. Demonstration of Graphics project by Saransh.
5. Space Complexity, Refer DS 2002 Lecture #2-6.
6. Algorithm Visualisation : examples of Bubble, Selection, Insertion sort algorithms. Refer to
Refer ADS (DS II 2002-03) Lecture #8 Slide 8-19.
7. Impact of Hardware speedup on the solvable problem size for any given algorithm. Refer DS
2002 Lecture #2-7.
8. Assignment : Write Bubble, Selection and Insertion sort programs with dynamic algorithmic
visualization
Lecture 18 (1 hr.) (28.09.04) (Class strength : Approx. 130 students)
1. Relationship of Data Tanks and Algorithms. Data tanks have functions like the buttons on the
tanks. These functions are realized by software (algorithms). Discrete data flows in/out of the
data tanks at discrete time.
2. Categories of data tanks :
- Problem specific : Part of problem (explicitly occur in the detailed statement of problem)
- Solution specific: Part of solution (not mentioned in the problem but designer conceives
these data tanks to create designs to solve the problem . Different designs solutions may
have different data tanks of this category).
- Often higher level complex processing units in the concept maps may zoom into a sub
concept map having lower level simpler processing units and additional solution
specific data tanks. This also may require further expansion in some cases until all
processing units are at the simplest level.
3. Consolidation of taxonomy of data tank types (as needed by students applications) on the
basis of structural properties.
4. Results of last year’s consolidation:
Time of insertion (TOI) None Earliest Insertion (EI)
Linear Value
Non Value+Time (VT) Anywhere (aw) After the last insertion
Linear (ALI)
After Last Low Priority Insertion (ALLPI) Before Last High Priority
Insertion (BLHPI)
Last Low Priority Insertion Last High Priority Insertion
(LLPI) (LHPI)
Ordered Insert-ion Delet-ion Interrogat-ion Manipul- ADT Name
by at at at ation at
Data tank Types needed for modelling student design
1 Linear Value ap aw aw aw
2 Linear TOI ALI aw aw None Semi-Open
Queue
3 Linear Value ap aw aw None
4 Linear None aw aw aw aw
5 Linear None None None aw aw
6 Linear TOI ALI None aw None
10. Lecture Notes, Data Structures Sanjay Goel, JIIT, 2004
7 Linear TOI ALI EI LI None Queue with
semi-open
back
8 Linear TOI ALI aw aw aw Open
Queue
9 Linear Value None None aw None
10 Linear VT ap None aw None
11 Linear None aw aw aw aw
12 Linear None aw aw aw None
13 Linear Value ap aw aw aw
14 Linear Value ap None aw aw
15 Linear TOI None None aw None
16 Non VT ap sublist's aw None
Linear last
17 Linear TOI ALI None aw aw
18 Linear None aw None aw None
Additional types (Not needed) for conceived
problems in 2003
19 Linear TOI ALI LI LI None Stack
20 Linear TOI ALI EI LI, EI None Queue
21 Linear TOI ALI LI, EI LI, EI None Shelf
22 Linear TOI+Priorit ALLPI, LLPI LLPI, LHPI None Scroll/Roll
y BLHPI
23 Linear TOI+Priorit ALLPI, LHPI LLPI, LHPI None Scroll/Roll
y BLHPI
24 Linear TOI+Priorit ALLPI, LLPI, LLPI, LHPI None Deque
y BLHPI LHPI
4. Assignment: Label your concept maps with the Data tank type id as per the above list. Identify
additional structurally different data tanks for your application that are not part of the above list.
Expand this list.
11. Lecture Notes, Data Structures Sanjay Goel, JIIT, 2004
Lecture 19 (2 hr.) (30.09.04) (Class strength : Approx. 100 students)
1. Result of new consolidation incorporating the requirements of new 2004 batch:
Time of insertion (TOI) None Earliest Insertion (EI)
Linear Value
Non Value+Time (VT) Anywhere (aw) After the last insertion
Linear (ALI)
After Last Low Priority Insertion (ALLPI) Before Last High Priority
Insertion (BLHPI)
Last Low Priority Insertion Last High Priority Insertion
(LLPI) (LHPI)
Before Last Insertion (BLI)
Data tank Types (Templates) needed for modelling student
design
(2003 batch)
1 Linear Value ap aw aw aw
2 Linear TOI ALI aw aw None Semi-Open
Queue
3 Linear Value ap aw aw None
4 Linear None aw aw aw aw
5 Linear None None None aw aw
6 Linear TOI ALI None aw None
7 Linear TOI ALI EI LI None Queue with
semi-open
back
8 Linear TOI ALI aw aw aw Open
Queue
9 Linear Value None None aw None
10 Linear VT ap None aw None
11 Linear None aw aw aw aw
12 Linear None aw aw aw None
13 Linear Value ap aw aw aw
14 Linear Value ap None aw aw
15 Linear TOI None None aw None
16 Non VT ap sublist's last aw None
Linear
17 Linear TOI ALI None aw aw
18 Linear None aw None aw None
Addiotional types (Templates) required by
2004 batch
19 Linear None None None aw None
20 Linear VT ap+aw aw aw aw
21 Linear VT ALI aw aw aw
22 Linear TOI BLI EI aw None
23 Linear TOI None None aw aw
24 Non VT ap aw aw None
Linear
12. Lecture Notes, Data Structures Sanjay Goel, JIIT, 2004
25 Linear Value ap None aw None
Additional important types (Templates) but not
needed for students conceived problems (2003 and
2004 batch)
26 Linear TOI ALI LI LI None Stack
27 Linear TOI ALI EI LI, EI None Queue
28 Linear TOI ALI LI, EI LI, EI None Shelf
29 Linear TOI+Priority ALLPI, BLHPI LLPI LLPI, None Scroll/Roll
LHPI
30 Linear TOI+Priority ALLPI, BLHPI LHPI LLPI, None Scroll/Roll
LHPI
31 Linear TOI+Priority ALLPI, BLHPI LLPI, LLPI, None Deque
LHPI LHPI
2. Data Storage : Options -
Amorphous or
Structured
3. Amorphous storage (collapsed structure) makes insertion of data very easy but retrieval of
required data is very inefficient.
4. Structured data storage requires more discipline and effort at the time of insertion of data
so that retrieval becomes more efficient.
5. Several types of primary data types and Data Structuring facilities are offered by
Programming Languages.
6. All application specific data for any data-tank needs to be represented and stored in terms
of language specified primary data types using structuring facilities for future usage and
processing.
7. Data of one data-tank can be stored on primary/secondary or mixed memory.
8. For storing structured data, addresses of individual record within a data tank can be realised
through following optional addressing mechanisms:
i. Formula based (Direct addressing)
ii. Linked List
iii. Indirect addressing (using a directly addressable index)
iv. Simulated Pointer.
9. All data tanks can be realised through any of the five mechanisms , amorphous or any of four
addressing mechanisms
10. Refer 2002 DS Lectures Lect 02-08, Lect 02-09, Lect 02-10, Lect 02-11, and Lect 03-01.
11. Assignment : Implement a data tank from your individual design story. Store the data on
file using amorphous (collapsed structure) storage. Implement the functions for Insertion,
deletion, interrogation and modification operations as per the requirement of your design.
Lecture 20 (1 hr.) (5.10.04) (Class strength : Approx. 110 students)
1. Review of software design artefacts – design story, concept map, data tanks (collection of
records), algorithm, data tank template, data structure.
2. Taxonomy of data tanks :
Perspective 1 : Static Vs Dynamic Data Tank.Dynamic data tanks allow changes in
the data values and/or adition/deletion of record. (not to be
confused with dynamic data structures)
Perspective 2 : Structural feature (Linear/Non linear and strategic positions for
access, insertion, modification and deletion) as per above table
Perspective 3 : Problem specific Vs Solution specific
3. Discussion of last class’s assignment: Implement a data tank from your individual design
story. Store the data on file using amorphous (completely collapsed structure i.e. no
predictable relative ordering of the records within the data tank and no predictable relative
13. Lecture Notes, Data Structures Sanjay Goel, JIIT, 2004
ordering of the fields with a record) storage. Implement the functions for Insertion, deletion,
interrogation and modification operations as per the requirement of your design.
4. Complete this assignment before next class.
Lecture 21 (2 hr.) (7.10.04) (Class strength : Approx. 70 students)
1. Review of the last Assignment.
2. Program demonstration by Mayank (MCA).
3. Learn about file modes & file operations and string.h.
4. WAP to create a test file of 10,000 records for your individual application for a data tank as
identified by you in your individual concept map in following different styles of data
storage:
1. Completely amorphous: unordered collection of variable length strings for records with
variable number of fields.
2. Semi amorphous : 1. unordered collection of fixed length strings for records with fixed
number fields.
2. unordered collection of variable length strings for records with
fixed number fields.
5. Formula based : Ordered collection of fixed length records without any missing keys.
6. Indirect addressing : Using Array based Index structure on multiple fields. Use binary
search over the index array. Store index in a separate file and make it RAM resident at
the run time of application.
5. Write programs for following operations on all the above files:
1. Insertion
2. Deletion
3. Retrieval
4. Updation
6. Integrate the above collection of functions in one application with a common UI.
7. Compare the time performance of the four types of storage formats for your application on
the following parameters:
1. Average insertion time as a function of file size (in terms of number of records at the files
sizes of 10, 100, 1000, 4000, 6000, 8000,10,000 records).
2. Average deletion time as a function of file size (in terms of number of records at the
sizes of 10,100, 1000, 4000, 6000, 8000, 10,000 records).
3. Average retrieval time as a function of file size (in terms of number of records at the
sizes of 10, 100, 1000, 4000 , 6000, 8000,10,000 records).
4. Average updation time as a function of file size (in terms of number of records at the
sizes of 10, 100, 1000, 4000, 6000, 8000, 10,000 records).
Draw four graphs for each comparison. You are encouraged to write a program for drawing
your graphs.
8. Compare the results across the different applications within every group.
Note: Group Members are encouraged to collaborate but finally every group member has
to submit a different application individually created by every student.
Lecture 22 (1 hr.) (12.10.04) (Class strength : Approx. 70 students)
1. Formula based addressing: if formula function is 1:1 , we need N memory slots for N
potential keys, even if there are only n << N keys under usage e.g. database of records with 6
digit roll numbers will require 106 memory slots even if there are only 2000 record in the
database. This results in huge wastage of memory as we need to keep reserved memory for
all possible keys even if there is no record with that key in the database.
2. Hashing provides a solution of this problem and gives huge memory saving. We use a address
calculation formula function which is M:1 rather than 1:1. So many keys contend for one slot.
14. Lecture Notes, Data Structures Sanjay Goel, JIIT, 2004
3. Discussion on Hashing (invented in mid fiftees) as a formula based addressing scheme of data
bases. Hash Table, Hash function, Synonym , Collision Detection, Collision Resolution,
Open Hashing, Bucket Hashing, Closed Hashing. Refer 2002 DS-II Lect 34 slide 54 to 74.
4. Load factor = Number of record/ Number of slots. For collision avoidance load factor < 1.
Load factor of 0.5 to 0.7 results in good performance.
5. The worst hash function maps all keys to the same address.
6. The best hash function maps all keys to distinct addresses. Difficult to design.
7. Data files and Index files can also be stored as a Hash files.
8. Assignment : WAP to copy your structured data file of earlier assignment into a RAM based
Hash Table using Open Hashing/ Bucket hashing/ Closed Hashing using appropriate hashing
function depending on the following criteria:
Enrollment number %3 = 0 : Open Hashing, Enrollment number %3 = 1 : Bucket Hashing
Enrollment number %3 = 2 : Closed hashing
9. Assignment: Analyze the best case, worst case and average case time complexity for the
three hashing techniques for insertion, deletion, search a record, modification of non key field
and modification of key field.
Lecture 23 (1 hr.) (26.10.04) (Class strength : Approx. 45 students)
1. Data Storage using Linked Addressing (RAM based storage)
2. Data Storage using Simulated Pointers (File as well RAM based storage)
3. Both these approaches efficiently facilitate ‘logically ordered (sorted) storage’ of data
elements without expensive data movement at insertion and deletion operations. Data files as
well as index files can be stored using simulated pointer addressing.
4. Forward traversal and Reverse order traversal (iterative and recursive) through Linked as well
as Simulated pointer based data storage.
5. Cost of Recursion : very costly in terms of run time as well as memory requirment. Refer
2002 DS Lect 3-17 slide 10 to 18.
6. Recusive algorithms can be converted to iterative algorithms.
7. Simple recursive functions e.g. factorial, forward linted list printing amnd so on (function
call is the last excecutable statement within the function) can be easily converted to more
efiicient (in terms of run time and also run time memory) iterative algorithms. Other recursive
algorithm require most sophisticated approaches by using additional solution specific Data
tanks (usually Stacks or Queues) for buffering the accessed but unprocessed data elements
from the problem specific data tanks.
8. Revised Individual Assignment (This is expanded and consolidated from the assignment
given in lecture 21 and lecture 22) : (last data 5th Nov)
8.1 Identify a data tank in your individual design story and implement it using different store
techniques of Completely amorphous file, Semi amorphous file, Formula based file, RAM
based Hash table (as per the criteria declared in lecture # 22), Indirect addressing using Array
based Index, Indirect addressing using Hashed Table based Index, Indirect addressing using
Linked list based Index, Simulated pointer based ordered file, write programs for following
operations:
1. Insertion of one record at a time.
2. Deletion of one record at a time.
3. Retrieval of one record at a time.
4. Modification of non key fileds of one record at a time.
5. Modification of Key field of one record at a time.
6. Ordered List Display of all records as per alphabetical ordering of key.
7. Reverse List Ordered Display of all records as per alphabetical ordering of key.
8. Ordered Range Display of all records having key value within a range of Key1 to
Key2 as per alphabetical ordering of key.
8.2 Compare the time performance of above storage formats for your application for all above
operations by measuring average case performance as a function of file size (in terms of
number of records at the files sizes of 10, 100, 1000, 4000, 6000, 8000,10,000 records).
Generate data for this experimental comparison by running your program with different data
size.
8.3 Draw graphs for each comparison. You are encouraged to write a program for drawing your
graphs. You are also encouraged to use Lagrange interpolation for drawing smooth
graph.
15. Lecture Notes, Data Structures Sanjay Goel, JIIT, 2004
9. Group Assignment: (Last data 5th Nov)
9.1 Compare the results of above individual assignment across the different applications within
every group.
9.2 Analyze the best case, worst case and average case time performance for these storage
structures for all the above mentioned operations at ‘8’.
10. Individual Assignment in lieu of Minor II: (Last data 10th Nov, 15 marks)
Implement the Program for your individual design story (or part of it) with at least 4 data
tanks. Use different data storage formats for different data tanks. If your story does not have
at least four data tanks, enhance your story and have at least 4 data tanks.
11. Complete your group project as per your group’s original design story (or part of it)
before 15th Nov. It will have a weightage of 20 marks for each student.
12. Complete all your Lab Assignments in time. They carry total weightage of 100 marks for
each student. This does not include the marks mentioned above at 10 and 11. The total
course marks inclusive of all components are 200 which includes 15 marks for Minor I,
35 marks for Major and 15 marks for Tutorials and regularity.
Lecture 24 (1 hr.) (1.11.04) (Class strength : Approx. 50 students)
1. Software Design Process: Design story Concept Map of Problem Concept map of
Solution (with simpler verbs and new data tanks, if needed) data structures and detailed
algorithms performance analysis (using time and space complexity analysis techniques)
program Software Testing and evaluation.
2. Usually software design is an iterative process and design and devlopment team may decide
to go back to a earlier stage after analysisg the results of some later stage before proceeding
further. For example, if performance analysis gives an impression that space utilisation or time
performance is not satisfactory then new data structure/algorithm need to be crerated before
writing the program. If this also does not improve the situation then a new solution (with a
new concept map) has to be designed with a different approach.
3. Popular data structure like Stack, Queue and Deque are not so much used for modeling the
problems itself, however they are very useful for designing solutions for a variety of
problems. They are used like buffers for temporary storage of produced/arrived/accessed/seen
but unprocessed data.
4. Many applications can be modeled as data producer(s), buffer(s) and server(s).
5. Usually buffers are non-empty for most of the time during run time because
- often, producers are more in numbers than server(s) or
- they produce faster than server can serve or
- server can not immediately process certain type(s) of data produced by producer(s)
6. Buffers are temporary data tanks that get data only during run time oif the application. They
are empty at the beginning and at the end of the applications. In many algorithms an emptied
buffer is a terminating condition for the algorithm.
7. Producer is that processing module which continuously/ periodically inputs a single data
into the buffer for processsing by server.
8. Server is that processing module which continuously/ periodically takes a single data out
of buffer for processsing and processes it (it could be as simple a processing as ‘print’ or a
very comprehensive detailed algorithm could be executed in some applications).
9. Stack is used when the server needs to acess the buffer as LIFO. Many applications require
such buffer e.g. undo in editor, back in internet browsers and so on
10. Queue is used when server needs to access the data as FIFO. Many application require such
buffer e.g. print server, internet server, job scheduler, process scheduler, file server, data base
server, mother dairy booth and so on.
11. Deque is used when server needs to access the data as LIFO and also as FIFO depending on
certain conditions. Limited application require such buffer e.g. certain special types of
process schedulers.
16. Lecture Notes, Data Structures Sanjay Goel, JIIT, 2004
12. Performance comparison of different storage schemes.
13. Implementing Index for Indexed storage: simple ordered array based storage offers good
search performance using Binary search but makes insertion and deletion slow as it requires
expensive movement of index records. An ordered linear linked list will give flexibility of
insertion and deletion but search will be slow as it will require linear search. We want to have
both of the following:
i. Good search time (similar to Binary search over ordered array) and
ii. Good insertion and deletion time (similar to singly linked list)
12. Essentially use a special non-linear Linked structure rather than ordered array such that it
also follows (at least tries a close approximation of) a search strategy like Binary search over
an ordered array.
13. Binary Search Tree is a special Binary tree to facilitate such search mechanism. It follows
some constraints for positioning the nodes within the tree i.e. values on the left side are lesser
than parent and values on the right side are more than the parent.
14. BST search requires no computation at run time to find the next location comparison, instead
the two next candidate locations are pointed to by current locations (as in linked list in which
each location points to one next location).
15. New node insertion algorithm in BST can be fast and very simple i.e. it attaches the new
arriving nodes at the leaf level only, appropriate leaf has to be searched within the current
BST for linking new node as its left/right child. This simple insertion may make the tree
unbalanced and it results into deeper tree requiring more comparisons on an average as
compared to binary search through an ordered array.
16. It is possible to design more sophisticated algorithm for insertion into a BST which keep the
BST balanced after every insertion by reorganizing part of the BST. This helps in optimizing
the number of average comparison during search time.
17. More sophisticated search trees structures are used for storing index structures in real
application like DBMS and so on. They essentially follow multi-way search through
multiway trees with multiple pointers (even 100 or more) rather than just 2 (left and right)
pointers.
18. Indexing facilitates faster retrieval. Key to improving retrieval time often lies in
designing better indexing structures. Index Structure Design has been a very active CS
research area for several decades and it continues to get new breakthroughs for specialized
and newer applications.
19. Ref DS 2002 lect 4_01, 4_02_03 and DS-II 2002 lect 18.
20. Assignment: WAP to convert your linear index structure into BST index.
Lecture 25 (1 hr.) (9.11.04) (Class strength : Approx. 50 students)
1. Software Design Process: Design story Concept Map of Problem Concept map of
Solution (with simpler verbs and new data tanks, if needed) data structures and detailed
algorithms performance analysis (using time and space complexity analysis techniques)
program Software Testing and evaluation.
2. Data Tanks will be
i. Problem specific (usually persistent) usually stored on well formatted files.
ii. Solution specific (often temporary)
a. Ancilliary Data tanks for the Problem specific primary data tanks i.e.
Index structures. Index structures can be RAM based or can also be on
file (if index is also large). Index structures are also often persistent and
are stored on files. At run time index )or part of it) is loaded into RAM.
Index can be stored as sorted list, hashed list or more often as some kind
of a search tree. BST is most simple form of such a search tree. There are
more sophisticated single as well as multi-dimensional index structures.
Refer 2002 DS II Lect 18.
b. Buffer data tanks e.g. stack, queue, deque and so on for temporary
buffering of data.
3. Detailed Process for 3rd stage i.e. Concept map of Solution (with simpler verbs and new data
tanks, if needed) data structures and detailed algorithms :
iii. Detailed modeling of each data tank with examples and identification of fields.
iv. Structural abstraction of each Data Tank Abstract Data Type (ADT)
17. Lecture Notes, Data Structures Sanjay Goel, JIIT, 2004
v. ADT specification of public view of ADT in terms of list of permitted and
required operations for construction, destruction, manipulation and access.
Have sufficient access functions to test the preconditions and post conditions for
for other operations.
vi. Interface design for each operation in terms of function name and parameters
list and their description.
vii. Pre-conditions (in terms of access functions) for each operation. Refer 2002 DS
lect 3_01 to 3_03.
viii. Post conditions (in terms of access functions) for each operation.
ix. Case specific Data Structure for each ADT .
x. Algorithm for each operation sub program (function) for each operation.
4. Array storage single and multi-dimensional array : (refer 2002 DS lect 3_04 and 3_05)
i. Single dimensional array : Space overhead (starting location, array length)
ii. Multi dimensional array storage:
a. Row major : require a single large Contiguous chunk of memory. Used
by C, C++. More memory overhead per chunk. Overhead
does not depend on the size. It only depends on the number
of dimensions.
b. Array of array : stores multidimensional array as collection of
hierarchically organized (e.g. array of planes within a 3d
matrix, array of rows within a plane , array of elements
within a row) several multiple single dimensional arrays.
Require multiple small contiguous chunks of memory.
Used by Java. Less overhead for each such chunk, but over
all more overhead as there are several such chunks. Hence,
Overhead depends on the size and also on number of
dimensions.
Lecture 26 (2 hr.) (11.11.04) (Class strength : 21 students)
1. Implemention and performance analysis of ADT Matrix for Diagonal, lower triangular,
upper triangular, tridiagonal and sparse matrices using following realization strategies :
i. Amorphous, semi amorphous
ii. Direct addressing (formula based addressing in a single dimensional array)
iii. Linked
iv. Indexed
v. Hash table: open hashing, bucket hashing, closed hashing
vi. Simulated pointer
Refer 2002 DS lect 3_05 to 3_14
2. Implementing sparse matrices with orthogonal lists.
3. Implemention and performance analysis of ADT Stack using following realization strategies
:
i. Amorphous, semi amorphous
ii. Direct addressing (formula based addressing in a single dimensional array)
iii. Linked
iv. Indexed
v. Simulated pointer
Refer 2002 DS lect 3_15 and 3_19
4. Implementing 2 stacks in a single array. Refer 2002 DS lect 3_16.
5. Implemention and performance analysis of ADT Queue using following realization
strategies:
i. Amorphous, semi amorphous
ii. Direct addressing (formula based addressing in a single dimensional array)
iii. Linked
iv. Indexed
v. Simulated pointer
Refer 2002 DS lect 3_21 to 3_24.
18. Lecture Notes, Data Structures Sanjay Goel, JIIT, 2004
6. Implemention and performance analysis of ADT ‘Double Ended Queue’ using following
realization strategies:
i. Amorphous, semi amorphous
ii. Direct addressing (formula based addressing in a single dimensional array)
iii. Linked
iv. Indexed
v. Simulated pointer
Refer 2002 DS lect 3_31.
7. Evaluating the buffer requirements (matrix, stack, queue, deque and so on) depending on the
application. Examples of print server, parenthesis matching, palindrom checking, infix to
postfix. Refer 2002 DS lect 3_16 to 3_20 and 3_31.
8. Assignment: WAP to implement two stacks in a single array.
9. Assignment: WAP to implement a deque using indexed and linked implementation.
Lecture 27 (1 hr.) (16.11.04) (Class strength : 40 students)
1. Design and performance analysis of ADT ‘Binary Tree’ using following realization
strategies:
a. Amorphous, semi amorphous
b. Direct addressing (formula based addressing in a single dimensional array)
c. Linked
d. Indexed
e. Simulated pointer
Refer 2002 DS lect 4_01 and 4_02.
2. BST Refer 2002 DS II lect 18.
3. Non-evaluative Assignment: WAP to implement a binary tree using Indexed and Hashed
storage.
Lecture 28 (3 hr.) (18.11.04) (Class strength : 22 students)
1. n-ary tree representation, Forest of n-ary tree representation, Operations on binary tree,
operations on Forests. Expression Tree,
2. Modeling 2d/3d/nd spaces connected through neighbours/relationships/.. as n-ary trees.
3. Tree traversals, pre-order, in-order, post-order, level order, non recursive pre-order, non
recursive in-order, non recursive post-order. Refer 2002 DS Lect 4_01 to 4_02_03.
4. Tower of Hanoi Refer 2002 DS Lect 3_18.
5. Rat in the Maze, depth first, breadth first, shortest path, recursion tree Refer 2002 DS Lect
3_18 and Lect 3_25.
6. Binary and n-ary Tree traversals using recursion, stack or queue.
7. Designing Simulations as producer(s)-server(s)-buffer(s) model. Refer 2002 DS Lect 3_26 to
Lect 3_29.
8. More Sorting algorithms using advanced techniques, radix sort, merge sort, non recursive
merge sort, shell sort and quick sort. Refer 2002 DS II Lect 6 to Lect 12.
9. Non-evaluative Assignment: Send me an email about your learning experiences in this
course. I would also like to know about your learning that happened by doing individual and
group assignment. What are the most important things that your have learnt. What was
missing. and so on… (sanjay.goel@jiit.ac.in).
Best of luck!!!!!!!!