SlideShare une entreprise Scribd logo
1  sur  55
Télécharger pour lire hors ligne
Complete database
Database is collectionof datawhichisrelatedbysome aspect.Datais collectionof factsandfigures
whichcan be processedtoproduce information.Name of astudent,age, classandhersubjectscan be
countedas data forrecordingpurposes.
Mostlydata representsrecordablefacts.Dataaidsin producinginformationwhichisbasedonfacts.For
example,if we have dataaboutmarksobtainedbyall students,we canthenconclude abouttoppersand
average marksetc.
A database managementsystemstoresdata,insucha waywhichiseasiertoretrieve,manipulate and
helpstoproduce information.
Characteristics
Traditionallydatawasorganizedinfile formats.DBMSwas all new conceptsthenandall the research
was done tomake it to overcome all the deficienciesintraditional style of datamanagement.Modern
DBMS hasthe followingcharacteristics:
Real-worldentity:ModernDBMSare more realisticandusesreal worldentitiesto designits
architecture.Itusesthe behaviorandattributestoo.Forexample,aschool database mayuse studentas
entityandtheirage as theirattribute.
Relation-basedtables:DBMSallowsentitiesandrelationsamongthemtoformas tables.Thiseasesthe
conceptof data saving.A usercan understandthe architecture of database justbylookingattable
namesetc.
Isolationof dataand application:A database systemisentirelydifferentthanitsdata.Where database is
saidto active entity,datais saidto be passive one onwhichthe database worksandorganizes.DBMS
alsostoresmetadatawhichisdata about data,to ease itsown process.
Lessredundancy:DBMS followsrulesof normalization,whichsplitsarelationwhenanyof itsattributes
ishavingredundancyinvalues.Followingnormalization,whichitself isamathematicallyrichand
scientificprocess,make the entire database tocontainaslessredundancyaspossible.
Consistency:DBMSalwaysenjoythe state onconsistencywhere the previous formof datastoring
applicationslikefile processingdoesnotguarantee this.Consistencyisastate where everyrelationin
database remainsconsistent.There existmethodsandtechniques,whichcandetectattemptof leaving
database ininconsistentstate.
QueryLanguage:DBMS isequippedwithquerylanguage,whichmakesitmore efficienttoretrieve and
manipulate data.A usercan applyas manyand differentfilteringoptions,ashe orshe wants.
Traditionallyitwasnotpossible where file-processingsystemwasused.
ACIDProperties:DBMSfollowsthe conceptsforACIDproperties,whichstandsforAtomicity,
Consistency,IsolationandDurability.These conceptsare appliedontransactions,whichmanipulate
data indatabase.ACIDpropertiesmaintainsdatabase inhealthystate inmulti-transactional
environmentandincase of failure.
MultiuserandConcurrentAccess:DBMS supportmulti-userenvironmentandallowsthemtoaccessand
manipulate datainparallel.Thoughthere are restrictionsontransactionswhentheyattempttohandle
same data item,butusersare alwaysunaware of them.
Multiple views:DBMSoffersmultiplesviewsfordifferentusers.A userwhoisin salesdepartmentwill
have a differentviewof database thanapersonworkinginproductiondepartment.Thisenablesuserto
have a concentrate viewof database accordingtotheirrequirements
Security:Featureslikemultipleviewsofferssecurityatsome extentwhere usersare unable toaccess
data of otherusersanddepartments.DBMSoffersmethodstoimpose constraintswhileenteringdata
intodatabase and retrievingdataat laterstage.DBMS offersmanydifferentlevelsof securityfeatures,
whichenablesmultiple userstohave differentview withdifferentfeatures.Forexample,auserinsales
departmentcannotsee dataof purchase departmentisone thing,additionallyhow muchdataof sales
departmenthe cansee,can alsobe managed.Because DBMS isnot savedon diskas traditional file
systemitisveryhard for a thief tobreakthe cod.
Users
DBMS isusedbyvarioususersfor variouspurposes.Some mayinvolve inretrievingdataandsome may
involve inbackingitup.Some of themare describedasfollows:
Administrators:A bunchof users
maintainthe DBMS and are responsible for administratingthe database.Theyare responsible tolook
afteritsusage and by whomit shouldbe used.Theycreate usersaccessandapplylimitationtomaintain
isolationandforce security.AdministratorsalsolookafterDBMSresourceslike systemlicense,software
applicationandtoolsrequiredandotherhardware relatedmaintenance.
Designer:Thisisthe groupof people whoactuallyworksondesigningpartof database.The actual
database isstartedwithrequirementanalysisfollowedbyagood designingprocess.Theypeople keepa
close watchon what data shouldbe keptandinwhat format.Theyidentifyanddesignthe whole setof
entities,relations,constraintsandviews.
End Users:This groupcontainsthe personswhoactuallytake advantage of database system.Endusers
can be justviewerswhopayattentiontothe logsormarketrates or enduserscan be as sophisticatedas
businessanalystswhotakesthe mostof it.
DBMS - Architecture
The designof a Database ManagementSystemhighlydependsonits architecture.Itcanbe centralized
or decentralizedorhierarchical.DBMSarchitecture canbe seenas single tierormulti tier.n-tier
architecture dividesthe whole systemintorelatedbutindependentnmodules,whichcanbe
independentlymodified,altered,changedorreplaced.
In 1-tierarchitecture,DBMSis the onlyentitywhere userdirectlysitsonDBMS andusesit.Any changes
done here will directlybe done onDBMS itself.Itdoesnotprovide handytoolsforendusersand
preferablydatabase designerandprogrammersuse singletierarchitecture.
If the architecture of DBMS is 2-tierthenmusthave some application,whichusesthe DBMS.
Programmersuse 2-tierarchitecture where theyaccessDBMSby meansof application.Here application
tierisentirelyindependentof database intermof operation,designandprogramming.
3-tierarchitecture
Most widelyusedarchitectureis3-tierarchitecture.3-tierarchitecture separatesittierfromeachother
on basisof users.It isdescribedasfollows:
Database (Data) Tier:At thistier,onlydatabase resides.Database alongwithitsqueryprocessing
languagessitsinlayer-3of 3-tierarchitecture.Italsocontainsall relationsandtheirconstraints.
Application(Middle) Tier:Atthistierthe applicationserverandprogram, whichaccessdatabase,
resides.Fora userthisapplicationtierworksasabstractedview of database.Usersare unaware of any
existence of database beyondapplication.Fordatabase-tier,applicationtieristhe userof it.Database
tierisnot aware of any otheruserbeyondapplicationtier.Thistierworksasmediatorbetweenthe two.
User (Presentation) Tier:Anendusersitsonthistier.From a usersaspectthistieris everything.He/she
doesn'tknowaboutany existence orformof database beyondthislayer.Atthislayermultiple viewsof
database can be providedbythe application.All viewsare generatedbyapplications,whichresidesin
applicationtier.
Multiple tierdatabase architecture ishighlymodifiableasalmostall itscomponentsare independent
and can be changedindependently.
DBMS - Data Models
Data model tellshowthe logical structure of adatabase ismodeled.DataModelsare fundamental
entitiestointroduce abstractioninDBMS.Data modelsdefine how dataisconnectedtoeachotherand
howit will be processedandstoredinside the system.
The veryfirstdata model couldbe flatdata-modelswhereall the datausedtobe keptinsame plane.
Because earlierdatamodelswere notsoscientifictheywere prone tointroduce lotsof duplicationand
update anomalies.
Entity-RelationshipMode
Entity-Relationshipmodel isbasedonthe notionof real worldentitiesandrelationshipamongthem.
While formulatingreal-worldscenariointodatabase model,ERModel createsentityset,relationship
set,general attributesandconstraints.
ER Model is bestusedforthe conceptual designof database.
ER Model is basedon:
Entitiesandtheirattributes
Relationshipsamongentities
These conceptsare explainedbelow.
Entity
An entity inER Model isreal worldentity,whichhassome propertiescalledattributes.Everyattribute is
definedbyitssetof values,calleddomain.
For example,inaschool database,astudentisconsideredasanentity.Studenthasvariousattributes
like name,age andclassetc.
Relationship
The logical associationamongentitiesiscalledrelationship.Relationshipsare mappedwithentitiesin
variousways.Mappingcardinalitiesdefine the numberof associationbetweentwoentities.
Mappingcardinalities:
one to one
one to many
manyto one
manyto many
ER-Model isexplainedhere.
Relational Model
The most populardata model inDBMS isRelational Model.Itismore scientificmodel thenothers.This
model isbasedonfirst-orderpredicate logicanddefinestable asann-aryrelation.
The main
highlightsof thismodel are:
Data is storedintablescalledrelations.
Relationscanbe normalized.
In normalizedrelations,valuessavedare atomicvalues.
Each row inrelationcontainsunique value
Each columnin relationcontainsvaluesfromasame domain.
Relational Model isexplainedhere.
DBMS - Data Schemas
Database schema
Database schemaskeletonstructure of anditrepresentsthe logical view of entire database.Ittells
abouthow the data isorganized andhow relationamongthemisassociated.Itformulatesall database
constraintsthatwouldbe put ondata in relations,whichresidesindatabase.
A database schemadefinesitsentitiesandthe relationshipamongthem.Database schemaisa
descriptive detail of the database,whichcanbe depictedbymeansof schemadiagrams.All these
activitiesare done bydatabase designertohelpprogrammersinordertogive some ease of
understandingall aspectof database.
Database schemacan be dividedbroadlyin twocategories:
Physical Database Schema:Thisschemapertainstothe actual storage of data and itsformof storage
like files,indicesetc.Itdefinesthe howdatawill be storedinsecondarystorage etc.
Logical Database Schema:Thisdefinesall logical constraintsthatneedtobe appliedondatastored.It
definestables,viewsandintegrityconstraintsetc.
Database Instance
It isimportantthat we distinguishthese twotermsindividually.Database schemaisthe skeletonof
database.Itis designed whendatabase doesn'texistatall andveryhard to do anychangesonce the
database isoperational.Database schemadoesnotcontainanydata or information.
Database instances,isa state of operational database withdataatany giventime.Thisisa snapshotof
database.Database instancestendtochange withtime.DBMSensuresthatits everyinstance (state)
mustbe a validstate bykeepinguptoall validation,constraintsandconditionthatdatabase designers
has imposedoritis expectedfromDBMS itself.
DBMS - Data Independence
If the database systemisnotmulti-layeredthenitwill be veryhardtomake any changesinthe database
system.Database systemsare designedinmulti-layersaswe leantearlier.
Data Independence:
There'sa lot of data inwhole database managementsystemotherthanuser'sdata.DBMS comprisesof
three kindsof schemas,whichisinturn data aboutdata (Meta-Data).Meta-dataisalsostoredalong
withdatabase,whichonce storedisthenhardto modify.Butas DBMS expands,itneedstobe changed
overthe time satisfythe requirementsof users.Butif the whole datawere highlydependentitwould
become tediousandhighlycomplex.
Data about data itself isdividedin
layeredarchitecture sothatwhenwe change dataat one layeritdoesnot affectthe data layeredat
differentlevel.Thisdataisindependentbutmappedoneachother.
Logical Data Independence
Logical data is data aboutdatabase,thatis,it storesinformationabouthow dataismanagedinside.For
example,atable (relation) storedinthe database andall constraints,whichare appliedonthatrelation.
Logical data independenceisakindof mechanism, whichliberalizesitself fromactual datastoredon the
disk.If we do some changesontable formatit shouldnotchange the data residingondisk.
Physical DataIndependence
All schemasare logical andactual data isstoredinbit formaton the disk.Physical dataindependence is
the powerto change the physical datawithoutimpactingthe schemaorlogical data.
For example,incase we wanttochange or upgrade the storage systemitself,thatis,usingSSDinstead
of Hard-disksshouldnothave anyimpactonlogical data or schemas.
ER Model Basic Concepts
Entityrelationshipmodel definesthe conceptual view of database.Itworksaroundreal worldentityand
associationamongthem.Atviewlevel,ERmodel isconsideredwell fordesigningdatabases.
Entity
A real-worldthingeitheranimate orinanimate thatcanbe easilyidentifiable anddistinguishable.For
example,inaschool database,student,teachers,classandcourse offeredcanbe consideredasentities.
All entitieshave some attributesorpropertiesthatgive themtheiridentity.
An entitysetisa collectionof similartypesof entities.Entitysetmaycontainentitieswithattribute
sharingsimilarvalues.Forexample,Studentssetmaycontainall the studentof a school;likewise
Teacherssetmay containall the teachersof school from all faculties.Entitiessetsneednottobe
disjoint.
Attributes
Entitiesare representedbymeansof theirproperties,calledattributes.Allattributeshave values.For
example,astudententitymayhave name,class,age asattributes.
There existsadomain or range of valuesthatcan be assignedtoattributes. Forexample,astudent's
name cannot be a numericvalue.Ithas to be alphabetic.A student'sage cannotbe negative,etc.
Typesof attributes:
Simple attribute:
Simple attributesare atomicvalues,whichcannotbe dividedfurther.Forexample,student'sphone-
numberisan atomicvalue of 10 digits.
Composite attribute:
Composite attributesare made of more than one simple attribute.Forexample,astudent'scomplete
name may have first_name andlast_name.
Derivedattribute:
Derivedattributesare attributes,whichdonotexistphysical inthe database,butthere valuesare
derivedfromotherattributespresentedinthe database.Forexample,average_salaryinadepartment
shouldbe savedindatabase insteaditcanbe derived. Foranotherexample,age canbe derivedfrom
data_of_birth.
Single-valuedattribute:
Single valuedattributescontainonsinglevalue.Forexample:Social_Security_Number.
Multi-value attribute:
Multi-value attribute maycontainmore thanone values. Forexample,apersoncanhave more thanone
phone numbers,email_addressesetc.
These attribute typescancome togetherina waylike:
simple single-valuedattributes
simple multi-valuedattributes
composite single-valuedattributes
composite multi-valuedattributes
Entity-setand Keys
Keyisan attribute or collectionof attributesthatuniquelyidentifiesanentityamongentityset.
For example,roll_numberof astudentmakesher/himidentifiable amongstudents.
Super Key:Setof attributes(one ormore) thatcollectivelyidentifiesanentityinanentityset.
Candidate Key: Minimal superkeyiscalledcandidate keythatis,superskeysforwhichnopropersubset
are a superkey.Anentitysetmayhave more thanone candidate key.
Primary Key: Thisis one of the candidate keychosenbythe database designertouniquelyidentifythe
entityset.
Relationship
The associationamongentitiesiscalledrelationship.Forexample,employee entityhasrelation worksat
withdepartment.Anotherexample isforstudentwhoenrollsinsome course.Here, Worksatand
Enrollsare calledrelationship.
RelationshipSet:
Relationshipof similartype iscalledrelationshipset.Like entities,arelationshiptoocanhave attributes.
These attributesare calleddescriptiveattributes.
Degree of relationship
The numberof participatingentitiesinanrelationshipdefinesthe degreeof the relationship.
Binary= degree 2
Ternary= degree 3
n-ary= degree
MappingCardinalities:
Cardinalitydefinesthe numberof entitiesinone entitysetwhichcanbe associatedtothe numberof
entitiesof othersetviarelationshipset.
One-to-one:one entityfromentitysetA can be associatedwithatmost one entityof entitysetBand
vice versa
One-to-many:One entityfrom entitysetA canbe associatedwithmore thanone entitiesof entitysetB
but fromentitysetB one entitycanbe associatedwithatmostone entity.
Many-to-one:More than one entitiesfromentity
setA can be associatedwithatmostone entityof entitysetB butone entityfromentitysetB can be
associatedwithmore thanone entityfromentitysetA.
Many-to-many:one entityfromA can be
associatedwithmore thanone entityfromB andvice versa
ER Diagram Representation
Nowwe shall learnhowER Model is representedbymeansof ER diagram.Everyobjectlike entity
attributesof an entity,relationshipset,andattributesof relationshipsetcanbe representedbytoolsof
ER diagram.
Entity
Entitiesare representedbymeansof rectangles.Rectanglesare namedwiththe entitysetthey
represent. Attributes
Attributesare propertiesof entities.Attributesare representedbymeansof eclipses.Everyeclipse
representsone attribute andisdirectlyconnectedtoitsentity(rectangle).
If the attributesare composite,
theyare furtherdividedinatree like structure.Everynode isthenconnectedtoitsattribute.Thatis
composite attributesare representedbyeclipsesthatare connectedwithaneclipse
Multivaluedattributesare
depictedbydouble eclipse.
Derivedattributesare
depictedbydashedeclipse.
Relationship
Relationshipsare representedbydiamondshapedbox.Nameof the relationshipiswritteninthe
diamond-box.All entities(rectangles),participatinginrelationship,are connectedtoitbya line.
Binaryrelationshipandcardinality
A relationshipwheretwoentitiesare participating,iscalledabinaryrelationship.Cardinalityisthe
numberof instance of an entityfroma relationthatcan be associatedwiththe relation.
One-to-one
Whenonlyone instance of entityisassociatedwiththe relationship,itismarkedas'1'. This image below
reflectsthatonly1 instance of each entityshouldbe associatedwiththe relationship.Itdepictsone-to-
one relationship
One-to-many
Whenmore than one instance of entityisassociatedwiththe relationship,itismarkedas'N'. Thisimage
belowreflectsthatonly1 instance of entityonthe leftandmore than one instance of entityonthe right
can be associatedwiththe relationship.Itdepictsone-to-manyrelationship
Many-to-
one
Whenmore than one instance of entityisassociatedwiththe relationship,itismarkedas'N'. Thisimage
belowreflectsthatmore thanone instance of entityonthe leftand onlyone instance of entityonthe
rightcan be associatedwiththe relationship.Itdepictsmany-to-onerelationship
Participation Constraints
Total Participation: Each entityinthe entityisinvolvedinthe relationship.Total participationis
representedbydouble lines.
Partial participation: Notall entitiesare involvedinthe relationship.Partial participationisrepresented
by single line.
Generalization,Aggregation
ER Model has the powerof expressingdatabase entitiesinconceptual hierarchical mannersuchthat, as
the hierarchical goesupitgeneralize the view of entitiesandaswe go deepinthe hierarchyitgivesus
detail of everyentityincluded.
Goingup inthisstructure iscalledgeneralization,where entitiesare clubbedtogethertorepresenta
more generalizedview.Forexample,aparticularstudentnamed,Miracan be generalizedalongwithall
the students,the entityshall be student,andfurtherastudentisperson.The reverse iscalled
specializationwhere apersonisstudent,andthatstudentisMira.
Generalization
As mentionedabove,the processof generalizingentities,wherethe generalizedentitiescontainthe
propertiesof all the generalizedentitiesiscalledGeneralization.Ingeneralization,anumberof entities
are broughttogetherintoone generalizedentitybasedontheirsimilarcharacteristics.Foranexample,
pigeon,house sparrow,crowanddove all can be generalizedasBirds.
Specialization
Specializationisaprocess,whichisopposite togeneralization,asmentionedabove. Inspecialization,a
groupof entitiesisdividedintosub-groupsbasedontheircharacteristics.Take agroupPersonfor
example.A personhasname,date of birth,genderetc.These propertiesare commoninall persons,
humanbeings.Butina company,a personcan be identifiedasemployee,employer,customerorvendor
basedon whatrole do theyplayincompany.
Similarly,inaschool database,apersoncan be specializedasteacher,studentorstaff;basedonwhat
role do theyplayinschool as entities
Inheritance
For example,attributesof apersonlike name,age,andgendercanbe inheritedbylowerlevel entities
like studentandteacheretc.
DBMS Codd'sRules
Dr Edgar F.Codd didsome extensive researchinRelational Model of database systemsand came up
withtwelve rulesof hisownwhichaccordingtohim, a database mustobeyinorderto be a true
relational database.
These rulescanbe appliedonadatabase systemthatis capable of managingisstoreddata usingonlyits
relational capabilities.Thisisafoundationrule,whichprovidesabase toimplyotherrulesonit.
Rule 1: Informationrule
Thisrule statesthat all information(data),whichisstoredinthe database,mustbe a value of some
table cell.Everythinginadatabase must be storedintable formats.Thisinformationcanbe userdata or
meta-data.
Rule 2: GuaranteedAccessrule
Thisrule statesthat everysingle dataelement(value) isguaranteedtobe accessiblelogicallywith
combinationof table-name,primary-key(row value) andattribute-name (columnvalue).Noother
means,suchas pointers,canbe usedto access data.
Rule 3: SystematicTreatmentof NULL values
Thisrule statesthe NULL valuesinthe database mustbe givena systematictreatment.AsaNULL may
have several meanings,i.e.NULLcanbe interpretedasone the following:dataismissing,dataisnot
known,datais notapplicable etc.
Rule 4: Active online catalog
Thisrule statesthat the structure descriptionof whole database mustbe storedinanonline catalog,i.e.
data dictionary,whichcanbe accessedbythe authorizedusers.Userscanuse the same querylanguage
to access the catalogwhichtheyuse to access the database itself.
Rule 5: Comprehensive datasub-language rule
Thisrule statesthat a database musthave a supportfora language whichhaslinearsyntax whichis
capable of data definition,datamanipulationandtransactionmanagementoperations.Database canbe
accessedbymeansof thislanguage only,eitherdirectlyorbymeansof some application.If the
database can be accessedor manipulatedinsome waywithoutanyhelpof thislanguage,itisthena
violation.
Rule 6: Viewupdatingrule
Thisrule statesthat all viewsof database,whichcantheoreticallybe updated,mustalsobe updatable
by the system.
Rule 7: High-levelinsert,update anddelete rule
Thisrule statesthe database mustemploysupporthigh-level insertion,updationanddeletion.Thismust
not be limitedtoasingle rowthat is,itmust also supportunion,intersectionandminusoperationsto
yieldsetsof datarecords.
Rule 8: Physical dataindependence
Thisrule statesthat the applicationshouldnothave anyconcernabouthow the data isphysically
stored.Also,anychange initsphysical structure mustnot have anyimpact onapplication.
Rule 9: Logical data independence
Thisrule statesthat the logical datamust be independentof itsuser’sview (application).Anychange in
logical datamust notimplyanychange in the applicationusing it.Forexample,if twotablesare merged
or one is splitintotwodifferenttables,there shouldbe noimpactthe change on userapplication.Thisis
one of the mostdifficultrule toapply.
Rule 10: Integrityindependence
Thisrule statesthat the database mustbe independentof the applicationusingit.All itsintegrity
constraintscan be independentlymodifiedwithoutthe needof anychange inthe application.Thisrule
makesdatabase independentof the front-endapplicationanditsinterface.
Rule 11: Distributionindependence
Thisrule statesthat the endusermust notbe able to see thatthe data is distributedovervarious
locations.Usermustalsosee that data islocatedat one site only.Thisrule hasbeenprovenasa
foundationof distributeddatabase systems.
Rule 12: Non-subversionrule
Thisrule statesthat if a systemhas an interface thatprovidesaccesstolow level records,thisinterface
thenmustnot be able to subvertthe systemandbypasssecurityandintegrityconstraints.
Relational DataModel
Relational datamodel isthe primarydatamodel,whichisusedwidelyaroundthe worldfordatastorage
and processing.Thismodel issimple andhave all the propertiesandcapabilitiesrequiredtoprocess
data withstorage efficiency.
Concepts
Tables:Inrelationdatamodel,relationsare savedinthe formatof Tables.Thisformatstoresthe
relationamongentities.A table hasrowsandcolumns,where rowsrepresentrecordsandcolumns
representsthe attributes.
Tuple:A single rowof a table,whichcontainsasingle recordforthatrelationiscalledatuple.
Relationinstance:A finitesetof tuplesinthe relational database systemrepresentsrelationinstance.
Relationinstancesdonothave duplicate tuples.
Relationschema:Thisdescribesthe relationname (table name),attributesandtheirnames.
Relationkey:Eachrow hasone or more attributeswhichcanidentifythe row inthe relation(table)
uniquely,iscalledthe relationkey.
Attribute domain:Everyattribute hassome pre-definedvaluescope,knownasattribute domain.
Constraints
Everyrelationhassome conditionsthatmustholdforit to be a validrelation.These conditionsare
calledRelationalIntegrityConstraints.There are three mainintegrityconstraints.
Key Constraints
Domainconstraints
Referential integrityconstraints
KeyConstraints:
There mustbe at leastone minimal subsetof attributesinthe relation,whichcanidentifyatuple
uniquely.Thisminimalsubsetof attributesiscalledkeyforthatrelation.If there are more thanone such
minimal subsets,these are calledcandidate keys.
Keyconstraintsforcesthat:
ina relationwithakeyattribute,notwotuplescanhave identical value forkeyattributes.
keyattribute cannot have NULL values.
Keyconstrainsare alsoreferredtoas EntityConstraints.
Domainconstraints
Attributeshave specificvaluesinreal-worldscenario.Forexample,age canonlybe positive integer.The
same constraintshasbeentried toemployonthe attributesof a relation.Everyattribute isboundto
have a specificrange of values.Forexample,age cannot be lessthanzero andtelephone numbercan
not be a outside 0-9.
Referential integrityconstraints
Thisintegrityconstraints worksonthe conceptof ForeignKey.A keyattribute of a relationcanbe
referredinotherrelation,where itiscalledforeignkey.
Referential integrityconstraintstatesthatif a relationreferstoankeyattribute of a differentorsame
relation,thatkeyelementmustexists.
Relational Algebra
Relational database systemsare expectedtobe equippedbyaquerylanguage thatcan assistitsuser to
querythe database instances.Thiswayitsuserempowersitself andcanpopulate the resultsas
required.There are twokindsof querylanguages,relationalalgebraandrelational calculus.
Relational algebra
Relational algebraisa procedural querylanguage,whichtakesinstancesof relationsasinputandyields
instancesof relationsasoutput.Itusesoperatorstoperformqueries.Anoperatorcanbe eitherunary
or binary.Theyaccept relationsastheirinputandyieldsrelationsastheiroutput.Relational algebrais
performedrecursivelyonarelationandintermediateresultsare alsoconsideredrelations.
Fundamental operationsof Relational algebra:
Select
Project
Union
Setdifferent
Cartesianproduct
Rename
These are definedbrieflyasfollows:
SelectOperation(σ)
Selectstuplesthatsatisfythe givenpredicate fromarelation.
Notationσp(r)
Where p standsfor selectionpredicate andrstandsfor relation.pisprepositional logicformulaewhich
may use connectorslike and,orand not.These termsmay use relational operatorslike:=,≠, ≥, < , >, ≤.
For example:
σsubject="database"(Books)
Output: Selectstuplesfrombookswhere subjectis'database'.
σsubject="database"andprice="450"(Books)
Output: Selectstuplesfrombookswhere subjectis'database'and'price'is450.
σsubject="database"andprice <"450" or year> "2010"(Books)
Output: Selectstuplesfrombookswhere subjectis'database'and'price'is450 or the publicationyear
isgreaterthan 2010, that ispublishedafter2010.
ProjectOperation(∏)
Projectscolumn(s) thatsatisfygivenpredicate.
Notation:∏A1, A2, An(r)
Where a1, a2 , an are attribute namesof relationr.
Duplicate rowsare automaticallyeliminated,asrelationisaset.
for example:
∏subject,author(Books)
Selectsandprojectscolumnsnamedassubjectandauthorfrom relationBooks.
UnionOperation(∪)
Unionoperationperformsbinaryunionbetweentwogivenrelationsandisdefinedas:
r ∪ s = { t | t ∈ r or t ∈ s}
Notion:r U s
Where r and s are eitherdatabase relationsorrelationresultset(temporaryrelation).
For a unionoperationtobe valid,the followingconditionsmusthold:
r, s must have same numberof attributes.
Attribute domainsmustbe compatible.
Duplicate tuplesare automaticallyeliminated.
∏ author (Books) ∪∏ author (Articles)
Output: Projectsthe name of authorwhohas eitherwrittenabookor an article or both.
SetDifference ( −)
The resultof set difference operationistupleswhichpresentinone relationbutare notinthe second
relation.
Notation:r − s
Findsall tuplesthatare presentinr but not s.
∏ author (Books) − ∏ author(Articles)
Output:Resultsthe name of authorswhohas writtenbooksbutnot articles.
CartesianProduct(Χ)
Combinesinformationof twodifferentrelationsintoone.
Notation:r Χs
Where r and s are relationsandthere outputwill be definedas:
r Χ s = { q t | q ∈ r and t ∈ s}
∏ author = 'tutorialspoint'(BooksΧArticles)
Output: yieldsarelationasresultwhichshowsall booksandarticleswrittenbytutorialspoint.
Rename operation( ρ )
Resultsof relational algebraare alsorelationsbutwithoutanyname.The rename operationallowsusto
rename the outputrelation.rename operationisdenotedwithsmallgreekletterrhoρ
Notation:ρ x (E)
Where the resultof expressionEissavedwithname of x.
Additional operationsare:
Setintersection
Assignment
Natural join
Relational Calculus
In contrastwithRelational Algebra,RelationalCalculusisnon-procedural querylanguage,thatis,ittells
whatto do but neverexplainsthe way,how todoit.
Relational calculusexistsintwoforms:
Tuple relational calculus(TRC)
Filteringvariable rangesovertuples
Notation:{ T | Condition}
Returnsall tuplesTthat satisfiescondition.
For Example:
{ T.name | Author(T) ANDT.article = 'database'}
Output:returnstupleswith'name'fromAuthorwhohas writtenarticle on'database'.
TRC can be quantifiedalso.We canuse Existential ( ∃)andUniversal Quantifiers( ∀).
For example:
{ R| ∃T ∈ Authors(T.article='database'ANDR.name=T.name)}
Output: the querywill yieldthe same resultasthe previousone.
Domainrelational calculus(DRC)
In DRC the filteringvariable usesdomainof attributesinsteadof entiretuple values(asdone inTRC,
mentioned above).
Notation:
{ a1, a2, a3, ...,an | P (a1, a2, a3, ... ,an)}
where a1, a2 are attributesandP standsfor formulae builtbyinnerattributes.
For example:
{< article,page,subject>| ∈ TutorialsPoint∧subject='database'}
Output:YieldsArticle,Page andSubjectfromrelationTutorialsPointwhere Subjectisdatabase.
Justlike TRC,DRC also can be writtenusingexistential anduniversal quantifiers.DRCalsoinvolves
relational operators.
Expressionpowerof Tuple relationcalculusandDomainrelationcalculusisequivalenttoRelational
Algebra.
ER to Relational Model
ER Model whenconceptualizedintodiagramsgivesagoodoverview of entity-relationship,whichis
easiertounderstand.ERdiagramscan be mappedtoRelational schemathatis,itispossible tocreate
relational schemausingERdiagram.Thoughwe cannotimportall the ER constraintsintoRelational
model butan approximate schemacanbe generated.
There are more than one processesandalgorithmsavailabletoconvertER DiagramsintoRelational
Schema.Some of themare automatedandsome of themare manual process.We may focushere on
the mappingdiagramcontentstorelational basics.
ER Diagrams mainlycomprisedof:
Entityand itsattributes
Relationship,whichisassociationamongentities.
MappingEntity
An entityisareal worldobjectwithsome attributes.
MappingProcess(Algorithm):
Create table foreach entity
Entity'sattributesshouldbecome fieldsof tableswiththeirrespectivedatatypes.
Declare primarykey
Mappingrelationship
A relationshipisassociationamongentities.
Mappingprocess(Algorithm):
We use all above featuresof ER-Model,inordertocreate classesof objectsinobjectoriented
programming.Thismakesiteasierfor the programmerto concentrate onwhatshe is programming.
Detailsof entitiesare generallyhiddenfromthe user,thisprocessknownasabstraction.
One of the importantfeaturesof GeneralizationandSpecialization,isinheritance,thatis,the attributes
of higher-levelentitiesare inheritedbythe lowerlevelentities.
Create table fora relationship
Addthe primarykeysof all participatingEntitiesasfieldsof table withtheirrespectivedatatypes.
If relationshiphasanyattribute,addeachattribute asfieldof table.
Declare a primarykeycomposingall the primarykeysof participatingentities.
Declare all foreignkeyconstraints.
MappingWeak EntitySets
A weakentitysetsisone whichdoesnothave anyprimarykeyassociatedwithit.
Mappingprocess(Algorithm):
Create table
for weakentityset
Addall its attributestotable asfield
Addthe primarykeyof identifyingentityset
Declare all foreignkeyconstraints
Mappinghierarchical entities
ER specializationorgeneralization comesinthe formof hierarchical entitysets.
Mappingprocess(Algorithm):
Create tablesforall higherlevel entities
Create tablesforlowerlevelentities
Addprimarykeysof higherlevel entitiesinthe table of lowerlevel entities
In lowerleveltables,addall otherattributesof lowerentities.
Declare primarykeyof higherlevel tablethe primarykeyforlowerlevel table
Declare foreignkeyconstraints.
SQL Overview
SQL is a programminglanguage forRelationalDatabases.Itisdesignedoverrelational algebraandtuple
relational calculus.SQLcomesasa package withall major distributionsof RDBMS.
SQL comprisesbothdatadefinitionanddatamanipulationlanguages.Usingthe datadefinition
propertiesof SQL,one can designandmodify database schemawhereasdatamanipulationproperties
allowsSQLto store and retrieve datafromdatabase.
Data definitionLanguage
SQL usesthe followingsetof commandstodefine database schema:
CREATE
Createsnewdatabases,tablesandviewsfromRDBMS
For example:
Create database tutorialspoint;
Create table article;
Create viewfor_students;
DROP
Drop commandsdeletesviews,tablesanddatabasesfromRDBMS
Drop object_type object_name;
Drop database tutorialspoint;
Drop table article;
Drop viewfor_students;
ALTER
Modifiesdatabase schema.
Alterobject_type object_name parameters;
for example:
Altertable article addsubjectvarchar;
This commandaddsan attribute inrelationarticle withname subjectof stringtype.
Data ManipulationLanguage
SQL is equippedwithdatamanipulationlanguage.DMLmodifiesthe database instance byinserting,
updatinganddeletingitsdata.DML is responsible forall datamodificationindatabases.SQLcontains
the followingsetof commandinDML section:
SELECT/FROM/WHERE
INSERT INTO/VALUES
UPDATE/SET/WHERE
DELETE FROM/WHERE
These basicconstructsallowsdatabase programmersanduserstoenterdata and informationintothe
database and retrieve efficiently usinganumberof filteroptions.
SELECT/FROM/WHERE
SELECT
Thisis one of the fundamental querycommandof SQL.It issimilartoprojectionoperationof relational
algebra.Itselectsthe attributesbasedonthe conditiondescribedbyWHEREclause.
FROM
Thisclause takesa relationname asan argumentfromwhichattributesare to be selected/projected.In
case more thanone relationnamesare giventhisclause correspondstocartesianproduct.
WHERE
Thisclause definespredicate orconditionswhichmustmatchinorderto qualifythe attributestobe
projected.
For example:
Selectauthor_name
From book_author
Where age > 50;
Thiscommandwill projectnamesof author’sfrombook_authorrelationwhose age isgreaterthan50.
INSERT INTO/VALUES
Thiscommandis usedforinsertingvaluesintorowsof table (relation).
Syntax is
INSERT INTOtable (column1[,column2,column3...]) VALUES (value1[,value2,value3...])
Or
INSERT INTOtable VALUES(value1,[value2,...])
For Example:
INSERT INTOtutorialspoint(Author,Subject)VALUES("anonymous","computers");
UPDATE/SET/WHERE
Thiscommandis usedforupdatingor modifyingvaluesof columnsof table (relation).
Syntax is
UPDATE table_name SETcolumn_name = value [,column_name =value ...] [WHERE condition]
For example:
UPDATE tutorialspointSETAuthor="webmaster"WHEREAuthor="anonymous";
DELETE/FROM/WHERE
Thiscommandis usedforremovingone ormore rowsfromtable (relation).
Syntax is
DELETE FROMtable_name [WHEREcondition];
For example:
DELETE FROMtutorialspoints
WHERE Author="unknown";
For in-depthandpractical knowledge of SQL,clickhere.
Database Normalization
Functional Dependency
Functional dependency(FD) issetof constraintsbetweentwoattributesinarelation.Functional
dependencysaysthatif two tupleshave same valuesforattributesA1,A2,...,Anthenthose twotuples
musthave to have same valuesforattributesB1, B2, ...,Bn.
Functional dependencyisrepresentedbyarrow sign(→),thatisX→Y,where X functionallydetermines
Y. The lefthandside attributesdeterminesthe valuesof attributesatrighthandside.
Armstrong'sAxioms
If F issetof functional dependenciesthenthe closureof F,denotedasF+,is the setof all functional
dependencieslogicallyimpliedbyF.Armstrong'sAxiomsare setof rules,whenappliedrepeatedly
generatesclosure of functional dependencies.
Reflexiverule:If alphaisa setof attributesandbetais_subset_of alpha,thenalphaholdsbeta.
Augmentationrule:if a→ b holdsandy isattribute set,thenay → by alsoholds.Thatis adding
attributesindependencies,doesnotchange the basicdependencies.
Transitivityrule:Same astransitive ruleinalgebra,if a→ b holdsand b → c holdsthena → c alsohold.a
→ b iscalledasa functionallydeterminesb.
Trivial Functional Dependency
Trivial:If an FD X → Y holdswhere Ysubsetof X, thenitis calledatrivial FD.Trivial FDsare alwayshold.
Non-trivial:If anFD X → Y holdswhere Yis notsubsetof X, thenit is callednon-trivial FD.
Completelynon-trivial:If anFD X→ Y holdswhere x intersectY= Φ, is saidto be completelynon-trivial
FD.
Normalization
If a database designisnotperfectitmaycontainanomalies,whichare like abaddreamfor database
itself.Managingadatabase withanomaliesisnexttoimpossible.
Update anomalies:if dataitemsare scatteredandare not linkedtoeachotherproperly,thenthere may
be instanceswhenwe tryto update one data itemthathas copiesof it scatteredatseveral places,few
instancesof itget updatedproperlywhile few are leftwiththere oldvalues.Thisleavesdatabaseinan
inconsistentstate.
Deletionanomalies:we triedtodeletearecord,butparts of itleftundeletedbecauseof unawareness,
the data is alsosavedsomewhereelse.
Insertanomalies:we triedtoinsertdataina record thatdoesnot existatall.
Normalizationisamethodtoremove all these anomaliesandbringdatabase toconsistentstate and
free fromany kindsof anomalies.
FirstNormal Form:
Thisis definedinthe definitionof relations(tables) itself.Thisrule definesthatall the attributesina
relationmusthave atomicdomains.Valuesinatomicdomainare indivisibleunits.
[Image:Unorganizedrelation]
We re-arrange the relation(table) asbelow,toconvertitto FirstNormal Form
[Image:Relationin1NF]
Each attribute mustcontainonlysingle value fromitspre-defineddomain.
SecondNormal Form:
Before we learnaboutsecondnormal form, we needtounderstandthe following:
Prime attribute:anattribute,whichispartof prime-key,isprime attribute.
Non-prime attribute:anattribute,whichisnotapart of prime-key,issaidtobe a non-prime attribute.
Secondnormal formsays, that everynon-prime attribute shouldbe fullyfunctionallydependenton
prime keyattribute.Thatis,if X → A holds,thenthere shouldnotbe anypropersubsetY of X, for that Y
→ A also holds.
[Image:Relationnotin2NF]
We see here inStudent_Projectrelationthatthe prime keyattributesare Stu_IDandProj_ID.According
to the rule,non-keyattributes,i.e.Stu_Name andProj_Name mustbe dependentuponbothandnoton
any of the prime keyattribute individually.Butwe findthatStu_Name can be identifiedbyStu_IDand
Proj_Name canbe identifiedbyProj_IDindependently.Thisiscalledpartial dependency,whichisnot
allowedinSecondNormal Form.
[Image:Relationin2NF]
We broke the relationintwoas depictedinthe above picture.Sothere existsnopartial dependency.
ThirdNormal Form:
For a relationtobe in ThirdNormal Form, it mustbe inSecondNormal formand the followingmust
satisfy:
No non-prime attribute istransitivelydependentonprime keyattribute
For any non-trivialfunctional dependency,X→ A,theneither
X isa superkeyor,
A isprime attribute.
[Image:Relationnotin3NF]
We findthatin above depictedStudent_detail relation,Stu_IDiskeyandonlyprime keyattribute.We
findthatCity can be identifiedbyStu_ID aswell asZipitself.NeitherZipisasuperkeynorCityisa prime
attribute.Additionally,Stu_ID→ Zip→ City,sothere existstransitive dependency.
[Image:Relationin3NF]
We broke the relationasabove depictedtworelationstobringitinto3NF.
Boyce-CoddNormal Form:
BCNFis an extensionof ThirdNormal Forminstrict way.BCNFstatesthat
For any non-trivialfunctional dependency,X→ A,thenX mustbe a super-key.
In the above depictedpicture,Stu_IDissuper-keyinStudent_Detail relationandZipissuper-keyin
ZipCodesrelation.So,
Stu_ID → Stu_Name,Zip
And
Zip→ City
Confirms,thatbothrelationsare inBCNF.
Database Joins
We understandthe benefitsof Cartesianproductof tworelation,whichgivesusall the possible tuples
that are pairedtogether.ButCartesianproductmightnotbe feasibleforhuge relationswhere number
of tuplesare inthousandsandthe attributesof bothrelationsare considerable large.
Joiniscombinationof Cartesianproductfollowedbyselectionprocess.Joinoperationpairstwotuples
fromdifferentrelationsif andonlyif the givenjoinconditionissatisfied.
Followingsectionshoulddescribe brieflyaboutjointypes:
Theta(θ) join
θ in Thetajoinisthe joincondition.Thetajoinscombinestuplesfromdifferentrelationsprovidedthey
satisfythe thetacondition.
Notation:
R1 ⋈θ R2
R1 andR2 are relationswiththeirattributes(A1,A2,..,An) and (B1, B2,..,Bn) suchthat no attribute
matchesthat isR1 ∩ R2 = Φ Here θ is conditioninformof setof conditionsC.
Thetajoincan use all kindsof comparisonoperators.
StudentSID Name Std
101 Alex 10
102 Maria 11
[Table:StudentRelation]
SubjectsClass Subject
10 Math
10 English
11 Music
11 Sports
[Table:SubjectsRelation]
Student_Detail =
STUDENT ⋈Student.Std= Subject.ClassSUBJECT
Student_detailSID Name Std Class Subject
101 Alex 10 10 Math
101 Alex 10 10 English
102 Maria 11 11 Music
102 Maria 11 11 Sports
[Table:Outputof thetajoin]
Equi-Join
WhenTheta joinusesonlyequalitycomparisonoperatoritissaidto be Equi-Join.The above example
conrrespondstoequi-join
Natural Join( ⋈ )
Natural joindoesnotuse any comparisonoperator.Itdoesnotconcatenate the wayCartesianproduct
does.Instead,Natural Joincanonlybe performedif the there isatleastone commonattribute exists
betweenrelation.Those attributesmusthave same name anddomain.
Natural joinacts on those matchingattributeswherethe valuesof attributesinbothrelationissame.
CoursesCID Course Dept
CS01 Database CS
ME01 Mechanics ME
EE01 Electronics EE
[Table:RelationCourses]
HoDDept Head
CS Alex
ME Maya
EE Mira
[Table:RelationHoD]
Courses⋈ HoDDept CID Course Head
CS CS01 Database Alex
ME ME01 Mechanics Maya
EE EE01 Electronics Mira
[Table:RelationCourses ⋈HoD]
OuterJoins
All joinsmentionedabove,thatisThetaJoin,Equi JoinandNatural Joinare calledinner-joins.Aninner-
joinprocessincludesonlytupleswithmatchingattributes,restare discarded inresultingrelation.There
existsmethodsbywhichall tuplesof anyrelationare includedinthe resultingrelation.
There are three kindsof outerjoins:
Leftouterjoin( R S )
All tuplesof Leftrelation,R,are includedinthe resultingrelationandif there existstuplesinRwithout
any matchingtuple inSthenthe S-attributesof resultingrelationare made NULL.
LeftA B
100 Database
101 Mechanics
102 Electronics
[Table:LeftRelation]
RightA B
100 Alex
102 Maya
104 Mira
[Table:RightRelation]
Courses HoDA B C D
100 Database 100 Alex
101 Mechanics --- ---
102 Electronics 102 Maya
[Table:Leftouterjoinoutput]
Rightouterjoin:( R S )
All tuplesof the Rightrelation,S,are includedinthe resultingrelationandif there existstuplesinS
withoutanymatchingtuple inR thenthe R-attributesof resultingrelationare made NULL.
Courses HoDA B C D
100 Database 100 Alex
102 Electronics 102 Maya
--- --- 104 Mira
[Table:Rightouterjoinoutput]
Full outerjoin:( R S)
All tuplesof bothparticipatingrelationsare includedinthe resultingrelationandif there nomatching
tuplesforbothrelations,theirrespective unmatchedattributesare made NULL.
Courses HoDA B C D
100 Database 100 Alex
101 Mechanics --- ---
102 Electronics 102 Maya
--- --- 104 Mira
[Table:Full outerjoinoutput]
DBMS - Storage System
Databasesare storedinfile formats,whichcontainsrecords.Atphysical level,actual dataisstoredin
electromagneticformatonsome device capable of storingitfora longeramountof time.These storage
devicescanbe broadlycategorizedinthree types:
PrimaryStorage:The memorystorage,whichisdirectlyaccessible bythe CPU,comesunderthis
category.CPU's internal memory(registers),fastmemory(cache)andmainmemory(RAM) are directly
accessible toCPUas theyall are placedon the motherboardorCPU chipset.Thisstorage istypicallyvery
small,ultrafastand volatile.Thisstorage needscontinuouspowersupplyinordertomaintainitsstate,
i.e.incase of powerfailure all dataare lost.
SecondaryStorage:The needtostore data for longeramountof time and to retainitevenafterthe
powersupplyisinterruptedgave birthtosecondarydatastorage.All memorydevices,whichare not
part of CPU chipsetormotherboardcomesunderthiscategory.Broadly,magneticdisks,all optical disks
(DVD,CD etc.),flashdrivesandmagnetictapesare notdirectlyaccessiblebythe CPU.Hard diskdrives,
whichcontainthe operatingsystemandgenerallynotremovedfromthe computersare,considered
secondarystorage andall other are calledtertiarystorage.
TertiaryStorage:Thirdlevel inmemoryhierarchyiscalledtertiarystorage.Thisisused tostore huge
amountof data.Because thisstorage isexternal tothe computersystem, itisthe slowestinspeed.
These storage devicesare mostlyusedtobackupthe entire system.Opticaldiskandmagnetictapesare
widelyusedstorage devicesastertiarystorage.
MemoryHierarchy
A computersystemhaswell-definedhierarchyof memory.CPUhasinbuiltregisters,whichsavesdata
beingoperatedon.Computersystemhasmainmemory,whichisalsodirectlyaccessiblebyCPU.
Because the accesstime of mainmemoryandCPU speedvariesalot,to minimize the losscache
memoryisintroduced.Cache memorycontainsmostrecentlyuseddataanddata whichmay be referred
by CPU innear future.
The memorywithfastestaccessisthe costliestone andisthe veryreasonof hierarchyof memory
system.Largerstorage offersslowspeedbutcanstore huge amountof data comparedtoCPU registers
or Cache memoryandthese are lessexpensive.
MagneticDisks
Hard diskdrivesare the most commonsecondarystorage devicesinpresentdaycomputersystems.
These are calledmagneticdisksbecause itusesthe conceptof magnetizationtostore information.Hard
disksconsistof metal diskscoatedwithmagnetizablematerial.These disksare placedverticallya
spindle.A read/write headmovesinbetweenthe disksandisusedtomagnetize orde-magnetizethe
spotunderit. Magnetizedspotcanbe recognizedas0 (zero) or1 (one).
Hard disksare formattedina well-definedordertostoreddata efficiently.A harddiskplate hasmany
concentriccirclesonit,calledtracks.Every track isfurtherdividedintosectors.A sectorona hard disk
typicallystores512 bytesof data.
RAID
Exponential growthintechnologyevolvedthe conceptof largersecondarystorage medium.Tomitigate
the requirementRAIDisintroduced.RAIDstandsforRedundantArrayof IndependentDisks,whichisa
technologytoconnectmultiplesecondarystorage devicesandmake use of themas a single storage
media.
RAID consistsanarray of diskinwhichmultiple disksare connectedtogethertoachieve differentgoals.
RAID levelsdefine the use of diskarrays.
RAID 0: In thislevel astripedarrayof disksisimplemented.The dataisbrokendownintoblocksandall
blocksare distributedamongall disks.Eachdiskreceivesablockof data to write/readinparallel.This
enhancesthe speedandperformance of storage device.There isnoparityandbackupin Level 0.
RAID 1: Thislevel uses
mirroringtechniques.Whendataissentto RAIDcontrolleritsendsacopy of data to all disksinarray.
RAID level 1isalsocalledmirroringandprovides100% redundancyincase of failure.
RAID 2: Thislevel recordsthe Error
CorrectionCode usingHammingdistance foritsdatastripedondifferentdisks.Like level 0,eachdata bit
ina wordis recordedona separate diskandECC codesof the data wordsare storedon differentset
disks.Because of itscomplex structure andhighcost,RAID2 isnot commerciallyavailable.
RAID 3: Thislevel also
stripesthe dataonto multiple disksinarray.The paritybitgeneratedfordata wordisstoredon a
differentdisk.Thistechnique makesittoovercome single diskfailureandasingle diskfailure doesnot
impactthe throughput.
RAID 4: In
thislevel anentire blockof dataiswrittenontodata disksandthenthe parity isgeneratedandstored
on a differentdisk.The prime difference betweenlevel 3and4 is,level 3usesbyte level striping
whereaslevel4usesblocklevel striping.Bothlevel 3and4 requiresatleast3 diskstoimplementRAID.
RAID 5:
Thislevel alsowriteswhole datablocksontodifferentdisksbutthe paritygeneratedfordatablock
stripe isnot storedona differentdedicateddisk,butisdistributedamongall the datadisk
RAID 6: Thislevel isan
extensionof level5.Inthis level twoindependentparitiesare generatedandstoredindistributed
fashionamongdisks.Twoparitiesprovideadditionalfaulttolerance.Thislevel requiresatleast4 disk
drivestobe implemented.
DBMS - File
Structure
Relative dataandinformationisstoredcollectivelyinfileformats.A file issequenceof recordsstoredin
binaryformat.A diskdrive isformattedintoseveralblocks,whichare capable forstoring records.File
recordsare mappedontothose diskblocks.
File Organization
The methodof mappingfile recordstodiskblocksdefinesfileorganization,i.e.how the filerecordsare
organized.The followingare the typesof file organization
HeapFile
Organization:Whenafile iscreatedusingHeapFile Organizationmechanism, the OperatingSystems
allocatesmemoryareatothat file withoutanyfurtheraccountingdetails.File recordscanbe placed
anywhere inthatmemoryarea.It isthe responsibilityof softwaretomanage the records.HeapFile does
not supportanyordering,sequencingorindexingonitsown.
SequentialFile Organization:Everyfilerecordcontainsadata field(attribute) touniquelyidentifythat
record.In sequentialfileorganization mechanism,recordsare placedinthe file inthe some sequential
orderbasedon the unique keyfieldorsearchkey.Practically,itisnotpossible tostore all the records
sequentiallyinphysical form.
Hash File Organization:ThismechanismusesaHash functioncomputationonsome fieldof the records.
As we know,thatfile isa collectionof records,whichhastobe mappedon some blockof the diskspace
allocatedtoit.This mappingisdefinedthatthe hashcomputation.The outputof hashdeterminesthe
locationof diskblockwhere the recordsmayexist.
ClusteredFileOrganization:Clusteredfileorganizationisnotconsideredgoodforlarge databases.Inthis
mechanism,relatedrecordsfromone ormore relationsare keptina same diskblock,thatis,the
orderingof recordsisnot basedon primarykeyor searchkey.Thisorganizationhelpstoretrievedata
easilybasedonparticularjoincondition.Otherthanparticularjoincondition,onwhichdataisstored,all
queriesbecome more expensive.
File Operations
Operationsondatabase filescanbe classifiedintotwocategoriesbroadly.
Update Operations
Retrieval Operations
Update operationschange the datavaluesbyinsertion,deletionorupdate.Retrieval operationsonthe
otherhand donot alterthe data but retrieve themafteroptionalconditional filtering.Inbothtypesof
operations,selectionplayssignificantrole.Otherthancreationanddeletionof afile,there couldbe
several operations,whichcanbe done onfiles.
Open:A file can be openedinone of twomodes,readmode or write mode.Inreadmode,operating
systemdoesnotallowanyone toalterdata itis solelyforreadingpurpose.Filesopenedinreadmode
can be sharedamongseveral entities.The othermode iswrite mode,in which,datamodificationis
allowed.Filesopenedinwrite modecanbe readalso butcannot be shared.
Locate: Everyfile hasa file pointer,whichtellsthe currentpositionwhere the dataistobe reador
written.Thispointercanbe adjustedaccordingly.Usingfind(seek) operationitcanbe movedforward
or backward.
Read:By default,whenfilesare openedinreadmode the file pointerpointstothe beginningof file.
There are optionswhere the usercantell the operatingsystemtowhere the file pointertobe locatedat
the time of file opening.The verynextdatatothe file pointerisread.
Write:User can selecttoopenfilesinwrite mode,whichenablesthemtoeditthe contentof file.Itcan
be deletion,insertionormodification.The file pointercanbe locatedatthe time of openingorcan be
dynamicallychangedif the operatingsystemalloweddoingso.
Close:Thisalsoismost importantoperationfromoperatingsystempointof view.Whenarequestto
close a file isgenerated,the operatingsystemremovesall the locks(if insharedmode) andsavesthe
contentof data (if altered) tothe secondarystorage mediaandrelease all the buffersandfile handlers
associatedwiththe file.
The organizationof data contentinside the fileplaysa majorrole here.Seekingorlocatingthe file
pointertothe desiredrecordinside file behavesdifferentlyif the filehasrecordsarrangedsequentially
or clustered,andsoon.
DBMS - Indexing
We knowthatinformationinthe DBMS filesisstoredinform of records.Everyrecord isequippedwith
some keyfield,whichhelpsittobe recognizeduniquely.
Indexingisadata structure technique toefficientlyretrieve recordsfromdatabase filesbasedonsome
attributesonwhichthe indexinghasbeendone.Indexingindatabase systemsissimilartothe one we
see inbooks.
Indexingisdefinedbasedonitsindexingattributes.Indexingcanbe one of the followingtypes:
PrimaryIndex:If index isbuiltonordering'key-field'of file itiscalledPrimaryIndex.Generallyitisthe
primarykeyof the relation.
SecondaryIndex:If index isbuiltonnon-orderingfieldof fileitiscalledSecondaryIndex.
ClusteringIndex:If index isbuiltonorderingnon-keyfieldof file itiscalledClusteringIndex.
Orderingfieldisthe fieldonwhichthe recordsof file are ordered.Itcanbe differentfromprimaryor
candidate keyof a file.
OrderedIndexingisof twotypes:
Dense Index
Sparse Index
Dense Index
In dense index,there isanindex recordforevery searchkeyvalue inthe database.Thismakessearching
fasterbut requiresmore space tostore index recordsitself.Index recordcontainssearchkeyvalue anda
pointertothe actual recordon the disk.
Sparse Index
In sparse index,index recordsare notcreatedforeverysearchkey.Anindex recordhere containssearch
keyand actual pointertothe data onthe disk.Tosearch a record we firstproceedbyindex recordand
reach at the actual locationof the data. If the data we are lookingforisnot where we directlyreachby
followingindex,the systemstartssequential searchuntil the desireddataisfound.
MultilevelIndex
Index recordsare comprisedof search-keyvalueanddatapointers.Thisindex itselfisstoredonthe disk
alongwiththe actual database files.Asthe size of database growssodoesthe size of indices.There isan
immense needtokeepthe index recordsinthe mainmemorysothatthe search can speedup.If single
level index isusedthenalarge size index cannotbe keptinmemoryaswhole andthisleadstomultiple
Multi-level Index helpsbreakingdownthe index intoseveral smallerindicesinordertomake the outer
mostlevel sosmall thatit can be savedinsingle diskblockwhichcaneasilybe accommodatedanywhere
inthe mainmemory.
B+ Tree
B tree is multi-level indexformat,whichisbalancedbinarysearchtrees.Asmentionedearliersingle
level index recordsbecomeslarge asthe database size grows,whichalsodegradesperformance.
All leaf nodesof B+ tree denote actual datapointers.B+tree ensuresthatall leaf nodesremainatthe
same height,thusbalanced.Additionally,all leafnodesare linkedusinglinklist,whichmakesB+tree to
supportrandomaccess as well assequentialaccess.
Structure of B+ tree
Everyleaf node isat equal distance fromthe rootnode.A B+ tree is of ordern where n isfixedforevery
B+ tree.
Internal nodes:
Internal (non-leaf) nodescontainsatleast ⌈n/2⌉ pointers,exceptthe rootnode.
At most,internal nodescontainn pointers.
Leaf nodes:
Leaf nodescontainat least ⌈n/2⌉ record pointersand ⌈n/2⌉ keyvalues
At most,leaf nodescontainnrecordpointersandn keyvalues
Everyleaf node containsone blockpointerPtopointto nextleaf node andformsa linkedlist.
B+ tree insertion
B+ tree are filledfrombottom.Andeachnode isinsertedatleaf node.
If leaf node overflows:
Splitnode intotwoparts
Partitionati = ⌊(m+1)/2⌋
Firsti entriesare storedinone node
Restof the entries(i+1onwards) are moved toa new node
ithkeyis duplicatedinthe parentof the leaf
If non-leaf node overflows:
Splitnode intotwoparts
Partitionthe node ati = ⌈(m+1)/2⌉
Entriesuptoi are keptin one node
Restof the entriesare movedtoa newnode
B+ tree deletion
B+ tree entriesare deletedleaf nodes.
The target entryissearchedand deleted.
If it is ininternal node,deleteandreplace withthe entryfromthe leftposition.
Afterdeletionunderflowistested
If underflowoccurs
Distribute entriesfromnodeslefttoit.
If distributionfromleftisnotpossible
Distribute fromnodesrighttoit
If distributionfromleftandrightisnotpossible
Merge the node withleftandrightto it.
DBMS - Hashing
For a huge database structure it isnot sometime feasibletosearchindex throughall itslevelandthen
reach the destinationdatablocktoretrieve the desireddata.Hashingisaneffectivetechnique to
calculate directlocationof datarecord on the diskwithoutusingindexstructure.
It usesa function,calledhashfunctionandgeneratesaddresswhencalledwithsearchkeyas
parameters.Hashfunctioncomputesthe locationof desireddataonthe disk.
Hash Organization
Bucket:Hash file storesdatainbucketformat.Bucketisconsideredaunitof storage.Buckettypically
storesone complete diskblock,whichinturncanstore one or more records.
Hash Function:A hash functionh,isa mappingfunctionthatmapsall set of search-keysKtothe address
where actual recordsare placed.Itis a functionfromsearchkeystobucketaddresses.
StaticHashing
In statichashing,whenasearch-keyvalue isprovidedthe hashfunctionalwayscomputesthe same
address.Forexample,if mod-4hashfunctionisusedthenitshall generateonly5values.The output
addressshall alwaysbe same forthat function.The numbersof bucketsprovidedremainsame atall
times.
Operation:
Insertion:Whenarecordis requiredtobe enteredusingstatichash,the hashfunctionh,computesthe
bucketaddressforsearch keyK, where the recordwill be stored.
Bucketaddress= h(K)
Search:Whena recordneedstobe retrievedthe same hashfunctioncanbe usedtoretrieve the address
of bucketwhere the dataisstored.
Delete:Thisissimplysearchfollowedbydeletionoperation.
BucketOverflow:
The conditionof bucket-overflowisknownascollision.Thisisafatal state for anystatic hashfunction.In
thiscase overflow chainingcanbe used.
OverflowChaining:Whenbucketsare full,anew bucketisallocatedforthe same hashresultandis
linkedafterthe previousone.ThismechanismiscalledClosedHashing.
LinearProbing:Whenhashfunctiongeneratesanaddressatwhichdata isalreadystored,the nextfree
bucketisallocatedtoit. ThismechanismiscalledOpen Hashing.
For a hash functiontoworkefficientlyand
effectivelythe followingmustmatch:
Distributionof recordsshouldbe uniform
Distributionshouldbe randominsteadof anyordering
DynamicHashing
Problemwithstatichashingisthatit doesnotexpandorshrinkdynamicallyasthe size of database
growsor shrinks.Dynamichashingprovidesamechanisminwhichdatabucketsare addedandremoved
dynamicallyandon-demand.Dynamichashingisalsoknownasextendedhashing.
Hash function,indynamichashing,ismade toproduce large numberof valuesandonlyafew are used
initially.
Organization
The prefix of entire hashvalue istakenashashindex.Onlyaportionof hash value isusedforcomputing
bucketaddresses.Everyhashindex hasadepth value,whichtellsithow manybitsare usedfor
computinghashfunction.These bitsare capable toaddress2n buckets.Whenall these bitsare
consumed,thatis,all bucketsare full,thenthe depthvalue isincreasedlinearlyandtwice the buckets
are allocated.
Operation
Querying:Lookat the depthvalue of hashindex anduse those bitstocompute the bucketaddress.
Update:Performa queryas above andupdate data.
Deletion:Performaquerytolocate desireddataand delete data.
Insertion:compute the addressof bucket
If the bucketisalreadyfull
Addmore buckets
Addadditional bittohashvalue
Re-compute the hashfunction
Else
Adddata to the bucket
If all bucketsare full,performthe remediesof statichashing.
Hashingisnot favorable whenthe dataisorganizedinsome orderingandqueriesrequire range of data.
Whendata is discrete andrandom,hashperformsthe best.
Hashingalgorithmandimplementationhave highcomplexitythanindexing.Allhashoperationsare
done inconstanttime.
DBMS - Transaction
A transactioncan be definedasa groupof tasks.A single taskisthe minimumprocessingunitof work,
whichcannotbe dividedfurther.
An example of transactioncanbe bankaccounts of two users,sayA & B. Whena bank employee
transfersamountof Rs. 500 from A's accountto B's account,a numberof tasksare executedbehindthe
screen.Thisverysimple andsmall transactionincludesseveral steps:decreaseA'sbankaccountfrom
500
Open_Account(A)
Old_Balance = A.balance
New_Balance =Old_Balance - 500
A.balance =New_Balance
Close_Account(A)
In simple words,the transactioninvolvesmanytasks,suchasopeningthe accountof A, readingthe old
balance,decreasingthe 500 fromit, savingnew balance toaccount of A and finallyclosingit.Toadd
amount500 inB's account same sort of tasksneedtobe done:
Open_Account(B)
Old_Balance = B.balance
New_Balance =Old_Balance + 500
B.balance = New_Balance
Close_Account(B)
A simple transactionof movinganamountof 500 from A to B involvesmanylow leveltasks.
ACIDProperties
A transactionmaycontainseveral lowleveltasksandfurtheratransactionis verysmall unitof any
program.A transactionina database systemmustmaintainsome propertiesinordertoensure the
accuracy of itscompletenessanddataintegrity.These propertiesare refertoas ACIDpropertiesandare
mentionedbelow:
Atomicity:Thoughatransactioninvolvesseveral lowlevel operationsbutthispropertystatesthata
transactionmustbe treatedasan atomic unit,thatis,eitherall of itsoperationsare executedornone.
There mustbe nostate indatabase where the transactionisleftpartiallycompleted.Statesshouldbe
definedeitherbefore the executionof the transactionorafterthe execution/abortion/failure of the
transaction.
Consistency:Thispropertystatesthatafterthe transactionisfinished,itsdatabase mustremainina
consistentstate.There mustnotbe any possibilitythatsome dataisincorrectlyaffectedbythe
executionof transaction.If the database wasina consistentstate before the executionof the
transaction,itmustremaininconsistentstate afterthe executionof the transaction.
Durability:Thispropertystatesthatinanycase all updatesmade onthe database will persistevenif the
systemfailsandrestarts.If a transactionwritesorupdatessome data indatabase and commitsthat
data will alwaysbe there inthe database.If the transactioncommitsbutdata isnot writtenonthe disk
and the systemfails,thatdatawill be updatedonce the systemcomesup.
Isolation:Ina database systemwhere more thanone transactionare beingexecutedsimultaneouslyand
inparallel,the propertyof isolationstatesthatall the transactionswill be carriedoutandexecutedasif
it isthe onlytransactioninthe system.Notransactionwill affectthe existence of anyothertransaction.
Serializability
Whenmore than one transactionisexecutedbythe operatingsysteminamultiprogramming
environment,there are possibilitiesthatinstructionsof one transactionsare interleavedwithsome
othertransaction.
Schedule:A chronological executionsequence of transactioniscalledschedule.A schedule canhave
manytransactionsinit, eachcomprisingof numberof instructions/tasks.
Serial Schedule:A schedule inwhichtransactionsare alignedinsuchaway that one transactionis
executedfirst.Whenthe firsttransactioncompletesitscycle thennexttransactionisexecuted.
Transactionsare orderedone afterother.Thistype of schedule iscalledserial scheduleastransactions
are executedinaserial manner.
In a multi-transactionenvironment,serial schedulesare consideredasbenchmark.The execution
sequence of instructioninatransactioncannotbe changedbut twotransactionscan have their
instructionexecutedinrandomfashion.Thisexecutiondoesnoharmif two transactionsare mutually
independentandworkingondifferentsegmentof databutin case these twotransactionsare working
on same data, resultsmayvary.This ever-varyingresultmaycause the database inan inconsistentstate.
To resolve the problem, we allowparallel executionof transactionscheduleif transactionsinitare
eitherserializable orhave some equivalence relationbetweenoramongtransactions.
Equivalence schedules:Schedulescanequivalence of the followingtypes:
ResultEquivalence:
If two schedulesproduce same resultsafterexecution,are saidtobe resultequivalent.Theymayyield
same resultforsome value andmay yielddifferentresultsforanothervalues.That'swhythis
equivalence isnotgenerallyconsideredsignificant.
ViewEquivalence:
Two schedulesare view equivalence if transactionsinbothschedulesperformsimilaractionsinsimilar
manner.
For example:
If T readsinitial datainS1 thenT alsoreadsinitial datainS2
If T readsvalue writtenbyJin S1 thenT alsoreadsvalue writtenbyJin S2
If T performsfinal write ondatavalue inS1 thenT also performsfinal write ondatavalue inS2
ConflictEquivalence:
Two operationsare saidtobe conflictingif theyhave the followingproperties:
Both belongto separate transactions
Both accessesthe same data item
At leastone of themis"write"operation
Two scheduleshave more thanone transactionswithconflictingoperationsare saidtobe conflict
equivalentif andonlyif:
Both schedulescontainsame setof Transactions
The order of conflictingpairsof operationismaintainedinbothschedules
Viewequivalentschedulesare viewserializable andconflictequivalentschedulesare conflict
serializable.All conflictserializableschedulesare view serializable too.
Statesof Transactions:
A transactionina database can be inone of the followingstate:
Active:Inthisstate the transactionis beingexecuted.Thisisthe initial state of everytransaction.
PartiallyCommitted:Whenatransactionexecutesitsfinal operation,itissaidtobe in thisstate.After
executionof all operations,the database systemperformssome checkse.g.the consistencystate of
database afterapplyingoutputof transactionontothe database.
Failed:If anychecksmade by database recoverysystemfails,the transactionissaidtobe in failedstate,
fromwhere itcan no longerproceedfurther.
Aborted:If anyof checksfailsandtransactionreachedinFailedstate,the recoverymanagerrollsback
all itswrite operationonthe database to make database inthe state where itwas priorto start of
executionof transaction.Transactionsinthisstate are calledaborted.Database recoverymodulecan
selectone of the twooperationsaftera transactionaborts:
Re-startthe transaction
Kill the transaction
Committed:If transactionexecutesall itsoperationssuccessfullyitissaidtobe committed.All itseffects
are nowpermanentlymade ondatabase system.
DBMS - ConcurrencyControl
In a multiprogrammingenvironmentwheremore thanone transactionscanbe concurrentlyexecuted,
there existsaneedof protocolstocontrol the concurrencyof transactionto ensure atomicityand
isolationpropertiesof transactions.
Concurrencycontrol protocols,whichensure serializabilityof transactions,are mostdesirable.
Concurrencycontrol protocolscan be broadlydividedintotwocategories:
Lock basedprotocols
Time stampbasedprotocols
Lock basedprotocols
Database systems,whichare equippedwithlock-basedprotocols,use mechanismbywhichany
transactioncannotread or write data until itacquiresappropriate lockonitfirst.Locksare of twokinds:
BinaryLocks: a lock ondata itemcan be intwostates;it iseitherlockedorunlocked.
Shared/exclusive:thistype of lockingmechanismdifferentiateslockbasedontheiruses.If alockis
acquiredona data itemtoperforma write operation,itisexclusivelock.Because allowingmore than
one transactionsto write onsame data itemwouldleadthe database intoaninconsistentstate.Read
locksare sharedbecause nodata value isbeingchanged.
There are fourtypeslockprotocolsavailable:
Simplistic
Simplisticlockbasedprotocolsallow transactiontoobtainlockoneveryobjectbefore'write'operation
isperformed.Assoonas 'write'hasbeendone,transactionsmayunlockthe dataitem.
Pre-claiming
In thisprotocol,a transactionsevaluationsitsoperationsandcreatesalistof data itemsonwhichit
needslocks.Before startingthe execution,transactionrequeststhe systemforall locksitneeds
beforehand.If all the locksare granted,the transactionexecutesandreleasesall the lockswhenall its
operationsare over.Else if all the locksare not granted,the transactionrollsbackand waitsuntil all
locksare granted.
Two Phase Locking- 2PL
Thislockingprotocol isdividestransactionexecutionphaseintothree parts.Inthe firstpart,when
transactionstarts executing,transactionseeksgrantforlocksitneedsas itexecutes.Secondpart is
where the transactionacquiresall locksandnoother lockisrequired.Transactionkeepsexecutingits
operation.Assoonas the transactionreleasesitsfirstlock,the thirdphase starts.Inthisphase a
transactioncannotdemandfor anylock butonlyreleasesthe acquiredlocks.
Two phase lockinghastwophases,one isgrowing;where all locksare beingacquiredbytransactionand
secondone isshrinking,where locksheldbythe transactionare beingreleased.
To claiman exclusive (write) lock,a transactionmustfirstacquire ashared(read) lockand thenupgrade
it to exclusive lock.
Strict TwoPhase Locking
The firstphase of Strict-2PLissame as 2PL. Afteracquiringall locksinthe firstphase,transaction
continuestoexecute normally.Butincontrastto 2PL, Strict-2PLdoesnot release lockassoonas itis no
more required,butitholdsall locksuntil commitstate arrives.Strict-2PLreleasesall locksatonce at
commitpoint.
Time stampbasedprotocols
The most commonlyused concurrencyprotocol istime-stampbasedprotocol.Thisprotocol useseither
systemtime orlogical countertobe usedas a time-stamp.
Lock basedprotocolsmanage the orderbetweenconflictingpairsamongtransactionatthe time of
executionwhereastime-stampbasedprotocolsstartworkingassoonas transactionis created.
Everytransactionhas a time-stampassociatedwithitandthe orderingisdeterminedbythe age of the
transaction.A transactioncreatedat 0002 clock time wouldbe olderthanall othertransaction,which
come afterit. For example,anytransaction'y'enteringthe systemat0004 istwo secondsyoungerand
prioritymaybe givento the olderone.
In addition,everydataitemisgiventhe latestreadandwrite-timestamp.Thisletsthe systemknow,
whenwaslastread and write operationmade onthe dataitem.
Time-stamporderingprotocol
The timestamp-orderingprotocol ensuresserializabilityamongtransactionintheirconflictingreadand
write operations.Thisisthe responsibilityof the protocol systemthatthe conflictingpairof tasksshould
be executedaccordingtothe timestampvaluesof the transactions.
Time-stampof TransactionTi isdenotedasTS(Ti).
Readtime-stampof data-itemXisdenotedbyR-timestamp(X).
Write time-stampof data-itemXisdenotedbyW-timestamp(X).
Timestamporderingprotocol worksasfollows:
If a transactionTi issuesread(X) operation:
If TS(Ti) < W-timestamp(X)
Operationrejected.
If TS(Ti) >= W-timestamp(X)
Operationexecuted.
All data-itemTimestampsupdated.
If a transactionTi issueswrite(X)operation:
If TS(Ti) < R-timestamp(X)
Operationrejected.
If TS(Ti) < W-timestamp(X)
OperationrejectedandTi rolledback.
Otherwise,operationexecuted.
Thomas' Write rule:
Thisrule statesthat incase of:
If TS(Ti) < W-timestamp(X)
OperationrejectedandTi rolledback.Timestamporderingrulescanbe modifiedtomake the schedule
viewserializable.Insteadof makingTi rolledback,the 'write'operationitself isignored.
DBMS - Deadlock
In a multi-processsystem, deadlockisasituation,whicharisesinsharedresource environmentwhere a
processindefinitelywaitsforaresource,whichisheldbysome otherprocess,whichinturnwaitingfora
resource heldbysome other process.
For example,assumeasetof transactions{T0, T1, T2, ...,Tn}.T0 needsa resource Xto complete itstask.
Resource Xis heldbyT1 andT1 is waitingfora resource Y, whichisheldbyT2. T2 is waitingforresource
Z, whichisheldbyT0. Thus,all processeswaitforeachotherto release resources.Inthissituation,none
of processescanfinishtheirtask.Thissituationisknownas'deadlock'.
Deadlockisnota goodphenomenonforahealthysystem.Tokeepsystemdeadlockfreefew methods
can be used.Incase the systemisstuckbecause of deadlock,eitherthe transactionsinvolvedin
deadlockare rolledbackandrestarted.
DeadlockPrevention
To preventanydeadlocksituationinthe system,the DBMSaggressivelyinspectsall the operations
whichtransactionsare about toexecute.DBMSinspectsoperationsandanalyze if theycancreate a
deadlocksituation.If itfindsthatadeadlocksituationmightoccurthenthattransactionisneverallowed
to be executed.
There are deadlockprevention schemes,whichusestime-stamporderingmechanismof transactionsin
orderto pre-decide adeadlocksituation.
Wait-Die Scheme:
In thisscheme,if atransactionrequesttolocka resource (dataitem),whichisalreadyheldwith
conflictinglockbysome othertransaction,one of the twopossibilitiesmayoccur:
If TS(Ti) < TS(Tj),thatisTi, whichisrequestingaconflictinglock,isolderthanTj,Ti isallowedtowait
until the data-itemisavailable.
If TS(Ti) > TS(tj),thatisTi is youngerthanTj, Ti dies.Ti is restartedlaterwithrandomdelaybutwith
same timestamp.
Thisscheme allowsthe oldertransactiontowaitbutkillsthe youngerone.
Wound-WaitScheme:
In thisscheme,if atransactionrequesttolocka resource (dataitem),whichisalreadyheldwith
conflictinglockbysome othertransaction,one of the twopossibilitiesmayoccur:
If TS(Ti) < TS(Tj),thatisTi, whichisrequestingaconflictinglock,isolderthanTj,Ti forcesTj to be rolled
back, thatis Ti woundsTj.Tj isrestartedlaterwithrandomdelaybutwithsame timestamp.
If TS(Ti) > TS(Tj),thatisTi is youngerthanTj, Ti is forcedto waituntil the resource isavailable.
Thisscheme,allowsthe youngertransactiontowaitbutwhenanoldertransactionrequestan itemheld
by youngerone,the oldertransactionforcesthe youngerone toabortand release the item.
In bothcases,transaction,whichenterslate inthe system, isaborted.
DeadlockAvoidance
Abortinga transactionisnot alwaysapractical approach.Insteaddeadlockavoidancemechanismscan
be usedto detectanydeadlocksituationinadvance.Methodslike"wait-forgraph"are available butfor
the systemwhere transactionsare lightinweightandhave holdonfewerinstancesof resource.Ina
bulkysystemdeadlockpreventiontechniquesmayworkwell.
Wait-forGraph
Thisis a simple methodavailabletotrackif anydeadlocksituationmayarise.Foreachtransaction
enteringinthe system,anode iscreated.WhentransactionTi requestsfora lockon item, sayX, which
isheldbysome othertransactionTj, a directededge iscreatedfromTi to Tj. If Tj releasesitemX,the
edge betweenthemisdroppedandTi locksthe data item.
The systemmaintainsthiswait-forgraphforeverytransactionwaitingfor some dataitemsheldby
others.Systemkeepscheckingif there'sanycycle inthe graph.
DBMS - Data Backup
Failure withlossof Non-Volatilestorage
What wouldhappenif the non-volatile storage like RAMabruptlycrashes?All transaction,whichare
beingexecutedare keptinmainmemory.All active logs,diskbuffersandrelateddataisstoredinnon-
volatile storage.
Whenstorage like RAMfails,ittakesaway all the logsand active copyof database.Itmakesrecovery
almostimpossible aseverythingtohelprecoverisalsolost.Followingtechniquesmaybe adoptedin
case of lossof non-volatilestorage.
A mechanismlike checkpointcanbe adoptedwhichmakesthe entire contentof database be saved
periodically.
State of active database innon-volatilememorycanbe dumpedontostable storage periodically,which
may alsocontainlogsand active transactionsandbufferblocks.
<dump> can be markedon logfile wheneverthe database contentsare dumpedfromnon-volatile
memoryto a stable one.
Recovery:
Whenthe systemrecoversfromfailure,itcanrestore the latestdump.
It can maintainredo-listandundo-listasincheckpoints.
It can recoverthe systembyconsultingundo-redoliststorestore the state of all transactionupto last
checkpoint.
Database backup& recoveryfromcatastrophicfailure
So far we have not discoveredanyotherplanetinoursolarsystem, whichmayhave life onit,andour
ownearth isnot that safe.Incase of catastrophicfailure like alienattack,the database administrator
may still be forcedtorecoverthe database.
Remote backup,describednext,isone of the solutionstosave life.Alternatively,wholedatabase
backupscan be takenonmagnetictapesand storedat a saferplace.Thisbackupcan laterbe restored
on a freshlyinstalleddatabaseandbringitto the state at leastatthe pointof backup.
Grown updatabasesare toolarge to be frequentlybacked-up.Instead,we are aware of techniques
where we can restore adatabase by justlookingatlogs.So backupof logsat frequentrate ismore
feasible thanthe entire database.Database canbe backed-uponce aweekandlogs,beingverysmall
can be backed-upeverydayoras frequentaseveryhour.
Remote Backup
Remote backupprovidesasense of securityandsafety incase the primarylocationwhere the database
islocatedgetsdestroyed.Remote backupcanbe offlineorreal-timeandonline.Incase itis offline itis
maintainedmanually.
Online
backupsystemsare more real-time andlifesaversfordatabase administratorsandinvestors.Anonline
backupsystemisa mechanismwhere everybitof real-time dataisbacked-upsimultaneouslyattwo
distantplace.One of themisdirectlyconnectedtosystemandotherone iskeptat remote place as
backup.
As soonas the primarydatabase storage fails,the backupsystemsense the failure andswitchthe user
systemtothe remote storage.Sometimesthisissoinstantthe usersevencan'trealize afailure.
DBMS - Data Recovery
Crash Recovery
Thoughwe are livinginhighlytechnologicallyadvancederawhere hundredsof satellite monitorthe
earthand at everysecondbillionsof people are connectedthroughinformationtechnology,failure is
expectedbutnoteverytime acceptable.
DBMS ishighlycomplex systemwithhundredsof transactionsbeingexecutedeverysecond.Availability
of DBMS dependsonitscomplex architecture andunderlyinghardware orsystemsoftware.If itfailsor
crashesamidtransactionsbeingexecuted,itisexpectedthatthe systemwouldfollowsome sortof
algorithmortechniquestorecoverfromcrashesor failures.
Failure Classification
To see where the problemhasoccurredwe generalizethe failure intovariouscategories,asfollows:
Transactionfailure
Whena transactionis failedtoexecute oritreachesapointafterwhichit cannotbe completed
successfullyithastoabort.This iscalledtransactionfailure.Whereonlyfew transactionorprocessare
hurt.
Reasonfortransactionfailure couldbe:
Logical errors:where a transactioncannotcomplete because of ithassome code error or any internal
error condition
Systemerrors:where the database systemitself terminatesanactive transactionbecause DBMSisnot
able to execute itorithas to stop because of some systemcondition.Forexample,incase of deadlock
or resource unavailabilitysystemsabortsanactive transaction.
Systemcrash
There are problems,whichare external tothe system, whichmaycause the systemtostopabruptlyand
cause the systemtocrash. For example interruptioninpowersupply,failureof underlyinghardware or
software failure.
Examplesmayinclude operatingsystemerrors.
Diskfailure:
In earlydaysof technologyevolution,itwasa commonproblemwhere harddiskdrivesorstorage drives
usedto fail frequently.
Diskfailuresincludeformationof badsectors,unreachabilitytothe disk,diskheadcrashor anyother
failure,whichdestroysall orpart of diskstorage
Storage Structure
We have alreadydescribedstorage systemhere.Inbrief,the storage structure canbe dividedinvarious
categories:
Volatile storage:Asname suggests,thisstorage doesnotsurvive systemcrashesandmostlyplacedvery
closedtoCPU by embeddingthemontothe chipsetitself forexamples:mainmemory,cache memory.
Theyare fastbut can store a small amountof information.
Nonvolatile storage:Thesememoriesare made tosurvive systemcrashes.Theyare huge indatastorage
capacitybut slowerinaccessibility.Examplesmayinclude,harddisks,magnetictapes,flashmemory,
non-volatile (batterybackedup) RAM.
RecoveryandAtomicity
Whena systemcrashes,itmany have several transactionsbeingexecutedandvariousfilesopenedfor
themto modifyingdataitems.Aswe know thattransactionsare made of variousoperations,whichare
atomicin nature.Butaccording to ACIDpropertiesof DBMS,atomicityof transactionsas a whole must
be maintainedthatis,eitherall operationsare executedornone.
WhenDBMS recoversfroma crash it shouldmaintainthe following:
It shouldcheckthe statesof all transactions,whichwere beingexecuted.
A transactionmaybe inthe middle of some operation;DBMSmustensure the atomicityof transaction
inthiscase.
It shouldcheckwhetherthe transactioncanbe completednow orneedstobe rolledback.
No transactionswouldbe allowedtoleftDBMSininconsistentstate.
There are twotypesof techniques,whichcanhelpDBMS inrecoveringaswell asmaintainingthe
atomicityof transaction:
Maintainingthe logsof eachtransaction,and writingthemontosome stable storage beforeactually
modifyingthe database.
Maintainingshadowpaging,where are the changesare done ona volatile memoryandlaterthe actual
database isupdated.
Log-BasedRecovery
Log is a sequence of records,whichmaintainsthe recordsof actionsperformedbyatransaction.Itis
importantthatthe logsare writtenpriortoactual modificationandstoredona stable storage media,
whichisfailsafe.
Log basedrecoveryworksasfollows:
The log file iskeptonstable storage media
Whena transactionentersthe systemandstarts execution,itwritesalogaboutit
<Tn, Start>
Whenthe transactionmodifiesanitemX,itwrite logsasfollows:
<Tn, X, V1,V2>
It readsTn has changedthe value of X, fromV1 to V2.
Whentransactionfinishes,itlogs:
<Tn, commit>
Database can be modifiedusingtwoapproaches:
Deferreddatabase modification:All logsare writtenontothe stable storage anddatabase isupdated
whentransactioncommits.
Immediate database modification:Eachlogfollowsanactual database modification.Thatis,database is
modifiedimmediatelyaftereveryoperation.
Recoverywithconcurrenttransactions
Whenmore than one transactionsare beingexecuted inparallel,the logsare interleaved.Atthe time of
recoveryitwouldbecome hardforrecoverysystemtobacktrack all logs,andthenstart recovering.To
ease thissituationmostmodernDBMSuse the conceptof 'checkpoints'.
Checkpoint
Keepingandmaintaininglogsinreal time andinreal environmentmayfill outall the memoryspace
available inthe system.Attime passeslogfile maybe toobigto be handledatall.Checkpointisa
mechanismwhere all the previouslogsare removedfromthe systemand storedpermanentlyinstorage
disk.Checkpointdeclaresapointbefore whichthe DBMSwasin consistentstate andall the transactions
were committed.
Recovery
Whensystemwithconcurrenttransactioncrashesandrecovers,itdoesbehave inthe followingmanner:
The recoverysystemreadsthe logsbackwardsfromthe endto the lastCheckpoint.
It maintainstwolists,undo-listandredo-list.
If the recoverysystemseesalogwith<Tn,Start> and <Tn, Commit>or just<Tn, Commit>,itputs the
transactioninredo-list.
If the recoverysystemseesalogwith<Tn,Start> butno commitor abort log found,itputsthe
transactioninundo-list.
All transactionsinundo-listare thenundone andtheirlogsare removed.All transactioninredo-list,
theirpreviouslogsare removedandthenredone againandlogsaved.

Contenu connexe

Tendances

L7 data model and dbms architecture
L7  data model and dbms architectureL7  data model and dbms architecture
L7 data model and dbms architectureRushdi Shams
 
Advanced Database Lecture Notes
Advanced Database Lecture NotesAdvanced Database Lecture Notes
Advanced Database Lecture NotesJasour Obeidat
 
Database Presentation
Database PresentationDatabase Presentation
Database Presentationa9oolq8
 
DBMS ppt by dipali jadhav
DBMS ppt by dipali jadhavDBMS ppt by dipali jadhav
DBMS ppt by dipali jadhavdipumaliy
 
Introduction to Database Management System
Introduction to Database Management SystemIntroduction to Database Management System
Introduction to Database Management SystemHitesh Mohapatra
 
Database overview
Database overviewDatabase overview
Database overviewSayem Khan
 
Lecture 01 introduction to database
Lecture 01 introduction to databaseLecture 01 introduction to database
Lecture 01 introduction to databaseemailharmeet
 
Cp 121 lecture 01
Cp 121 lecture 01Cp 121 lecture 01
Cp 121 lecture 01ITNet
 
Introduction to databases
Introduction to databasesIntroduction to databases
Introduction to databasesAashima Wadhwa
 
Database management system by Gursharan singh
Database management system by Gursharan singhDatabase management system by Gursharan singh
Database management system by Gursharan singhGursharan Singh
 
Database introduction
Database introductionDatabase introduction
Database introductionHarry Potter
 
Introduction: Databases and Database Users
Introduction: Databases and Database UsersIntroduction: Databases and Database Users
Introduction: Databases and Database Userssontumax
 
Fundamentals of Database ppt ch01
Fundamentals of Database ppt ch01Fundamentals of Database ppt ch01
Fundamentals of Database ppt ch01Jotham Gadot
 

Tendances (20)

Mba 758 database management system
Mba 758 database management systemMba 758 database management system
Mba 758 database management system
 
L7 data model and dbms architecture
L7  data model and dbms architectureL7  data model and dbms architecture
L7 data model and dbms architecture
 
Advanced Database Lecture Notes
Advanced Database Lecture NotesAdvanced Database Lecture Notes
Advanced Database Lecture Notes
 
Database Presentation
Database PresentationDatabase Presentation
Database Presentation
 
DBMS ppt by dipali jadhav
DBMS ppt by dipali jadhavDBMS ppt by dipali jadhav
DBMS ppt by dipali jadhav
 
Dbms notes
Dbms notesDbms notes
Dbms notes
 
Introduction to Database Management System
Introduction to Database Management SystemIntroduction to Database Management System
Introduction to Database Management System
 
Database overview
Database overviewDatabase overview
Database overview
 
Lecture 01 introduction to database
Lecture 01 introduction to databaseLecture 01 introduction to database
Lecture 01 introduction to database
 
Cp 121 lecture 01
Cp 121 lecture 01Cp 121 lecture 01
Cp 121 lecture 01
 
Introduction to databases
Introduction to databasesIntroduction to databases
Introduction to databases
 
27 fcs157al2
27 fcs157al227 fcs157al2
27 fcs157al2
 
Database management system by Gursharan singh
Database management system by Gursharan singhDatabase management system by Gursharan singh
Database management system by Gursharan singh
 
DBMS - Introduction
DBMS - IntroductionDBMS - Introduction
DBMS - Introduction
 
Database Management System ppt
Database Management System pptDatabase Management System ppt
Database Management System ppt
 
Database introduction
Database introductionDatabase introduction
Database introduction
 
23246406 dbms-unit-1
23246406 dbms-unit-123246406 dbms-unit-1
23246406 dbms-unit-1
 
Introduction: Databases and Database Users
Introduction: Databases and Database UsersIntroduction: Databases and Database Users
Introduction: Databases and Database Users
 
Database - Design & Implementation - 1
Database - Design & Implementation - 1Database - Design & Implementation - 1
Database - Design & Implementation - 1
 
Fundamentals of Database ppt ch01
Fundamentals of Database ppt ch01Fundamentals of Database ppt ch01
Fundamentals of Database ppt ch01
 

En vedette

Introduction to Relational Algebra
Introduction to Relational AlgebraIntroduction to Relational Algebra
Introduction to Relational AlgebraJames McMurray
 
Database Systems - Relational Data Model (Chapter 2)
Database Systems - Relational Data Model (Chapter 2)Database Systems - Relational Data Model (Chapter 2)
Database Systems - Relational Data Model (Chapter 2)Vidyasagar Mundroy
 
Relational algebra calculus
Relational algebra  calculusRelational algebra  calculus
Relational algebra calculusVaibhav Kathuria
 
Relational algebra-and-relational-calculus
Relational algebra-and-relational-calculusRelational algebra-and-relational-calculus
Relational algebra-and-relational-calculusSalman Vadsarya
 
Relational Algebra
Relational AlgebraRelational Algebra
Relational Algebraguest20b0b3
 
Relational Algebra-Database Systems
Relational Algebra-Database SystemsRelational Algebra-Database Systems
Relational Algebra-Database Systemsjakodongo
 
Relational algebra in dbms
Relational algebra in dbmsRelational algebra in dbms
Relational algebra in dbmsshekhar1991
 
Presentation on dbms(relational calculus)
Presentation on dbms(relational calculus)Presentation on dbms(relational calculus)
Presentation on dbms(relational calculus)yourbookworldanil
 

En vedette (8)

Introduction to Relational Algebra
Introduction to Relational AlgebraIntroduction to Relational Algebra
Introduction to Relational Algebra
 
Database Systems - Relational Data Model (Chapter 2)
Database Systems - Relational Data Model (Chapter 2)Database Systems - Relational Data Model (Chapter 2)
Database Systems - Relational Data Model (Chapter 2)
 
Relational algebra calculus
Relational algebra  calculusRelational algebra  calculus
Relational algebra calculus
 
Relational algebra-and-relational-calculus
Relational algebra-and-relational-calculusRelational algebra-and-relational-calculus
Relational algebra-and-relational-calculus
 
Relational Algebra
Relational AlgebraRelational Algebra
Relational Algebra
 
Relational Algebra-Database Systems
Relational Algebra-Database SystemsRelational Algebra-Database Systems
Relational Algebra-Database Systems
 
Relational algebra in dbms
Relational algebra in dbmsRelational algebra in dbms
Relational algebra in dbms
 
Presentation on dbms(relational calculus)
Presentation on dbms(relational calculus)Presentation on dbms(relational calculus)
Presentation on dbms(relational calculus)
 

Similaire à DBMS FOR STUDENTS MUST DOWNLOAD AND READ

Database Concepts & SQL(1).pdf
Database Concepts & SQL(1).pdfDatabase Concepts & SQL(1).pdf
Database Concepts & SQL(1).pdfrsujeet169
 
Database Lecture 3.pptx
Database Lecture 3.pptxDatabase Lecture 3.pptx
Database Lecture 3.pptxRUBAB79
 
Database System Concepts AND architecture [Autosaved].pptx
Database System Concepts AND architecture [Autosaved].pptxDatabase System Concepts AND architecture [Autosaved].pptx
Database System Concepts AND architecture [Autosaved].pptxKoteswari Kasireddy
 
DBMS Lecture1.ppt
DBMS Lecture1.pptDBMS Lecture1.ppt
DBMS Lecture1.pptIpsitaSaha9
 
Database management system
Database management systemDatabase management system
Database management systemsonykhan3
 
DBMS Database Management System
DBMS Database Management SystemDBMS Database Management System
DBMS Database Management SystemDipen Bharadava
 
Database Management System.docx
Database Management System.docxDatabase Management System.docx
Database Management System.docxantonymwangi31
 
Spatial Database and Database Management System
Spatial Database and Database Management SystemSpatial Database and Database Management System
Spatial Database and Database Management SystemLal Mohammad
 
Database Management System
Database Management SystemDatabase Management System
Database Management SystemTamur Iqbal
 
WHAT IS A DBMS? EXPLAIN DIFFERENT MYSQL COMMANDS AND CONSTRAINTS OF THE SAME.
WHAT IS A DBMS? EXPLAIN DIFFERENT MYSQL COMMANDS AND  CONSTRAINTS OF THE SAME.WHAT IS A DBMS? EXPLAIN DIFFERENT MYSQL COMMANDS AND  CONSTRAINTS OF THE SAME.
WHAT IS A DBMS? EXPLAIN DIFFERENT MYSQL COMMANDS AND CONSTRAINTS OF THE SAME.`Shweta Bhavsar
 
Database Management Systems ( Dbms )
Database Management Systems ( Dbms )Database Management Systems ( Dbms )
Database Management Systems ( Dbms )Patty Buckley
 

Similaire à DBMS FOR STUDENTS MUST DOWNLOAD AND READ (20)

DBMS Part 1.pptx
DBMS Part 1.pptxDBMS Part 1.pptx
DBMS Part 1.pptx
 
Dbms quick guide
Dbms quick guideDbms quick guide
Dbms quick guide
 
Database Concepts & SQL(1).pdf
Database Concepts & SQL(1).pdfDatabase Concepts & SQL(1).pdf
Database Concepts & SQL(1).pdf
 
Database Lecture 3.pptx
Database Lecture 3.pptxDatabase Lecture 3.pptx
Database Lecture 3.pptx
 
Database System Concepts AND architecture [Autosaved].pptx
Database System Concepts AND architecture [Autosaved].pptxDatabase System Concepts AND architecture [Autosaved].pptx
Database System Concepts AND architecture [Autosaved].pptx
 
Chapter 05 pertemuan 7- donpas - manajemen data
Chapter 05 pertemuan 7- donpas - manajemen dataChapter 05 pertemuan 7- donpas - manajemen data
Chapter 05 pertemuan 7- donpas - manajemen data
 
Data models
Data modelsData models
Data models
 
Data models
Data modelsData models
Data models
 
Database Concepts
Database ConceptsDatabase Concepts
Database Concepts
 
MADHU.pptx
MADHU.pptxMADHU.pptx
MADHU.pptx
 
DBMS Lecture1.ppt
DBMS Lecture1.pptDBMS Lecture1.ppt
DBMS Lecture1.ppt
 
Database management system
Database management systemDatabase management system
Database management system
 
DBMS Database Management System
DBMS Database Management SystemDBMS Database Management System
DBMS Database Management System
 
Database Management System.docx
Database Management System.docxDatabase Management System.docx
Database Management System.docx
 
Spatial Database and Database Management System
Spatial Database and Database Management SystemSpatial Database and Database Management System
Spatial Database and Database Management System
 
Database Management System
Database Management SystemDatabase Management System
Database Management System
 
Dbms unit i
Dbms unit iDbms unit i
Dbms unit i
 
WHAT IS A DBMS? EXPLAIN DIFFERENT MYSQL COMMANDS AND CONSTRAINTS OF THE SAME.
WHAT IS A DBMS? EXPLAIN DIFFERENT MYSQL COMMANDS AND  CONSTRAINTS OF THE SAME.WHAT IS A DBMS? EXPLAIN DIFFERENT MYSQL COMMANDS AND  CONSTRAINTS OF THE SAME.
WHAT IS A DBMS? EXPLAIN DIFFERENT MYSQL COMMANDS AND CONSTRAINTS OF THE SAME.
 
Database Management Systems ( Dbms )
Database Management Systems ( Dbms )Database Management Systems ( Dbms )
Database Management Systems ( Dbms )
 
Database management systems
Database management systemsDatabase management systems
Database management systems
 

Dernier

Blowin' in the Wind of Caste_ Bob Dylan's Song as a Catalyst for Social Justi...
Blowin' in the Wind of Caste_ Bob Dylan's Song as a Catalyst for Social Justi...Blowin' in the Wind of Caste_ Bob Dylan's Song as a Catalyst for Social Justi...
Blowin' in the Wind of Caste_ Bob Dylan's Song as a Catalyst for Social Justi...DhatriParmar
 
4.9.24 Social Capital and Social Exclusion.pptx
4.9.24 Social Capital and Social Exclusion.pptx4.9.24 Social Capital and Social Exclusion.pptx
4.9.24 Social Capital and Social Exclusion.pptxmary850239
 
DBMSArchitecture_QueryProcessingandOptimization.pdf
DBMSArchitecture_QueryProcessingandOptimization.pdfDBMSArchitecture_QueryProcessingandOptimization.pdf
DBMSArchitecture_QueryProcessingandOptimization.pdfChristalin Nelson
 
Grade Three -ELLNA-REVIEWER-ENGLISH.pptx
Grade Three -ELLNA-REVIEWER-ENGLISH.pptxGrade Three -ELLNA-REVIEWER-ENGLISH.pptx
Grade Three -ELLNA-REVIEWER-ENGLISH.pptxkarenfajardo43
 
PART 1 - CHAPTER 1 - CELL THE FUNDAMENTAL UNIT OF LIFE
PART 1 - CHAPTER 1 - CELL THE FUNDAMENTAL UNIT OF LIFEPART 1 - CHAPTER 1 - CELL THE FUNDAMENTAL UNIT OF LIFE
PART 1 - CHAPTER 1 - CELL THE FUNDAMENTAL UNIT OF LIFEMISSRITIMABIOLOGYEXP
 
4.11.24 Poverty and Inequality in America.pptx
4.11.24 Poverty and Inequality in America.pptx4.11.24 Poverty and Inequality in America.pptx
4.11.24 Poverty and Inequality in America.pptxmary850239
 
Q-Factor General Quiz-7th April 2024, Quiz Club NITW
Q-Factor General Quiz-7th April 2024, Quiz Club NITWQ-Factor General Quiz-7th April 2024, Quiz Club NITW
Q-Factor General Quiz-7th April 2024, Quiz Club NITWQuiz Club NITW
 
Tree View Decoration Attribute in the Odoo 17
Tree View Decoration Attribute in the Odoo 17Tree View Decoration Attribute in the Odoo 17
Tree View Decoration Attribute in the Odoo 17Celine George
 
DiskStorage_BasicFileStructuresandHashing.pdf
DiskStorage_BasicFileStructuresandHashing.pdfDiskStorage_BasicFileStructuresandHashing.pdf
DiskStorage_BasicFileStructuresandHashing.pdfChristalin Nelson
 
Healthy Minds, Flourishing Lives: A Philosophical Approach to Mental Health a...
Healthy Minds, Flourishing Lives: A Philosophical Approach to Mental Health a...Healthy Minds, Flourishing Lives: A Philosophical Approach to Mental Health a...
Healthy Minds, Flourishing Lives: A Philosophical Approach to Mental Health a...Osopher
 
Mythology Quiz-4th April 2024, Quiz Club NITW
Mythology Quiz-4th April 2024, Quiz Club NITWMythology Quiz-4th April 2024, Quiz Club NITW
Mythology Quiz-4th April 2024, Quiz Club NITWQuiz Club NITW
 
Objectives n learning outcoms - MD 20240404.pptx
Objectives n learning outcoms - MD 20240404.pptxObjectives n learning outcoms - MD 20240404.pptx
Objectives n learning outcoms - MD 20240404.pptxMadhavi Dharankar
 
Comparative Literature in India by Amiya dev.pptx
Comparative Literature in India by Amiya dev.pptxComparative Literature in India by Amiya dev.pptx
Comparative Literature in India by Amiya dev.pptxAvaniJani1
 
An Overview of the Calendar App in Odoo 17 ERP
An Overview of the Calendar App in Odoo 17 ERPAn Overview of the Calendar App in Odoo 17 ERP
An Overview of the Calendar App in Odoo 17 ERPCeline George
 
ICS 2208 Lecture Slide Notes for Topic 6
ICS 2208 Lecture Slide Notes for Topic 6ICS 2208 Lecture Slide Notes for Topic 6
ICS 2208 Lecture Slide Notes for Topic 6Vanessa Camilleri
 
Congestive Cardiac Failure..presentation
Congestive Cardiac Failure..presentationCongestive Cardiac Failure..presentation
Congestive Cardiac Failure..presentationdeepaannamalai16
 

Dernier (20)

Blowin' in the Wind of Caste_ Bob Dylan's Song as a Catalyst for Social Justi...
Blowin' in the Wind of Caste_ Bob Dylan's Song as a Catalyst for Social Justi...Blowin' in the Wind of Caste_ Bob Dylan's Song as a Catalyst for Social Justi...
Blowin' in the Wind of Caste_ Bob Dylan's Song as a Catalyst for Social Justi...
 
4.9.24 Social Capital and Social Exclusion.pptx
4.9.24 Social Capital and Social Exclusion.pptx4.9.24 Social Capital and Social Exclusion.pptx
4.9.24 Social Capital and Social Exclusion.pptx
 
Faculty Profile prashantha K EEE dept Sri Sairam college of Engineering
Faculty Profile prashantha K EEE dept Sri Sairam college of EngineeringFaculty Profile prashantha K EEE dept Sri Sairam college of Engineering
Faculty Profile prashantha K EEE dept Sri Sairam college of Engineering
 
DBMSArchitecture_QueryProcessingandOptimization.pdf
DBMSArchitecture_QueryProcessingandOptimization.pdfDBMSArchitecture_QueryProcessingandOptimization.pdf
DBMSArchitecture_QueryProcessingandOptimization.pdf
 
Spearman's correlation,Formula,Advantages,
Spearman's correlation,Formula,Advantages,Spearman's correlation,Formula,Advantages,
Spearman's correlation,Formula,Advantages,
 
Grade Three -ELLNA-REVIEWER-ENGLISH.pptx
Grade Three -ELLNA-REVIEWER-ENGLISH.pptxGrade Three -ELLNA-REVIEWER-ENGLISH.pptx
Grade Three -ELLNA-REVIEWER-ENGLISH.pptx
 
PART 1 - CHAPTER 1 - CELL THE FUNDAMENTAL UNIT OF LIFE
PART 1 - CHAPTER 1 - CELL THE FUNDAMENTAL UNIT OF LIFEPART 1 - CHAPTER 1 - CELL THE FUNDAMENTAL UNIT OF LIFE
PART 1 - CHAPTER 1 - CELL THE FUNDAMENTAL UNIT OF LIFE
 
4.11.24 Poverty and Inequality in America.pptx
4.11.24 Poverty and Inequality in America.pptx4.11.24 Poverty and Inequality in America.pptx
4.11.24 Poverty and Inequality in America.pptx
 
CARNAVAL COM MAGIA E EUFORIA _
CARNAVAL COM MAGIA E EUFORIA            _CARNAVAL COM MAGIA E EUFORIA            _
CARNAVAL COM MAGIA E EUFORIA _
 
Q-Factor General Quiz-7th April 2024, Quiz Club NITW
Q-Factor General Quiz-7th April 2024, Quiz Club NITWQ-Factor General Quiz-7th April 2024, Quiz Club NITW
Q-Factor General Quiz-7th April 2024, Quiz Club NITW
 
Tree View Decoration Attribute in the Odoo 17
Tree View Decoration Attribute in the Odoo 17Tree View Decoration Attribute in the Odoo 17
Tree View Decoration Attribute in the Odoo 17
 
DiskStorage_BasicFileStructuresandHashing.pdf
DiskStorage_BasicFileStructuresandHashing.pdfDiskStorage_BasicFileStructuresandHashing.pdf
DiskStorage_BasicFileStructuresandHashing.pdf
 
Healthy Minds, Flourishing Lives: A Philosophical Approach to Mental Health a...
Healthy Minds, Flourishing Lives: A Philosophical Approach to Mental Health a...Healthy Minds, Flourishing Lives: A Philosophical Approach to Mental Health a...
Healthy Minds, Flourishing Lives: A Philosophical Approach to Mental Health a...
 
Mythology Quiz-4th April 2024, Quiz Club NITW
Mythology Quiz-4th April 2024, Quiz Club NITWMythology Quiz-4th April 2024, Quiz Club NITW
Mythology Quiz-4th April 2024, Quiz Club NITW
 
Objectives n learning outcoms - MD 20240404.pptx
Objectives n learning outcoms - MD 20240404.pptxObjectives n learning outcoms - MD 20240404.pptx
Objectives n learning outcoms - MD 20240404.pptx
 
Comparative Literature in India by Amiya dev.pptx
Comparative Literature in India by Amiya dev.pptxComparative Literature in India by Amiya dev.pptx
Comparative Literature in India by Amiya dev.pptx
 
An Overview of the Calendar App in Odoo 17 ERP
An Overview of the Calendar App in Odoo 17 ERPAn Overview of the Calendar App in Odoo 17 ERP
An Overview of the Calendar App in Odoo 17 ERP
 
Introduction to Research ,Need for research, Need for design of Experiments, ...
Introduction to Research ,Need for research, Need for design of Experiments, ...Introduction to Research ,Need for research, Need for design of Experiments, ...
Introduction to Research ,Need for research, Need for design of Experiments, ...
 
ICS 2208 Lecture Slide Notes for Topic 6
ICS 2208 Lecture Slide Notes for Topic 6ICS 2208 Lecture Slide Notes for Topic 6
ICS 2208 Lecture Slide Notes for Topic 6
 
Congestive Cardiac Failure..presentation
Congestive Cardiac Failure..presentationCongestive Cardiac Failure..presentation
Congestive Cardiac Failure..presentation
 

DBMS FOR STUDENTS MUST DOWNLOAD AND READ

  • 1. Complete database Database is collectionof datawhichisrelatedbysome aspect.Datais collectionof factsandfigures whichcan be processedtoproduce information.Name of astudent,age, classandhersubjectscan be countedas data forrecordingpurposes. Mostlydata representsrecordablefacts.Dataaidsin producinginformationwhichisbasedonfacts.For example,if we have dataaboutmarksobtainedbyall students,we canthenconclude abouttoppersand average marksetc. A database managementsystemstoresdata,insucha waywhichiseasiertoretrieve,manipulate and helpstoproduce information. Characteristics Traditionallydatawasorganizedinfile formats.DBMSwas all new conceptsthenandall the research was done tomake it to overcome all the deficienciesintraditional style of datamanagement.Modern DBMS hasthe followingcharacteristics: Real-worldentity:ModernDBMSare more realisticandusesreal worldentitiesto designits architecture.Itusesthe behaviorandattributestoo.Forexample,aschool database mayuse studentas entityandtheirage as theirattribute. Relation-basedtables:DBMSallowsentitiesandrelationsamongthemtoformas tables.Thiseasesthe conceptof data saving.A usercan understandthe architecture of database justbylookingattable namesetc. Isolationof dataand application:A database systemisentirelydifferentthanitsdata.Where database is saidto active entity,datais saidto be passive one onwhichthe database worksandorganizes.DBMS alsostoresmetadatawhichisdata about data,to ease itsown process. Lessredundancy:DBMS followsrulesof normalization,whichsplitsarelationwhenanyof itsattributes ishavingredundancyinvalues.Followingnormalization,whichitself isamathematicallyrichand scientificprocess,make the entire database tocontainaslessredundancyaspossible. Consistency:DBMSalwaysenjoythe state onconsistencywhere the previous formof datastoring applicationslikefile processingdoesnotguarantee this.Consistencyisastate where everyrelationin database remainsconsistent.There existmethodsandtechniques,whichcandetectattemptof leaving database ininconsistentstate. QueryLanguage:DBMS isequippedwithquerylanguage,whichmakesitmore efficienttoretrieve and manipulate data.A usercan applyas manyand differentfilteringoptions,ashe orshe wants. Traditionallyitwasnotpossible where file-processingsystemwasused. ACIDProperties:DBMSfollowsthe conceptsforACIDproperties,whichstandsforAtomicity, Consistency,IsolationandDurability.These conceptsare appliedontransactions,whichmanipulate data indatabase.ACIDpropertiesmaintainsdatabase inhealthystate inmulti-transactional environmentandincase of failure.
  • 2. MultiuserandConcurrentAccess:DBMS supportmulti-userenvironmentandallowsthemtoaccessand manipulate datainparallel.Thoughthere are restrictionsontransactionswhentheyattempttohandle same data item,butusersare alwaysunaware of them. Multiple views:DBMSoffersmultiplesviewsfordifferentusers.A userwhoisin salesdepartmentwill have a differentviewof database thanapersonworkinginproductiondepartment.Thisenablesuserto have a concentrate viewof database accordingtotheirrequirements Security:Featureslikemultipleviewsofferssecurityatsome extentwhere usersare unable toaccess data of otherusersanddepartments.DBMSoffersmethodstoimpose constraintswhileenteringdata intodatabase and retrievingdataat laterstage.DBMS offersmanydifferentlevelsof securityfeatures, whichenablesmultiple userstohave differentview withdifferentfeatures.Forexample,auserinsales departmentcannotsee dataof purchase departmentisone thing,additionallyhow muchdataof sales departmenthe cansee,can alsobe managed.Because DBMS isnot savedon diskas traditional file systemitisveryhard for a thief tobreakthe cod. Users DBMS isusedbyvarioususersfor variouspurposes.Some mayinvolve inretrievingdataandsome may involve inbackingitup.Some of themare describedasfollows: Administrators:A bunchof users maintainthe DBMS and are responsible for administratingthe database.Theyare responsible tolook afteritsusage and by whomit shouldbe used.Theycreate usersaccessandapplylimitationtomaintain isolationandforce security.AdministratorsalsolookafterDBMSresourceslike systemlicense,software applicationandtoolsrequiredandotherhardware relatedmaintenance. Designer:Thisisthe groupof people whoactuallyworksondesigningpartof database.The actual database isstartedwithrequirementanalysisfollowedbyagood designingprocess.Theypeople keepa close watchon what data shouldbe keptandinwhat format.Theyidentifyanddesignthe whole setof entities,relations,constraintsandviews. End Users:This groupcontainsthe personswhoactuallytake advantage of database system.Endusers can be justviewerswhopayattentiontothe logsormarketrates or enduserscan be as sophisticatedas businessanalystswhotakesthe mostof it. DBMS - Architecture The designof a Database ManagementSystemhighlydependsonits architecture.Itcanbe centralized or decentralizedorhierarchical.DBMSarchitecture canbe seenas single tierormulti tier.n-tier architecture dividesthe whole systemintorelatedbutindependentnmodules,whichcanbe independentlymodified,altered,changedorreplaced. In 1-tierarchitecture,DBMSis the onlyentitywhere userdirectlysitsonDBMS andusesit.Any changes done here will directlybe done onDBMS itself.Itdoesnotprovide handytoolsforendusersand preferablydatabase designerandprogrammersuse singletierarchitecture.
  • 3. If the architecture of DBMS is 2-tierthenmusthave some application,whichusesthe DBMS. Programmersuse 2-tierarchitecture where theyaccessDBMSby meansof application.Here application tierisentirelyindependentof database intermof operation,designandprogramming. 3-tierarchitecture Most widelyusedarchitectureis3-tierarchitecture.3-tierarchitecture separatesittierfromeachother on basisof users.It isdescribedasfollows: Database (Data) Tier:At thistier,onlydatabase resides.Database alongwithitsqueryprocessing languagessitsinlayer-3of 3-tierarchitecture.Italsocontainsall relationsandtheirconstraints. Application(Middle) Tier:Atthistierthe applicationserverandprogram, whichaccessdatabase, resides.Fora userthisapplicationtierworksasabstractedview of database.Usersare unaware of any existence of database beyondapplication.Fordatabase-tier,applicationtieristhe userof it.Database tierisnot aware of any otheruserbeyondapplicationtier.Thistierworksasmediatorbetweenthe two. User (Presentation) Tier:Anendusersitsonthistier.From a usersaspectthistieris everything.He/she doesn'tknowaboutany existence orformof database beyondthislayer.Atthislayermultiple viewsof database can be providedbythe application.All viewsare generatedbyapplications,whichresidesin applicationtier. Multiple tierdatabase architecture ishighlymodifiableasalmostall itscomponentsare independent and can be changedindependently. DBMS - Data Models Data model tellshowthe logical structure of adatabase ismodeled.DataModelsare fundamental entitiestointroduce abstractioninDBMS.Data modelsdefine how dataisconnectedtoeachotherand howit will be processedandstoredinside the system. The veryfirstdata model couldbe flatdata-modelswhereall the datausedtobe keptinsame plane. Because earlierdatamodelswere notsoscientifictheywere prone tointroduce lotsof duplicationand update anomalies. Entity-RelationshipMode Entity-Relationshipmodel isbasedonthe notionof real worldentitiesandrelationshipamongthem. While formulatingreal-worldscenariointodatabase model,ERModel createsentityset,relationship set,general attributesandconstraints. ER Model is bestusedforthe conceptual designof database. ER Model is basedon: Entitiesandtheirattributes
  • 4. Relationshipsamongentities These conceptsare explainedbelow. Entity An entity inER Model isreal worldentity,whichhassome propertiescalledattributes.Everyattribute is definedbyitssetof values,calleddomain. For example,inaschool database,astudentisconsideredasanentity.Studenthasvariousattributes like name,age andclassetc. Relationship The logical associationamongentitiesiscalledrelationship.Relationshipsare mappedwithentitiesin variousways.Mappingcardinalitiesdefine the numberof associationbetweentwoentities. Mappingcardinalities: one to one one to many manyto one manyto many ER-Model isexplainedhere. Relational Model The most populardata model inDBMS isRelational Model.Itismore scientificmodel thenothers.This model isbasedonfirst-orderpredicate logicanddefinestable asann-aryrelation. The main highlightsof thismodel are: Data is storedintablescalledrelations.
  • 5. Relationscanbe normalized. In normalizedrelations,valuessavedare atomicvalues. Each row inrelationcontainsunique value Each columnin relationcontainsvaluesfromasame domain. Relational Model isexplainedhere. DBMS - Data Schemas Database schema Database schemaskeletonstructure of anditrepresentsthe logical view of entire database.Ittells abouthow the data isorganized andhow relationamongthemisassociated.Itformulatesall database constraintsthatwouldbe put ondata in relations,whichresidesindatabase. A database schemadefinesitsentitiesandthe relationshipamongthem.Database schemaisa descriptive detail of the database,whichcanbe depictedbymeansof schemadiagrams.All these activitiesare done bydatabase designertohelpprogrammersinordertogive some ease of understandingall aspectof database. Database schemacan be dividedbroadlyin twocategories: Physical Database Schema:Thisschemapertainstothe actual storage of data and itsformof storage like files,indicesetc.Itdefinesthe howdatawill be storedinsecondarystorage etc. Logical Database Schema:Thisdefinesall logical constraintsthatneedtobe appliedondatastored.It definestables,viewsandintegrityconstraintsetc. Database Instance
  • 6. It isimportantthat we distinguishthese twotermsindividually.Database schemaisthe skeletonof database.Itis designed whendatabase doesn'texistatall andveryhard to do anychangesonce the database isoperational.Database schemadoesnotcontainanydata or information. Database instances,isa state of operational database withdataatany giventime.Thisisa snapshotof database.Database instancestendtochange withtime.DBMSensuresthatits everyinstance (state) mustbe a validstate bykeepinguptoall validation,constraintsandconditionthatdatabase designers has imposedoritis expectedfromDBMS itself. DBMS - Data Independence If the database systemisnotmulti-layeredthenitwill be veryhardtomake any changesinthe database system.Database systemsare designedinmulti-layersaswe leantearlier. Data Independence: There'sa lot of data inwhole database managementsystemotherthanuser'sdata.DBMS comprisesof three kindsof schemas,whichisinturn data aboutdata (Meta-Data).Meta-dataisalsostoredalong withdatabase,whichonce storedisthenhardto modify.Butas DBMS expands,itneedstobe changed overthe time satisfythe requirementsof users.Butif the whole datawere highlydependentitwould become tediousandhighlycomplex. Data about data itself isdividedin layeredarchitecture sothatwhenwe change dataat one layeritdoesnot affectthe data layeredat differentlevel.Thisdataisindependentbutmappedoneachother. Logical Data Independence Logical data is data aboutdatabase,thatis,it storesinformationabouthow dataismanagedinside.For example,atable (relation) storedinthe database andall constraints,whichare appliedonthatrelation. Logical data independenceisakindof mechanism, whichliberalizesitself fromactual datastoredon the disk.If we do some changesontable formatit shouldnotchange the data residingondisk. Physical DataIndependence All schemasare logical andactual data isstoredinbit formaton the disk.Physical dataindependence is the powerto change the physical datawithoutimpactingthe schemaorlogical data.
  • 7. For example,incase we wanttochange or upgrade the storage systemitself,thatis,usingSSDinstead of Hard-disksshouldnothave anyimpactonlogical data or schemas. ER Model Basic Concepts Entityrelationshipmodel definesthe conceptual view of database.Itworksaroundreal worldentityand associationamongthem.Atviewlevel,ERmodel isconsideredwell fordesigningdatabases. Entity A real-worldthingeitheranimate orinanimate thatcanbe easilyidentifiable anddistinguishable.For example,inaschool database,student,teachers,classandcourse offeredcanbe consideredasentities. All entitieshave some attributesorpropertiesthatgive themtheiridentity. An entitysetisa collectionof similartypesof entities.Entitysetmaycontainentitieswithattribute sharingsimilarvalues.Forexample,Studentssetmaycontainall the studentof a school;likewise Teacherssetmay containall the teachersof school from all faculties.Entitiessetsneednottobe disjoint. Attributes Entitiesare representedbymeansof theirproperties,calledattributes.Allattributeshave values.For example,astudententitymayhave name,class,age asattributes. There existsadomain or range of valuesthatcan be assignedtoattributes. Forexample,astudent's name cannot be a numericvalue.Ithas to be alphabetic.A student'sage cannotbe negative,etc. Typesof attributes: Simple attribute: Simple attributesare atomicvalues,whichcannotbe dividedfurther.Forexample,student'sphone- numberisan atomicvalue of 10 digits. Composite attribute: Composite attributesare made of more than one simple attribute.Forexample,astudent'scomplete name may have first_name andlast_name. Derivedattribute: Derivedattributesare attributes,whichdonotexistphysical inthe database,butthere valuesare derivedfromotherattributespresentedinthe database.Forexample,average_salaryinadepartment shouldbe savedindatabase insteaditcanbe derived. Foranotherexample,age canbe derivedfrom data_of_birth. Single-valuedattribute: Single valuedattributescontainonsinglevalue.Forexample:Social_Security_Number. Multi-value attribute:
  • 8. Multi-value attribute maycontainmore thanone values. Forexample,apersoncanhave more thanone phone numbers,email_addressesetc. These attribute typescancome togetherina waylike: simple single-valuedattributes simple multi-valuedattributes composite single-valuedattributes composite multi-valuedattributes Entity-setand Keys Keyisan attribute or collectionof attributesthatuniquelyidentifiesanentityamongentityset. For example,roll_numberof astudentmakesher/himidentifiable amongstudents. Super Key:Setof attributes(one ormore) thatcollectivelyidentifiesanentityinanentityset. Candidate Key: Minimal superkeyiscalledcandidate keythatis,superskeysforwhichnopropersubset are a superkey.Anentitysetmayhave more thanone candidate key. Primary Key: Thisis one of the candidate keychosenbythe database designertouniquelyidentifythe entityset. Relationship The associationamongentitiesiscalledrelationship.Forexample,employee entityhasrelation worksat withdepartment.Anotherexample isforstudentwhoenrollsinsome course.Here, Worksatand Enrollsare calledrelationship. RelationshipSet: Relationshipof similartype iscalledrelationshipset.Like entities,arelationshiptoocanhave attributes. These attributesare calleddescriptiveattributes. Degree of relationship The numberof participatingentitiesinanrelationshipdefinesthe degreeof the relationship. Binary= degree 2 Ternary= degree 3 n-ary= degree MappingCardinalities: Cardinalitydefinesthe numberof entitiesinone entitysetwhichcanbe associatedtothe numberof entitiesof othersetviarelationshipset.
  • 9. One-to-one:one entityfromentitysetA can be associatedwithatmost one entityof entitysetBand vice versa One-to-many:One entityfrom entitysetA canbe associatedwithmore thanone entitiesof entitysetB but fromentitysetB one entitycanbe associatedwithatmostone entity. Many-to-one:More than one entitiesfromentity setA can be associatedwithatmostone entityof entitysetB butone entityfromentitysetB can be associatedwithmore thanone entityfromentitysetA. Many-to-many:one entityfromA can be
  • 10. associatedwithmore thanone entityfromB andvice versa ER Diagram Representation Nowwe shall learnhowER Model is representedbymeansof ER diagram.Everyobjectlike entity attributesof an entity,relationshipset,andattributesof relationshipsetcanbe representedbytoolsof ER diagram. Entity Entitiesare representedbymeansof rectangles.Rectanglesare namedwiththe entitysetthey represent. Attributes Attributesare propertiesof entities.Attributesare representedbymeansof eclipses.Everyeclipse representsone attribute andisdirectlyconnectedtoitsentity(rectangle). If the attributesare composite, theyare furtherdividedinatree like structure.Everynode isthenconnectedtoitsattribute.Thatis composite attributesare representedbyeclipsesthatare connectedwithaneclipse
  • 12. Relationship Relationshipsare representedbydiamondshapedbox.Nameof the relationshipiswritteninthe diamond-box.All entities(rectangles),participatinginrelationship,are connectedtoitbya line. Binaryrelationshipandcardinality A relationshipwheretwoentitiesare participating,iscalledabinaryrelationship.Cardinalityisthe numberof instance of an entityfroma relationthatcan be associatedwiththe relation. One-to-one Whenonlyone instance of entityisassociatedwiththe relationship,itismarkedas'1'. This image below reflectsthatonly1 instance of each entityshouldbe associatedwiththe relationship.Itdepictsone-to- one relationship One-to-many Whenmore than one instance of entityisassociatedwiththe relationship,itismarkedas'N'. Thisimage belowreflectsthatonly1 instance of entityonthe leftandmore than one instance of entityonthe right can be associatedwiththe relationship.Itdepictsone-to-manyrelationship Many-to- one Whenmore than one instance of entityisassociatedwiththe relationship,itismarkedas'N'. Thisimage belowreflectsthatmore thanone instance of entityonthe leftand onlyone instance of entityonthe rightcan be associatedwiththe relationship.Itdepictsmany-to-onerelationship Participation Constraints
  • 13. Total Participation: Each entityinthe entityisinvolvedinthe relationship.Total participationis representedbydouble lines. Partial participation: Notall entitiesare involvedinthe relationship.Partial participationisrepresented by single line. Generalization,Aggregation ER Model has the powerof expressingdatabase entitiesinconceptual hierarchical mannersuchthat, as the hierarchical goesupitgeneralize the view of entitiesandaswe go deepinthe hierarchyitgivesus detail of everyentityincluded. Goingup inthisstructure iscalledgeneralization,where entitiesare clubbedtogethertorepresenta more generalizedview.Forexample,aparticularstudentnamed,Miracan be generalizedalongwithall the students,the entityshall be student,andfurtherastudentisperson.The reverse iscalled specializationwhere apersonisstudent,andthatstudentisMira. Generalization As mentionedabove,the processof generalizingentities,wherethe generalizedentitiescontainthe propertiesof all the generalizedentitiesiscalledGeneralization.Ingeneralization,anumberof entities are broughttogetherintoone generalizedentitybasedontheirsimilarcharacteristics.Foranexample, pigeon,house sparrow,crowanddove all can be generalizedasBirds. Specialization Specializationisaprocess,whichisopposite togeneralization,asmentionedabove. Inspecialization,a groupof entitiesisdividedintosub-groupsbasedontheircharacteristics.Take agroupPersonfor example.A personhasname,date of birth,genderetc.These propertiesare commoninall persons, humanbeings.Butina company,a personcan be identifiedasemployee,employer,customerorvendor basedon whatrole do theyplayincompany.
  • 14. Similarly,inaschool database,apersoncan be specializedasteacher,studentorstaff;basedonwhat role do theyplayinschool as entities Inheritance For example,attributesof apersonlike name,age,andgendercanbe inheritedbylowerlevel entities like studentandteacheretc. DBMS Codd'sRules Dr Edgar F.Codd didsome extensive researchinRelational Model of database systemsand came up withtwelve rulesof hisownwhichaccordingtohim, a database mustobeyinorderto be a true relational database. These rulescanbe appliedonadatabase systemthatis capable of managingisstoreddata usingonlyits relational capabilities.Thisisafoundationrule,whichprovidesabase toimplyotherrulesonit. Rule 1: Informationrule
  • 15. Thisrule statesthat all information(data),whichisstoredinthe database,mustbe a value of some table cell.Everythinginadatabase must be storedintable formats.Thisinformationcanbe userdata or meta-data. Rule 2: GuaranteedAccessrule Thisrule statesthat everysingle dataelement(value) isguaranteedtobe accessiblelogicallywith combinationof table-name,primary-key(row value) andattribute-name (columnvalue).Noother means,suchas pointers,canbe usedto access data. Rule 3: SystematicTreatmentof NULL values Thisrule statesthe NULL valuesinthe database mustbe givena systematictreatment.AsaNULL may have several meanings,i.e.NULLcanbe interpretedasone the following:dataismissing,dataisnot known,datais notapplicable etc. Rule 4: Active online catalog Thisrule statesthat the structure descriptionof whole database mustbe storedinanonline catalog,i.e. data dictionary,whichcanbe accessedbythe authorizedusers.Userscanuse the same querylanguage to access the catalogwhichtheyuse to access the database itself. Rule 5: Comprehensive datasub-language rule Thisrule statesthat a database musthave a supportfora language whichhaslinearsyntax whichis capable of data definition,datamanipulationandtransactionmanagementoperations.Database canbe accessedbymeansof thislanguage only,eitherdirectlyorbymeansof some application.If the database can be accessedor manipulatedinsome waywithoutanyhelpof thislanguage,itisthena violation. Rule 6: Viewupdatingrule Thisrule statesthat all viewsof database,whichcantheoreticallybe updated,mustalsobe updatable by the system. Rule 7: High-levelinsert,update anddelete rule Thisrule statesthe database mustemploysupporthigh-level insertion,updationanddeletion.Thismust not be limitedtoasingle rowthat is,itmust also supportunion,intersectionandminusoperationsto yieldsetsof datarecords. Rule 8: Physical dataindependence Thisrule statesthat the applicationshouldnothave anyconcernabouthow the data isphysically stored.Also,anychange initsphysical structure mustnot have anyimpact onapplication. Rule 9: Logical data independence Thisrule statesthat the logical datamust be independentof itsuser’sview (application).Anychange in logical datamust notimplyanychange in the applicationusing it.Forexample,if twotablesare merged or one is splitintotwodifferenttables,there shouldbe noimpactthe change on userapplication.Thisis one of the mostdifficultrule toapply. Rule 10: Integrityindependence
  • 16. Thisrule statesthat the database mustbe independentof the applicationusingit.All itsintegrity constraintscan be independentlymodifiedwithoutthe needof anychange inthe application.Thisrule makesdatabase independentof the front-endapplicationanditsinterface. Rule 11: Distributionindependence Thisrule statesthat the endusermust notbe able to see thatthe data is distributedovervarious locations.Usermustalsosee that data islocatedat one site only.Thisrule hasbeenprovenasa foundationof distributeddatabase systems. Rule 12: Non-subversionrule Thisrule statesthat if a systemhas an interface thatprovidesaccesstolow level records,thisinterface thenmustnot be able to subvertthe systemandbypasssecurityandintegrityconstraints. Relational DataModel Relational datamodel isthe primarydatamodel,whichisusedwidelyaroundthe worldfordatastorage and processing.Thismodel issimple andhave all the propertiesandcapabilitiesrequiredtoprocess data withstorage efficiency. Concepts Tables:Inrelationdatamodel,relationsare savedinthe formatof Tables.Thisformatstoresthe relationamongentities.A table hasrowsandcolumns,where rowsrepresentrecordsandcolumns representsthe attributes. Tuple:A single rowof a table,whichcontainsasingle recordforthatrelationiscalledatuple. Relationinstance:A finitesetof tuplesinthe relational database systemrepresentsrelationinstance. Relationinstancesdonothave duplicate tuples. Relationschema:Thisdescribesthe relationname (table name),attributesandtheirnames. Relationkey:Eachrow hasone or more attributeswhichcanidentifythe row inthe relation(table) uniquely,iscalledthe relationkey. Attribute domain:Everyattribute hassome pre-definedvaluescope,knownasattribute domain. Constraints Everyrelationhassome conditionsthatmustholdforit to be a validrelation.These conditionsare calledRelationalIntegrityConstraints.There are three mainintegrityconstraints. Key Constraints Domainconstraints Referential integrityconstraints KeyConstraints:
  • 17. There mustbe at leastone minimal subsetof attributesinthe relation,whichcanidentifyatuple uniquely.Thisminimalsubsetof attributesiscalledkeyforthatrelation.If there are more thanone such minimal subsets,these are calledcandidate keys. Keyconstraintsforcesthat: ina relationwithakeyattribute,notwotuplescanhave identical value forkeyattributes. keyattribute cannot have NULL values. Keyconstrainsare alsoreferredtoas EntityConstraints. Domainconstraints Attributeshave specificvaluesinreal-worldscenario.Forexample,age canonlybe positive integer.The same constraintshasbeentried toemployonthe attributesof a relation.Everyattribute isboundto have a specificrange of values.Forexample,age cannot be lessthanzero andtelephone numbercan not be a outside 0-9. Referential integrityconstraints Thisintegrityconstraints worksonthe conceptof ForeignKey.A keyattribute of a relationcanbe referredinotherrelation,where itiscalledforeignkey. Referential integrityconstraintstatesthatif a relationreferstoankeyattribute of a differentorsame relation,thatkeyelementmustexists. Relational Algebra Relational database systemsare expectedtobe equippedbyaquerylanguage thatcan assistitsuser to querythe database instances.Thiswayitsuserempowersitself andcanpopulate the resultsas required.There are twokindsof querylanguages,relationalalgebraandrelational calculus. Relational algebra Relational algebraisa procedural querylanguage,whichtakesinstancesof relationsasinputandyields instancesof relationsasoutput.Itusesoperatorstoperformqueries.Anoperatorcanbe eitherunary or binary.Theyaccept relationsastheirinputandyieldsrelationsastheiroutput.Relational algebrais performedrecursivelyonarelationandintermediateresultsare alsoconsideredrelations. Fundamental operationsof Relational algebra: Select Project Union Setdifferent Cartesianproduct
  • 18. Rename These are definedbrieflyasfollows: SelectOperation(σ) Selectstuplesthatsatisfythe givenpredicate fromarelation. Notationσp(r) Where p standsfor selectionpredicate andrstandsfor relation.pisprepositional logicformulaewhich may use connectorslike and,orand not.These termsmay use relational operatorslike:=,≠, ≥, < , >, ≤. For example: σsubject="database"(Books) Output: Selectstuplesfrombookswhere subjectis'database'. σsubject="database"andprice="450"(Books) Output: Selectstuplesfrombookswhere subjectis'database'and'price'is450. σsubject="database"andprice <"450" or year> "2010"(Books) Output: Selectstuplesfrombookswhere subjectis'database'and'price'is450 or the publicationyear isgreaterthan 2010, that ispublishedafter2010. ProjectOperation(∏) Projectscolumn(s) thatsatisfygivenpredicate. Notation:∏A1, A2, An(r) Where a1, a2 , an are attribute namesof relationr. Duplicate rowsare automaticallyeliminated,asrelationisaset. for example: ∏subject,author(Books) Selectsandprojectscolumnsnamedassubjectandauthorfrom relationBooks. UnionOperation(∪) Unionoperationperformsbinaryunionbetweentwogivenrelationsandisdefinedas: r ∪ s = { t | t ∈ r or t ∈ s}
  • 19. Notion:r U s Where r and s are eitherdatabase relationsorrelationresultset(temporaryrelation). For a unionoperationtobe valid,the followingconditionsmusthold: r, s must have same numberof attributes. Attribute domainsmustbe compatible. Duplicate tuplesare automaticallyeliminated. ∏ author (Books) ∪∏ author (Articles) Output: Projectsthe name of authorwhohas eitherwrittenabookor an article or both. SetDifference ( −) The resultof set difference operationistupleswhichpresentinone relationbutare notinthe second relation. Notation:r − s Findsall tuplesthatare presentinr but not s. ∏ author (Books) − ∏ author(Articles) Output:Resultsthe name of authorswhohas writtenbooksbutnot articles. CartesianProduct(Χ) Combinesinformationof twodifferentrelationsintoone. Notation:r Χs Where r and s are relationsandthere outputwill be definedas: r Χ s = { q t | q ∈ r and t ∈ s} ∏ author = 'tutorialspoint'(BooksΧArticles) Output: yieldsarelationasresultwhichshowsall booksandarticleswrittenbytutorialspoint. Rename operation( ρ ) Resultsof relational algebraare alsorelationsbutwithoutanyname.The rename operationallowsusto rename the outputrelation.rename operationisdenotedwithsmallgreekletterrhoρ Notation:ρ x (E) Where the resultof expressionEissavedwithname of x.
  • 20. Additional operationsare: Setintersection Assignment Natural join Relational Calculus In contrastwithRelational Algebra,RelationalCalculusisnon-procedural querylanguage,thatis,ittells whatto do but neverexplainsthe way,how todoit. Relational calculusexistsintwoforms: Tuple relational calculus(TRC) Filteringvariable rangesovertuples Notation:{ T | Condition} Returnsall tuplesTthat satisfiescondition. For Example: { T.name | Author(T) ANDT.article = 'database'} Output:returnstupleswith'name'fromAuthorwhohas writtenarticle on'database'. TRC can be quantifiedalso.We canuse Existential ( ∃)andUniversal Quantifiers( ∀). For example: { R| ∃T ∈ Authors(T.article='database'ANDR.name=T.name)} Output: the querywill yieldthe same resultasthe previousone. Domainrelational calculus(DRC) In DRC the filteringvariable usesdomainof attributesinsteadof entiretuple values(asdone inTRC, mentioned above). Notation: { a1, a2, a3, ...,an | P (a1, a2, a3, ... ,an)} where a1, a2 are attributesandP standsfor formulae builtbyinnerattributes. For example:
  • 21. {< article,page,subject>| ∈ TutorialsPoint∧subject='database'} Output:YieldsArticle,Page andSubjectfromrelationTutorialsPointwhere Subjectisdatabase. Justlike TRC,DRC also can be writtenusingexistential anduniversal quantifiers.DRCalsoinvolves relational operators. Expressionpowerof Tuple relationcalculusandDomainrelationcalculusisequivalenttoRelational Algebra. ER to Relational Model ER Model whenconceptualizedintodiagramsgivesagoodoverview of entity-relationship,whichis easiertounderstand.ERdiagramscan be mappedtoRelational schemathatis,itispossible tocreate relational schemausingERdiagram.Thoughwe cannotimportall the ER constraintsintoRelational model butan approximate schemacanbe generated. There are more than one processesandalgorithmsavailabletoconvertER DiagramsintoRelational Schema.Some of themare automatedandsome of themare manual process.We may focushere on the mappingdiagramcontentstorelational basics. ER Diagrams mainlycomprisedof: Entityand itsattributes Relationship,whichisassociationamongentities. MappingEntity An entityisareal worldobjectwithsome attributes. MappingProcess(Algorithm): Create table foreach entity Entity'sattributesshouldbecome fieldsof tableswiththeirrespectivedatatypes. Declare primarykey Mappingrelationship A relationshipisassociationamongentities. Mappingprocess(Algorithm):
  • 22. We use all above featuresof ER-Model,inordertocreate classesof objectsinobjectoriented programming.Thismakesiteasierfor the programmerto concentrate onwhatshe is programming. Detailsof entitiesare generallyhiddenfromthe user,thisprocessknownasabstraction. One of the importantfeaturesof GeneralizationandSpecialization,isinheritance,thatis,the attributes of higher-levelentitiesare inheritedbythe lowerlevelentities. Create table fora relationship Addthe primarykeysof all participatingEntitiesasfieldsof table withtheirrespectivedatatypes. If relationshiphasanyattribute,addeachattribute asfieldof table. Declare a primarykeycomposingall the primarykeysof participatingentities. Declare all foreignkeyconstraints. MappingWeak EntitySets A weakentitysetsisone whichdoesnothave anyprimarykeyassociatedwithit. Mappingprocess(Algorithm): Create table for weakentityset Addall its attributestotable asfield Addthe primarykeyof identifyingentityset Declare all foreignkeyconstraints Mappinghierarchical entities
  • 23. ER specializationorgeneralization comesinthe formof hierarchical entitysets. Mappingprocess(Algorithm): Create tablesforall higherlevel entities Create tablesforlowerlevelentities Addprimarykeysof higherlevel entitiesinthe table of lowerlevel entities In lowerleveltables,addall otherattributesof lowerentities. Declare primarykeyof higherlevel tablethe primarykeyforlowerlevel table Declare foreignkeyconstraints. SQL Overview SQL is a programminglanguage forRelationalDatabases.Itisdesignedoverrelational algebraandtuple relational calculus.SQLcomesasa package withall major distributionsof RDBMS. SQL comprisesbothdatadefinitionanddatamanipulationlanguages.Usingthe datadefinition propertiesof SQL,one can designandmodify database schemawhereasdatamanipulationproperties allowsSQLto store and retrieve datafromdatabase. Data definitionLanguage SQL usesthe followingsetof commandstodefine database schema: CREATE
  • 24. Createsnewdatabases,tablesandviewsfromRDBMS For example: Create database tutorialspoint; Create table article; Create viewfor_students; DROP Drop commandsdeletesviews,tablesanddatabasesfromRDBMS Drop object_type object_name; Drop database tutorialspoint; Drop table article; Drop viewfor_students; ALTER Modifiesdatabase schema. Alterobject_type object_name parameters; for example: Altertable article addsubjectvarchar; This commandaddsan attribute inrelationarticle withname subjectof stringtype. Data ManipulationLanguage SQL is equippedwithdatamanipulationlanguage.DMLmodifiesthe database instance byinserting, updatinganddeletingitsdata.DML is responsible forall datamodificationindatabases.SQLcontains the followingsetof commandinDML section: SELECT/FROM/WHERE INSERT INTO/VALUES UPDATE/SET/WHERE DELETE FROM/WHERE These basicconstructsallowsdatabase programmersanduserstoenterdata and informationintothe database and retrieve efficiently usinganumberof filteroptions. SELECT/FROM/WHERE SELECT Thisis one of the fundamental querycommandof SQL.It issimilartoprojectionoperationof relational algebra.Itselectsthe attributesbasedonthe conditiondescribedbyWHEREclause. FROM
  • 25. Thisclause takesa relationname asan argumentfromwhichattributesare to be selected/projected.In case more thanone relationnamesare giventhisclause correspondstocartesianproduct. WHERE Thisclause definespredicate orconditionswhichmustmatchinorderto qualifythe attributestobe projected. For example: Selectauthor_name From book_author Where age > 50; Thiscommandwill projectnamesof author’sfrombook_authorrelationwhose age isgreaterthan50. INSERT INTO/VALUES Thiscommandis usedforinsertingvaluesintorowsof table (relation). Syntax is INSERT INTOtable (column1[,column2,column3...]) VALUES (value1[,value2,value3...]) Or INSERT INTOtable VALUES(value1,[value2,...]) For Example: INSERT INTOtutorialspoint(Author,Subject)VALUES("anonymous","computers"); UPDATE/SET/WHERE Thiscommandis usedforupdatingor modifyingvaluesof columnsof table (relation). Syntax is UPDATE table_name SETcolumn_name = value [,column_name =value ...] [WHERE condition] For example: UPDATE tutorialspointSETAuthor="webmaster"WHEREAuthor="anonymous"; DELETE/FROM/WHERE Thiscommandis usedforremovingone ormore rowsfromtable (relation). Syntax is DELETE FROMtable_name [WHEREcondition]; For example: DELETE FROMtutorialspoints WHERE Author="unknown"; For in-depthandpractical knowledge of SQL,clickhere.
  • 26. Database Normalization Functional Dependency Functional dependency(FD) issetof constraintsbetweentwoattributesinarelation.Functional dependencysaysthatif two tupleshave same valuesforattributesA1,A2,...,Anthenthose twotuples musthave to have same valuesforattributesB1, B2, ...,Bn. Functional dependencyisrepresentedbyarrow sign(→),thatisX→Y,where X functionallydetermines Y. The lefthandside attributesdeterminesthe valuesof attributesatrighthandside. Armstrong'sAxioms If F issetof functional dependenciesthenthe closureof F,denotedasF+,is the setof all functional dependencieslogicallyimpliedbyF.Armstrong'sAxiomsare setof rules,whenappliedrepeatedly generatesclosure of functional dependencies. Reflexiverule:If alphaisa setof attributesandbetais_subset_of alpha,thenalphaholdsbeta. Augmentationrule:if a→ b holdsandy isattribute set,thenay → by alsoholds.Thatis adding attributesindependencies,doesnotchange the basicdependencies. Transitivityrule:Same astransitive ruleinalgebra,if a→ b holdsand b → c holdsthena → c alsohold.a → b iscalledasa functionallydeterminesb. Trivial Functional Dependency Trivial:If an FD X → Y holdswhere Ysubsetof X, thenitis calledatrivial FD.Trivial FDsare alwayshold. Non-trivial:If anFD X → Y holdswhere Yis notsubsetof X, thenit is callednon-trivial FD. Completelynon-trivial:If anFD X→ Y holdswhere x intersectY= Φ, is saidto be completelynon-trivial FD. Normalization If a database designisnotperfectitmaycontainanomalies,whichare like abaddreamfor database itself.Managingadatabase withanomaliesisnexttoimpossible. Update anomalies:if dataitemsare scatteredandare not linkedtoeachotherproperly,thenthere may be instanceswhenwe tryto update one data itemthathas copiesof it scatteredatseveral places,few instancesof itget updatedproperlywhile few are leftwiththere oldvalues.Thisleavesdatabaseinan inconsistentstate. Deletionanomalies:we triedtodeletearecord,butparts of itleftundeletedbecauseof unawareness, the data is alsosavedsomewhereelse. Insertanomalies:we triedtoinsertdataina record thatdoesnot existatall. Normalizationisamethodtoremove all these anomaliesandbringdatabase toconsistentstate and free fromany kindsof anomalies. FirstNormal Form:
  • 27. Thisis definedinthe definitionof relations(tables) itself.Thisrule definesthatall the attributesina relationmusthave atomicdomains.Valuesinatomicdomainare indivisibleunits. [Image:Unorganizedrelation] We re-arrange the relation(table) asbelow,toconvertitto FirstNormal Form [Image:Relationin1NF] Each attribute mustcontainonlysingle value fromitspre-defineddomain. SecondNormal Form: Before we learnaboutsecondnormal form, we needtounderstandthe following: Prime attribute:anattribute,whichispartof prime-key,isprime attribute. Non-prime attribute:anattribute,whichisnotapart of prime-key,issaidtobe a non-prime attribute. Secondnormal formsays, that everynon-prime attribute shouldbe fullyfunctionallydependenton prime keyattribute.Thatis,if X → A holds,thenthere shouldnotbe anypropersubsetY of X, for that Y → A also holds. [Image:Relationnotin2NF] We see here inStudent_Projectrelationthatthe prime keyattributesare Stu_IDandProj_ID.According to the rule,non-keyattributes,i.e.Stu_Name andProj_Name mustbe dependentuponbothandnoton any of the prime keyattribute individually.Butwe findthatStu_Name can be identifiedbyStu_IDand Proj_Name canbe identifiedbyProj_IDindependently.Thisiscalledpartial dependency,whichisnot allowedinSecondNormal Form. [Image:Relationin2NF] We broke the relationintwoas depictedinthe above picture.Sothere existsnopartial dependency. ThirdNormal Form: For a relationtobe in ThirdNormal Form, it mustbe inSecondNormal formand the followingmust satisfy: No non-prime attribute istransitivelydependentonprime keyattribute For any non-trivialfunctional dependency,X→ A,theneither X isa superkeyor, A isprime attribute. [Image:Relationnotin3NF] We findthatin above depictedStudent_detail relation,Stu_IDiskeyandonlyprime keyattribute.We findthatCity can be identifiedbyStu_ID aswell asZipitself.NeitherZipisasuperkeynorCityisa prime attribute.Additionally,Stu_ID→ Zip→ City,sothere existstransitive dependency. [Image:Relationin3NF]
  • 28. We broke the relationasabove depictedtworelationstobringitinto3NF. Boyce-CoddNormal Form: BCNFis an extensionof ThirdNormal Forminstrict way.BCNFstatesthat For any non-trivialfunctional dependency,X→ A,thenX mustbe a super-key. In the above depictedpicture,Stu_IDissuper-keyinStudent_Detail relationandZipissuper-keyin ZipCodesrelation.So, Stu_ID → Stu_Name,Zip And Zip→ City Confirms,thatbothrelationsare inBCNF. Database Joins We understandthe benefitsof Cartesianproductof tworelation,whichgivesusall the possible tuples that are pairedtogether.ButCartesianproductmightnotbe feasibleforhuge relationswhere number of tuplesare inthousandsandthe attributesof bothrelationsare considerable large. Joiniscombinationof Cartesianproductfollowedbyselectionprocess.Joinoperationpairstwotuples fromdifferentrelationsif andonlyif the givenjoinconditionissatisfied. Followingsectionshoulddescribe brieflyaboutjointypes: Theta(θ) join θ in Thetajoinisthe joincondition.Thetajoinscombinestuplesfromdifferentrelationsprovidedthey satisfythe thetacondition. Notation: R1 ⋈θ R2 R1 andR2 are relationswiththeirattributes(A1,A2,..,An) and (B1, B2,..,Bn) suchthat no attribute matchesthat isR1 ∩ R2 = Φ Here θ is conditioninformof setof conditionsC. Thetajoincan use all kindsof comparisonoperators. StudentSID Name Std 101 Alex 10 102 Maria 11 [Table:StudentRelation] SubjectsClass Subject 10 Math 10 English
  • 29. 11 Music 11 Sports [Table:SubjectsRelation] Student_Detail = STUDENT ⋈Student.Std= Subject.ClassSUBJECT Student_detailSID Name Std Class Subject 101 Alex 10 10 Math 101 Alex 10 10 English 102 Maria 11 11 Music 102 Maria 11 11 Sports [Table:Outputof thetajoin] Equi-Join WhenTheta joinusesonlyequalitycomparisonoperatoritissaidto be Equi-Join.The above example conrrespondstoequi-join Natural Join( ⋈ ) Natural joindoesnotuse any comparisonoperator.Itdoesnotconcatenate the wayCartesianproduct does.Instead,Natural Joincanonlybe performedif the there isatleastone commonattribute exists betweenrelation.Those attributesmusthave same name anddomain. Natural joinacts on those matchingattributeswherethe valuesof attributesinbothrelationissame. CoursesCID Course Dept CS01 Database CS ME01 Mechanics ME EE01 Electronics EE [Table:RelationCourses] HoDDept Head CS Alex ME Maya EE Mira [Table:RelationHoD] Courses⋈ HoDDept CID Course Head CS CS01 Database Alex ME ME01 Mechanics Maya EE EE01 Electronics Mira [Table:RelationCourses ⋈HoD] OuterJoins
  • 30. All joinsmentionedabove,thatisThetaJoin,Equi JoinandNatural Joinare calledinner-joins.Aninner- joinprocessincludesonlytupleswithmatchingattributes,restare discarded inresultingrelation.There existsmethodsbywhichall tuplesof anyrelationare includedinthe resultingrelation. There are three kindsof outerjoins: Leftouterjoin( R S ) All tuplesof Leftrelation,R,are includedinthe resultingrelationandif there existstuplesinRwithout any matchingtuple inSthenthe S-attributesof resultingrelationare made NULL. LeftA B 100 Database 101 Mechanics 102 Electronics [Table:LeftRelation] RightA B 100 Alex 102 Maya 104 Mira [Table:RightRelation] Courses HoDA B C D 100 Database 100 Alex 101 Mechanics --- --- 102 Electronics 102 Maya [Table:Leftouterjoinoutput] Rightouterjoin:( R S ) All tuplesof the Rightrelation,S,are includedinthe resultingrelationandif there existstuplesinS withoutanymatchingtuple inR thenthe R-attributesof resultingrelationare made NULL. Courses HoDA B C D 100 Database 100 Alex 102 Electronics 102 Maya --- --- 104 Mira [Table:Rightouterjoinoutput] Full outerjoin:( R S) All tuplesof bothparticipatingrelationsare includedinthe resultingrelationandif there nomatching tuplesforbothrelations,theirrespective unmatchedattributesare made NULL. Courses HoDA B C D 100 Database 100 Alex 101 Mechanics --- --- 102 Electronics 102 Maya --- --- 104 Mira [Table:Full outerjoinoutput]
  • 31. DBMS - Storage System Databasesare storedinfile formats,whichcontainsrecords.Atphysical level,actual dataisstoredin electromagneticformatonsome device capable of storingitfora longeramountof time.These storage devicescanbe broadlycategorizedinthree types: PrimaryStorage:The memorystorage,whichisdirectlyaccessible bythe CPU,comesunderthis category.CPU's internal memory(registers),fastmemory(cache)andmainmemory(RAM) are directly accessible toCPUas theyall are placedon the motherboardorCPU chipset.Thisstorage istypicallyvery small,ultrafastand volatile.Thisstorage needscontinuouspowersupplyinordertomaintainitsstate, i.e.incase of powerfailure all dataare lost. SecondaryStorage:The needtostore data for longeramountof time and to retainitevenafterthe powersupplyisinterruptedgave birthtosecondarydatastorage.All memorydevices,whichare not part of CPU chipsetormotherboardcomesunderthiscategory.Broadly,magneticdisks,all optical disks (DVD,CD etc.),flashdrivesandmagnetictapesare notdirectlyaccessiblebythe CPU.Hard diskdrives, whichcontainthe operatingsystemandgenerallynotremovedfromthe computersare,considered secondarystorage andall other are calledtertiarystorage. TertiaryStorage:Thirdlevel inmemoryhierarchyiscalledtertiarystorage.Thisisused tostore huge amountof data.Because thisstorage isexternal tothe computersystem, itisthe slowestinspeed. These storage devicesare mostlyusedtobackupthe entire system.Opticaldiskandmagnetictapesare widelyusedstorage devicesastertiarystorage. MemoryHierarchy A computersystemhaswell-definedhierarchyof memory.CPUhasinbuiltregisters,whichsavesdata beingoperatedon.Computersystemhasmainmemory,whichisalsodirectlyaccessiblebyCPU. Because the accesstime of mainmemoryandCPU speedvariesalot,to minimize the losscache memoryisintroduced.Cache memorycontainsmostrecentlyuseddataanddata whichmay be referred by CPU innear future. The memorywithfastestaccessisthe costliestone andisthe veryreasonof hierarchyof memory system.Largerstorage offersslowspeedbutcanstore huge amountof data comparedtoCPU registers or Cache memoryandthese are lessexpensive. MagneticDisks
  • 32. Hard diskdrivesare the most commonsecondarystorage devicesinpresentdaycomputersystems. These are calledmagneticdisksbecause itusesthe conceptof magnetizationtostore information.Hard disksconsistof metal diskscoatedwithmagnetizablematerial.These disksare placedverticallya spindle.A read/write headmovesinbetweenthe disksandisusedtomagnetize orde-magnetizethe spotunderit. Magnetizedspotcanbe recognizedas0 (zero) or1 (one). Hard disksare formattedina well-definedordertostoreddata efficiently.A harddiskplate hasmany concentriccirclesonit,calledtracks.Every track isfurtherdividedintosectors.A sectorona hard disk typicallystores512 bytesof data. RAID Exponential growthintechnologyevolvedthe conceptof largersecondarystorage medium.Tomitigate the requirementRAIDisintroduced.RAIDstandsforRedundantArrayof IndependentDisks,whichisa technologytoconnectmultiplesecondarystorage devicesandmake use of themas a single storage media. RAID consistsanarray of diskinwhichmultiple disksare connectedtogethertoachieve differentgoals. RAID levelsdefine the use of diskarrays. RAID 0: In thislevel astripedarrayof disksisimplemented.The dataisbrokendownintoblocksandall blocksare distributedamongall disks.Eachdiskreceivesablockof data to write/readinparallel.This enhancesthe speedandperformance of storage device.There isnoparityandbackupin Level 0. RAID 1: Thislevel uses mirroringtechniques.Whendataissentto RAIDcontrolleritsendsacopy of data to all disksinarray. RAID level 1isalsocalledmirroringandprovides100% redundancyincase of failure. RAID 2: Thislevel recordsthe Error CorrectionCode usingHammingdistance foritsdatastripedondifferentdisks.Like level 0,eachdata bit ina wordis recordedona separate diskandECC codesof the data wordsare storedon differentset disks.Because of itscomplex structure andhighcost,RAID2 isnot commerciallyavailable. RAID 3: Thislevel also stripesthe dataonto multiple disksinarray.The paritybitgeneratedfordata wordisstoredon a differentdisk.Thistechnique makesittoovercome single diskfailureandasingle diskfailure doesnot impactthe throughput.
  • 33. RAID 4: In thislevel anentire blockof dataiswrittenontodata disksandthenthe parity isgeneratedandstored on a differentdisk.The prime difference betweenlevel 3and4 is,level 3usesbyte level striping whereaslevel4usesblocklevel striping.Bothlevel 3and4 requiresatleast3 diskstoimplementRAID. RAID 5: Thislevel alsowriteswhole datablocksontodifferentdisksbutthe paritygeneratedfordatablock stripe isnot storedona differentdedicateddisk,butisdistributedamongall the datadisk RAID 6: Thislevel isan extensionof level5.Inthis level twoindependentparitiesare generatedandstoredindistributed fashionamongdisks.Twoparitiesprovideadditionalfaulttolerance.Thislevel requiresatleast4 disk drivestobe implemented. DBMS - File Structure
  • 34. Relative dataandinformationisstoredcollectivelyinfileformats.A file issequenceof recordsstoredin binaryformat.A diskdrive isformattedintoseveralblocks,whichare capable forstoring records.File recordsare mappedontothose diskblocks. File Organization The methodof mappingfile recordstodiskblocksdefinesfileorganization,i.e.how the filerecordsare organized.The followingare the typesof file organization HeapFile Organization:Whenafile iscreatedusingHeapFile Organizationmechanism, the OperatingSystems allocatesmemoryareatothat file withoutanyfurtheraccountingdetails.File recordscanbe placed anywhere inthatmemoryarea.It isthe responsibilityof softwaretomanage the records.HeapFile does not supportanyordering,sequencingorindexingonitsown. SequentialFile Organization:Everyfilerecordcontainsadata field(attribute) touniquelyidentifythat record.In sequentialfileorganization mechanism,recordsare placedinthe file inthe some sequential orderbasedon the unique keyfieldorsearchkey.Practically,itisnotpossible tostore all the records sequentiallyinphysical form. Hash File Organization:ThismechanismusesaHash functioncomputationonsome fieldof the records. As we know,thatfile isa collectionof records,whichhastobe mappedon some blockof the diskspace allocatedtoit.This mappingisdefinedthatthe hashcomputation.The outputof hashdeterminesthe locationof diskblockwhere the recordsmayexist. ClusteredFileOrganization:Clusteredfileorganizationisnotconsideredgoodforlarge databases.Inthis mechanism,relatedrecordsfromone ormore relationsare keptina same diskblock,thatis,the orderingof recordsisnot basedon primarykeyor searchkey.Thisorganizationhelpstoretrievedata easilybasedonparticularjoincondition.Otherthanparticularjoincondition,onwhichdataisstored,all queriesbecome more expensive. File Operations Operationsondatabase filescanbe classifiedintotwocategoriesbroadly.
  • 35. Update Operations Retrieval Operations Update operationschange the datavaluesbyinsertion,deletionorupdate.Retrieval operationsonthe otherhand donot alterthe data but retrieve themafteroptionalconditional filtering.Inbothtypesof operations,selectionplayssignificantrole.Otherthancreationanddeletionof afile,there couldbe several operations,whichcanbe done onfiles. Open:A file can be openedinone of twomodes,readmode or write mode.Inreadmode,operating systemdoesnotallowanyone toalterdata itis solelyforreadingpurpose.Filesopenedinreadmode can be sharedamongseveral entities.The othermode iswrite mode,in which,datamodificationis allowed.Filesopenedinwrite modecanbe readalso butcannot be shared. Locate: Everyfile hasa file pointer,whichtellsthe currentpositionwhere the dataistobe reador written.Thispointercanbe adjustedaccordingly.Usingfind(seek) operationitcanbe movedforward or backward. Read:By default,whenfilesare openedinreadmode the file pointerpointstothe beginningof file. There are optionswhere the usercantell the operatingsystemtowhere the file pointertobe locatedat the time of file opening.The verynextdatatothe file pointerisread. Write:User can selecttoopenfilesinwrite mode,whichenablesthemtoeditthe contentof file.Itcan be deletion,insertionormodification.The file pointercanbe locatedatthe time of openingorcan be dynamicallychangedif the operatingsystemalloweddoingso. Close:Thisalsoismost importantoperationfromoperatingsystempointof view.Whenarequestto close a file isgenerated,the operatingsystemremovesall the locks(if insharedmode) andsavesthe contentof data (if altered) tothe secondarystorage mediaandrelease all the buffersandfile handlers associatedwiththe file. The organizationof data contentinside the fileplaysa majorrole here.Seekingorlocatingthe file pointertothe desiredrecordinside file behavesdifferentlyif the filehasrecordsarrangedsequentially or clustered,andsoon. DBMS - Indexing We knowthatinformationinthe DBMS filesisstoredinform of records.Everyrecord isequippedwith some keyfield,whichhelpsittobe recognizeduniquely. Indexingisadata structure technique toefficientlyretrieve recordsfromdatabase filesbasedonsome attributesonwhichthe indexinghasbeendone.Indexingindatabase systemsissimilartothe one we see inbooks. Indexingisdefinedbasedonitsindexingattributes.Indexingcanbe one of the followingtypes:
  • 36. PrimaryIndex:If index isbuiltonordering'key-field'of file itiscalledPrimaryIndex.Generallyitisthe primarykeyof the relation. SecondaryIndex:If index isbuiltonnon-orderingfieldof fileitiscalledSecondaryIndex. ClusteringIndex:If index isbuiltonorderingnon-keyfieldof file itiscalledClusteringIndex. Orderingfieldisthe fieldonwhichthe recordsof file are ordered.Itcanbe differentfromprimaryor candidate keyof a file. OrderedIndexingisof twotypes: Dense Index Sparse Index Dense Index In dense index,there isanindex recordforevery searchkeyvalue inthe database.Thismakessearching fasterbut requiresmore space tostore index recordsitself.Index recordcontainssearchkeyvalue anda pointertothe actual recordon the disk. Sparse Index In sparse index,index recordsare notcreatedforeverysearchkey.Anindex recordhere containssearch keyand actual pointertothe data onthe disk.Tosearch a record we firstproceedbyindex recordand reach at the actual locationof the data. If the data we are lookingforisnot where we directlyreachby followingindex,the systemstartssequential searchuntil the desireddataisfound. MultilevelIndex Index recordsare comprisedof search-keyvalueanddatapointers.Thisindex itselfisstoredonthe disk alongwiththe actual database files.Asthe size of database growssodoesthe size of indices.There isan immense needtokeepthe index recordsinthe mainmemorysothatthe search can speedup.If single
  • 37. level index isusedthenalarge size index cannotbe keptinmemoryaswhole andthisleadstomultiple Multi-level Index helpsbreakingdownthe index intoseveral smallerindicesinordertomake the outer mostlevel sosmall thatit can be savedinsingle diskblockwhichcaneasilybe accommodatedanywhere inthe mainmemory. B+ Tree B tree is multi-level indexformat,whichisbalancedbinarysearchtrees.Asmentionedearliersingle level index recordsbecomeslarge asthe database size grows,whichalsodegradesperformance. All leaf nodesof B+ tree denote actual datapointers.B+tree ensuresthatall leaf nodesremainatthe same height,thusbalanced.Additionally,all leafnodesare linkedusinglinklist,whichmakesB+tree to supportrandomaccess as well assequentialaccess. Structure of B+ tree Everyleaf node isat equal distance fromthe rootnode.A B+ tree is of ordern where n isfixedforevery B+ tree.
  • 38. Internal nodes: Internal (non-leaf) nodescontainsatleast ⌈n/2⌉ pointers,exceptthe rootnode. At most,internal nodescontainn pointers. Leaf nodes: Leaf nodescontainat least ⌈n/2⌉ record pointersand ⌈n/2⌉ keyvalues At most,leaf nodescontainnrecordpointersandn keyvalues Everyleaf node containsone blockpointerPtopointto nextleaf node andformsa linkedlist. B+ tree insertion B+ tree are filledfrombottom.Andeachnode isinsertedatleaf node. If leaf node overflows: Splitnode intotwoparts Partitionati = ⌊(m+1)/2⌋ Firsti entriesare storedinone node Restof the entries(i+1onwards) are moved toa new node ithkeyis duplicatedinthe parentof the leaf If non-leaf node overflows: Splitnode intotwoparts Partitionthe node ati = ⌈(m+1)/2⌉ Entriesuptoi are keptin one node Restof the entriesare movedtoa newnode B+ tree deletion B+ tree entriesare deletedleaf nodes. The target entryissearchedand deleted. If it is ininternal node,deleteandreplace withthe entryfromthe leftposition. Afterdeletionunderflowistested
  • 39. If underflowoccurs Distribute entriesfromnodeslefttoit. If distributionfromleftisnotpossible Distribute fromnodesrighttoit If distributionfromleftandrightisnotpossible Merge the node withleftandrightto it. DBMS - Hashing For a huge database structure it isnot sometime feasibletosearchindex throughall itslevelandthen reach the destinationdatablocktoretrieve the desireddata.Hashingisaneffectivetechnique to calculate directlocationof datarecord on the diskwithoutusingindexstructure. It usesa function,calledhashfunctionandgeneratesaddresswhencalledwithsearchkeyas parameters.Hashfunctioncomputesthe locationof desireddataonthe disk. Hash Organization Bucket:Hash file storesdatainbucketformat.Bucketisconsideredaunitof storage.Buckettypically storesone complete diskblock,whichinturncanstore one or more records. Hash Function:A hash functionh,isa mappingfunctionthatmapsall set of search-keysKtothe address where actual recordsare placed.Itis a functionfromsearchkeystobucketaddresses. StaticHashing In statichashing,whenasearch-keyvalue isprovidedthe hashfunctionalwayscomputesthe same address.Forexample,if mod-4hashfunctionisusedthenitshall generateonly5values.The output addressshall alwaysbe same forthat function.The numbersof bucketsprovidedremainsame atall
  • 40. times. Operation: Insertion:Whenarecordis requiredtobe enteredusingstatichash,the hashfunctionh,computesthe bucketaddressforsearch keyK, where the recordwill be stored. Bucketaddress= h(K) Search:Whena recordneedstobe retrievedthe same hashfunctioncanbe usedtoretrieve the address of bucketwhere the dataisstored. Delete:Thisissimplysearchfollowedbydeletionoperation. BucketOverflow: The conditionof bucket-overflowisknownascollision.Thisisafatal state for anystatic hashfunction.In thiscase overflow chainingcanbe used.
  • 41. OverflowChaining:Whenbucketsare full,anew bucketisallocatedforthe same hashresultandis linkedafterthe previousone.ThismechanismiscalledClosedHashing. LinearProbing:Whenhashfunctiongeneratesanaddressatwhichdata isalreadystored,the nextfree bucketisallocatedtoit. ThismechanismiscalledOpen Hashing. For a hash functiontoworkefficientlyand effectivelythe followingmustmatch: Distributionof recordsshouldbe uniform Distributionshouldbe randominsteadof anyordering DynamicHashing Problemwithstatichashingisthatit doesnotexpandorshrinkdynamicallyasthe size of database growsor shrinks.Dynamichashingprovidesamechanisminwhichdatabucketsare addedandremoved dynamicallyandon-demand.Dynamichashingisalsoknownasextendedhashing. Hash function,indynamichashing,ismade toproduce large numberof valuesandonlyafew are used initially.
  • 42. Organization The prefix of entire hashvalue istakenashashindex.Onlyaportionof hash value isusedforcomputing bucketaddresses.Everyhashindex hasadepth value,whichtellsithow manybitsare usedfor computinghashfunction.These bitsare capable toaddress2n buckets.Whenall these bitsare consumed,thatis,all bucketsare full,thenthe depthvalue isincreasedlinearlyandtwice the buckets are allocated. Operation Querying:Lookat the depthvalue of hashindex anduse those bitstocompute the bucketaddress. Update:Performa queryas above andupdate data. Deletion:Performaquerytolocate desireddataand delete data. Insertion:compute the addressof bucket If the bucketisalreadyfull Addmore buckets Addadditional bittohashvalue Re-compute the hashfunction
  • 43. Else Adddata to the bucket If all bucketsare full,performthe remediesof statichashing. Hashingisnot favorable whenthe dataisorganizedinsome orderingandqueriesrequire range of data. Whendata is discrete andrandom,hashperformsthe best. Hashingalgorithmandimplementationhave highcomplexitythanindexing.Allhashoperationsare done inconstanttime. DBMS - Transaction A transactioncan be definedasa groupof tasks.A single taskisthe minimumprocessingunitof work, whichcannotbe dividedfurther. An example of transactioncanbe bankaccounts of two users,sayA & B. Whena bank employee transfersamountof Rs. 500 from A's accountto B's account,a numberof tasksare executedbehindthe screen.Thisverysimple andsmall transactionincludesseveral steps:decreaseA'sbankaccountfrom 500 Open_Account(A) Old_Balance = A.balance New_Balance =Old_Balance - 500 A.balance =New_Balance Close_Account(A) In simple words,the transactioninvolvesmanytasks,suchasopeningthe accountof A, readingthe old balance,decreasingthe 500 fromit, savingnew balance toaccount of A and finallyclosingit.Toadd amount500 inB's account same sort of tasksneedtobe done: Open_Account(B) Old_Balance = B.balance New_Balance =Old_Balance + 500 B.balance = New_Balance Close_Account(B) A simple transactionof movinganamountof 500 from A to B involvesmanylow leveltasks. ACIDProperties A transactionmaycontainseveral lowleveltasksandfurtheratransactionis verysmall unitof any program.A transactionina database systemmustmaintainsome propertiesinordertoensure the accuracy of itscompletenessanddataintegrity.These propertiesare refertoas ACIDpropertiesandare mentionedbelow: Atomicity:Thoughatransactioninvolvesseveral lowlevel operationsbutthispropertystatesthata transactionmustbe treatedasan atomic unit,thatis,eitherall of itsoperationsare executedornone. There mustbe nostate indatabase where the transactionisleftpartiallycompleted.Statesshouldbe
  • 44. definedeitherbefore the executionof the transactionorafterthe execution/abortion/failure of the transaction. Consistency:Thispropertystatesthatafterthe transactionisfinished,itsdatabase mustremainina consistentstate.There mustnotbe any possibilitythatsome dataisincorrectlyaffectedbythe executionof transaction.If the database wasina consistentstate before the executionof the transaction,itmustremaininconsistentstate afterthe executionof the transaction. Durability:Thispropertystatesthatinanycase all updatesmade onthe database will persistevenif the systemfailsandrestarts.If a transactionwritesorupdatessome data indatabase and commitsthat data will alwaysbe there inthe database.If the transactioncommitsbutdata isnot writtenonthe disk and the systemfails,thatdatawill be updatedonce the systemcomesup. Isolation:Ina database systemwhere more thanone transactionare beingexecutedsimultaneouslyand inparallel,the propertyof isolationstatesthatall the transactionswill be carriedoutandexecutedasif it isthe onlytransactioninthe system.Notransactionwill affectthe existence of anyothertransaction. Serializability Whenmore than one transactionisexecutedbythe operatingsysteminamultiprogramming environment,there are possibilitiesthatinstructionsof one transactionsare interleavedwithsome othertransaction. Schedule:A chronological executionsequence of transactioniscalledschedule.A schedule canhave manytransactionsinit, eachcomprisingof numberof instructions/tasks. Serial Schedule:A schedule inwhichtransactionsare alignedinsuchaway that one transactionis executedfirst.Whenthe firsttransactioncompletesitscycle thennexttransactionisexecuted. Transactionsare orderedone afterother.Thistype of schedule iscalledserial scheduleastransactions are executedinaserial manner. In a multi-transactionenvironment,serial schedulesare consideredasbenchmark.The execution sequence of instructioninatransactioncannotbe changedbut twotransactionscan have their instructionexecutedinrandomfashion.Thisexecutiondoesnoharmif two transactionsare mutually independentandworkingondifferentsegmentof databutin case these twotransactionsare working on same data, resultsmayvary.This ever-varyingresultmaycause the database inan inconsistentstate. To resolve the problem, we allowparallel executionof transactionscheduleif transactionsinitare eitherserializable orhave some equivalence relationbetweenoramongtransactions. Equivalence schedules:Schedulescanequivalence of the followingtypes: ResultEquivalence: If two schedulesproduce same resultsafterexecution,are saidtobe resultequivalent.Theymayyield same resultforsome value andmay yielddifferentresultsforanothervalues.That'swhythis equivalence isnotgenerallyconsideredsignificant. ViewEquivalence:
  • 45. Two schedulesare view equivalence if transactionsinbothschedulesperformsimilaractionsinsimilar manner. For example: If T readsinitial datainS1 thenT alsoreadsinitial datainS2 If T readsvalue writtenbyJin S1 thenT alsoreadsvalue writtenbyJin S2 If T performsfinal write ondatavalue inS1 thenT also performsfinal write ondatavalue inS2 ConflictEquivalence: Two operationsare saidtobe conflictingif theyhave the followingproperties: Both belongto separate transactions Both accessesthe same data item At leastone of themis"write"operation Two scheduleshave more thanone transactionswithconflictingoperationsare saidtobe conflict equivalentif andonlyif: Both schedulescontainsame setof Transactions The order of conflictingpairsof operationismaintainedinbothschedules Viewequivalentschedulesare viewserializable andconflictequivalentschedulesare conflict serializable.All conflictserializableschedulesare view serializable too. Statesof Transactions: A transactionina database can be inone of the followingstate:
  • 46. Active:Inthisstate the transactionis beingexecuted.Thisisthe initial state of everytransaction. PartiallyCommitted:Whenatransactionexecutesitsfinal operation,itissaidtobe in thisstate.After executionof all operations,the database systemperformssome checkse.g.the consistencystate of database afterapplyingoutputof transactionontothe database. Failed:If anychecksmade by database recoverysystemfails,the transactionissaidtobe in failedstate, fromwhere itcan no longerproceedfurther. Aborted:If anyof checksfailsandtransactionreachedinFailedstate,the recoverymanagerrollsback all itswrite operationonthe database to make database inthe state where itwas priorto start of executionof transaction.Transactionsinthisstate are calledaborted.Database recoverymodulecan selectone of the twooperationsaftera transactionaborts: Re-startthe transaction Kill the transaction Committed:If transactionexecutesall itsoperationssuccessfullyitissaidtobe committed.All itseffects are nowpermanentlymade ondatabase system. DBMS - ConcurrencyControl In a multiprogrammingenvironmentwheremore thanone transactionscanbe concurrentlyexecuted, there existsaneedof protocolstocontrol the concurrencyof transactionto ensure atomicityand isolationpropertiesof transactions. Concurrencycontrol protocols,whichensure serializabilityof transactions,are mostdesirable. Concurrencycontrol protocolscan be broadlydividedintotwocategories: Lock basedprotocols Time stampbasedprotocols Lock basedprotocols Database systems,whichare equippedwithlock-basedprotocols,use mechanismbywhichany transactioncannotread or write data until itacquiresappropriate lockonitfirst.Locksare of twokinds: BinaryLocks: a lock ondata itemcan be intwostates;it iseitherlockedorunlocked. Shared/exclusive:thistype of lockingmechanismdifferentiateslockbasedontheiruses.If alockis acquiredona data itemtoperforma write operation,itisexclusivelock.Because allowingmore than one transactionsto write onsame data itemwouldleadthe database intoaninconsistentstate.Read locksare sharedbecause nodata value isbeingchanged. There are fourtypeslockprotocolsavailable: Simplistic
  • 47. Simplisticlockbasedprotocolsallow transactiontoobtainlockoneveryobjectbefore'write'operation isperformed.Assoonas 'write'hasbeendone,transactionsmayunlockthe dataitem. Pre-claiming In thisprotocol,a transactionsevaluationsitsoperationsandcreatesalistof data itemsonwhichit needslocks.Before startingthe execution,transactionrequeststhe systemforall locksitneeds beforehand.If all the locksare granted,the transactionexecutesandreleasesall the lockswhenall its operationsare over.Else if all the locksare not granted,the transactionrollsbackand waitsuntil all locksare granted. Two Phase Locking- 2PL Thislockingprotocol isdividestransactionexecutionphaseintothree parts.Inthe firstpart,when transactionstarts executing,transactionseeksgrantforlocksitneedsas itexecutes.Secondpart is where the transactionacquiresall locksandnoother lockisrequired.Transactionkeepsexecutingits operation.Assoonas the transactionreleasesitsfirstlock,the thirdphase starts.Inthisphase a transactioncannotdemandfor anylock butonlyreleasesthe acquiredlocks. Two phase lockinghastwophases,one isgrowing;where all locksare beingacquiredbytransactionand secondone isshrinking,where locksheldbythe transactionare beingreleased. To claiman exclusive (write) lock,a transactionmustfirstacquire ashared(read) lockand thenupgrade it to exclusive lock. Strict TwoPhase Locking The firstphase of Strict-2PLissame as 2PL. Afteracquiringall locksinthe firstphase,transaction continuestoexecute normally.Butincontrastto 2PL, Strict-2PLdoesnot release lockassoonas itis no
  • 48. more required,butitholdsall locksuntil commitstate arrives.Strict-2PLreleasesall locksatonce at commitpoint. Time stampbasedprotocols The most commonlyused concurrencyprotocol istime-stampbasedprotocol.Thisprotocol useseither systemtime orlogical countertobe usedas a time-stamp. Lock basedprotocolsmanage the orderbetweenconflictingpairsamongtransactionatthe time of executionwhereastime-stampbasedprotocolsstartworkingassoonas transactionis created. Everytransactionhas a time-stampassociatedwithitandthe orderingisdeterminedbythe age of the transaction.A transactioncreatedat 0002 clock time wouldbe olderthanall othertransaction,which come afterit. For example,anytransaction'y'enteringthe systemat0004 istwo secondsyoungerand prioritymaybe givento the olderone. In addition,everydataitemisgiventhe latestreadandwrite-timestamp.Thisletsthe systemknow, whenwaslastread and write operationmade onthe dataitem. Time-stamporderingprotocol The timestamp-orderingprotocol ensuresserializabilityamongtransactionintheirconflictingreadand write operations.Thisisthe responsibilityof the protocol systemthatthe conflictingpairof tasksshould be executedaccordingtothe timestampvaluesof the transactions. Time-stampof TransactionTi isdenotedasTS(Ti). Readtime-stampof data-itemXisdenotedbyR-timestamp(X). Write time-stampof data-itemXisdenotedbyW-timestamp(X). Timestamporderingprotocol worksasfollows: If a transactionTi issuesread(X) operation: If TS(Ti) < W-timestamp(X) Operationrejected. If TS(Ti) >= W-timestamp(X) Operationexecuted.
  • 49. All data-itemTimestampsupdated. If a transactionTi issueswrite(X)operation: If TS(Ti) < R-timestamp(X) Operationrejected. If TS(Ti) < W-timestamp(X) OperationrejectedandTi rolledback. Otherwise,operationexecuted. Thomas' Write rule: Thisrule statesthat incase of: If TS(Ti) < W-timestamp(X) OperationrejectedandTi rolledback.Timestamporderingrulescanbe modifiedtomake the schedule viewserializable.Insteadof makingTi rolledback,the 'write'operationitself isignored. DBMS - Deadlock In a multi-processsystem, deadlockisasituation,whicharisesinsharedresource environmentwhere a processindefinitelywaitsforaresource,whichisheldbysome otherprocess,whichinturnwaitingfora resource heldbysome other process. For example,assumeasetof transactions{T0, T1, T2, ...,Tn}.T0 needsa resource Xto complete itstask. Resource Xis heldbyT1 andT1 is waitingfora resource Y, whichisheldbyT2. T2 is waitingforresource Z, whichisheldbyT0. Thus,all processeswaitforeachotherto release resources.Inthissituation,none of processescanfinishtheirtask.Thissituationisknownas'deadlock'. Deadlockisnota goodphenomenonforahealthysystem.Tokeepsystemdeadlockfreefew methods can be used.Incase the systemisstuckbecause of deadlock,eitherthe transactionsinvolvedin deadlockare rolledbackandrestarted. DeadlockPrevention To preventanydeadlocksituationinthe system,the DBMSaggressivelyinspectsall the operations whichtransactionsare about toexecute.DBMSinspectsoperationsandanalyze if theycancreate a deadlocksituation.If itfindsthatadeadlocksituationmightoccurthenthattransactionisneverallowed to be executed. There are deadlockprevention schemes,whichusestime-stamporderingmechanismof transactionsin orderto pre-decide adeadlocksituation. Wait-Die Scheme:
  • 50. In thisscheme,if atransactionrequesttolocka resource (dataitem),whichisalreadyheldwith conflictinglockbysome othertransaction,one of the twopossibilitiesmayoccur: If TS(Ti) < TS(Tj),thatisTi, whichisrequestingaconflictinglock,isolderthanTj,Ti isallowedtowait until the data-itemisavailable. If TS(Ti) > TS(tj),thatisTi is youngerthanTj, Ti dies.Ti is restartedlaterwithrandomdelaybutwith same timestamp. Thisscheme allowsthe oldertransactiontowaitbutkillsthe youngerone. Wound-WaitScheme: In thisscheme,if atransactionrequesttolocka resource (dataitem),whichisalreadyheldwith conflictinglockbysome othertransaction,one of the twopossibilitiesmayoccur: If TS(Ti) < TS(Tj),thatisTi, whichisrequestingaconflictinglock,isolderthanTj,Ti forcesTj to be rolled back, thatis Ti woundsTj.Tj isrestartedlaterwithrandomdelaybutwithsame timestamp. If TS(Ti) > TS(Tj),thatisTi is youngerthanTj, Ti is forcedto waituntil the resource isavailable. Thisscheme,allowsthe youngertransactiontowaitbutwhenanoldertransactionrequestan itemheld by youngerone,the oldertransactionforcesthe youngerone toabortand release the item. In bothcases,transaction,whichenterslate inthe system, isaborted. DeadlockAvoidance Abortinga transactionisnot alwaysapractical approach.Insteaddeadlockavoidancemechanismscan be usedto detectanydeadlocksituationinadvance.Methodslike"wait-forgraph"are available butfor the systemwhere transactionsare lightinweightandhave holdonfewerinstancesof resource.Ina bulkysystemdeadlockpreventiontechniquesmayworkwell. Wait-forGraph Thisis a simple methodavailabletotrackif anydeadlocksituationmayarise.Foreachtransaction enteringinthe system,anode iscreated.WhentransactionTi requestsfora lockon item, sayX, which isheldbysome othertransactionTj, a directededge iscreatedfromTi to Tj. If Tj releasesitemX,the edge betweenthemisdroppedandTi locksthe data item. The systemmaintainsthiswait-forgraphforeverytransactionwaitingfor some dataitemsheldby others.Systemkeepscheckingif there'sanycycle inthe graph.
  • 51. DBMS - Data Backup Failure withlossof Non-Volatilestorage What wouldhappenif the non-volatile storage like RAMabruptlycrashes?All transaction,whichare beingexecutedare keptinmainmemory.All active logs,diskbuffersandrelateddataisstoredinnon- volatile storage. Whenstorage like RAMfails,ittakesaway all the logsand active copyof database.Itmakesrecovery almostimpossible aseverythingtohelprecoverisalsolost.Followingtechniquesmaybe adoptedin case of lossof non-volatilestorage. A mechanismlike checkpointcanbe adoptedwhichmakesthe entire contentof database be saved periodically. State of active database innon-volatilememorycanbe dumpedontostable storage periodically,which may alsocontainlogsand active transactionsandbufferblocks. <dump> can be markedon logfile wheneverthe database contentsare dumpedfromnon-volatile memoryto a stable one. Recovery: Whenthe systemrecoversfromfailure,itcanrestore the latestdump. It can maintainredo-listandundo-listasincheckpoints. It can recoverthe systembyconsultingundo-redoliststorestore the state of all transactionupto last checkpoint. Database backup& recoveryfromcatastrophicfailure So far we have not discoveredanyotherplanetinoursolarsystem, whichmayhave life onit,andour ownearth isnot that safe.Incase of catastrophicfailure like alienattack,the database administrator may still be forcedtorecoverthe database.
  • 52. Remote backup,describednext,isone of the solutionstosave life.Alternatively,wholedatabase backupscan be takenonmagnetictapesand storedat a saferplace.Thisbackupcan laterbe restored on a freshlyinstalleddatabaseandbringitto the state at leastatthe pointof backup. Grown updatabasesare toolarge to be frequentlybacked-up.Instead,we are aware of techniques where we can restore adatabase by justlookingatlogs.So backupof logsat frequentrate ismore feasible thanthe entire database.Database canbe backed-uponce aweekandlogs,beingverysmall can be backed-upeverydayoras frequentaseveryhour. Remote Backup Remote backupprovidesasense of securityandsafety incase the primarylocationwhere the database islocatedgetsdestroyed.Remote backupcanbe offlineorreal-timeandonline.Incase itis offline itis maintainedmanually. Online backupsystemsare more real-time andlifesaversfordatabase administratorsandinvestors.Anonline backupsystemisa mechanismwhere everybitof real-time dataisbacked-upsimultaneouslyattwo distantplace.One of themisdirectlyconnectedtosystemandotherone iskeptat remote place as backup. As soonas the primarydatabase storage fails,the backupsystemsense the failure andswitchthe user systemtothe remote storage.Sometimesthisissoinstantthe usersevencan'trealize afailure. DBMS - Data Recovery Crash Recovery Thoughwe are livinginhighlytechnologicallyadvancederawhere hundredsof satellite monitorthe earthand at everysecondbillionsof people are connectedthroughinformationtechnology,failure is expectedbutnoteverytime acceptable. DBMS ishighlycomplex systemwithhundredsof transactionsbeingexecutedeverysecond.Availability of DBMS dependsonitscomplex architecture andunderlyinghardware orsystemsoftware.If itfailsor crashesamidtransactionsbeingexecuted,itisexpectedthatthe systemwouldfollowsome sortof algorithmortechniquestorecoverfromcrashesor failures.
  • 53. Failure Classification To see where the problemhasoccurredwe generalizethe failure intovariouscategories,asfollows: Transactionfailure Whena transactionis failedtoexecute oritreachesapointafterwhichit cannotbe completed successfullyithastoabort.This iscalledtransactionfailure.Whereonlyfew transactionorprocessare hurt. Reasonfortransactionfailure couldbe: Logical errors:where a transactioncannotcomplete because of ithassome code error or any internal error condition Systemerrors:where the database systemitself terminatesanactive transactionbecause DBMSisnot able to execute itorithas to stop because of some systemcondition.Forexample,incase of deadlock or resource unavailabilitysystemsabortsanactive transaction. Systemcrash There are problems,whichare external tothe system, whichmaycause the systemtostopabruptlyand cause the systemtocrash. For example interruptioninpowersupply,failureof underlyinghardware or software failure. Examplesmayinclude operatingsystemerrors. Diskfailure: In earlydaysof technologyevolution,itwasa commonproblemwhere harddiskdrivesorstorage drives usedto fail frequently. Diskfailuresincludeformationof badsectors,unreachabilitytothe disk,diskheadcrashor anyother failure,whichdestroysall orpart of diskstorage Storage Structure We have alreadydescribedstorage systemhere.Inbrief,the storage structure canbe dividedinvarious categories: Volatile storage:Asname suggests,thisstorage doesnotsurvive systemcrashesandmostlyplacedvery closedtoCPU by embeddingthemontothe chipsetitself forexamples:mainmemory,cache memory. Theyare fastbut can store a small amountof information. Nonvolatile storage:Thesememoriesare made tosurvive systemcrashes.Theyare huge indatastorage capacitybut slowerinaccessibility.Examplesmayinclude,harddisks,magnetictapes,flashmemory, non-volatile (batterybackedup) RAM. RecoveryandAtomicity Whena systemcrashes,itmany have several transactionsbeingexecutedandvariousfilesopenedfor themto modifyingdataitems.Aswe know thattransactionsare made of variousoperations,whichare
  • 54. atomicin nature.Butaccording to ACIDpropertiesof DBMS,atomicityof transactionsas a whole must be maintainedthatis,eitherall operationsare executedornone. WhenDBMS recoversfroma crash it shouldmaintainthe following: It shouldcheckthe statesof all transactions,whichwere beingexecuted. A transactionmaybe inthe middle of some operation;DBMSmustensure the atomicityof transaction inthiscase. It shouldcheckwhetherthe transactioncanbe completednow orneedstobe rolledback. No transactionswouldbe allowedtoleftDBMSininconsistentstate. There are twotypesof techniques,whichcanhelpDBMS inrecoveringaswell asmaintainingthe atomicityof transaction: Maintainingthe logsof eachtransaction,and writingthemontosome stable storage beforeactually modifyingthe database. Maintainingshadowpaging,where are the changesare done ona volatile memoryandlaterthe actual database isupdated. Log-BasedRecovery Log is a sequence of records,whichmaintainsthe recordsof actionsperformedbyatransaction.Itis importantthatthe logsare writtenpriortoactual modificationandstoredona stable storage media, whichisfailsafe. Log basedrecoveryworksasfollows: The log file iskeptonstable storage media Whena transactionentersthe systemandstarts execution,itwritesalogaboutit <Tn, Start> Whenthe transactionmodifiesanitemX,itwrite logsasfollows: <Tn, X, V1,V2> It readsTn has changedthe value of X, fromV1 to V2. Whentransactionfinishes,itlogs: <Tn, commit> Database can be modifiedusingtwoapproaches: Deferreddatabase modification:All logsare writtenontothe stable storage anddatabase isupdated whentransactioncommits.
  • 55. Immediate database modification:Eachlogfollowsanactual database modification.Thatis,database is modifiedimmediatelyaftereveryoperation. Recoverywithconcurrenttransactions Whenmore than one transactionsare beingexecuted inparallel,the logsare interleaved.Atthe time of recoveryitwouldbecome hardforrecoverysystemtobacktrack all logs,andthenstart recovering.To ease thissituationmostmodernDBMSuse the conceptof 'checkpoints'. Checkpoint Keepingandmaintaininglogsinreal time andinreal environmentmayfill outall the memoryspace available inthe system.Attime passeslogfile maybe toobigto be handledatall.Checkpointisa mechanismwhere all the previouslogsare removedfromthe systemand storedpermanentlyinstorage disk.Checkpointdeclaresapointbefore whichthe DBMSwasin consistentstate andall the transactions were committed. Recovery Whensystemwithconcurrenttransactioncrashesandrecovers,itdoesbehave inthe followingmanner: The recoverysystemreadsthe logsbackwardsfromthe endto the lastCheckpoint. It maintainstwolists,undo-listandredo-list. If the recoverysystemseesalogwith<Tn,Start> and <Tn, Commit>or just<Tn, Commit>,itputs the transactioninredo-list. If the recoverysystemseesalogwith<Tn,Start> butno commitor abort log found,itputsthe transactioninundo-list. All transactionsinundo-listare thenundone andtheirlogsare removed.All transactioninredo-list, theirpreviouslogsare removedandthenredone againandlogsaved.