SlideShare une entreprise Scribd logo
1  sur  5
Télécharger pour lire hors ligne
Overview
                                                                       We study the internal of DBMSs
                                                                           Principles of relational DBMSs
   Database Management Systems                                              • Emphasize on query & transaction processing
                                                                              techniques
                                                                           Advanced database systems & applications
                                                                            • OODBMS, XML database, data warehousing, OLAP,
                                                                              data mining
                      Prof. Weining Zhang
                                                                       Course work includes
                   Dept. of Computer Science
                                                                           Homework, 2 midterm exams, no final exam
                University of Texas at San Antonio
                                                                           Programming assignments in Java


                                                                       W. Zhang                   Introduction                       2




Teaching Staff                                                        Communication
    Instructor: Prof. Weining Zhang                                    Web page:
             Office: SB 4.01.19
                                                                       http://www.cs.utsa.edu/~wzhang/cs5443/home
                                                                           Contains everything about the course: syllabus,
             Phone: 458-5557
                                                                           announcement, assignments, project, lecture notes, etc.
             Email: wzhang@cs.utsa.edu
             Office hour: MW 5:00 – 6:00 pm
                                                                       You should check course web pages regularly.
                          T 4:00-5:00 pm                               Mailing list: 5443@cs.utsa.edu
                          and by appointment                               Include your CS email address; you may need to forward
                                                                           emails to your regular email address




 W. Zhang                        Introduction                     3    W. Zhang                   Introduction                       4




Textbooks                                                             Other Textbooks
    Required textbook:                                                    Fundamentals of Database Systems, 5th ed., by
            Database Management systems,        3rd   ed., by             Elmasri & Navathe
            Ramakrishnan & Gehrke                                         Other database books in the Main Library
    Recommended textbook:
            Principles of Distributed Database Systems, by M.
            Ozsu & P. Valduriez
            Database System: the Complete Book, by Garcia-
            Molina, Ullman & Widom
            Database system concepts, 5th ed., by Silberschatz,
            Korth & Sudarshan


 W. Zhang                        Introduction                     5    W. Zhang                   Introduction                       6
Prerequisite                                                                             Grading
 CS3743 or equivalent, or extensive experience with                                          Programming assignments 20%
 database & DB application
                                                                                             Homework 20%
 Strong Java programming skills
                                                                                             Midterm I 25%
 Data structures, algorithms, OO programming, etc.
                                                                                             Midterm II 25%
 Mathematics including logic, sets, algebra, …
                                                                                             Intangibles 10%




 W. Zhang                              Introduction                                 7     W. Zhang                     Introduction                      8




Programming Assignments                                                                  Introduction to Database Systems
    Implement several components of a simple DBMS                                         A database system consists of
    called Minibase (Java version), such as,                                                  Database management system: the software
            Buffer Manager                                                                    Databases: the data
            Heap File                                                                     A DBMS needs to provide
            Hash-based Index                                                                   persistent data storage
            Relational operators
                                                                                               declarative query language for efficient data retrieval
            Query processing
                                                                                               shared access to data by different applications
    Work in groups of 2                                                                        data security
    Programming in Java, on Linux or Windows,                                                  data integrity …
    recommend using Eclipse IDE

 W. Zhang                              Introduction                                 9     W. Zhang                     Introduction                      10




An RDBMS Architecture                                                                    Storage Management
                                                                                          Data is stored on disks, and processed in the main
  Web forms              Application front end                 SQL interface
                                                                                          memory
                                SQL Commands                                              Since disk I/Os are costly, search structures, such as,
                                                                     Query                indexes, must be used to achieve efficient data access
                        Parser                 Optimizer
                                                                     Evaluation
 Concurrency        Plan Executor        Operator Evaluator Engine
                                                                                          DBMS components that manage different types of
 Control                                                                                  storage include
 Xction Man              File & Access Methods
                                                                       Recovery               Disk Manager: manages pages on disk drive
                                                                       Manager                Buffer Manager: manages pages in main memory buffer
 Lock Man            Buffer Manager & Disk Manager                                DBMS



                  Index files                         Sys. catalog
                                  Data files
 W. Zhang                              Introduction                                11     W. Zhang                     Introduction                      12
File Organization                                          Query Processing
 Data records are logically organized in files and          DBMS evaluates declarative queries by executing an
 physically stored on disk pages                            optimal query plan that is expressed using relational
 File organization must consider the format and size of     algebraic operations.
 data records                                               A DBMS must evaluate algebraic operations
 In addition to simple files of raw data, DBMS also         efficiently.
 maintains search structures, such as,                      The algorithms and the costs of relational algebraic
     Ordering                                               operations, such as, selection and join, depend
     Hashing                                                critically on
     Indexing                                                   types of query condition
 to reduce access costs                                         specifics of file organizations

 W. Zhang                 Introduction                13    W. Zhang                     Introduction                   14




Query Optimization                                         Transaction Processing
                                                            A transaction models the execution of a database
    For easy of use, query languages are declarative.
                                                            application, which typically updates the data in
    The system must figure out an efficient evaluation
                                                            databases.
    plan
                                                            Transaction management must deal with concurrent
    The goal is to answer a query with as few disk I/O
                                                            transactions and possible system failures.
    as possible
    The system uses statistics of the data & heuristics
    to decide how to process the query




 W. Zhang                 Introduction                15    W. Zhang                     Introduction                   16




Recovery                                                   Concurrency Control
 The recovery manager protects data integrity in case of    Concurrent execution of application programs is
 system crash.                                              essential for good DBMS performance.
 The system guarantees that either all operations of a          Need to keep CPU busy while performing I/O operations
                                                                (frequent & relatively slow).
 transaction or none of them are performed, and updates
 made by completed transactions are persistent.             Interleaving actions of different user programs can lead
                                                            to inconsistency: e.g., check is cleared while account
                                                            balance is being computed.
                                                            Concurrency control subsystem ensures such problems
                                                            don’t arise: users can pretend they are using a single-
                                                            user system.

 W. Zhang                 Introduction                17    W. Zhang                     Introduction                   18
Advanced Hashing & Indexing                                        Distributed DBMS
 Relational DBMS support hashing & B+ tree indexing                 Modern corporations have data, control, & application
 New DBMSs & DB applications need more                              distributed globally
 sophisticated search structures                                    Multiple databases at geographically dispersed
     Hashing with variable size hash table or multiple keys         locations need to cooperate to answer queries with
     Indexes for spatial, multidimensional data (common in          distributed data
     multimedia DBSs, Data warehousing, OLAP, …)                    Concurrent transaction processing and recovery are still
                                                                    major issues




 W. Zhang                    Introduction                     19    W. Zhang                   Introduction                20




Parallel DBMS                                                      XML & Semistructured DBMS
 Both centralized and distributed databases may use                 Data in RDB, OODB, & ORDB are structured (with
 multiple processors to evaluate queries                            rigid schemas)
 Parallel system architecture requires new algorithms               Data on the Web (and other applications) are
 for query evaluation and optimization                              semistructured
 Performance concerns include                                           HTML, XML, Text, …
     Ability to scale up                                            Need new concepts and techniques
     Ability to speed up                                                Data model, query language
                                                                        Query processing & optimization
                                                                        Storage management
                                                                        Update, transaction processing, CC, …


 W. Zhang                    Introduction                     21    W. Zhang                   Introduction                22




Data Warehousing & OLAP                                            Data Mining
 Corporations need to put all available data into use               Data contains important patterns useful for making
 when making vital business decisions                               sound business decisions
 Need to have technology to integrate data from all                 Databases need tools to discover knowledge embedded
 sources, and keep them up to date                                  in data
 Need advanced tools to analyze, summarize, and view                    Associations
 data in various ways                                                   Clusters
 Issues:                                                                Classifications
     Data cube model                                                Useful for business trend analysis, fraud detection,
     OLAP operations                                                diagnosis, market prediction, …
     Query processing, indexing, views, …

 W. Zhang                    Introduction                     23    W. Zhang                   Introduction                24
Topics                                             Topics (cont.)
 Relational algebra and calculus                     Distributed Database Systems
 Storage & File Management                              Database design
     Disk manager, buffer manager,                      Query processing & optimization
     Indexing, hashing                                  Concurrency control & recovery
 Query Evaluation & Optimization                     Parallel Database systems
     Access methods, selection, joins, etc.          XML databases
     Query optimization methods
                                                     Data Warehousing and OLAP
 Transaction Processing
                                                     Data Mining, …
     Crash Recovery
     Concurrency Control

 W. Zhang                     Introduction    25    W. Zhang                  Introduction   26

Contenu connexe

En vedette (9)

Doc1
Doc1Doc1
Doc1
 
Mics capital presentation
Mics capital presentationMics capital presentation
Mics capital presentation
 
Sql smart reference_by_prasad
Sql smart reference_by_prasadSql smart reference_by_prasad
Sql smart reference_by_prasad
 
Sql smart reference_by_prasad
Sql smart reference_by_prasadSql smart reference_by_prasad
Sql smart reference_by_prasad
 
Relatório de desempenho digital intergastro
Relatório de desempenho digital   intergastroRelatório de desempenho digital   intergastro
Relatório de desempenho digital intergastro
 
ER model
ER modelER model
ER model
 
Texas s ta r chart
Texas s ta r chartTexas s ta r chart
Texas s ta r chart
 
Texas s ta r chart
Texas s ta r chartTexas s ta r chart
Texas s ta r chart
 
Dms01
Dms01Dms01
Dms01
 

Similaire à 01 intro

Mis582 final exam_study_guide
Mis582 final exam_study_guideMis582 final exam_study_guide
Mis582 final exam_study_guide
dclouds
 
INTRODUCTION TO DATABASE-SYSTEMS PRESENTATION.pptx
INTRODUCTION TO DATABASE-SYSTEMS PRESENTATION.pptxINTRODUCTION TO DATABASE-SYSTEMS PRESENTATION.pptx
INTRODUCTION TO DATABASE-SYSTEMS PRESENTATION.pptx
renadmajid789
 
INFORMATION TECHNOLOGY PRESENTATION ON INFORMATON MANAGEMENT.pptx
INFORMATION TECHNOLOGY PRESENTATION ON INFORMATON MANAGEMENT.pptxINFORMATION TECHNOLOGY PRESENTATION ON INFORMATON MANAGEMENT.pptx
INFORMATION TECHNOLOGY PRESENTATION ON INFORMATON MANAGEMENT.pptx
odane3
 

Similaire à 01 intro (20)

Mis582 final exam_study_guide
Mis582 final exam_study_guideMis582 final exam_study_guide
Mis582 final exam_study_guide
 
DBMS introduction
DBMS introductionDBMS introduction
DBMS introduction
 
Nosql Presentation.pdf for DBMS understanding
Nosql Presentation.pdf for DBMS understandingNosql Presentation.pdf for DBMS understanding
Nosql Presentation.pdf for DBMS understanding
 
Chapter 5
Chapter 5Chapter 5
Chapter 5
 
354 ch1
354 ch1354 ch1
354 ch1
 
Database System
Database SystemDatabase System
Database System
 
What Is Super Key In Dbms
What Is Super Key In DbmsWhat Is Super Key In Dbms
What Is Super Key In Dbms
 
Case mis ch05
Case mis ch05Case mis ch05
Case mis ch05
 
Database.docx
Database.docxDatabase.docx
Database.docx
 
DBMS PART 1.docx
DBMS PART 1.docxDBMS PART 1.docx
DBMS PART 1.docx
 
ICT L5+.pptx
ICT L5+.pptxICT L5+.pptx
ICT L5+.pptx
 
MADHU.pptx
MADHU.pptxMADHU.pptx
MADHU.pptx
 
INTRODUCTION TO DATABASE-SYSTEMS PRESENTATION.pptx
INTRODUCTION TO DATABASE-SYSTEMS PRESENTATION.pptxINTRODUCTION TO DATABASE-SYSTEMS PRESENTATION.pptx
INTRODUCTION TO DATABASE-SYSTEMS PRESENTATION.pptx
 
project report
project reportproject report
project report
 
INFORMATION TECHNOLOGY PRESENTATION ON INFORMATON MANAGEMENT.pptx
INFORMATION TECHNOLOGY PRESENTATION ON INFORMATON MANAGEMENT.pptxINFORMATION TECHNOLOGY PRESENTATION ON INFORMATON MANAGEMENT.pptx
INFORMATION TECHNOLOGY PRESENTATION ON INFORMATON MANAGEMENT.pptx
 
oracle intro
oracle introoracle intro
oracle intro
 
1677091759369776.pdf
1677091759369776.pdf1677091759369776.pdf
1677091759369776.pdf
 
Database management system
Database management systemDatabase management system
Database management system
 
Database Management System, Lecture-1
Database Management System, Lecture-1Database Management System, Lecture-1
Database Management System, Lecture-1
 
File system vs database
File system vs databaseFile system vs database
File system vs database
 

Dernier

Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 

Dernier (20)

Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot ModelNavi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
A Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source MilvusA Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source Milvus
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 

01 intro

  • 1. Overview We study the internal of DBMSs Principles of relational DBMSs Database Management Systems • Emphasize on query & transaction processing techniques Advanced database systems & applications • OODBMS, XML database, data warehousing, OLAP, data mining Prof. Weining Zhang Course work includes Dept. of Computer Science Homework, 2 midterm exams, no final exam University of Texas at San Antonio Programming assignments in Java W. Zhang Introduction 2 Teaching Staff Communication Instructor: Prof. Weining Zhang Web page: Office: SB 4.01.19 http://www.cs.utsa.edu/~wzhang/cs5443/home Contains everything about the course: syllabus, Phone: 458-5557 announcement, assignments, project, lecture notes, etc. Email: wzhang@cs.utsa.edu Office hour: MW 5:00 – 6:00 pm You should check course web pages regularly. T 4:00-5:00 pm Mailing list: 5443@cs.utsa.edu and by appointment Include your CS email address; you may need to forward emails to your regular email address W. Zhang Introduction 3 W. Zhang Introduction 4 Textbooks Other Textbooks Required textbook: Fundamentals of Database Systems, 5th ed., by Database Management systems, 3rd ed., by Elmasri & Navathe Ramakrishnan & Gehrke Other database books in the Main Library Recommended textbook: Principles of Distributed Database Systems, by M. Ozsu & P. Valduriez Database System: the Complete Book, by Garcia- Molina, Ullman & Widom Database system concepts, 5th ed., by Silberschatz, Korth & Sudarshan W. Zhang Introduction 5 W. Zhang Introduction 6
  • 2. Prerequisite Grading CS3743 or equivalent, or extensive experience with Programming assignments 20% database & DB application Homework 20% Strong Java programming skills Midterm I 25% Data structures, algorithms, OO programming, etc. Midterm II 25% Mathematics including logic, sets, algebra, … Intangibles 10% W. Zhang Introduction 7 W. Zhang Introduction 8 Programming Assignments Introduction to Database Systems Implement several components of a simple DBMS A database system consists of called Minibase (Java version), such as, Database management system: the software Buffer Manager Databases: the data Heap File A DBMS needs to provide Hash-based Index persistent data storage Relational operators declarative query language for efficient data retrieval Query processing shared access to data by different applications Work in groups of 2 data security Programming in Java, on Linux or Windows, data integrity … recommend using Eclipse IDE W. Zhang Introduction 9 W. Zhang Introduction 10 An RDBMS Architecture Storage Management Data is stored on disks, and processed in the main Web forms Application front end SQL interface memory SQL Commands Since disk I/Os are costly, search structures, such as, Query indexes, must be used to achieve efficient data access Parser Optimizer Evaluation Concurrency Plan Executor Operator Evaluator Engine DBMS components that manage different types of Control storage include Xction Man File & Access Methods Recovery Disk Manager: manages pages on disk drive Manager Buffer Manager: manages pages in main memory buffer Lock Man Buffer Manager & Disk Manager DBMS Index files Sys. catalog Data files W. Zhang Introduction 11 W. Zhang Introduction 12
  • 3. File Organization Query Processing Data records are logically organized in files and DBMS evaluates declarative queries by executing an physically stored on disk pages optimal query plan that is expressed using relational File organization must consider the format and size of algebraic operations. data records A DBMS must evaluate algebraic operations In addition to simple files of raw data, DBMS also efficiently. maintains search structures, such as, The algorithms and the costs of relational algebraic Ordering operations, such as, selection and join, depend Hashing critically on Indexing types of query condition to reduce access costs specifics of file organizations W. Zhang Introduction 13 W. Zhang Introduction 14 Query Optimization Transaction Processing A transaction models the execution of a database For easy of use, query languages are declarative. application, which typically updates the data in The system must figure out an efficient evaluation databases. plan Transaction management must deal with concurrent The goal is to answer a query with as few disk I/O transactions and possible system failures. as possible The system uses statistics of the data & heuristics to decide how to process the query W. Zhang Introduction 15 W. Zhang Introduction 16 Recovery Concurrency Control The recovery manager protects data integrity in case of Concurrent execution of application programs is system crash. essential for good DBMS performance. The system guarantees that either all operations of a Need to keep CPU busy while performing I/O operations (frequent & relatively slow). transaction or none of them are performed, and updates made by completed transactions are persistent. Interleaving actions of different user programs can lead to inconsistency: e.g., check is cleared while account balance is being computed. Concurrency control subsystem ensures such problems don’t arise: users can pretend they are using a single- user system. W. Zhang Introduction 17 W. Zhang Introduction 18
  • 4. Advanced Hashing & Indexing Distributed DBMS Relational DBMS support hashing & B+ tree indexing Modern corporations have data, control, & application New DBMSs & DB applications need more distributed globally sophisticated search structures Multiple databases at geographically dispersed Hashing with variable size hash table or multiple keys locations need to cooperate to answer queries with Indexes for spatial, multidimensional data (common in distributed data multimedia DBSs, Data warehousing, OLAP, …) Concurrent transaction processing and recovery are still major issues W. Zhang Introduction 19 W. Zhang Introduction 20 Parallel DBMS XML & Semistructured DBMS Both centralized and distributed databases may use Data in RDB, OODB, & ORDB are structured (with multiple processors to evaluate queries rigid schemas) Parallel system architecture requires new algorithms Data on the Web (and other applications) are for query evaluation and optimization semistructured Performance concerns include HTML, XML, Text, … Ability to scale up Need new concepts and techniques Ability to speed up Data model, query language Query processing & optimization Storage management Update, transaction processing, CC, … W. Zhang Introduction 21 W. Zhang Introduction 22 Data Warehousing & OLAP Data Mining Corporations need to put all available data into use Data contains important patterns useful for making when making vital business decisions sound business decisions Need to have technology to integrate data from all Databases need tools to discover knowledge embedded sources, and keep them up to date in data Need advanced tools to analyze, summarize, and view Associations data in various ways Clusters Issues: Classifications Data cube model Useful for business trend analysis, fraud detection, OLAP operations diagnosis, market prediction, … Query processing, indexing, views, … W. Zhang Introduction 23 W. Zhang Introduction 24
  • 5. Topics Topics (cont.) Relational algebra and calculus Distributed Database Systems Storage & File Management Database design Disk manager, buffer manager, Query processing & optimization Indexing, hashing Concurrency control & recovery Query Evaluation & Optimization Parallel Database systems Access methods, selection, joins, etc. XML databases Query optimization methods Data Warehousing and OLAP Transaction Processing Data Mining, … Crash Recovery Concurrency Control W. Zhang Introduction 25 W. Zhang Introduction 26