SlideShare une entreprise Scribd logo
1  sur  19
MAS.500 - Software Module - Rahul Bhargava 
Data 
Management 
2014.11.21
Topics 
❖ Regular Expressions (online quickstart) 
❖ Databases 
❖ History 
❖ Relational modeling 
❖ Sql (mysql quickstart) 
❖ Keys/Indexes 
❖ No-sql (couchdb quickstart) 
❖ Behind the Scenes with Ed Platt 
❖ Homework
Regular Expressions
Regular Expressions 
(RegEx/grep) 
❖ Match a string of text by defining a pattern 
❖ Useful for cleaning up or identifying data 
❖ “Find” Demo on http://regexpal.com 
❖ “Find/Replace” Demo with 
http://www.sugarscript.com/findandreplace/index.php 
❖ Interested? Interactive tutorial on http://regexone.com
Databases
Database History 
❖ List-based 
❖ Follow link from one record to another (linked-list) 
❖ File-system data stores 
❖ Based on filenaming convention, limited by file i/o 
speeds 
❖ Generic data storage and management 
❖ Relational modeling or entities and relationships 
(ER)
Relational Modeling: In 
English 
❖ A Group has many People 
❖ A Person belongs to one Group 
❖ A Group has many Projects 
❖ A Project belongs to one Group 
❖ A Person has many Projects 
❖ A Project has many People
Relational Modeling: Diagram 
many 1 
Person Group 
Project 
1 
many 
many 
many
Relational Modeling: Tables 
Group: 
id 
name 
url 
Person: 
id 
name 
password 
group_id 
many 1 
Project: 
id 
name 
url 
1 
many 
many 
many 
Membership: 
person_id 
project_id
Relational Modeling: Keys 
Group: 
id 
name 
url 
Person: 
id 
name 
password 
group_id 
many 1 
Project: 
id 
name 
url 
1 
many 
many 
many 
Membership: 
person_id 
project_id 
key 
Foreign keys 
key 
key
Structured Query Language 
(SQL) 
❖ Works in lots of database servers 
❖ SQLite, MySQL, PostgreSQL, MS SQL Server 
❖ Standard way to: 
❖ Find subsets of data based on criteria 
❖ Merge data in separate tables 
❖ Compute aggregate info 
❖ Assumptions 
❖ Don’t duplicate data (“data normalization”) 
❖ Various parts of your data relate to each other 
❖ Your metadata/schema (tables/columns) doesn’t change often 
❖ Many frameworks will generate SQL for you 
❖ Ask about Database Abstraction Layers
NoSQL 
❖ Sometimes your data isn’t relational and the metadata 
changes often 
❖ Queuing, document storage, logging, real-time, low-latency, 
concurrency 
❖ Read this write up for more: 
❖ http://kkovacs.eu/cassandra-vs-mongodb-vs-couchdb-vs-redis
Tangent: JavaScript Object Notation 
(JSON) 
❖ A human-readable data exchange format 
❖ CSV, XML, YAML are some others 
❖ Example: 
❖ http://media.mongodb.org/zips.json 
❖ http://mongohub.todayclose.com (for Mac)
❖ sudo mkdir -p /data/db
MongoDB: Intro 
❖ Demo: 
❖ Command Line 
❖ MongoHub
Indexes 
❖ An index tracks keys 
❖ Convention: have an “id” column with an index on it 
❖ Why all these indexes? 
❖ Multiple ways to get at rows quickly 
❖ Creating indexes is tricky 
❖ Many frameworks include query logging to help you find 
slow queries that might need optimizing 
❖ Query optimization is a bit of an art 
❖ Use the “Explain” command
Map-Reduce Instead of SQL 
❖ Used to query large datasets 
❖ Example: Count words in a document 
❖ Map: select the data you need to operate on 
❖ “emit” one records for each word in a document, 
keyed by the word 
❖ Reduce: combine the mapped data 
❖ Sum up the uses of each word, “emitting” one 
record for each total
Picking Data Storage 
Strategies 
❖ If you just need to dump data and pull it out by some id, use a no-sql 
solution (MongoDB is simple) 
❖ flexible, easy to start with 
❖ If you are modeling an app, a relational database is usually the 
right answer (MySQL/PostgreSQL are standard) 
❖ Database modeling is REALLY important to get right at the 
start of your project, because it is a pain to change later 
❖ Names matter – choose your table names carefully 
❖ PS: we can try stuff out on Amazon’s cloud services for free
Homework 
❖ see course outline

Contenu connexe

Tendances

Introduction to mongo db by zain
Introduction to mongo db by zainIntroduction to mongo db by zain
Introduction to mongo db by zainKenAndTea
 
SDEC2011 NoSQL Data modelling
SDEC2011 NoSQL Data modellingSDEC2011 NoSQL Data modelling
SDEC2011 NoSQL Data modellingKorea Sdec
 
SQL vs. NoSQL Databases
SQL vs. NoSQL DatabasesSQL vs. NoSQL Databases
SQL vs. NoSQL DatabasesOsama Jomaa
 
Intro to XML in libraries
Intro to XML in librariesIntro to XML in libraries
Intro to XML in librariesKyle Banerjee
 
Introduction to MongoDB and CRUD operations
Introduction to MongoDB and CRUD operationsIntroduction to MongoDB and CRUD operations
Introduction to MongoDB and CRUD operationsAnand Kumar
 
Making MySQL Agile-ish
Making MySQL Agile-ishMaking MySQL Agile-ish
Making MySQL Agile-ishDave Stokes
 
SQL vs NoSQL | MySQL vs MongoDB Tutorial | Edureka
SQL vs NoSQL | MySQL vs MongoDB Tutorial | EdurekaSQL vs NoSQL | MySQL vs MongoDB Tutorial | Edureka
SQL vs NoSQL | MySQL vs MongoDB Tutorial | EdurekaEdureka!
 
Apache big data 2016 - Speaking the language of Big Data
Apache big data 2016 - Speaking the language of Big DataApache big data 2016 - Speaking the language of Big Data
Apache big data 2016 - Speaking the language of Big Datatechmaddy
 
MongoDB Introduction - Document Oriented Nosql Database
MongoDB Introduction - Document Oriented Nosql DatabaseMongoDB Introduction - Document Oriented Nosql Database
MongoDB Introduction - Document Oriented Nosql DatabaseSudhir Patil
 
ElasticSearch for data mining
ElasticSearch for data mining ElasticSearch for data mining
ElasticSearch for data mining William Simms
 
GIDS 2016 Understanding and Building No SQLs
GIDS 2016 Understanding and Building No SQLsGIDS 2016 Understanding and Building No SQLs
GIDS 2016 Understanding and Building No SQLstechmaddy
 

Tendances (20)

Introduction to mongo db by zain
Introduction to mongo db by zainIntroduction to mongo db by zain
Introduction to mongo db by zain
 
Xml and DTD's
Xml and DTD'sXml and DTD's
Xml and DTD's
 
Ajax xml json
Ajax xml jsonAjax xml json
Ajax xml json
 
SDEC2011 NoSQL Data modelling
SDEC2011 NoSQL Data modellingSDEC2011 NoSQL Data modelling
SDEC2011 NoSQL Data modelling
 
Mongo db
Mongo dbMongo db
Mongo db
 
SQL vs. NoSQL Databases
SQL vs. NoSQL DatabasesSQL vs. NoSQL Databases
SQL vs. NoSQL Databases
 
Intro to XML in libraries
Intro to XML in librariesIntro to XML in libraries
Intro to XML in libraries
 
Introduction to MongoDB and CRUD operations
Introduction to MongoDB and CRUD operationsIntroduction to MongoDB and CRUD operations
Introduction to MongoDB and CRUD operations
 
Making MySQL Agile-ish
Making MySQL Agile-ishMaking MySQL Agile-ish
Making MySQL Agile-ish
 
Processing XML
Processing XMLProcessing XML
Processing XML
 
SQL vs NoSQL | MySQL vs MongoDB Tutorial | Edureka
SQL vs NoSQL | MySQL vs MongoDB Tutorial | EdurekaSQL vs NoSQL | MySQL vs MongoDB Tutorial | Edureka
SQL vs NoSQL | MySQL vs MongoDB Tutorial | Edureka
 
BigData, NoSQL & ElasticSearch
BigData, NoSQL & ElasticSearchBigData, NoSQL & ElasticSearch
BigData, NoSQL & ElasticSearch
 
10 mongo db
10 mongo db10 mongo db
10 mongo db
 
NOSQL vs SQL
NOSQL vs SQLNOSQL vs SQL
NOSQL vs SQL
 
MongoDB
MongoDBMongoDB
MongoDB
 
Apache big data 2016 - Speaking the language of Big Data
Apache big data 2016 - Speaking the language of Big DataApache big data 2016 - Speaking the language of Big Data
Apache big data 2016 - Speaking the language of Big Data
 
MongoDB Introduction - Document Oriented Nosql Database
MongoDB Introduction - Document Oriented Nosql DatabaseMongoDB Introduction - Document Oriented Nosql Database
MongoDB Introduction - Document Oriented Nosql Database
 
NoSQL databases
NoSQL databasesNoSQL databases
NoSQL databases
 
ElasticSearch for data mining
ElasticSearch for data mining ElasticSearch for data mining
ElasticSearch for data mining
 
GIDS 2016 Understanding and Building No SQLs
GIDS 2016 Understanding and Building No SQLsGIDS 2016 Understanding and Building No SQLs
GIDS 2016 Understanding and Building No SQLs
 

En vedette

Records guinness
Records guinnessRecords guinness
Records guinnessfilipj2000
 
[Mas 500] Visualization
[Mas 500] Visualization[Mas 500] Visualization
[Mas 500] Visualizationrahulbot
 
LinkedIn Financial Services Webinar Part 2 - 6-19-12
LinkedIn Financial Services Webinar Part 2 - 6-19-12LinkedIn Financial Services Webinar Part 2 - 6-19-12
LinkedIn Financial Services Webinar Part 2 - 6-19-12LinkedIn
 
Protecting Privacy in Facebook
Protecting Privacy in FacebookProtecting Privacy in Facebook
Protecting Privacy in FacebookLee Aase
 
Материалы круглого стола по инженерному делу в Совете Федерации. Часть 4
Материалы круглого стола по инженерному делу в Совете Федерации. Часть 4Материалы круглого стола по инженерному делу в Совете Федерации. Часть 4
Материалы круглого стола по инженерному делу в Совете Федерации. Часть 4Союз молодых инженеров России
 
Investigacion y tecnologia
Investigacion y tecnologiaInvestigacion y tecnologia
Investigacion y tecnologiaCinthia Delgado
 
Sistemas Operativos
Sistemas OperativosSistemas Operativos
Sistemas Operativoslesml
 
Protocolo colaboración Junta Castilla Leon Gobierno Vasco.pdf
Protocolo colaboración Junta Castilla Leon Gobierno Vasco.pdfProtocolo colaboración Junta Castilla Leon Gobierno Vasco.pdf
Protocolo colaboración Junta Castilla Leon Gobierno Vasco.pdfIrekia - EJGV
 
New frontiers for south african librarianship
New frontiers for south african librarianshipNew frontiers for south african librarianship
New frontiers for south african librarianshipUjala Satgoor
 
Usefull language
Usefull languageUsefull language
Usefull languagemarycovi
 

En vedette (12)

Records guinness
Records guinnessRecords guinness
Records guinness
 
[Mas 500] Visualization
[Mas 500] Visualization[Mas 500] Visualization
[Mas 500] Visualization
 
LinkedIn Financial Services Webinar Part 2 - 6-19-12
LinkedIn Financial Services Webinar Part 2 - 6-19-12LinkedIn Financial Services Webinar Part 2 - 6-19-12
LinkedIn Financial Services Webinar Part 2 - 6-19-12
 
Social marketing EN
Social marketing ENSocial marketing EN
Social marketing EN
 
Protecting Privacy in Facebook
Protecting Privacy in FacebookProtecting Privacy in Facebook
Protecting Privacy in Facebook
 
Материалы круглого стола по инженерному делу в Совете Федерации. Часть 4
Материалы круглого стола по инженерному делу в Совете Федерации. Часть 4Материалы круглого стола по инженерному делу в Совете Федерации. Часть 4
Материалы круглого стола по инженерному делу в Совете Федерации. Часть 4
 
Investigacion y tecnologia
Investigacion y tecnologiaInvestigacion y tecnologia
Investigacion y tecnologia
 
Sistemas Operativos
Sistemas OperativosSistemas Operativos
Sistemas Operativos
 
Protocolo colaboración Junta Castilla Leon Gobierno Vasco.pdf
Protocolo colaboración Junta Castilla Leon Gobierno Vasco.pdfProtocolo colaboración Junta Castilla Leon Gobierno Vasco.pdf
Protocolo colaboración Junta Castilla Leon Gobierno Vasco.pdf
 
Askep isk
Askep iskAskep isk
Askep isk
 
New frontiers for south african librarianship
New frontiers for south african librarianshipNew frontiers for south african librarianship
New frontiers for south african librarianship
 
Usefull language
Usefull languageUsefull language
Usefull language
 

Similaire à [Mas 500] Data Basics

Presentation: mongo db & elasticsearch & membase
Presentation: mongo db & elasticsearch & membasePresentation: mongo db & elasticsearch & membase
Presentation: mongo db & elasticsearch & membaseArdak Shalkarbayuli
 
Big data technology unit 3
Big data technology unit 3Big data technology unit 3
Big data technology unit 3RojaT4
 
Dropping ACID: Wrapping Your Mind Around NoSQL Databases
Dropping ACID: Wrapping Your Mind Around NoSQL DatabasesDropping ACID: Wrapping Your Mind Around NoSQL Databases
Dropping ACID: Wrapping Your Mind Around NoSQL DatabasesKyle Banerjee
 
Mongo Bb - NoSQL tutorial
Mongo Bb - NoSQL tutorialMongo Bb - NoSQL tutorial
Mongo Bb - NoSQL tutorialMohan Rathour
 
Evolution of the DBA to Data Platform Administrator/Specialist
Evolution of the DBA to Data Platform Administrator/SpecialistEvolution of the DBA to Data Platform Administrator/Specialist
Evolution of the DBA to Data Platform Administrator/SpecialistTony Rogerson
 
Pass chapter meeting dec 2013 - compression a hidden gem for io heavy databas...
Pass chapter meeting dec 2013 - compression a hidden gem for io heavy databas...Pass chapter meeting dec 2013 - compression a hidden gem for io heavy databas...
Pass chapter meeting dec 2013 - compression a hidden gem for io heavy databas...Charley Hanania
 
Deep dive to ElasticSearch - معرفی ابزار جستجوی الاستیکی
Deep dive to ElasticSearch - معرفی ابزار جستجوی الاستیکیDeep dive to ElasticSearch - معرفی ابزار جستجوی الاستیکی
Deep dive to ElasticSearch - معرفی ابزار جستجوی الاستیکیEhsan Asgarian
 
Big data architectures and the data lake
Big data architectures and the data lakeBig data architectures and the data lake
Big data architectures and the data lakeJames Serra
 
Introduction to NoSql
Introduction to NoSqlIntroduction to NoSql
Introduction to NoSqlOmid Vahdaty
 

Similaire à [Mas 500] Data Basics (20)

NoSQL.pptx
NoSQL.pptxNoSQL.pptx
NoSQL.pptx
 
unit2-ppt1.pptx
unit2-ppt1.pptxunit2-ppt1.pptx
unit2-ppt1.pptx
 
NOsql Presentation.pdf
NOsql Presentation.pdfNOsql Presentation.pdf
NOsql Presentation.pdf
 
Datastore PPT.pptx
Datastore PPT.pptxDatastore PPT.pptx
Datastore PPT.pptx
 
Presentation: mongo db & elasticsearch & membase
Presentation: mongo db & elasticsearch & membasePresentation: mongo db & elasticsearch & membase
Presentation: mongo db & elasticsearch & membase
 
Big data technology unit 3
Big data technology unit 3Big data technology unit 3
Big data technology unit 3
 
Nosql
NosqlNosql
Nosql
 
Dropping ACID: Wrapping Your Mind Around NoSQL Databases
Dropping ACID: Wrapping Your Mind Around NoSQL DatabasesDropping ACID: Wrapping Your Mind Around NoSQL Databases
Dropping ACID: Wrapping Your Mind Around NoSQL Databases
 
Nosql
NosqlNosql
Nosql
 
Mongo Bb - NoSQL tutorial
Mongo Bb - NoSQL tutorialMongo Bb - NoSQL tutorial
Mongo Bb - NoSQL tutorial
 
Evolution of the DBA to Data Platform Administrator/Specialist
Evolution of the DBA to Data Platform Administrator/SpecialistEvolution of the DBA to Data Platform Administrator/Specialist
Evolution of the DBA to Data Platform Administrator/Specialist
 
Pass chapter meeting dec 2013 - compression a hidden gem for io heavy databas...
Pass chapter meeting dec 2013 - compression a hidden gem for io heavy databas...Pass chapter meeting dec 2013 - compression a hidden gem for io heavy databas...
Pass chapter meeting dec 2013 - compression a hidden gem for io heavy databas...
 
No sq lv1_0
No sq lv1_0No sq lv1_0
No sq lv1_0
 
Some NoSQL
Some NoSQLSome NoSQL
Some NoSQL
 
Deep dive to ElasticSearch - معرفی ابزار جستجوی الاستیکی
Deep dive to ElasticSearch - معرفی ابزار جستجوی الاستیکیDeep dive to ElasticSearch - معرفی ابزار جستجوی الاستیکی
Deep dive to ElasticSearch - معرفی ابزار جستجوی الاستیکی
 
nosql.pptx
nosql.pptxnosql.pptx
nosql.pptx
 
Big data architectures and the data lake
Big data architectures and the data lakeBig data architectures and the data lake
Big data architectures and the data lake
 
Nosql seminar
Nosql seminarNosql seminar
Nosql seminar
 
NoSQL
NoSQLNoSQL
NoSQL
 
Introduction to NoSql
Introduction to NoSqlIntroduction to NoSql
Introduction to NoSql
 

Plus de rahulbot

Data Storytelling for Social Change
Data Storytelling for Social ChangeData Storytelling for Social Change
Data Storytelling for Social Changerahulbot
 
Empowering those that don't "speak" data
Empowering those that don't "speak" dataEmpowering those that don't "speak" data
Empowering those that don't "speak" datarahulbot
 
Hands on Approaches to Data Storytelling
Hands on Approaches to Data StorytellingHands on Approaches to Data Storytelling
Hands on Approaches to Data Storytellingrahulbot
 
Hands on Approaches to Data Storytelling
Hands on Approaches to Data StorytellingHands on Approaches to Data Storytelling
Hands on Approaches to Data Storytellingrahulbot
 
From Data to Argument
From Data to ArgumentFrom Data to Argument
From Data to Argumentrahulbot
 
Data Therapy: Telling Your Story Well
Data Therapy: Telling Your Story WellData Therapy: Telling Your Story Well
Data Therapy: Telling Your Story Wellrahulbot
 
Empowering People with Data
Empowering People with DataEmpowering People with Data
Empowering People with Datarahulbot
 
Practicing Data Science Responsibly
Practicing Data Science ResponsiblyPracticing Data Science Responsibly
Practicing Data Science Responsiblyrahulbot
 
[Mas 500] Various Topics
[Mas 500] Various Topics[Mas 500] Various Topics
[Mas 500] Various Topicsrahulbot
 
[Mas 500] Mobile Basics
[Mas 500] Mobile Basics[Mas 500] Mobile Basics
[Mas 500] Mobile Basicsrahulbot
 
[Mas 500] Web Basics
[Mas 500] Web Basics[Mas 500] Web Basics
[Mas 500] Web Basicsrahulbot
 
[Mas 500] Software Development Strategies
[Mas 500] Software Development Strategies[Mas 500] Software Development Strategies
[Mas 500] Software Development Strategiesrahulbot
 
[Mas 500] Intro to Programming
[Mas 500] Intro to Programming[Mas 500] Intro to Programming
[Mas 500] Intro to Programmingrahulbot
 
Putt Putt 101
Putt Putt 101Putt Putt 101
Putt Putt 101rahulbot
 
DataTherapy (Boston: Hub of Innovation)
DataTherapy (Boston: Hub of Innovation)DataTherapy (Boston: Hub of Innovation)
DataTherapy (Boston: Hub of Innovation)rahulbot
 

Plus de rahulbot (15)

Data Storytelling for Social Change
Data Storytelling for Social ChangeData Storytelling for Social Change
Data Storytelling for Social Change
 
Empowering those that don't "speak" data
Empowering those that don't "speak" dataEmpowering those that don't "speak" data
Empowering those that don't "speak" data
 
Hands on Approaches to Data Storytelling
Hands on Approaches to Data StorytellingHands on Approaches to Data Storytelling
Hands on Approaches to Data Storytelling
 
Hands on Approaches to Data Storytelling
Hands on Approaches to Data StorytellingHands on Approaches to Data Storytelling
Hands on Approaches to Data Storytelling
 
From Data to Argument
From Data to ArgumentFrom Data to Argument
From Data to Argument
 
Data Therapy: Telling Your Story Well
Data Therapy: Telling Your Story WellData Therapy: Telling Your Story Well
Data Therapy: Telling Your Story Well
 
Empowering People with Data
Empowering People with DataEmpowering People with Data
Empowering People with Data
 
Practicing Data Science Responsibly
Practicing Data Science ResponsiblyPracticing Data Science Responsibly
Practicing Data Science Responsibly
 
[Mas 500] Various Topics
[Mas 500] Various Topics[Mas 500] Various Topics
[Mas 500] Various Topics
 
[Mas 500] Mobile Basics
[Mas 500] Mobile Basics[Mas 500] Mobile Basics
[Mas 500] Mobile Basics
 
[Mas 500] Web Basics
[Mas 500] Web Basics[Mas 500] Web Basics
[Mas 500] Web Basics
 
[Mas 500] Software Development Strategies
[Mas 500] Software Development Strategies[Mas 500] Software Development Strategies
[Mas 500] Software Development Strategies
 
[Mas 500] Intro to Programming
[Mas 500] Intro to Programming[Mas 500] Intro to Programming
[Mas 500] Intro to Programming
 
Putt Putt 101
Putt Putt 101Putt Putt 101
Putt Putt 101
 
DataTherapy (Boston: Hub of Innovation)
DataTherapy (Boston: Hub of Innovation)DataTherapy (Boston: Hub of Innovation)
DataTherapy (Boston: Hub of Innovation)
 

Dernier

80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...Nguyen Thanh Tu Collection
 
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptxHMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptxEsquimalt MFRC
 
Python Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxPython Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxRamakrishna Reddy Bijjam
 
ICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxAreebaZafar22
 
Salient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functionsSalient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functionsKarakKing
 
Towards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptxTowards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptxJisc
 
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...Nguyen Thanh Tu Collection
 
Exploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptx
Exploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptxExploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptx
Exploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptxPooja Bhuva
 
On_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptx
On_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptxOn_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptx
On_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptxPooja Bhuva
 
COMMUNICATING NEGATIVE NEWS - APPROACHES .pptx
COMMUNICATING NEGATIVE NEWS - APPROACHES .pptxCOMMUNICATING NEGATIVE NEWS - APPROACHES .pptx
COMMUNICATING NEGATIVE NEWS - APPROACHES .pptxannathomasp01
 
Single or Multiple melodic lines structure
Single or Multiple melodic lines structureSingle or Multiple melodic lines structure
Single or Multiple melodic lines structuredhanjurrannsibayan2
 
Google Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptxGoogle Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptxDr. Sarita Anand
 
General Principles of Intellectual Property: Concepts of Intellectual Proper...
General Principles of Intellectual Property: Concepts of Intellectual  Proper...General Principles of Intellectual Property: Concepts of Intellectual  Proper...
General Principles of Intellectual Property: Concepts of Intellectual Proper...Poonam Aher Patil
 
Wellbeing inclusion and digital dystopias.pptx
Wellbeing inclusion and digital dystopias.pptxWellbeing inclusion and digital dystopias.pptx
Wellbeing inclusion and digital dystopias.pptxJisc
 
How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17Celine George
 
Fostering Friendships - Enhancing Social Bonds in the Classroom
Fostering Friendships - Enhancing Social Bonds  in the ClassroomFostering Friendships - Enhancing Social Bonds  in the Classroom
Fostering Friendships - Enhancing Social Bonds in the ClassroomPooky Knightsmith
 
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...pradhanghanshyam7136
 
Jamworks pilot and AI at Jisc (20/03/2024)
Jamworks pilot and AI at Jisc (20/03/2024)Jamworks pilot and AI at Jisc (20/03/2024)
Jamworks pilot and AI at Jisc (20/03/2024)Jisc
 
Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.pptRamjanShidvankar
 

Dernier (20)

80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
 
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptxHMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
 
Python Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxPython Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docx
 
ICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptx
 
Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024
 
Salient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functionsSalient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functions
 
Towards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptxTowards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptx
 
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
 
Exploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptx
Exploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptxExploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptx
Exploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptx
 
On_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptx
On_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptxOn_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptx
On_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptx
 
COMMUNICATING NEGATIVE NEWS - APPROACHES .pptx
COMMUNICATING NEGATIVE NEWS - APPROACHES .pptxCOMMUNICATING NEGATIVE NEWS - APPROACHES .pptx
COMMUNICATING NEGATIVE NEWS - APPROACHES .pptx
 
Single or Multiple melodic lines structure
Single or Multiple melodic lines structureSingle or Multiple melodic lines structure
Single or Multiple melodic lines structure
 
Google Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptxGoogle Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptx
 
General Principles of Intellectual Property: Concepts of Intellectual Proper...
General Principles of Intellectual Property: Concepts of Intellectual  Proper...General Principles of Intellectual Property: Concepts of Intellectual  Proper...
General Principles of Intellectual Property: Concepts of Intellectual Proper...
 
Wellbeing inclusion and digital dystopias.pptx
Wellbeing inclusion and digital dystopias.pptxWellbeing inclusion and digital dystopias.pptx
Wellbeing inclusion and digital dystopias.pptx
 
How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17
 
Fostering Friendships - Enhancing Social Bonds in the Classroom
Fostering Friendships - Enhancing Social Bonds  in the ClassroomFostering Friendships - Enhancing Social Bonds  in the Classroom
Fostering Friendships - Enhancing Social Bonds in the Classroom
 
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
 
Jamworks pilot and AI at Jisc (20/03/2024)
Jamworks pilot and AI at Jisc (20/03/2024)Jamworks pilot and AI at Jisc (20/03/2024)
Jamworks pilot and AI at Jisc (20/03/2024)
 
Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.ppt
 

[Mas 500] Data Basics

  • 1. MAS.500 - Software Module - Rahul Bhargava Data Management 2014.11.21
  • 2. Topics ❖ Regular Expressions (online quickstart) ❖ Databases ❖ History ❖ Relational modeling ❖ Sql (mysql quickstart) ❖ Keys/Indexes ❖ No-sql (couchdb quickstart) ❖ Behind the Scenes with Ed Platt ❖ Homework
  • 4. Regular Expressions (RegEx/grep) ❖ Match a string of text by defining a pattern ❖ Useful for cleaning up or identifying data ❖ “Find” Demo on http://regexpal.com ❖ “Find/Replace” Demo with http://www.sugarscript.com/findandreplace/index.php ❖ Interested? Interactive tutorial on http://regexone.com
  • 6. Database History ❖ List-based ❖ Follow link from one record to another (linked-list) ❖ File-system data stores ❖ Based on filenaming convention, limited by file i/o speeds ❖ Generic data storage and management ❖ Relational modeling or entities and relationships (ER)
  • 7. Relational Modeling: In English ❖ A Group has many People ❖ A Person belongs to one Group ❖ A Group has many Projects ❖ A Project belongs to one Group ❖ A Person has many Projects ❖ A Project has many People
  • 8. Relational Modeling: Diagram many 1 Person Group Project 1 many many many
  • 9. Relational Modeling: Tables Group: id name url Person: id name password group_id many 1 Project: id name url 1 many many many Membership: person_id project_id
  • 10. Relational Modeling: Keys Group: id name url Person: id name password group_id many 1 Project: id name url 1 many many many Membership: person_id project_id key Foreign keys key key
  • 11. Structured Query Language (SQL) ❖ Works in lots of database servers ❖ SQLite, MySQL, PostgreSQL, MS SQL Server ❖ Standard way to: ❖ Find subsets of data based on criteria ❖ Merge data in separate tables ❖ Compute aggregate info ❖ Assumptions ❖ Don’t duplicate data (“data normalization”) ❖ Various parts of your data relate to each other ❖ Your metadata/schema (tables/columns) doesn’t change often ❖ Many frameworks will generate SQL for you ❖ Ask about Database Abstraction Layers
  • 12. NoSQL ❖ Sometimes your data isn’t relational and the metadata changes often ❖ Queuing, document storage, logging, real-time, low-latency, concurrency ❖ Read this write up for more: ❖ http://kkovacs.eu/cassandra-vs-mongodb-vs-couchdb-vs-redis
  • 13. Tangent: JavaScript Object Notation (JSON) ❖ A human-readable data exchange format ❖ CSV, XML, YAML are some others ❖ Example: ❖ http://media.mongodb.org/zips.json ❖ http://mongohub.todayclose.com (for Mac)
  • 14. ❖ sudo mkdir -p /data/db
  • 15. MongoDB: Intro ❖ Demo: ❖ Command Line ❖ MongoHub
  • 16. Indexes ❖ An index tracks keys ❖ Convention: have an “id” column with an index on it ❖ Why all these indexes? ❖ Multiple ways to get at rows quickly ❖ Creating indexes is tricky ❖ Many frameworks include query logging to help you find slow queries that might need optimizing ❖ Query optimization is a bit of an art ❖ Use the “Explain” command
  • 17. Map-Reduce Instead of SQL ❖ Used to query large datasets ❖ Example: Count words in a document ❖ Map: select the data you need to operate on ❖ “emit” one records for each word in a document, keyed by the word ❖ Reduce: combine the mapped data ❖ Sum up the uses of each word, “emitting” one record for each total
  • 18. Picking Data Storage Strategies ❖ If you just need to dump data and pull it out by some id, use a no-sql solution (MongoDB is simple) ❖ flexible, easy to start with ❖ If you are modeling an app, a relational database is usually the right answer (MySQL/PostgreSQL are standard) ❖ Database modeling is REALLY important to get right at the start of your project, because it is a pain to change later ❖ Names matter – choose your table names carefully ❖ PS: we can try stuff out on Amazon’s cloud services for free
  • 19. Homework ❖ see course outline