SlideShare une entreprise Scribd logo
1  sur  29
Télécharger pour lire hors ligne
Eat Your Own Dog Food
Migrating MongoDB University From SQL to MongoDB
John Yu
What Is MongoDB University?
MOOC: Massive Open Online Courses (~2010)
Free MongoDB courses on the web
MongoDB 인증 (개발자, DBA)
Developed by MongoDB Inc
Why was it using SQL?
History of MongoDB University
Started in 2012 with a fork of edX
MySQL DB, Python Django, XML
Django is designed for SQL databases
Future option to use MongoDB for course materials
Why should we move to
MongoDB?
Maybe we shouldn’t
Site works fine
SQL is fine
A lot of work to move to MongoDB
MongoDB is not a great fit for django
We don’t use many of MongoDB’s standout features (sharding)
Eat your own dog food
If we think MongoDB is good, then we should use it
Help test MongoDB products
MongoDB is good for University too
MongoDB is closer to application data
Arrays (배열)
Subclasses (flexible schema)
Ease of development (pymongo)
Integration with other MongoDB tools (Atlas, Compass, Charts)
“While you are attending PyCon, please visit the MongoDB booth to learn about
PyMongo!”
#PyCon #Cleveland #MongoDB #Python
MongoDB:
{
“text”: “While you are attending PyCon, please visit the MongoDB
booth to learn about PyMongo!”,
“tags”: [“PyCon”, “Cleveland”, “MongoDB”, “Python”]
}
SQL:
text id
“While you are
attending PyCon...”
1
blog_id tag
1 “PyCon”
1 “Cleveland”
PyMongo vs Python SQL connector
> database.collection.find_one({‘user_id’:1})
{
"email": "john.yu@mongodb.com",
"address":{
"street": "1633 Broadway",
"city": "New York",
"state": "NY",
"country": "United States"
},
}
> user[‘address’][‘country’]
“United States”
> connection.execute(‘SELECT * FROM people
where id=1’)
(‘john.yu@mongodb.com’, ‘1633 Broadway’, ‘New
York’, ‘NY’, ‘United States’)
> user[4]
“United States"
Flexible Schema Within a Collection
Analogous to subclasses in programming languages
Example: Multiple-choice problem (객관식) vs Text problem (주관식)
{
type: "multiple-choice",
question: "Who was the first president of the US?",
choices: [
{
"text": "Barack Obama",
"is_correct": false
},
{
"text": "George Washington",
"is_correct": true
}
]
}
{
type: "text"
question: "Who was the president during the civil war?",
answer: "Abraham Lincoln"
}
MongoDB can be normalized like SQL, but you can also have arrays and
embedded documents.
MongoDB gives you more options than a tabular DB.
Summary
How did we do it?
Flexible schema is great, but more decisions to be made.
Top-down design
How will the data be CRUDed?
- What operations are needed to render a web page?
Optimize for queries, since querying happens more often
than creating/updating/deleting
What is a course?
Course (수업)
A course has one or more chapters
A chapter has one or more lessons
A lesson has one or more units
- Lecture (강의)
- Problems (multiple-choice, text)
Student progress (진행, 성적)
Did student view a
chapter/lesson/problem?
Student submissions for problems
- Problem ID
- Answer submitted
- Submitted date
Student grade
Many students per course
Old Way: Course
<course id=“M101P”, title=“MongoDB for Python Developers“ start=“Aug 12 17:00 UTC 2013”
end=“Sep 21 17:00 UTC 2013”>
<chapter title=“Week 1: Introduction” start=“Aug 12 17:00 UTC 2013” end=“Aug 19 17:00 UTC 2013”>
<lesson title=“Welcome to M101P”>
<problem id=“5d4340a7eba” title=“Quiz: MongoDB vs SQL”>
Which DB should you use?
<choice correct=”false”>MySQL</choice>
<choice correct=”false”>Postgres</choice>
<choice correct=”true”>MongoDB</choice>
</problem>
</lesson>
<lesson> … </lesson>
</chapter>
<chapter title=“Week 2: CRUD operations”>…</chapter>
</course>
New Way: Course
{
start: 2018-08-01 17:00 UTC,
end: 2018-09-01 17:00 UTC,
title: "US History",
chapters: [
{
title: "Chapter 1: Introduction",
lessons: [
{
title: "First President",
video: youtube.com/123456,
problem: {
type: "multiple-choice",
question: "Who was the first president of
the US?",
choices: [
{
"text": "Barack Obama",
"is_correct": false
},
{
"text": "George Washington",
"is_correct": true
}
]
}
},
{
title: "Second president",
video: youtube.com/2334566,
problem: {
type: "text"
question: "Who was the president
during the civil war?",
answer: "Abraham Lincoln"
}
}
]
},
{
title: "Chapter 2: Wars",
lessons: [
]
}
]
}
Good Bad
Courseware needs
bits of the entire
offering
Can project just
the fields we need
Offering can be a
big document
Note: previously, offerings were
in memory, not DB
Old way: Student Progress
student_id course_id problem_id state (상태)
71495 “M101P/2019_July” “5d4340a7eba” ‘{
"answer": [0,1,2,3],
"score": 1,
"submit_date": 2019-07-21
09:15 UTC
}’
13789 “M101/2015_May” “21b172e26113” ‘{
“answer”: “Barack Obama”,
“score”: 0,
“submit_date”: 2015-05-12
10:15 UTC
}’
?
Approach 1: Mechanically move SQL tables to
MongoDB collections
{
student_id: 71495,
course_id: “M101P/2019_July”,
problem_id: “5d4340a7eba”,
state: {
"answer": [0,1,2,3],
"score": 1,
"submit_date": 2019-07-21 09:15 UTC
}
},
{
student_id: 13789,
course_id: “M101P/2015_May”,
lecture_id: “21b172e26113”,
state: {
”last_viewed": 2015-05-12 10:15 UTC
}
}
Good Bad
Easy to migrate
Already better than the
previous table
Many queries required per
page
Approach 2A: All progress for a course in 1 document
{
course_id: "M101P/2019_May",
students: [
{
user_id: 11111,
units: [
{
id: "Problem 1",
attempts: [
{
date: 2016-06-02 15:02 UTC
index: 2
},
{
date: 2017-06-05 11:08 UTC,
index: 0
}
],
}
]
},
Good Bad
? Doesn’t fit
common use
case
Grows without
bound
{
user_id: 22222,
units: [
{
id: "Problem 1",
attempts: [
{
date: 2016-06-02 15:02 UTC
index: 2
},
{
date: 2017-06-05 11:08 UTC,
index: 0
}
],
}
]
}
]
}
Approach 2B: All courses for a student in 1 document
{
user_id: 11111,
courses: [
{
course_id: "M101P",
units: {
"lecture_1": {
last_viewed: 2016-06-01 10:10 UTC
},
"problem_1": {
attempts: [
{
date: 2016-06-02 15:02 UTC,
index: 2
},
{
date: 2017-06-05 11:08 UTC,
index: 0
}
]
},
},
{
course_id: "M101P",
units: {
"lecture_1": {
last_viewed: 2016-06-01 10:10 UTC
},
"problem_1": {
attempts: [
{
date: 2016-06-02 15:02 UTC
index: 2
},
{
date: 2017-06-05 11:08 UTC,
index: 0
}
]
},
},
]
}
Good Bad
? Will probably
grow larger
than
document size
limit
Better Approach: Fit to use case (progress)
{
user_id: 71495,
course_id: "M101P/2019_July",
units: {
"lecture_1": {
last_viewed: 2016-06-01 10:10 UTC
},
"problem_1": {
attempts: [{
date: 2016-06-02 15:02 UTC, index: 2
},
{
date: 2017-06-05 11:08 UTC, index: 0
}
]
},
"problem_2": {
attempts: [{
date: 2017-06-05 11:08 UTC, text: "Barack Obama"
}
]
}
}
}
Good Bad
- Courseware often
needs multiple units at
a time
- Grade student’s
progress in one
document
- Can still update just
parts of the document
Document can grow
without bound
ODM
We use PyModm (https://github.com/mongodb/pymodm)
We can use Python classes instead of dictionaries
- Application side schema validation (검증)
- Now there is MongoDB schema validation
- Type checking
- Convenience
Downsides:
- New querying language (but mimics Django ORM)
- Unclear when queries are actually being executed
How about performance (성능)?
• Performance gains from data model
• Basic indexes on queries
Timeline
Certification Exams
Just pymongo
Certification Exams v2
ODM (mongoengine)
Courseware
ODM (mongoengine)
Courseware v2
ODM (PyModm)
Summary
SQL is fine
But MongoDB is also good, and sometimes better
We moved to MongoDB because it is great for developers
Beware of pitfalls with document DBs
Future Plans
• Move the rest of the SQL tables to Mongo
• Try newer MongoDB features
• Schema validation
• Transactions
Thank You

Contenu connexe

Similaire à Eat your own dog food using mongo db at mongodb

Java parser a fine grained indexing tool and its application
Java parser a fine grained indexing tool and its applicationJava parser a fine grained indexing tool and its application
Java parser a fine grained indexing tool and its application
Roya Hosseini
 
Mla 2011 cloud collaboration
Mla 2011 cloud collaborationMla 2011 cloud collaboration
Mla 2011 cloud collaboration
amandamills79
 
WebQuest for zunal.com
WebQuest for zunal.comWebQuest for zunal.com
WebQuest for zunal.com
dexterdex
 
Mla 2011 cloud collaboration
Mla 2011 cloud collaborationMla 2011 cloud collaboration
Mla 2011 cloud collaboration
amandamills79
 

Similaire à Eat your own dog food using mongo db at mongodb (20)

Java parser a fine grained indexing tool and its application
Java parser a fine grained indexing tool and its applicationJava parser a fine grained indexing tool and its application
Java parser a fine grained indexing tool and its application
 
Using oer for cambodia
Using oer for cambodiaUsing oer for cambodia
Using oer for cambodia
 
OOo4Kids
OOo4KidsOOo4Kids
OOo4Kids
 
GeoGebra Moodle module: how to embed GeoGebra files into Moodle and reporting...
GeoGebra Moodle module: how to embed GeoGebra files into Moodle and reporting...GeoGebra Moodle module: how to embed GeoGebra files into Moodle and reporting...
GeoGebra Moodle module: how to embed GeoGebra files into Moodle and reporting...
 
Microservices Primer for Monolithic Devs
Microservices Primer for Monolithic DevsMicroservices Primer for Monolithic Devs
Microservices Primer for Monolithic Devs
 
Mla 2011 cloud collaboration
Mla 2011 cloud collaborationMla 2011 cloud collaboration
Mla 2011 cloud collaboration
 
Web 1.0, 3.0. 3.0 School of Business
Web 1.0, 3.0. 3.0 School of BusinessWeb 1.0, 3.0. 3.0 School of Business
Web 1.0, 3.0. 3.0 School of Business
 
Chennai Drupal Meet
Chennai Drupal MeetChennai Drupal Meet
Chennai Drupal Meet
 
SushantResume
SushantResumeSushantResume
SushantResume
 
Nagacv
NagacvNagacv
Nagacv
 
What About Moodle
What About MoodleWhat About Moodle
What About Moodle
 
WebQuest for zunal.com
WebQuest for zunal.comWebQuest for zunal.com
WebQuest for zunal.com
 
Teaching Open Web Mapping - AAG 2017
Teaching Open Web Mapping - AAG 2017Teaching Open Web Mapping - AAG 2017
Teaching Open Web Mapping - AAG 2017
 
Rapid Application Development with MEAN Stack
Rapid Application Development with MEAN StackRapid Application Development with MEAN Stack
Rapid Application Development with MEAN Stack
 
Web 1.0, 2.0 & 3.0
Web 1.0, 2.0 & 3.0Web 1.0, 2.0 & 3.0
Web 1.0, 2.0 & 3.0
 
Brightspace South Carolina Connection Opening Session
Brightspace South Carolina Connection Opening SessionBrightspace South Carolina Connection Opening Session
Brightspace South Carolina Connection Opening Session
 
Kickstarting Your Mongo Education with MongoDB University
Kickstarting Your Mongo Education with MongoDB UniversityKickstarting Your Mongo Education with MongoDB University
Kickstarting Your Mongo Education with MongoDB University
 
Five D2L Tools to Increase Student Engagement and Instructor Presence
Five D2L Tools to Increase Student Engagement and Instructor Presence Five D2L Tools to Increase Student Engagement and Instructor Presence
Five D2L Tools to Increase Student Engagement and Instructor Presence
 
CADMOS: A learning design tool for Moodle courses
CADMOS: A learning design tool for Moodle coursesCADMOS: A learning design tool for Moodle courses
CADMOS: A learning design tool for Moodle courses
 
Mla 2011 cloud collaboration
Mla 2011 cloud collaborationMla 2011 cloud collaboration
Mla 2011 cloud collaboration
 

Plus de NAVER Engineering

Plus de NAVER Engineering (20)

React vac pattern
React vac patternReact vac pattern
React vac pattern
 
디자인 시스템에 직방 ZUIX
디자인 시스템에 직방 ZUIX디자인 시스템에 직방 ZUIX
디자인 시스템에 직방 ZUIX
 
진화하는 디자인 시스템(걸음마 편)
진화하는 디자인 시스템(걸음마 편)진화하는 디자인 시스템(걸음마 편)
진화하는 디자인 시스템(걸음마 편)
 
서비스 운영을 위한 디자인시스템 프로젝트
서비스 운영을 위한 디자인시스템 프로젝트서비스 운영을 위한 디자인시스템 프로젝트
서비스 운영을 위한 디자인시스템 프로젝트
 
BPL(Banksalad Product Language) 무야호
BPL(Banksalad Product Language) 무야호BPL(Banksalad Product Language) 무야호
BPL(Banksalad Product Language) 무야호
 
이번 생에 디자인 시스템은 처음이라
이번 생에 디자인 시스템은 처음이라이번 생에 디자인 시스템은 처음이라
이번 생에 디자인 시스템은 처음이라
 
날고 있는 여러 비행기 넘나 들며 정비하기
날고 있는 여러 비행기 넘나 들며 정비하기날고 있는 여러 비행기 넘나 들며 정비하기
날고 있는 여러 비행기 넘나 들며 정비하기
 
쏘카프레임 구축 배경과 과정
 쏘카프레임 구축 배경과 과정 쏘카프레임 구축 배경과 과정
쏘카프레임 구축 배경과 과정
 
플랫폼 디자이너 없이 디자인 시스템을 구축하는 프로덕트 디자이너의 우당탕탕 고통 연대기
플랫폼 디자이너 없이 디자인 시스템을 구축하는 프로덕트 디자이너의 우당탕탕 고통 연대기플랫폼 디자이너 없이 디자인 시스템을 구축하는 프로덕트 디자이너의 우당탕탕 고통 연대기
플랫폼 디자이너 없이 디자인 시스템을 구축하는 프로덕트 디자이너의 우당탕탕 고통 연대기
 
200820 NAVER TECH CONCERT 15_Code Review is Horse(코드리뷰는 말이야)(feat.Latte)
200820 NAVER TECH CONCERT 15_Code Review is Horse(코드리뷰는 말이야)(feat.Latte)200820 NAVER TECH CONCERT 15_Code Review is Horse(코드리뷰는 말이야)(feat.Latte)
200820 NAVER TECH CONCERT 15_Code Review is Horse(코드리뷰는 말이야)(feat.Latte)
 
200819 NAVER TECH CONCERT 03_화려한 코루틴이 내 앱을 감싸네! 코루틴으로 작성해보는 깔끔한 비동기 코드
200819 NAVER TECH CONCERT 03_화려한 코루틴이 내 앱을 감싸네! 코루틴으로 작성해보는 깔끔한 비동기 코드200819 NAVER TECH CONCERT 03_화려한 코루틴이 내 앱을 감싸네! 코루틴으로 작성해보는 깔끔한 비동기 코드
200819 NAVER TECH CONCERT 03_화려한 코루틴이 내 앱을 감싸네! 코루틴으로 작성해보는 깔끔한 비동기 코드
 
200819 NAVER TECH CONCERT 10_맥북에서도 아이맥프로에서 빌드하는 것처럼 빌드 속도 빠르게 하기
200819 NAVER TECH CONCERT 10_맥북에서도 아이맥프로에서 빌드하는 것처럼 빌드 속도 빠르게 하기200819 NAVER TECH CONCERT 10_맥북에서도 아이맥프로에서 빌드하는 것처럼 빌드 속도 빠르게 하기
200819 NAVER TECH CONCERT 10_맥북에서도 아이맥프로에서 빌드하는 것처럼 빌드 속도 빠르게 하기
 
200819 NAVER TECH CONCERT 08_성능을 고민하는 슬기로운 개발자 생활
200819 NAVER TECH CONCERT 08_성능을 고민하는 슬기로운 개발자 생활200819 NAVER TECH CONCERT 08_성능을 고민하는 슬기로운 개발자 생활
200819 NAVER TECH CONCERT 08_성능을 고민하는 슬기로운 개발자 생활
 
200819 NAVER TECH CONCERT 05_모르면 손해보는 Android 디버깅/분석 꿀팁 대방출
200819 NAVER TECH CONCERT 05_모르면 손해보는 Android 디버깅/분석 꿀팁 대방출200819 NAVER TECH CONCERT 05_모르면 손해보는 Android 디버깅/분석 꿀팁 대방출
200819 NAVER TECH CONCERT 05_모르면 손해보는 Android 디버깅/분석 꿀팁 대방출
 
200819 NAVER TECH CONCERT 09_Case.xcodeproj - 좋은 동료로 거듭나기 위한 노하우
200819 NAVER TECH CONCERT 09_Case.xcodeproj - 좋은 동료로 거듭나기 위한 노하우200819 NAVER TECH CONCERT 09_Case.xcodeproj - 좋은 동료로 거듭나기 위한 노하우
200819 NAVER TECH CONCERT 09_Case.xcodeproj - 좋은 동료로 거듭나기 위한 노하우
 
200820 NAVER TECH CONCERT 14_야 너두 할 수 있어. 비전공자, COBOL 개발자를 거쳐 네이버에서 FE 개발하게 된...
200820 NAVER TECH CONCERT 14_야 너두 할 수 있어. 비전공자, COBOL 개발자를 거쳐 네이버에서 FE 개발하게 된...200820 NAVER TECH CONCERT 14_야 너두 할 수 있어. 비전공자, COBOL 개발자를 거쳐 네이버에서 FE 개발하게 된...
200820 NAVER TECH CONCERT 14_야 너두 할 수 있어. 비전공자, COBOL 개발자를 거쳐 네이버에서 FE 개발하게 된...
 
200820 NAVER TECH CONCERT 13_네이버에서 오픈 소스 개발을 통해 성장하는 방법
200820 NAVER TECH CONCERT 13_네이버에서 오픈 소스 개발을 통해 성장하는 방법200820 NAVER TECH CONCERT 13_네이버에서 오픈 소스 개발을 통해 성장하는 방법
200820 NAVER TECH CONCERT 13_네이버에서 오픈 소스 개발을 통해 성장하는 방법
 
200820 NAVER TECH CONCERT 12_상반기 네이버 인턴을 돌아보며
200820 NAVER TECH CONCERT 12_상반기 네이버 인턴을 돌아보며200820 NAVER TECH CONCERT 12_상반기 네이버 인턴을 돌아보며
200820 NAVER TECH CONCERT 12_상반기 네이버 인턴을 돌아보며
 
200820 NAVER TECH CONCERT 11_빠르게 성장하는 슈퍼루키로 거듭나기
200820 NAVER TECH CONCERT 11_빠르게 성장하는 슈퍼루키로 거듭나기200820 NAVER TECH CONCERT 11_빠르게 성장하는 슈퍼루키로 거듭나기
200820 NAVER TECH CONCERT 11_빠르게 성장하는 슈퍼루키로 거듭나기
 
200819 NAVER TECH CONCERT 07_신입 iOS 개발자 개발업무 적응기
200819 NAVER TECH CONCERT 07_신입 iOS 개발자 개발업무 적응기200819 NAVER TECH CONCERT 07_신입 iOS 개발자 개발업무 적응기
200819 NAVER TECH CONCERT 07_신입 iOS 개발자 개발업무 적응기
 

Dernier

Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 

Dernier (20)

ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)
 
Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..
 
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 

Eat your own dog food using mongo db at mongodb

  • 1. Eat Your Own Dog Food Migrating MongoDB University From SQL to MongoDB John Yu
  • 2. What Is MongoDB University?
  • 3. MOOC: Massive Open Online Courses (~2010) Free MongoDB courses on the web MongoDB 인증 (개발자, DBA) Developed by MongoDB Inc
  • 4. Why was it using SQL?
  • 5. History of MongoDB University Started in 2012 with a fork of edX MySQL DB, Python Django, XML Django is designed for SQL databases Future option to use MongoDB for course materials
  • 6. Why should we move to MongoDB?
  • 7. Maybe we shouldn’t Site works fine SQL is fine A lot of work to move to MongoDB MongoDB is not a great fit for django We don’t use many of MongoDB’s standout features (sharding)
  • 8. Eat your own dog food If we think MongoDB is good, then we should use it Help test MongoDB products
  • 9. MongoDB is good for University too MongoDB is closer to application data Arrays (배열) Subclasses (flexible schema) Ease of development (pymongo) Integration with other MongoDB tools (Atlas, Compass, Charts)
  • 10. “While you are attending PyCon, please visit the MongoDB booth to learn about PyMongo!” #PyCon #Cleveland #MongoDB #Python MongoDB: { “text”: “While you are attending PyCon, please visit the MongoDB booth to learn about PyMongo!”, “tags”: [“PyCon”, “Cleveland”, “MongoDB”, “Python”] } SQL: text id “While you are attending PyCon...” 1 blog_id tag 1 “PyCon” 1 “Cleveland”
  • 11. PyMongo vs Python SQL connector > database.collection.find_one({‘user_id’:1}) { "email": "john.yu@mongodb.com", "address":{ "street": "1633 Broadway", "city": "New York", "state": "NY", "country": "United States" }, } > user[‘address’][‘country’] “United States” > connection.execute(‘SELECT * FROM people where id=1’) (‘john.yu@mongodb.com’, ‘1633 Broadway’, ‘New York’, ‘NY’, ‘United States’) > user[4] “United States"
  • 12. Flexible Schema Within a Collection Analogous to subclasses in programming languages Example: Multiple-choice problem (객관식) vs Text problem (주관식) { type: "multiple-choice", question: "Who was the first president of the US?", choices: [ { "text": "Barack Obama", "is_correct": false }, { "text": "George Washington", "is_correct": true } ] } { type: "text" question: "Who was the president during the civil war?", answer: "Abraham Lincoln" }
  • 13. MongoDB can be normalized like SQL, but you can also have arrays and embedded documents. MongoDB gives you more options than a tabular DB. Summary
  • 14. How did we do it?
  • 15. Flexible schema is great, but more decisions to be made. Top-down design How will the data be CRUDed? - What operations are needed to render a web page? Optimize for queries, since querying happens more often than creating/updating/deleting
  • 16. What is a course? Course (수업) A course has one or more chapters A chapter has one or more lessons A lesson has one or more units - Lecture (강의) - Problems (multiple-choice, text) Student progress (진행, 성적) Did student view a chapter/lesson/problem? Student submissions for problems - Problem ID - Answer submitted - Submitted date Student grade Many students per course
  • 17. Old Way: Course <course id=“M101P”, title=“MongoDB for Python Developers“ start=“Aug 12 17:00 UTC 2013” end=“Sep 21 17:00 UTC 2013”> <chapter title=“Week 1: Introduction” start=“Aug 12 17:00 UTC 2013” end=“Aug 19 17:00 UTC 2013”> <lesson title=“Welcome to M101P”> <problem id=“5d4340a7eba” title=“Quiz: MongoDB vs SQL”> Which DB should you use? <choice correct=”false”>MySQL</choice> <choice correct=”false”>Postgres</choice> <choice correct=”true”>MongoDB</choice> </problem> </lesson> <lesson> … </lesson> </chapter> <chapter title=“Week 2: CRUD operations”>…</chapter> </course>
  • 18. New Way: Course { start: 2018-08-01 17:00 UTC, end: 2018-09-01 17:00 UTC, title: "US History", chapters: [ { title: "Chapter 1: Introduction", lessons: [ { title: "First President", video: youtube.com/123456, problem: { type: "multiple-choice", question: "Who was the first president of the US?", choices: [ { "text": "Barack Obama", "is_correct": false }, { "text": "George Washington", "is_correct": true } ] } }, { title: "Second president", video: youtube.com/2334566, problem: { type: "text" question: "Who was the president during the civil war?", answer: "Abraham Lincoln" } } ] }, { title: "Chapter 2: Wars", lessons: [ ] } ] } Good Bad Courseware needs bits of the entire offering Can project just the fields we need Offering can be a big document Note: previously, offerings were in memory, not DB
  • 19. Old way: Student Progress student_id course_id problem_id state (상태) 71495 “M101P/2019_July” “5d4340a7eba” ‘{ "answer": [0,1,2,3], "score": 1, "submit_date": 2019-07-21 09:15 UTC }’ 13789 “M101/2015_May” “21b172e26113” ‘{ “answer”: “Barack Obama”, “score”: 0, “submit_date”: 2015-05-12 10:15 UTC }’ ?
  • 20. Approach 1: Mechanically move SQL tables to MongoDB collections { student_id: 71495, course_id: “M101P/2019_July”, problem_id: “5d4340a7eba”, state: { "answer": [0,1,2,3], "score": 1, "submit_date": 2019-07-21 09:15 UTC } }, { student_id: 13789, course_id: “M101P/2015_May”, lecture_id: “21b172e26113”, state: { ”last_viewed": 2015-05-12 10:15 UTC } } Good Bad Easy to migrate Already better than the previous table Many queries required per page
  • 21. Approach 2A: All progress for a course in 1 document { course_id: "M101P/2019_May", students: [ { user_id: 11111, units: [ { id: "Problem 1", attempts: [ { date: 2016-06-02 15:02 UTC index: 2 }, { date: 2017-06-05 11:08 UTC, index: 0 } ], } ] }, Good Bad ? Doesn’t fit common use case Grows without bound { user_id: 22222, units: [ { id: "Problem 1", attempts: [ { date: 2016-06-02 15:02 UTC index: 2 }, { date: 2017-06-05 11:08 UTC, index: 0 } ], } ] } ] }
  • 22. Approach 2B: All courses for a student in 1 document { user_id: 11111, courses: [ { course_id: "M101P", units: { "lecture_1": { last_viewed: 2016-06-01 10:10 UTC }, "problem_1": { attempts: [ { date: 2016-06-02 15:02 UTC, index: 2 }, { date: 2017-06-05 11:08 UTC, index: 0 } ] }, }, { course_id: "M101P", units: { "lecture_1": { last_viewed: 2016-06-01 10:10 UTC }, "problem_1": { attempts: [ { date: 2016-06-02 15:02 UTC index: 2 }, { date: 2017-06-05 11:08 UTC, index: 0 } ] }, }, ] } Good Bad ? Will probably grow larger than document size limit
  • 23. Better Approach: Fit to use case (progress) { user_id: 71495, course_id: "M101P/2019_July", units: { "lecture_1": { last_viewed: 2016-06-01 10:10 UTC }, "problem_1": { attempts: [{ date: 2016-06-02 15:02 UTC, index: 2 }, { date: 2017-06-05 11:08 UTC, index: 0 } ] }, "problem_2": { attempts: [{ date: 2017-06-05 11:08 UTC, text: "Barack Obama" } ] } } } Good Bad - Courseware often needs multiple units at a time - Grade student’s progress in one document - Can still update just parts of the document Document can grow without bound
  • 24. ODM We use PyModm (https://github.com/mongodb/pymodm) We can use Python classes instead of dictionaries - Application side schema validation (검증) - Now there is MongoDB schema validation - Type checking - Convenience Downsides: - New querying language (but mimics Django ORM) - Unclear when queries are actually being executed
  • 25. How about performance (성능)? • Performance gains from data model • Basic indexes on queries
  • 26. Timeline Certification Exams Just pymongo Certification Exams v2 ODM (mongoengine) Courseware ODM (mongoengine) Courseware v2 ODM (PyModm)
  • 27. Summary SQL is fine But MongoDB is also good, and sometimes better We moved to MongoDB because it is great for developers Beware of pitfalls with document DBs
  • 28. Future Plans • Move the rest of the SQL tables to Mongo • Try newer MongoDB features • Schema validation • Transactions