SlideShare une entreprise Scribd logo
1  sur  9
Pandoc:
The Deep Dive
All that is great
stands in the storm
● Universal markup converter == " the swiss
army knife of text markup formats"
● ALL HASKELL
● Example:
pandoc -o myDoc.md myDoc.html
pandoc -f html -t latex hackage.org
pandoc myDoc.txt -o myDoc.pdf
What is Pandoc?
● Reads:
○ Markdown (GitHub, Strict, etc.), HTML, LaTeX,
Textile, reStructuredText, JSON,
● Writes:
○ Markdown, reStructuredText, HTML, Docbook
XML, OpenDocument XML, ODT, RTF, groff
man, MediaWiki markup, GNU Texinfo, LaTeX,
ConTeXt, EPUB, Textile, Emacs org-mode, Slidy,
S5
● Extensions for LaTeX math, tables, etc.
● Note to self: Pandoc in the CLI
What is Pandoc? (pt. 2)
● Performance vis-à-vis scripting languages
● Type safety
● Text.Parsec library
● Hypermuscular list processing (more
about FP more generally than about
Haskell)
Why Haskell?
● One possibility: functions devoted to each
type-to-type combination
○ markdownToHTML
○ HTMLtoEPUB
○ 12^31 possibilities
○ FUCK THAT
● Vastly better possibility?
Reader -->
Neutral Haskell data type -->
Writer -->
Converted document
Possible approaches
● Semi-stateful, non-opinionated REGEX
machine
○ Accumulative — return (x:xs)
○ getParserState
○ modifyState
● Core functions
○ parse
■ parse parser filePath input
■ parse numbers "" "a,b,2,3"
○ many
○ skipMany
○ manyAccum
● type Parser t s = Parsec t s
Text.Parsec
● Neutral data types
○ Pandoc = [Block]
○ Block = [(Inline || Block)]
○ Inline
○ etc.
● Reader
○ Applies parsers to documents
○ Documents are treated as lists
● Writer
○ Converts neutral data type into document
○ Again, documents are just structured lists
Basic flow
● Readers/Markdown.hs
● Writers/HTML.hs
● Pandoc/Builder.hs
Markdown to HTML
● When doing big, complex things with FP,
you're probably going to end up thinking in
terms of lists
● Lists are infinitely flexible
● Hard to escape state entirely
○ ReaderState
○ WriterState
● Don't give up
● Force yourself to give a presentation at
PDXFunc
General lessons

Contenu connexe

Tendances

Tendances (19)

FluentDom
FluentDomFluentDom
FluentDom
 
Automata Invasion
Automata InvasionAutomata Invasion
Automata Invasion
 
Stripe CTF3 wrap-up
Stripe CTF3 wrap-upStripe CTF3 wrap-up
Stripe CTF3 wrap-up
 
Learning groovy -EU workshop
Learning groovy  -EU workshopLearning groovy  -EU workshop
Learning groovy -EU workshop
 
Tips and Tricks for Increased Development Efficiency
Tips and Tricks for Increased Development EfficiencyTips and Tricks for Increased Development Efficiency
Tips and Tricks for Increased Development Efficiency
 
Jade
JadeJade
Jade
 
Restinio (actual aug 2018)
Restinio (actual aug 2018)Restinio (actual aug 2018)
Restinio (actual aug 2018)
 
TANET 2018 - Insights into the reliability of open-source distributed file sy...
TANET 2018 - Insights into the reliability of open-source distributed file sy...TANET 2018 - Insights into the reliability of open-source distributed file sy...
TANET 2018 - Insights into the reliability of open-source distributed file sy...
 
Introduction to Web Development - JavaScript
Introduction to Web Development - JavaScriptIntroduction to Web Development - JavaScript
Introduction to Web Development - JavaScript
 
ActiveDoc
ActiveDocActiveDoc
ActiveDoc
 
In a Nutshell: Rancher
In a Nutshell: RancherIn a Nutshell: Rancher
In a Nutshell: Rancher
 
Atmosphere 2014: Centralized log management based on Logstash and Kibana - ca...
Atmosphere 2014: Centralized log management based on Logstash and Kibana - ca...Atmosphere 2014: Centralized log management based on Logstash and Kibana - ca...
Atmosphere 2014: Centralized log management based on Logstash and Kibana - ca...
 
Compress and the other side
Compress and the other sideCompress and the other side
Compress and the other side
 
Rust system programming language
Rust system programming languageRust system programming language
Rust system programming language
 
Mongodb meetup
Mongodb meetupMongodb meetup
Mongodb meetup
 
Introduction to Sublime text 2
Introduction to Sublime text 2Introduction to Sublime text 2
Introduction to Sublime text 2
 
Writing Groovy DSLs
Writing Groovy DSLsWriting Groovy DSLs
Writing Groovy DSLs
 
KubeCon EU 2019 - P2P Docker Image Distribution in Hybrid Cloud Environment w...
KubeCon EU 2019 - P2P Docker Image Distribution in Hybrid Cloud Environment w...KubeCon EU 2019 - P2P Docker Image Distribution in Hybrid Cloud Environment w...
KubeCon EU 2019 - P2P Docker Image Distribution in Hybrid Cloud Environment w...
 
Caffe + H2O - By Cyprien noel
Caffe + H2O - By Cyprien noelCaffe + H2O - By Cyprien noel
Caffe + H2O - By Cyprien noel
 

Similaire à Pandoc: the deep dive (PDXFunc presentation)

Balisage - EXPath - A practical introduction
Balisage - EXPath - A practical introductionBalisage - EXPath - A practical introduction
Balisage - EXPath - A practical introduction
Florent Georges
 

Similaire à Pandoc: the deep dive (PDXFunc presentation) (20)

A Multiformat Document Workflow With Docutils
A Multiformat Document Workflow With DocutilsA Multiformat Document Workflow With Docutils
A Multiformat Document Workflow With Docutils
 
NANO266 - Lecture 9 - Tools of the Modeling Trade
NANO266 - Lecture 9 - Tools of the Modeling TradeNANO266 - Lecture 9 - Tools of the Modeling Trade
NANO266 - Lecture 9 - Tools of the Modeling Trade
 
Why go ?
Why go ?Why go ?
Why go ?
 
Grant Rogerson SDEC2015
Grant Rogerson SDEC2015Grant Rogerson SDEC2015
Grant Rogerson SDEC2015
 
sphinx-i18n — The True Story
sphinx-i18n — The True Storysphinx-i18n — The True Story
sphinx-i18n — The True Story
 
Balisage - EXPath - A practical introduction
Balisage - EXPath - A practical introductionBalisage - EXPath - A practical introduction
Balisage - EXPath - A practical introduction
 
ROS distributed architecture
ROS  distributed architectureROS  distributed architecture
ROS distributed architecture
 
Introduction to MapReduce and Hadoop
Introduction to MapReduce and HadoopIntroduction to MapReduce and Hadoop
Introduction to MapReduce and Hadoop
 
Language-agnostic data analysis workflows and reproducible research
Language-agnostic data analysis workflows and reproducible researchLanguage-agnostic data analysis workflows and reproducible research
Language-agnostic data analysis workflows and reproducible research
 
From XML to eBooks Part 2: The Details
From XML to eBooks Part 2: The DetailsFrom XML to eBooks Part 2: The Details
From XML to eBooks Part 2: The Details
 
Programming languages
Programming languagesProgramming languages
Programming languages
 
IAS for IBM WebSphere MQ Users
IAS for IBM WebSphere MQ UsersIAS for IBM WebSphere MQ Users
IAS for IBM WebSphere MQ Users
 
The Go features I can't live without, 2nd round
The Go features I can't live without, 2nd roundThe Go features I can't live without, 2nd round
The Go features I can't live without, 2nd round
 
Go Is Your Next Language — Sergii Shapoval
Go Is Your Next Language — Sergii ShapovalGo Is Your Next Language — Sergii Shapoval
Go Is Your Next Language — Sergii Shapoval
 
Latex workshop: Essentials and Practices
Latex workshop: Essentials and PracticesLatex workshop: Essentials and Practices
Latex workshop: Essentials and Practices
 
Fscons scalable appplication transfers
Fscons scalable appplication transfersFscons scalable appplication transfers
Fscons scalable appplication transfers
 
In the DOM, no one will hear you scream
In the DOM, no one will hear you screamIn the DOM, no one will hear you scream
In the DOM, no one will hear you scream
 
Lecture 4: Data-Intensive Computing for Text Analysis (Fall 2011)
Lecture 4: Data-Intensive Computing for Text Analysis (Fall 2011)Lecture 4: Data-Intensive Computing for Text Analysis (Fall 2011)
Lecture 4: Data-Intensive Computing for Text Analysis (Fall 2011)
 
LaTeX for beginners
LaTeX for beginnersLaTeX for beginners
LaTeX for beginners
 
數位出版2.0 it
數位出版2.0 it數位出版2.0 it
數位出版2.0 it
 

Dernier

Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 

Dernier (20)

Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu SubbuApidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
 
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot ModelNavi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 

Pandoc: the deep dive (PDXFunc presentation)

  • 1. Pandoc: The Deep Dive All that is great stands in the storm
  • 2. ● Universal markup converter == " the swiss army knife of text markup formats" ● ALL HASKELL ● Example: pandoc -o myDoc.md myDoc.html pandoc -f html -t latex hackage.org pandoc myDoc.txt -o myDoc.pdf What is Pandoc?
  • 3. ● Reads: ○ Markdown (GitHub, Strict, etc.), HTML, LaTeX, Textile, reStructuredText, JSON, ● Writes: ○ Markdown, reStructuredText, HTML, Docbook XML, OpenDocument XML, ODT, RTF, groff man, MediaWiki markup, GNU Texinfo, LaTeX, ConTeXt, EPUB, Textile, Emacs org-mode, Slidy, S5 ● Extensions for LaTeX math, tables, etc. ● Note to self: Pandoc in the CLI What is Pandoc? (pt. 2)
  • 4. ● Performance vis-à-vis scripting languages ● Type safety ● Text.Parsec library ● Hypermuscular list processing (more about FP more generally than about Haskell) Why Haskell?
  • 5. ● One possibility: functions devoted to each type-to-type combination ○ markdownToHTML ○ HTMLtoEPUB ○ 12^31 possibilities ○ FUCK THAT ● Vastly better possibility? Reader --> Neutral Haskell data type --> Writer --> Converted document Possible approaches
  • 6. ● Semi-stateful, non-opinionated REGEX machine ○ Accumulative — return (x:xs) ○ getParserState ○ modifyState ● Core functions ○ parse ■ parse parser filePath input ■ parse numbers "" "a,b,2,3" ○ many ○ skipMany ○ manyAccum ● type Parser t s = Parsec t s Text.Parsec
  • 7. ● Neutral data types ○ Pandoc = [Block] ○ Block = [(Inline || Block)] ○ Inline ○ etc. ● Reader ○ Applies parsers to documents ○ Documents are treated as lists ● Writer ○ Converts neutral data type into document ○ Again, documents are just structured lists Basic flow
  • 8. ● Readers/Markdown.hs ● Writers/HTML.hs ● Pandoc/Builder.hs Markdown to HTML
  • 9. ● When doing big, complex things with FP, you're probably going to end up thinking in terms of lists ● Lists are infinitely flexible ● Hard to escape state entirely ○ ReaderState ○ WriterState ● Don't give up ● Force yourself to give a presentation at PDXFunc General lessons