SlideShare une entreprise Scribd logo
1  sur  56
Pterodactyl
https://github.com/osfameron/pterodactyl/ (very early stage...)
References
http://www.worldofspectrum.org/ZXBasicManual/
https://ecc-comp.blogspot.co.uk/2015/05/a-brief-glance-at-how-5-text-editors.html
https://ecc-comp.blogspot.co.uk/2016/11/a-glance-into-web-tech-based-text.html
https://github.com/neovim/neovim/wiki/Architectural-musing-and-ideas
https://web.archive.org/web/20150409021025/http://curiousreef.com/class/learning-vim-from-the-inside/
http://www.finseth.com/craft/index.html
http://www.cs.unm.edu/~crowley/papers/sds/sds.html
https://github.com/yi-editor/yi-rope
https://en.wikipedia.org/wiki/Enfilade_(Xanadu)
http://www.catch22.net/tuts/piece-chains
https://en.wikipedia.org/wiki/Piece_table
http://java-sl.com/editor_kit_tutorial_document.html
http://history.dcs.ed.ac.uk/archive/apps/Whitfield-Thesis/thesis.html
Image Credits
Pterodactyl derived from CC-BY-SA 4.0 Matthew P. Martyniuk:
https://en.wikipedia.org/wiki/Pterodactylus#/media/File:Pterodactylus_holotype_fly_mmartyniuk.png

Contenu connexe

Tendances

A project report on chat application
A project report on chat applicationA project report on chat application
A project report on chat application
Kumar Gaurav
 
History of Computer Programming Languages.pptx
History of Computer Programming Languages.pptxHistory of Computer Programming Languages.pptx
History of Computer Programming Languages.pptx
AliAbbas906043
 
Generation of computer languages
Generation of computer languagesGeneration of computer languages
Generation of computer languages
kitturashmikittu
 

Tendances (20)

Introduction to Android ppt
Introduction to Android pptIntroduction to Android ppt
Introduction to Android ppt
 
Linux.ppt
Linux.ppt Linux.ppt
Linux.ppt
 
PROJECT REPORT
PROJECT REPORTPROJECT REPORT
PROJECT REPORT
 
Introduction to python for Beginners
Introduction to python for Beginners Introduction to python for Beginners
Introduction to python for Beginners
 
A project report on chat application
A project report on chat applicationA project report on chat application
A project report on chat application
 
IRJET- Prediction of Fine-Grained Air Quality for Pollution Control
IRJET- Prediction of Fine-Grained Air Quality for Pollution ControlIRJET- Prediction of Fine-Grained Air Quality for Pollution Control
IRJET- Prediction of Fine-Grained Air Quality for Pollution Control
 
Simple calulator using GUI tkinter.pptx
Simple calulator using GUI tkinter.pptxSimple calulator using GUI tkinter.pptx
Simple calulator using GUI tkinter.pptx
 
History of Computer Programming Languages.pptx
History of Computer Programming Languages.pptxHistory of Computer Programming Languages.pptx
History of Computer Programming Languages.pptx
 
unit 1 ppt.pptx
unit 1 ppt.pptxunit 1 ppt.pptx
unit 1 ppt.pptx
 
The Full Stack Web Development
The Full Stack Web DevelopmentThe Full Stack Web Development
The Full Stack Web Development
 
Project proposal android operating system
Project proposal android operating systemProject proposal android operating system
Project proposal android operating system
 
CSE Final Year Project Presentation on Android Application
CSE Final Year Project Presentation on Android ApplicationCSE Final Year Project Presentation on Android Application
CSE Final Year Project Presentation on Android Application
 
Language processor
Language processorLanguage processor
Language processor
 
Final_report
Final_reportFinal_report
Final_report
 
Basics of python
Basics of pythonBasics of python
Basics of python
 
Android Programming Seminar
Android Programming SeminarAndroid Programming Seminar
Android Programming Seminar
 
Generation of computer languages
Generation of computer languagesGeneration of computer languages
Generation of computer languages
 
Python introduction
Python introductionPython introduction
Python introduction
 
Staffing level estimation
Staffing level estimation Staffing level estimation
Staffing level estimation
 
android app development training report
android app development training reportandroid app development training report
android app development training report
 

Similaire à Data Structures for Text Editors

JavaScript as a First Class Language
JavaScript as a First Class LanguageJavaScript as a First Class Language
JavaScript as a First Class Language
fabiopereirame
 
Firefox 4 & web
Firefox 4 & webFirefox 4 & web
Firefox 4 & web
dynamis
 

Similaire à Data Structures for Text Editors (20)

WebDev References
WebDev ReferencesWebDev References
WebDev References
 
儲かるドキュメント
儲かるドキュメント儲かるドキュメント
儲かるドキュメント
 
JavaScript as a First Class Language
JavaScript as a First Class LanguageJavaScript as a First Class Language
JavaScript as a First Class Language
 
With a little help from my friends: Handy MongoDB Tools
With a little help from my friends: Handy MongoDB ToolsWith a little help from my friends: Handy MongoDB Tools
With a little help from my friends: Handy MongoDB Tools
 
Riding on rails3 with full stack of gems
Riding on rails3 with full stack of gemsRiding on rails3 with full stack of gems
Riding on rails3 with full stack of gems
 
Ferramentas de apoio ao desenvolvedor
Ferramentas de apoio ao desenvolvedorFerramentas de apoio ao desenvolvedor
Ferramentas de apoio ao desenvolvedor
 
IAC_PuppetCampLondon_2016
IAC_PuppetCampLondon_2016IAC_PuppetCampLondon_2016
IAC_PuppetCampLondon_2016
 
Erlang (GeekTalks)
Erlang (GeekTalks)Erlang (GeekTalks)
Erlang (GeekTalks)
 
Welcome to Erlang
Welcome to ErlangWelcome to Erlang
Welcome to Erlang
 
API standardization work in W3C groups
API standardization work in W3C groupsAPI standardization work in W3C groups
API standardization work in W3C groups
 
Common asp.net design patterns aspconf2012
Common asp.net design patterns aspconf2012Common asp.net design patterns aspconf2012
Common asp.net design patterns aspconf2012
 
Random Thoughts on Paper Implementations [KAIST 2018]
Random Thoughts on Paper Implementations [KAIST 2018]Random Thoughts on Paper Implementations [KAIST 2018]
Random Thoughts on Paper Implementations [KAIST 2018]
 
Ebook List Of Ebook Sites
Ebook   List Of Ebook SitesEbook   List Of Ebook Sites
Ebook List Of Ebook Sites
 
Firefox 4 & web
Firefox 4 & webFirefox 4 & web
Firefox 4 & web
 
Firefox 4 and Web
Firefox 4 and WebFirefox 4 and Web
Firefox 4 and Web
 
Web and browser evolution
Web and browser evolutionWeb and browser evolution
Web and browser evolution
 
DevoxxUK-2023 - Java Divergence.pdf
DevoxxUK-2023 - Java Divergence.pdfDevoxxUK-2023 - Java Divergence.pdf
DevoxxUK-2023 - Java Divergence.pdf
 
DevoxxUK-2023 - Java Divergence.pdf
DevoxxUK-2023 - Java Divergence.pdfDevoxxUK-2023 - Java Divergence.pdf
DevoxxUK-2023 - Java Divergence.pdf
 
Konvensyen Webmaster Negeri Sabah 2013
Konvensyen Webmaster Negeri Sabah 2013Konvensyen Webmaster Negeri Sabah 2013
Konvensyen Webmaster Negeri Sabah 2013
 
"Will Git Be Around Forever? A List of Possible Successors" at UtrechtJUG
"Will Git Be Around Forever? A List of Possible Successors" at UtrechtJUG"Will Git Be Around Forever? A List of Possible Successors" at UtrechtJUG
"Will Git Be Around Forever? A List of Possible Successors" at UtrechtJUG
 

Plus de osfameron

Plus de osfameron (15)

Writing a Tile-Matching Game - FP Style
Writing a Tile-Matching Game - FP StyleWriting a Tile-Matching Game - FP Style
Writing a Tile-Matching Game - FP Style
 
Is Haskell an acceptable Perl?
Is Haskell an acceptable Perl?Is Haskell an acceptable Perl?
Is Haskell an acceptable Perl?
 
Rewriting the Apocalypse
Rewriting the ApocalypseRewriting the Apocalypse
Rewriting the Apocalypse
 
Global Civic Hacking 101 (lightning talk)
Global Civic Hacking 101 (lightning talk)Global Civic Hacking 101 (lightning talk)
Global Civic Hacking 101 (lightning talk)
 
Functional pe(a)rls: Huey's zipper
Functional pe(a)rls: Huey's zipperFunctional pe(a)rls: Huey's zipper
Functional pe(a)rls: Huey's zipper
 
Adventures in civic hacking
Adventures in civic hackingAdventures in civic hacking
Adventures in civic hacking
 
Functional Pe(a)rls - the Purely Functional Datastructures edition
Functional Pe(a)rls - the Purely Functional Datastructures editionFunctional Pe(a)rls - the Purely Functional Datastructures edition
Functional Pe(a)rls - the Purely Functional Datastructures edition
 
Haskell in the Real World
Haskell in the Real WorldHaskell in the Real World
Haskell in the Real World
 
Oyster: an incubator for perls in the cloud
Oyster: an incubator for perls in the cloudOyster: an incubator for perls in the cloud
Oyster: an incubator for perls in the cloud
 
Semantic Pipes (London Perl Workshop 2009)
Semantic Pipes (London Perl Workshop 2009)Semantic Pipes (London Perl Workshop 2009)
Semantic Pipes (London Perl Workshop 2009)
 
Functional Pearls 4 (YAPC::EU::2009 remix)
Functional Pearls 4 (YAPC::EU::2009 remix)Functional Pearls 4 (YAPC::EU::2009 remix)
Functional Pearls 4 (YAPC::EU::2009 remix)
 
Functional Pe(a)rls version 2
Functional Pe(a)rls version 2Functional Pe(a)rls version 2
Functional Pe(a)rls version 2
 
Functional Pe(a)rls
Functional Pe(a)rlsFunctional Pe(a)rls
Functional Pe(a)rls
 
Readable Perl
Readable PerlReadable Perl
Readable Perl
 
Bigbadwolf
BigbadwolfBigbadwolf
Bigbadwolf
 

Dernier

Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
Medical / Health Care (+971588192166) Mifepristone and Misoprostol tablets 200mg
 
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
VictoriaMetrics
 
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...
chiefasafspells
 
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
masabamasaba
 
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...
masabamasaba
 
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...
masabamasaba
 
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
masabamasaba
 

Dernier (20)

Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
 
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
 
WSO2CON 2024 - Does Open Source Still Matter?
WSO2CON 2024 - Does Open Source Still Matter?WSO2CON 2024 - Does Open Source Still Matter?
WSO2CON 2024 - Does Open Source Still Matter?
 
%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand
 
What Goes Wrong with Language Definitions and How to Improve the Situation
What Goes Wrong with Language Definitions and How to Improve the SituationWhat Goes Wrong with Language Definitions and How to Improve the Situation
What Goes Wrong with Language Definitions and How to Improve the Situation
 
Artyushina_Guest lecture_YorkU CS May 2024.pptx
Artyushina_Guest lecture_YorkU CS May 2024.pptxArtyushina_Guest lecture_YorkU CS May 2024.pptx
Artyushina_Guest lecture_YorkU CS May 2024.pptx
 
VTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnVTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learn
 
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
 
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital TransformationWSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
 
%in Rustenburg+277-882-255-28 abortion pills for sale in Rustenburg
%in Rustenburg+277-882-255-28 abortion pills for sale in Rustenburg%in Rustenburg+277-882-255-28 abortion pills for sale in Rustenburg
%in Rustenburg+277-882-255-28 abortion pills for sale in Rustenburg
 
Microsoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdfMicrosoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdf
 
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...
 
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
 
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
 
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park %in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
 
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
 
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...
 
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...
 
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
 

Data Structures for Text Editors

  • 1.
  • 2.
  • 3.
  • 4.
  • 5.
  • 6.
  • 7.
  • 8.
  • 9.
  • 10.
  • 11.
  • 12.
  • 13.
  • 14.
  • 15.
  • 16.
  • 17.
  • 18.
  • 19.
  • 20.
  • 21.
  • 22.
  • 23.
  • 24.
  • 25.
  • 26.
  • 27.
  • 28.
  • 29.
  • 30.
  • 31.
  • 32.
  • 33.
  • 34.
  • 35.
  • 36.
  • 37.
  • 38.
  • 39.
  • 40.
  • 41.
  • 42.
  • 43.
  • 44.
  • 45.
  • 46.
  • 47.
  • 48.
  • 49.
  • 50.
  • 51.
  • 52.
  • 53.
  • 54.
  • 55.
  • 56. Pterodactyl https://github.com/osfameron/pterodactyl/ (very early stage...) References http://www.worldofspectrum.org/ZXBasicManual/ https://ecc-comp.blogspot.co.uk/2015/05/a-brief-glance-at-how-5-text-editors.html https://ecc-comp.blogspot.co.uk/2016/11/a-glance-into-web-tech-based-text.html https://github.com/neovim/neovim/wiki/Architectural-musing-and-ideas https://web.archive.org/web/20150409021025/http://curiousreef.com/class/learning-vim-from-the-inside/ http://www.finseth.com/craft/index.html http://www.cs.unm.edu/~crowley/papers/sds/sds.html https://github.com/yi-editor/yi-rope https://en.wikipedia.org/wiki/Enfilade_(Xanadu) http://www.catch22.net/tuts/piece-chains https://en.wikipedia.org/wiki/Piece_table http://java-sl.com/editor_kit_tutorial_document.html http://history.dcs.ed.ac.uk/archive/apps/Whitfield-Thesis/thesis.html Image Credits Pterodactyl derived from CC-BY-SA 4.0 Matthew P. Martyniuk: https://en.wikipedia.org/wiki/Pterodactylus#/media/File:Pterodactylus_holotype_fly_mmartyniuk.png

Notes de l'éditeur

  1. Data structures for Text Editors: Hakim Cassimally @osfameron Lambda Lounge Manchester, Mon 16th Jan 2017 I don't normally draw slides, but this time I started to get grumpy about drawing boxes in Keynote and decided it would actually be easier to use pen and paper. I'm not sure if the result looks good, but it kept me amused and was mostly fun (taking photos of the resulting thing was a bit of a faff.) LambdaLounge is a meetup for functional programmers, and this talk has a focus on immutable data-structures. There are few code examples (it turns out that one (dis?)advantage of drawing slides is that you don't really feel like writing out large chunks of source code) but what little there is is in Clojure.
  2. My First Text Editor. (picture of Spectrum 48K) When I was 10 or so, my brother and I had a Spectrum 48K. My dad had an Amstrad PCW which he used for word-processing. I was obsessed for a while with the idea of writing what I called a "word-processor" on the Spectrum.
  3. Spectrum BASIC - a primer. To write the text editor, I used commands like these: DIM s$ (to preserve memory for the string) CLS (clears the screen) PRINT (print out the string) INKEY$ (get the input from keyboard) s$[1 to 5] (slice into the string to e.g. edit it) GOTO (repeat the main loop) It was very slow, and in fact, looking at photos of the computer, it seems the Spectrum doesn't have arrow keys, so I expect it wasn't very good at editing. I wrote a version later for the BBC Micro that did step through the string, left-and-right (up and down was too hard. We'll come back to this).
  4. Components: There doesn't seem to be a lot of good literature on text editors, but ones you keep coming across are the old theses by e.g. Finseth and Crowley. Finseth is rather old-fashioned, and has some odd hobby-horses about the specific implementation of Emacs, but is worth reading. He also looks at the various components that make up an editor. For example: the Input command language. Nowadays we expect shortcuts like Cmd-F to find a string, but of course earlier editors have interesting shortcut keys like Emacs (C-x C-c) to quit, and Vim's powerful but gnomic shortcuts like "ggVGc" which replaces all the text in the buffer ("gg" to go to the top, "V" to enter visual mode, "G" to go to the end, and "c" to replace the selection and start insert mode.) Then there's the intermediate layer of code that maps input to editor processes. This could be as simple and basic as Vim's "insert" mode, or it could be an editor mode for a particular language etc. Editor buffers: what Finseth calls "the internal sub-editor" are essentially the topic of this talk - the data-structure used to store text, and the API used to navigate and modify it. Finally, Output - whether to GUI, Console (perhaps using a curses based interface) or to HTML for JS-based editors, which is a whole topic in and of itself. I won't go into that, but it's interesting that I'm considering using a GUI widget like the JEditorPane for display only. Yet this widget encapsulates a text editor itself. There's an interesting opportunity and perhaps risk in terms of sharing the concerns here.
  5. Strings! "APRIL is the cruellest month, breeding\nLilacs out of the dead land, mixing\nMemory and desire..." I'm using the opening of the Waste Land by T. S. Eliot as the example text here. Though I used strings in my first editor on the Spectrum 48K, I suspect it might not be a great data-structure, for reasons of efficiency at least. I wondered if there were in fact *any* text editors that use this approach...
  6. pyvim! "We just have a Python string object... seems to work really well up to 100,00 lines" - Jonathan S. The lead on the pyvim project claims they simply use a string as the data-structure. It's interesting that of course processing power has advanced so much now that the optimal data-structure may not even be a real issue. Certainly, 100K lines is plenty for most purposes. But of course, if you're interested in functional programming, and therefore immutable data-structures, classic strings (essentially contiguous arrays of characters) have some features which make them suboptimal.
  7. (Mutable) Strings Here we have a mutable string. I wanted to delete the word "APRIL " at the beginning of the string. This should be highly efficient, as it can be done by simply updating the pointer of the string to be offset - e.g. point at "is the cruellest month".
  8. (Mutable) Strings (continued) But if we want to delete the word "cruellest", we have to copy all the characters to the right into their new position. Although this feels perhaps inelegant, it's such a common operation that it's generally optimized and is easily fast enough. From a functional programming perspective, it's unfortunate that you don't retain the previous version (e.g. for undo stacks etc.) but the memory usage is efficient.
  9. (Immutable) strings: The first case (removing the first part of the word) may well be optimized for immutable strings too.
  10. (Immutable) strings - continued. The deletion case is essentially handled by making a copy of the whole string, except for the changed piece. If you wanted to keep a log of all previous states (for undo etc.) then this could use a lot of memory.
  11. Gap Buffer: The gap buffer or "split vector" is conceptually just a string with a gap in it. The key insight is that when you're editing text, you tend to make lots of changes in a given area. So by filling up the gap, you don't need to do the shunting around of bytes in memory after every change. Obviously once you start editing in a different area, the buffer will move the gap to the new area, moving just the bytes between the old and new gaps. It's really very elegant. Emacs uses this approach, as does Scintilla. It's apparently blazingly fast for most purposes. Given how historically important and powerful the data-structure is, one measly slide doesn't really do the topic justice. But just like strings, it's not that interesting with an immutable data-structures hat on.
  12. Array of lines: ["APRIL is the cruellest month, breeding", "Lilacs out of the dead land, mixing", "Memory and desire, stirring", "Dull roots with spring rain."] Finseth and Crowley discuss this and suggest that it's inconvenient. Finseth writes "Implementations that use this model usually make a very sharp distinction between editing within a line and editing lines. For example, lines may have a maximum length or cut and paste operations may only operate on line boundaries." This may explain some of the oddities of e.g. the Vim interface. (And in an odd way, some of its power - the line-based style turns out to be mostly very powerful for programming).
  13. Editors using Arrays of Lines: Vi/Ex (1976), Vim, Neovim Gnu Moe (2005), Atom (!!!) Microsoft Visual Studio Code web? Cloud 9 Ace Eclipse Orion... Given how fiddly the literature suggested this would be to implement, I'm surprised at how many editors use this approach. It may be that it helps you get started very quickly, and the writers are happy to then iron out all the oddities with code and abstractions. And it's interesting that modern editors like Atom use it. Again, it doesn't seem that interesting with an FP hat on - and, though I love Vim, I find that the line-based approach can be frustrating for domains that aren't line-based (long-form prose, Lisp code, ...) so I won't go into it further.
  14. Array of Characters [ \A, \P, \R, \I, \L, \space, \i, \s, \space, ...] The array of characters has some of the disadvantages of the array of lines. But it may be appealing to think of your text as being a sequence of characters - even if the text is actually *stored* in a difference structure, this is a nice, consistent way of actually interacting with e.g. a cursor position in a text buffer.
  15. ... with metadata When you think of text as an array (or stream) of characters, you could then think of each character position as containing not just the character, but also the metadata, for example: {:char \I :colour :red :marks [:a] }
  16. Linked Lists: So arrays aren't really optimal as functional data-structures. Lists are often useful: So the string "APRIL" might be represented as: A -> P -> R -> I -> L -> ()
  17. Going with the flow: If you want to make changes to a list that go along the flow (the direction of the arrows in the diagram) then everything is hunky dory: e.g. changing APRIL to BOVRIL ends up with 2 lists, both of which share the "RIL" part.
  18. Going against the flow: But changing "APRIL" to "APRON" doesn't work as well...
  19. In a mutable structure, we could take the link from "R" and point it to the new list "ON", but that's a mutation... so we simply can't do that with a purely functional data structure.
  20. ... so we have to copy the head of the list too, ending up with 2 lists representing "APRON" and "APRIL".
  21. Doubly Linked Lists. When we write text, we tend not to be able to start at the beginning, type away, get to the end, and then finish. So we have to have a way to go backwards. In mutable structures, you might reach for a doubly linked list, where each node has a pointer to both the next node *and* the previous one. So the "A" in APRIL points to the "P", while "P" points both forward to "R" and backwards to "A". The problem now is how this works with immutable data structures. As we saw, we can only take advantage of structural sharing by going *with* the flow. But because doubly linked lists flow in *both* ways, we're always going to be going against one of the flows. So any change will always result in having to copy the whole list.
  22. Zippers! Luckily there is a nice way around the problem of doubly-linked lists in functional data-structures. Zippers are a key concept. You simply keep a list of the previous nodes as you traverse through the list. So when I start on the letter "A" there's nothing in my :prev list. When I navigate to "P", I take a pointer to that letter.
  23. Then when I navigate to "R" I push the letter "P" onto the :prev list (now consisting of [\P, \A]. And so on.
  24. You can describe this as "picking up" the element you're currently focused on, and that's how I've drawn it in these slides. When you want to traverse back, you simply pick up the next element off the *previous* list, and make that the new head. (You can see this in action by pressing PgUp and PgDown a few times in these slides :-) Zippers are fascinating. There's a rich body of very mathsy computer science papers that make a charming whooshing noise as they fly straight over my head. Luckily, they're really easy to use in practice. Clojure has a standard `clojure.zip` library, a perhaps improved `hara.zip`, or it's very easy to write your own, which is what I've done for now.
  25. Trees. Trees are perhaps the immutable data structure par excellence. Here we have a tree with some leaf nodes ["In ", "Xanadu ", "did", " Kublai ", "Khan"].
  26. Trees. If we mark the lengths of each of the leaves, we can then sum the lengths at each node up to the top. That's the length of the entire buffer. You can also see that if you wanted to access part of the text before character 10, you'd want to travel down the left-hand branch. As long as the tree is well-balanced, you can access any part of it efficiently - O(log n). Red-Black trees are a popular way of implementing balanced binary trees. I find them quite complicated and usually get stuck when it comes to handling deletions. But I've previously implemented the "AA tree" variant, which is quite a bit simpler, though slightly less efficient, in Perl.
  27. The structure I showed is very similar to the "Enfilade" used in the Xanadu project. This is from a famous, utopian, hippy project from 1970s California to write a text editor, micropayments, hypertext engine, and world-wide information store that has fascinated me for decades (though the code that was finally released to the public a few years ago was, perhaps unsurprisingly, disappointing given the inflated expectations.) The rope structure is very closely related. Yi's rope structure is a variant called the finger tree, which has some optimizations for adding text to the beginning and end of the string. I believe Adobe's web-based editor Brackets/CodeMirror uses a binary tree of lines, and JEditorPane uses a hierarchical tree of paragraphs/elements (this is perhaps more because it's really an HTML display widget rather than a text editor store.) By the way, if you wanted to write a text editor and don't care about writing your own data structure, just use a Rope already. It's a really good data structure, and someone has probably written a good library for them in your programming language. Essentially you can use it like a string, but it will be fast and memory efficient.
  28. Piece tables are a great semi-functional data-structure. You usually start off with a single buffer which might be the contents of a file. (So this might be mmap'd into memory.) This table is immutable. Here, we have a buffer containing "In Xanadu did Genghis Khan." and a "piece" structure pointing to that buffer, with the indices 0-27 (e.g. the whole contents).
  29. ... but of course I've confused my Mongol emperors. So if we wanted to replace Genghis with Kublai, we could simply generate a new set of pieces: Base buffer 0-13 =>"In Xanadu did " Append buffer 0-5 => "Kublai" Base buffer 22-26 => " Khan" So we've added the text "Kublai" to the append buffer. This is often represented in a semi-functional way - e.g. the text is mutated by being appended to, but is never otherwise changed.
  30. I really like Piece Tables, but oddly, they don't seem to be widely used. Microsoft Word 1.1a apparently does (perhaps later versions do too, but aren't open source) as does Abiword. Also the "highly influential" Bravo editor (about which I know nothing else). I'd guess that Piece Tables seem a bit rarefied and functional for imperative, mutable programmers, who would reach for a gap buffer or an array of lines instead. And functional programmers will go straight for trees.
  31. So, we started using Clojure at work for a few small projects, and now for large chunks of our latest microservice. Over Christmas I started to play with writing some text editor structures in Clojure. Now I just needed to make the important decision... the name! From PTE (Programmer's Text Editor) I came up with Pterodactyl, a text editor for winged dinosaurs. I'm amused by the idea of people typing "pterodactyl foo.txt" at the command line. The picture is copied from M Martyniuk's CC-licensed image on Wikipedia, thanks! (full credit in references)
  32. Pieces of immutable strings. So, ropes are great. But this is largely a learning and discovery exercise, so I want to implement my own data-structure. Creating a balanced binary tree, slightly hurts my head as I've mentioned, which is fine, but I don't think it's the most interesting part of this exercise - for me. Also, I really want to use zippers, and I'm unsure of the details of how they work against a self-balancing tree. So, for now at least, I'm going with a piece table. I don't think this is a terrible idea - piece tables are a great structure! Also, you can think of a zipper-on-a-rope as being largely an optimization detail over a zipper-on-a-piece-table (e.g. once I have a nice API for interacting with the zipper, that won't change at all if I change the backend data-structure. I'll just be able to take advantage of faster random access within the buffer.) Classic piece tables point to either the immutable buffer or the append buffer. As I'm using Clojure, a JVM language with immutable strings, I'll avoid that layer of indirection by simply using strings. Here I've created 3 pieces as follows: ["In Xanadu di Kublai Khan" 0-24, "a stately pleasure dome decree" 0-30, "where Ralph the naughty squirrel ran" 0-35 ] Oh noes! I think I got some words wrong though...
  33. Here, I show how to correct the words to "Alph" and "sacred river". Notice how the text "where Ralph the naughty squirrel ran" is now repeated 3 times though. This is memory efficient - they are all pointing to exactly the same string item.
  34. Zipper: There's a core library "clojure.zip", but I was unconvinced with it, and created my own. This zips over the pieces (in this diagram, each word is represented as a single piece) but also maintains some other state: - in the top left, you have the offset from the beginning of the buffer. e.g. the piece "Xanadu " starts at offset 3. - the zipper is pointing to pos 3 within its piece e.g. within "Xan><adu". You can add the piece offset and zipper offset together to get 6, the current offset of the cursor from the beginning of the buffer. As you zip left and right, the offsets are updated, and if we make any changes, we'll add and subtract the offsets, accumulating their values as we traverse to the right.
  35. Zipper of Zippers: The previous slide was my first, naive version. But we can do better than that! We can think of a zipper of zippers - here I'm in the Xanadu piece as before, but now I'm in a zipper containing just the characters [\X \a \n \a \d \u \space]. As you can see, every one of those characters has its own offset, so I have to add and subtract those as I traverse left and right.
  36. Accumulator: but it's not just the offset from the beginning of the buffer that's interesting! I could also keep the number of rows, or the current column position. I might want to track the number of words that I've counted so far etc.
  37. ... a really exciting thing to accumulate might be the lexer state of the syntax highlighter.
  38. ... and current nesting level: e.g. "( ( hello ) >< ..." is at nesting level 1.
  39. ... I could also keep a note of any marks I've seen. So if at position 7 I see the current character has {:marks [:a]}, I could update {:marks {:a 7}} in the accumulator.
  40. ... and as we have the syntax state, why not keep a note of indexes of symbols too? e.g. where variables and functions were declared.
  41. Finite State Automata You can think of these as finite state automata, that nibble away at a character, changing state as appropriate. For example, to update :row and :col we might check whether the current character is a newline or not, and whether the *previous* character was a newline (e.g. which would be stored in the accumulated state). I've been playing with ztellman/automat, which is really promising, if a bit confusing sometimes.
  42. Reductions. So... did you see me mention "accumulator" several times? I've been adding to the accumulator as I travel to the right, and subtracting as I travel left... and this is starting to remind me of "reduce" or "foldl". In this slide I show a vector [1 2 3 10 5 4 8] which could be sum'd as 33. Of course a sum is just what happens when you reduce or "fold" the operator (+) over the array.
  43. But there's also the concept of intermediate results: (reductions) in Clojure, or `scanl` in Haskell. Here you can see the intermediate sum at every step. There is nothing to stop us from zipping together those 2 lists, and then traversing over that new list with our zipper!
  44. So to be clear, each of these accumulators (here, the offset, like 0 or 3, or 10) is the reduction currently active at a given point. The FSA runs for every character, but the reductions are only stored per-piece (and per-character for *only* the piece that's currently being zipped over) so this shouldn't be too memory ineffiicent, when you take advantage of structural sharing. Note that you could asynchronously send a zipper to the very end of the buffer to update the *complete* reduction from time to time. This will give you the character count, word count, number of lines, and index of marks and functions, which you can then keep for future use.
  45. Moving up and down: So let's see how interacting with the zipper might look in practice! If you're at row 1/col 4, and you want to move up (presumably to row 0/col 4, assuming that exists) how would you do it?
  46. As we know we're in column 4, we know that we can move left 4 character to get to column 0.
  47. From there, we can move left 1, which we know will take us to the end of the previous line. At this point, we now see that we're in column 7. If we were in a column less than 4, we'd have to stop here! But as it is, we can now move left 3 to column 4.
  48. (Here I moved to the beginning of the buffer instead to show a bit more context.) Now... how do we move *down* instead? Again, we'll start from row 1, col 4. Previously, we simply moved to the beginning of the line, because we knew what column we're in. But we don't know what column the end of the line is in...
  49. So... we're going to have to search for a newline! We move right one character. It's an "n", no luck.
  50. We move right again... and we find a \newline! (I've represented that as "\n" in the diagram) We know we're at the end of this row.
  51. So we can now move right by one character, to get to col 0 of the next row.
  52. Now, we can move right 4 characters to get us to column 4! (In this case, there were indeed 4 or more characters, but if there hadn't been, we'd have had to backtrack, using the approach used in the previous slides) Note how moving up and moving down are not duals of each other in terms of implementation! (Also, note how if we were just using the "Array of lines" approach, the task of moving up and down by lines would have been trivial. The choice of data-structure will always make some things easy and some things harder. I'm hoping that my piece-table approach will make the right things easy...) I'll leave figuring out what follows "The cat sat on the m" as an exercise to the reader.
  53. So that's about it... I'm currently working for the BBC in MediaCityUK, Salford in the Programme Metadata team. We're hiring!
  54. Teams around us use all kinds of languages, even mutable, non-functional ones*... (* Although, have you read Higher Order Perl?)
  55. ...but also functional ones like Clojure, XQuery, and Scala. Our team is particularly interested in recruiting Clojure devs at the moment, so please do give me a shout if you'd like to know more.
  56. The (very early stage) repo for Pterodactyl is at https://github.com/osfameron/pterodactyl/