Pharo DataFrame: Past, Present, and Future

E
ESUGESUG
Pharo DataFrame:
Past, Present, and Future
Larisa Safina, Oleksandr Zaitsev, Cyril Ferlicot-Delbecque,
Papa Ibrahima Sow
International Workshop on Smalltalk Technologies 2023, Lyon, France
Agenda
What is Data Frames and what they are good for
Evolution of Data Frames in Pharo
Data Frames outside of Pharo
2
Data Frames
Tabular structure
Columns with named headers
Columns/Rows index
Mixed data types in rows
Primitive data types
*The table is courtesy of Zaitsev et al (ESUG22)
3
Data Frames API
Data Import/Export
Grouping and Aggregation
Missing Data Handling
Statistical Operations
Time Series Analysis
Visualization
4
*The table is courtesy of Zaitsev et al (ESUG22)
Pharo DataFrame. Past
Started as a GSoC 2017 project (by Oleks)
Focus of the two followed-up GSoC
Contributions from external developers
5
Pharo DataFrame. Past
Problems:
- No stable maintenance
- Lack of functionality
- Low performance
- Incomplete coherence with Pharo collections
- Lack of detailed documentation
6
Pharo DataFrame. Present
Stable project with a permanent developer (Cyril)
Code optimization
- code quality
- speed and volume improvements
Adding new functionality
7
Awesome
Data Frames
8
Kyle Mitchell (jcmkk3) and
Uwe L. Korn (xhochy)
https://github.com/jcmkk3
/awesome-dataframes
Pharo
DataFrame
vs
Pandas
9
Pharo DataFrame. Future
✨ Better performance ✨
Functionality Enhancements
Better synchronisation with PolyMath and pharo-ai
Big Data Support
Evaluation ❤
Toy Story 1995
10
Do you want to contribute?
https://github.com/PolyMathOrg/DataFrame
11
12
Student project → Mature project with engineers
Better performance
Functionality Enhancements
Evaluation
1 sur 12

Recommandé

External Master Data in Alfresco: Integrating and Keeping Metadata Consistent... par
External Master Data in Alfresco: Integrating and Keeping Metadata Consistent...External Master Data in Alfresco: Integrating and Keeping Metadata Consistent...
External Master Data in Alfresco: Integrating and Keeping Metadata Consistent...ITD Systems
2.2K vues49 diapositives
A Finnish perspective on FAIRsFAIR outputs par
A Finnish perspective on FAIRsFAIR outputsA Finnish perspective on FAIRsFAIR outputs
A Finnish perspective on FAIRsFAIR outputsJessica Parland-von Essen
249 vues30 diapositives
Science and data manipulation in Pharo par
Science and data manipulation in PharoScience and data manipulation in Pharo
Science and data manipulation in PharoESUG
64 vues21 diapositives
ALM Search Presentation for the VSS Arch Council par
ALM Search Presentation for the VSS Arch CouncilALM Search Presentation for the VSS Arch Council
ALM Search Presentation for the VSS Arch CouncilSunita Shrivastava
256 vues25 diapositives
Flink Apachecon Presentation par
Flink Apachecon PresentationFlink Apachecon Presentation
Flink Apachecon PresentationGyula Fóra
3.6K vues45 diapositives
Neo4j Vision and Roadmap par
Neo4j Vision and Roadmap Neo4j Vision and Roadmap
Neo4j Vision and Roadmap Neo4j
40 vues31 diapositives

Contenu connexe

Similaire à Pharo DataFrame: Past, Present, and Future

Dalton Sergio Leonardo Eng Resume 20160803 par
Dalton Sergio Leonardo Eng Resume 20160803Dalton Sergio Leonardo Eng Resume 20160803
Dalton Sergio Leonardo Eng Resume 20160803Dalton Sergio Leonardo
220 vues4 diapositives
Overview of Modern Graph Analysis Tools par
Overview of Modern Graph Analysis ToolsOverview of Modern Graph Analysis Tools
Overview of Modern Graph Analysis ToolsKeiichiro Ono
2.4K vues35 diapositives
OpenNebulaConf2017EU: FairShare Scheduling by Valentina Zaccolo, INDIGO par
OpenNebulaConf2017EU: FairShare Scheduling by Valentina Zaccolo, INDIGOOpenNebulaConf2017EU: FairShare Scheduling by Valentina Zaccolo, INDIGO
OpenNebulaConf2017EU: FairShare Scheduling by Valentina Zaccolo, INDIGOOpenNebula Project
246 vues7 diapositives
Thinking About Guideline for Data Interoperability - Design concept and workf... par
Thinking About Guideline for Data Interoperability - Design concept and workf...Thinking About Guideline for Data Interoperability - Design concept and workf...
Thinking About Guideline for Data Interoperability - Design concept and workf...Open Cyber University of Korea
595 vues9 diapositives
SiLA: Making the standard fit for the future and adapting an open-source coll... par
SiLA: Making the standard fit for the future and adapting an open-source coll...SiLA: Making the standard fit for the future and adapting an open-source coll...
SiLA: Making the standard fit for the future and adapting an open-source coll...Gáspár Incze
226 vues32 diapositives
The AmeriFlux Network Data Management System par
The AmeriFlux Network Data Management SystemThe AmeriFlux Network Data Management System
The AmeriFlux Network Data Management SystemIntegrated Carbon Observation System (ICOS)
73 vues20 diapositives

Similaire à Pharo DataFrame: Past, Present, and Future(20)

Overview of Modern Graph Analysis Tools par Keiichiro Ono
Overview of Modern Graph Analysis ToolsOverview of Modern Graph Analysis Tools
Overview of Modern Graph Analysis Tools
Keiichiro Ono2.4K vues
OpenNebulaConf2017EU: FairShare Scheduling by Valentina Zaccolo, INDIGO par OpenNebula Project
OpenNebulaConf2017EU: FairShare Scheduling by Valentina Zaccolo, INDIGOOpenNebulaConf2017EU: FairShare Scheduling by Valentina Zaccolo, INDIGO
OpenNebulaConf2017EU: FairShare Scheduling by Valentina Zaccolo, INDIGO
SiLA: Making the standard fit for the future and adapting an open-source coll... par Gáspár Incze
SiLA: Making the standard fit for the future and adapting an open-source coll...SiLA: Making the standard fit for the future and adapting an open-source coll...
SiLA: Making the standard fit for the future and adapting an open-source coll...
Gáspár Incze226 vues
MuseoTorino, first italian project using a GraphDB, RDFa, Linked Open Data par 21Style
MuseoTorino, first italian project using a GraphDB, RDFa, Linked Open DataMuseoTorino, first italian project using a GraphDB, RDFa, Linked Open Data
MuseoTorino, first italian project using a GraphDB, RDFa, Linked Open Data
21Style1.4K vues
Data Enthusiasts London: Scalable and Interoperable data services. Applied to... par Andy Petrella
Data Enthusiasts London: Scalable and Interoperable data services. Applied to...Data Enthusiasts London: Scalable and Interoperable data services. Applied to...
Data Enthusiasts London: Scalable and Interoperable data services. Applied to...
Andy Petrella1.3K vues
Andy Petrella_Med@Scale by Data Fellas: Scalable and Interoperable Genomics d... par Dataconomy Media
Andy Petrella_Med@Scale by Data Fellas: Scalable and Interoperable Genomics d...Andy Petrella_Med@Scale by Data Fellas: Scalable and Interoperable Genomics d...
Andy Petrella_Med@Scale by Data Fellas: Scalable and Interoperable Genomics d...
Dataconomy Media582 vues
Productive Development with APEX par Simon Boorsma
Productive Development with APEXProductive Development with APEX
Productive Development with APEX
Simon Boorsma856 vues
2015 Data Science Summit @ dato Review par Hang Li
2015 Data Science Summit @ dato Review2015 Data Science Summit @ dato Review
2015 Data Science Summit @ dato Review
Hang Li976 vues
System Architecture_BPM TIBCO_Network Inventory par Ela Pamieta
System Architecture_BPM TIBCO_Network InventorySystem Architecture_BPM TIBCO_Network Inventory
System Architecture_BPM TIBCO_Network Inventory
Ela Pamieta202 vues
How to expand the Galaxy from genes to Earth in six simple steps (and live sm... par Raffaele Montella
How to expand the Galaxy from genes to Earth in six simple steps (and live sm...How to expand the Galaxy from genes to Earth in six simple steps (and live sm...
How to expand the Galaxy from genes to Earth in six simple steps (and live sm...
Java EE 7 with Apache Spark for the World’s Largest Credit Card Core Systems ... par Hirofumi Iwasaki
Java EE 7 with Apache Spark for the World’s Largest Credit Card Core Systems ...Java EE 7 with Apache Spark for the World’s Largest Credit Card Core Systems ...
Java EE 7 with Apache Spark for the World’s Largest Credit Card Core Systems ...
Hirofumi Iwasaki1.1K vues
Apache Flink: Past, Present and Future par Gyula Fóra
Apache Flink: Past, Present and FutureApache Flink: Past, Present and Future
Apache Flink: Past, Present and Future
Gyula Fóra740 vues
Tripal within the Arabidopsis Information Portal - PAG XXIII par Vivek Krishnakumar
Tripal within the Arabidopsis Information Portal - PAG XXIIITripal within the Arabidopsis Information Portal - PAG XXIII
Tripal within the Arabidopsis Information Portal - PAG XXIII
Vivek Krishnakumar1.2K vues

Plus de ESUG

Workshop: Identifying concept inventories in agile programming par
Workshop: Identifying concept inventories in agile programmingWorkshop: Identifying concept inventories in agile programming
Workshop: Identifying concept inventories in agile programmingESUG
12 vues16 diapositives
Technical documentation support in Pharo par
Technical documentation support in PharoTechnical documentation support in Pharo
Technical documentation support in PharoESUG
45 vues39 diapositives
The Pharo Debugger and Debugging tools: Advances and Roadmap par
The Pharo Debugger and Debugging tools: Advances and RoadmapThe Pharo Debugger and Debugging tools: Advances and Roadmap
The Pharo Debugger and Debugging tools: Advances and RoadmapESUG
56 vues44 diapositives
Sequence: Pipeline modelling in Pharo par
Sequence: Pipeline modelling in PharoSequence: Pipeline modelling in Pharo
Sequence: Pipeline modelling in PharoESUG
87 vues22 diapositives
Migration process from monolithic to micro frontend architecture in mobile ap... par
Migration process from monolithic to micro frontend architecture in mobile ap...Migration process from monolithic to micro frontend architecture in mobile ap...
Migration process from monolithic to micro frontend architecture in mobile ap...ESUG
23 vues35 diapositives
Analyzing Dart Language with Pharo: Report and early results par
Analyzing Dart Language with Pharo: Report and early resultsAnalyzing Dart Language with Pharo: Report and early results
Analyzing Dart Language with Pharo: Report and early resultsESUG
108 vues30 diapositives

Plus de ESUG(20)

Workshop: Identifying concept inventories in agile programming par ESUG
Workshop: Identifying concept inventories in agile programmingWorkshop: Identifying concept inventories in agile programming
Workshop: Identifying concept inventories in agile programming
ESUG12 vues
Technical documentation support in Pharo par ESUG
Technical documentation support in PharoTechnical documentation support in Pharo
Technical documentation support in Pharo
ESUG45 vues
The Pharo Debugger and Debugging tools: Advances and Roadmap par ESUG
The Pharo Debugger and Debugging tools: Advances and RoadmapThe Pharo Debugger and Debugging tools: Advances and Roadmap
The Pharo Debugger and Debugging tools: Advances and Roadmap
ESUG56 vues
Sequence: Pipeline modelling in Pharo par ESUG
Sequence: Pipeline modelling in PharoSequence: Pipeline modelling in Pharo
Sequence: Pipeline modelling in Pharo
ESUG87 vues
Migration process from monolithic to micro frontend architecture in mobile ap... par ESUG
Migration process from monolithic to micro frontend architecture in mobile ap...Migration process from monolithic to micro frontend architecture in mobile ap...
Migration process from monolithic to micro frontend architecture in mobile ap...
ESUG23 vues
Analyzing Dart Language with Pharo: Report and early results par ESUG
Analyzing Dart Language with Pharo: Report and early resultsAnalyzing Dart Language with Pharo: Report and early results
Analyzing Dart Language with Pharo: Report and early results
ESUG108 vues
Transpiling Pharo Classes to JS ECMAScript 5 versus ECMAScript 6 par ESUG
Transpiling Pharo Classes to JS ECMAScript 5 versus ECMAScript 6Transpiling Pharo Classes to JS ECMAScript 5 versus ECMAScript 6
Transpiling Pharo Classes to JS ECMAScript 5 versus ECMAScript 6
ESUG39 vues
A Unit Test Metamodel for Test Generation par ESUG
A Unit Test Metamodel for Test GenerationA Unit Test Metamodel for Test Generation
A Unit Test Metamodel for Test Generation
ESUG54 vues
Creating Unit Tests Using Genetic Programming par ESUG
Creating Unit Tests Using Genetic ProgrammingCreating Unit Tests Using Genetic Programming
Creating Unit Tests Using Genetic Programming
ESUG49 vues
Threaded-Execution and CPS Provide Smooth Switching Between Execution Modes par ESUG
Threaded-Execution and CPS Provide Smooth Switching Between Execution ModesThreaded-Execution and CPS Provide Smooth Switching Between Execution Modes
Threaded-Execution and CPS Provide Smooth Switching Between Execution Modes
ESUG52 vues
Exploring GitHub Actions through EGAD: An Experience Report par ESUG
Exploring GitHub Actions through EGAD: An Experience ReportExploring GitHub Actions through EGAD: An Experience Report
Exploring GitHub Actions through EGAD: An Experience Report
ESUG17 vues
Pharo: a reflective language A first systematic analysis of reflective APIs par ESUG
Pharo: a reflective language A first systematic analysis of reflective APIsPharo: a reflective language A first systematic analysis of reflective APIs
Pharo: a reflective language A first systematic analysis of reflective APIs
ESUG58 vues
Garbage Collector Tuning par ESUG
Garbage Collector TuningGarbage Collector Tuning
Garbage Collector Tuning
ESUG20 vues
Improving Performance Through Object Lifetime Profiling: the DataFrame Case par ESUG
Improving Performance Through Object Lifetime Profiling: the DataFrame CaseImproving Performance Through Object Lifetime Profiling: the DataFrame Case
Improving Performance Through Object Lifetime Profiling: the DataFrame Case
ESUG43 vues
thisContext in the Debugger par ESUG
thisContext in the DebuggerthisContext in the Debugger
thisContext in the Debugger
ESUG36 vues
Websockets for Fencing Score par ESUG
Websockets for Fencing ScoreWebsockets for Fencing Score
Websockets for Fencing Score
ESUG18 vues
ShowUs: PharoJS.org Develop in Pharo, Run on JavaScript par ESUG
ShowUs: PharoJS.org Develop in Pharo, Run on JavaScriptShowUs: PharoJS.org Develop in Pharo, Run on JavaScript
ShowUs: PharoJS.org Develop in Pharo, Run on JavaScript
ESUG46 vues
Advanced Object- Oriented Design Mooc par ESUG
Advanced Object- Oriented Design MoocAdvanced Object- Oriented Design Mooc
Advanced Object- Oriented Design Mooc
ESUG85 vues
A New Architecture Reconciling Refactorings and Transformations par ESUG
A New Architecture Reconciling Refactorings and TransformationsA New Architecture Reconciling Refactorings and Transformations
A New Architecture Reconciling Refactorings and Transformations
ESUG28 vues
BioSmalltalk par ESUG
BioSmalltalkBioSmalltalk
BioSmalltalk
ESUG415 vues

Dernier

tecnologia18.docx par
tecnologia18.docxtecnologia18.docx
tecnologia18.docxnosi6702
5 vues5 diapositives
Introduction to Gradle par
Introduction to GradleIntroduction to Gradle
Introduction to GradleJohn Valentino
6 vues7 diapositives
The Path to DevOps par
The Path to DevOpsThe Path to DevOps
The Path to DevOpsJohn Valentino
5 vues6 diapositives
ADDO_2022_CICID_Tom_Halpin.pdf par
ADDO_2022_CICID_Tom_Halpin.pdfADDO_2022_CICID_Tom_Halpin.pdf
ADDO_2022_CICID_Tom_Halpin.pdfTomHalpin9
5 vues33 diapositives
How to build dyanmic dashboards and ensure they always work par
How to build dyanmic dashboards and ensure they always workHow to build dyanmic dashboards and ensure they always work
How to build dyanmic dashboards and ensure they always workWiiisdom
14 vues13 diapositives
Airline Booking Software par
Airline Booking SoftwareAirline Booking Software
Airline Booking SoftwareSharmiMehta
9 vues26 diapositives

Dernier(20)

tecnologia18.docx par nosi6702
tecnologia18.docxtecnologia18.docx
tecnologia18.docx
nosi67025 vues
ADDO_2022_CICID_Tom_Halpin.pdf par TomHalpin9
ADDO_2022_CICID_Tom_Halpin.pdfADDO_2022_CICID_Tom_Halpin.pdf
ADDO_2022_CICID_Tom_Halpin.pdf
TomHalpin95 vues
How to build dyanmic dashboards and ensure they always work par Wiiisdom
How to build dyanmic dashboards and ensure they always workHow to build dyanmic dashboards and ensure they always work
How to build dyanmic dashboards and ensure they always work
Wiiisdom14 vues
360 graden fabriek par info33492
360 graden fabriek360 graden fabriek
360 graden fabriek
info33492165 vues
How To Make Your Plans Suck Less — Maarten Dalmijn at the 57th Hands-on Agile... par Stefan Wolpers
How To Make Your Plans Suck Less — Maarten Dalmijn at the 57th Hands-on Agile...How To Make Your Plans Suck Less — Maarten Dalmijn at the 57th Hands-on Agile...
How To Make Your Plans Suck Less — Maarten Dalmijn at the 57th Hands-on Agile...
Stefan Wolpers42 vues
predicting-m3-devopsconMunich-2023.pptx par Tier1 app
predicting-m3-devopsconMunich-2023.pptxpredicting-m3-devopsconMunich-2023.pptx
predicting-m3-devopsconMunich-2023.pptx
Tier1 app8 vues
Navigating container technology for enhanced security by Niklas Saari par Metosin Oy
Navigating container technology for enhanced security by Niklas SaariNavigating container technology for enhanced security by Niklas Saari
Navigating container technology for enhanced security by Niklas Saari
Metosin Oy15 vues
Transport Management System - Shipment & Container Tracking par Freightoscope
Transport Management System - Shipment & Container TrackingTransport Management System - Shipment & Container Tracking
Transport Management System - Shipment & Container Tracking
Dev-HRE-Ops - Addressing the _Last Mile DevOps Challenge_ in Highly Regulated... par TomHalpin9
Dev-HRE-Ops - Addressing the _Last Mile DevOps Challenge_ in Highly Regulated...Dev-HRE-Ops - Addressing the _Last Mile DevOps Challenge_ in Highly Regulated...
Dev-HRE-Ops - Addressing the _Last Mile DevOps Challenge_ in Highly Regulated...
TomHalpin96 vues

Pharo DataFrame: Past, Present, and Future

  • 1. Pharo DataFrame: Past, Present, and Future Larisa Safina, Oleksandr Zaitsev, Cyril Ferlicot-Delbecque, Papa Ibrahima Sow International Workshop on Smalltalk Technologies 2023, Lyon, France
  • 2. Agenda What is Data Frames and what they are good for Evolution of Data Frames in Pharo Data Frames outside of Pharo 2
  • 3. Data Frames Tabular structure Columns with named headers Columns/Rows index Mixed data types in rows Primitive data types *The table is courtesy of Zaitsev et al (ESUG22) 3
  • 4. Data Frames API Data Import/Export Grouping and Aggregation Missing Data Handling Statistical Operations Time Series Analysis Visualization 4 *The table is courtesy of Zaitsev et al (ESUG22)
  • 5. Pharo DataFrame. Past Started as a GSoC 2017 project (by Oleks) Focus of the two followed-up GSoC Contributions from external developers 5
  • 6. Pharo DataFrame. Past Problems: - No stable maintenance - Lack of functionality - Low performance - Incomplete coherence with Pharo collections - Lack of detailed documentation 6
  • 7. Pharo DataFrame. Present Stable project with a permanent developer (Cyril) Code optimization - code quality - speed and volume improvements Adding new functionality 7
  • 8. Awesome Data Frames 8 Kyle Mitchell (jcmkk3) and Uwe L. Korn (xhochy) https://github.com/jcmkk3 /awesome-dataframes
  • 10. Pharo DataFrame. Future ✨ Better performance ✨ Functionality Enhancements Better synchronisation with PolyMath and pharo-ai Big Data Support Evaluation ❤ Toy Story 1995 10
  • 11. Do you want to contribute? https://github.com/PolyMathOrg/DataFrame 11
  • 12. 12 Student project → Mature project with engineers Better performance Functionality Enhancements Evaluation