SlideShare une entreprise Scribd logo
1  sur  22
Télécharger pour lire hors ligne
The pandas Library
Haim Michael
September 8th
, 2020
All logos, trade marks and brand names used in this presentation belong
to the respective owners.
lifemichael
https://youtu.be/Go_6xXYEtkw
© 2008 Haim Michael 20150729
What is Pandas?
 Pandas is a fast, powerful, flexible and easy to use open
source data analysis and manipulation tool, built on top of
the Python programming language. (pandas.pydata.org)
© 2008 Haim Michael 20150729
Installing Pandas
 There are more than a few ways to install Pandas. The
simplest would be using the pip utility.
pip install pandas
© 2008 Haim Michael 20150729
Checking Pandas Version
 You can easily check the version of the Pandas library you
already have installed using the following code.
 The expected output should look like the following:
© 2008 Haim Michael 20150729
Importing Pandas
 In order to use Pandas we should first importing it. It is a
common practice to import it using the alias pd.
import pandas as pd
© 2008 Haim Michael 20150729
The DataFrame Class
 When instantiating the DataFrame class we will get an
object that represents a table.
import pandas as pd
df = pd.DataFrame({ "country":["israel","france","germany"],
"currency":["ils","euro","euro"],
"capitol":["jerusalem","paris","berlin"]
})
print(df)
© 2008 Haim Michael 20150729
The Series Class
 Each and every column of a DataFrame object is
represented using a Series object.
import pandas as pd
marks = pd.Series([88,90,72,64], name="Mark")
print(marks)
© 2008 Haim Michael 20150729
The describe() Function
 We can invoke this method both on a Series object and on a
DataFrame object. Calling this function we will get a detailed
statistic description.
import pandas as pd
marks = pd.Series([88,90,72,64], name="Mark")
print(marks.describe())
© 2008 Haim Michael 20150729
Reading & Writing Data
 The available methods allow us to read data from files in
various formats directly into a DataFrame object, and to
write data we already have in a DataFrame object directly to
files in various formats.
© 2008 Haim Michael 20150729
Reading & Writing Data
 The to_excel, to_csv, etc... are methods that we invoke
on a DataFrame object. These methods allow us to to write
the data we already have organized in a DataFrame object
to a new file of a specific format.
 The read_excel, read_csv, etc... are public methods that
were defined in the pandas module.
© 2008 Haim Michael 20150729
Writing to Excel Sample
import pandas as pd
df = pd.DataFrame({ "country":["israel","france","germany"],
"currency":["ils","euro","euro"],
"capitol":["jerusalem","paris","berlin"]
}
)
df.to_excel("countries.xlsx")
© 2008 Haim Michael 20150729
Reading from CSV
import pandas as pd
ob = pd.read_csv("countries.csv")
print(ob)
© 2008 Haim Michael 20150729
Selecting Column
 Selecting a column is done by using square brackets
together with the column name of the column of interest.
Each column is represented using an object of the type
Series.
© 2008 Haim Michael 20150729
Selecting Column
import pandas as pd
df = pd.DataFrame({ "country":["israel","france","germany"],
"currency":["ils","euro","euro"],
"capitol":["jerusalem","paris","berlin"]
}
)
countries = df["country"]
print(countries)
print(type(countries)
© 2008 Haim Michael 20150729
Selecting Columns
 Selecting multiple columns is done by using a list of column
names.
© 2008 Haim Michael 20150729
Selecting Columns
import pandas as pd
df = pd.DataFrame({ "country":["israel","france","germany"],
"currency":["ils","euro","euro"],
"capitol":["jerusalem","paris","berlin"]
}
)
ob = df[["country","capitol"]]
print(ob)
print(type(ob))
© 2008 Haim Michael 20150729
Selecting Rows
 Selecting rows based on a specific condition is done using a
condition we specify inside the selection brackets.
© 2008 Haim Michael 20150729
Selecting Rows
import pandas as pd
df = pd.DataFrame({ "first name":["moshe","daniel","tal"],
"last name":["israeli","cohen","lahat"],
"id":["234234","645645","678678"],
"average":[85,90,64]
}
)
beststudents = df[df["average"]>80]
print(beststudents)
print(type(beststudents))
© 2008 Haim Michael 20150729
Selecting Rows
 Selecting rows that a specific column in the selected row
has a value which is a specific value.
© 2008 Haim Michael 20150729
Selecting Rows
import pandas as pd
df = pd.DataFrame({ "first name":["moshe","daniel","tal","jane"],
"last name":["israeli","cohen","lahat","lala"],
"id":["234234","645645","678678","234234"],
"class":["1st","1st","2nd","3rd"]
}
)
premiumpassengers = df[(df["class"] == "1st") | (df["class"] == "3rd")]
print(premiumpassengers)
print(type(premiumpassengers))
© 2008 Haim Michael 20150729
Selecting Multiple Rows & Cols
 In order to select multiple rows and cols, a subset of our
data, we should use the iloc operator.
© 2008 Haim Michael 20150729
Selecting Multiple Rows & Cols
import pandas as pd
df = pd.DataFrame({ "first name":["moshe","daniel","tal"],
"last name":["israeli","cohen","lahat"],
"id":["234234","645645","678678"],
"class":["1st","1st","2nd"]
}
)
ob = df.iloc[0:2,0:2]
print(ob)
print(type(ob))

Contenu connexe

Similaire à Pandas meetup 20200908

Galvanise NYC - Scaling R with Hadoop & Spark. V1.0
Galvanise NYC - Scaling R with Hadoop & Spark. V1.0Galvanise NYC - Scaling R with Hadoop & Spark. V1.0
Galvanise NYC - Scaling R with Hadoop & Spark. V1.0
vithakur
 
building-a-fdm-application-for-a-hfm-target
 building-a-fdm-application-for-a-hfm-target building-a-fdm-application-for-a-hfm-target
building-a-fdm-application-for-a-hfm-target
Sid Mehta
 

Similaire à Pandas meetup 20200908 (20)

Import contents, taxonomies and translations in Drupal8
Import contents, taxonomies and translations in Drupal8Import contents, taxonomies and translations in Drupal8
Import contents, taxonomies and translations in Drupal8
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
 
Data Analysis with Python Pandas
Data Analysis with Python PandasData Analysis with Python Pandas
Data Analysis with Python Pandas
 
"R & Text Analytics" (15 January 2013)
"R & Text Analytics" (15 January 2013)"R & Text Analytics" (15 January 2013)
"R & Text Analytics" (15 January 2013)
 
Hivemall Talk at TD tech talk #3
Hivemall Talk at TD tech talk #3Hivemall Talk at TD tech talk #3
Hivemall Talk at TD tech talk #3
 
Introduction to Hivemall
Introduction to HivemallIntroduction to Hivemall
Introduction to Hivemall
 
Functional programming in Java
Functional programming in Java  Functional programming in Java
Functional programming in Java
 
A framework used to bridge between the language of business and PLCS
A framework used to bridge between the language of business and PLCSA framework used to bridge between the language of business and PLCS
A framework used to bridge between the language of business and PLCS
 
ML Best Practices: Prepare Data, Build Models, and Manage Lifecycle (AIM396-S...
ML Best Practices: Prepare Data, Build Models, and Manage Lifecycle (AIM396-S...ML Best Practices: Prepare Data, Build Models, and Manage Lifecycle (AIM396-S...
ML Best Practices: Prepare Data, Build Models, and Manage Lifecycle (AIM396-S...
 
Galvanise NYC - Scaling R with Hadoop & Spark. V1.0
Galvanise NYC - Scaling R with Hadoop & Spark. V1.0Galvanise NYC - Scaling R with Hadoop & Spark. V1.0
Galvanise NYC - Scaling R with Hadoop & Spark. V1.0
 
Intake 38 data access 5
Intake 38 data access 5Intake 38 data access 5
Intake 38 data access 5
 
JavaOne 2013: Memory Efficient Java
JavaOne 2013: Memory Efficient JavaJavaOne 2013: Memory Efficient Java
JavaOne 2013: Memory Efficient Java
 
Odsc london data science bootcamp with pixie dust
Odsc london data science bootcamp with pixie dustOdsc london data science bootcamp with pixie dust
Odsc london data science bootcamp with pixie dust
 
Inteligencia artificial, open source e IBM Call for Code
Inteligencia artificial, open source e IBM Call for CodeInteligencia artificial, open source e IBM Call for Code
Inteligencia artificial, open source e IBM Call for Code
 
Batch uploading to the Internet Archive using Python
Batch uploading to the Internet Archive using PythonBatch uploading to the Internet Archive using Python
Batch uploading to the Internet Archive using Python
 
Declarative Machine Learning: Bring your own Syntax, Algorithm, Data and Infr...
Declarative Machine Learning: Bring your own Syntax, Algorithm, Data and Infr...Declarative Machine Learning: Bring your own Syntax, Algorithm, Data and Infr...
Declarative Machine Learning: Bring your own Syntax, Algorithm, Data and Infr...
 
building-a-fdm-application-for-a-hfm-target
 building-a-fdm-application-for-a-hfm-target building-a-fdm-application-for-a-hfm-target
building-a-fdm-application-for-a-hfm-target
 
Typed data in drupal 8
Typed data in drupal 8Typed data in drupal 8
Typed data in drupal 8
 
Construindo Aplicações Deep Learning com TensorFlow e Amazon SageMaker - MCL...
Construindo Aplicações Deep Learning com TensorFlow e Amazon SageMaker -  MCL...Construindo Aplicações Deep Learning com TensorFlow e Amazon SageMaker -  MCL...
Construindo Aplicações Deep Learning com TensorFlow e Amazon SageMaker - MCL...
 
Dart for Java Developers
Dart for Java DevelopersDart for Java Developers
Dart for Java Developers
 

Plus de Haim Michael

Plus de Haim Michael (20)

Anti Patterns
Anti PatternsAnti Patterns
Anti Patterns
 
Virtual Threads in Java
Virtual Threads in JavaVirtual Threads in Java
Virtual Threads in Java
 
MongoDB Design Patterns
MongoDB Design PatternsMongoDB Design Patterns
MongoDB Design Patterns
 
Introduction to SQL Injections
Introduction to SQL InjectionsIntroduction to SQL Injections
Introduction to SQL Injections
 
Record Classes in Java
Record Classes in JavaRecord Classes in Java
Record Classes in Java
 
Microservices Design Patterns
Microservices Design PatternsMicroservices Design Patterns
Microservices Design Patterns
 
OOP Best Practices in JavaScript
OOP Best Practices in JavaScriptOOP Best Practices in JavaScript
OOP Best Practices in JavaScript
 
Java Jump Start
Java Jump StartJava Jump Start
Java Jump Start
 
JavaScript Jump Start 20220214
JavaScript Jump Start 20220214JavaScript Jump Start 20220214
JavaScript Jump Start 20220214
 
Bootstrap Jump Start
Bootstrap Jump StartBootstrap Jump Start
Bootstrap Jump Start
 
What is new in PHP
What is new in PHPWhat is new in PHP
What is new in PHP
 
What is new in Python 3.9
What is new in Python 3.9What is new in Python 3.9
What is new in Python 3.9
 
Programming in Python on Steroid
Programming in Python on SteroidProgramming in Python on Steroid
Programming in Python on Steroid
 
The matplotlib Library
The matplotlib LibraryThe matplotlib Library
The matplotlib Library
 
Jupyter notebook 20200728
Jupyter notebook 20200728Jupyter notebook 20200728
Jupyter notebook 20200728
 
Node.js Crash Course (Jump Start)
Node.js Crash Course (Jump Start) Node.js Crash Course (Jump Start)
Node.js Crash Course (Jump Start)
 
The Power of Decorators in Python [Meetup]
The Power of Decorators in Python [Meetup]The Power of Decorators in Python [Meetup]
The Power of Decorators in Python [Meetup]
 
Asynchronous JavaScript Programming
Asynchronous JavaScript ProgrammingAsynchronous JavaScript Programming
Asynchronous JavaScript Programming
 
Python Jump Start
Python Jump StartPython Jump Start
Python Jump Start
 
WordPress Jump Start
WordPress Jump StartWordPress Jump Start
WordPress Jump Start
 

Dernier

+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
Health
 
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesAI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
VictorSzoltysek
 

Dernier (20)

The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...
The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...
The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docx
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Models
 
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
 
VTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnVTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learn
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTV
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.com
 
Microsoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdfMicrosoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdf
 
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdfPayment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
 
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
Direct Style Effect Systems -The Print[A] Example- A Comprehension AidDirect Style Effect Systems -The Print[A] Example- A Comprehension Aid
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
 
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
 
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park %in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
 
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
 
Sector 18, Noida Call girls :8448380779 Model Escorts | 100% verified
Sector 18, Noida Call girls :8448380779 Model Escorts | 100% verifiedSector 18, Noida Call girls :8448380779 Model Escorts | 100% verified
Sector 18, Noida Call girls :8448380779 Model Escorts | 100% verified
 
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
 
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesAI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
 

Pandas meetup 20200908

  • 1. The pandas Library Haim Michael September 8th , 2020 All logos, trade marks and brand names used in this presentation belong to the respective owners. lifemichael https://youtu.be/Go_6xXYEtkw
  • 2. © 2008 Haim Michael 20150729 What is Pandas?  Pandas is a fast, powerful, flexible and easy to use open source data analysis and manipulation tool, built on top of the Python programming language. (pandas.pydata.org)
  • 3. © 2008 Haim Michael 20150729 Installing Pandas  There are more than a few ways to install Pandas. The simplest would be using the pip utility. pip install pandas
  • 4. © 2008 Haim Michael 20150729 Checking Pandas Version  You can easily check the version of the Pandas library you already have installed using the following code.  The expected output should look like the following:
  • 5. © 2008 Haim Michael 20150729 Importing Pandas  In order to use Pandas we should first importing it. It is a common practice to import it using the alias pd. import pandas as pd
  • 6. © 2008 Haim Michael 20150729 The DataFrame Class  When instantiating the DataFrame class we will get an object that represents a table. import pandas as pd df = pd.DataFrame({ "country":["israel","france","germany"], "currency":["ils","euro","euro"], "capitol":["jerusalem","paris","berlin"] }) print(df)
  • 7. © 2008 Haim Michael 20150729 The Series Class  Each and every column of a DataFrame object is represented using a Series object. import pandas as pd marks = pd.Series([88,90,72,64], name="Mark") print(marks)
  • 8. © 2008 Haim Michael 20150729 The describe() Function  We can invoke this method both on a Series object and on a DataFrame object. Calling this function we will get a detailed statistic description. import pandas as pd marks = pd.Series([88,90,72,64], name="Mark") print(marks.describe())
  • 9. © 2008 Haim Michael 20150729 Reading & Writing Data  The available methods allow us to read data from files in various formats directly into a DataFrame object, and to write data we already have in a DataFrame object directly to files in various formats.
  • 10. © 2008 Haim Michael 20150729 Reading & Writing Data  The to_excel, to_csv, etc... are methods that we invoke on a DataFrame object. These methods allow us to to write the data we already have organized in a DataFrame object to a new file of a specific format.  The read_excel, read_csv, etc... are public methods that were defined in the pandas module.
  • 11. © 2008 Haim Michael 20150729 Writing to Excel Sample import pandas as pd df = pd.DataFrame({ "country":["israel","france","germany"], "currency":["ils","euro","euro"], "capitol":["jerusalem","paris","berlin"] } ) df.to_excel("countries.xlsx")
  • 12. © 2008 Haim Michael 20150729 Reading from CSV import pandas as pd ob = pd.read_csv("countries.csv") print(ob)
  • 13. © 2008 Haim Michael 20150729 Selecting Column  Selecting a column is done by using square brackets together with the column name of the column of interest. Each column is represented using an object of the type Series.
  • 14. © 2008 Haim Michael 20150729 Selecting Column import pandas as pd df = pd.DataFrame({ "country":["israel","france","germany"], "currency":["ils","euro","euro"], "capitol":["jerusalem","paris","berlin"] } ) countries = df["country"] print(countries) print(type(countries)
  • 15. © 2008 Haim Michael 20150729 Selecting Columns  Selecting multiple columns is done by using a list of column names.
  • 16. © 2008 Haim Michael 20150729 Selecting Columns import pandas as pd df = pd.DataFrame({ "country":["israel","france","germany"], "currency":["ils","euro","euro"], "capitol":["jerusalem","paris","berlin"] } ) ob = df[["country","capitol"]] print(ob) print(type(ob))
  • 17. © 2008 Haim Michael 20150729 Selecting Rows  Selecting rows based on a specific condition is done using a condition we specify inside the selection brackets.
  • 18. © 2008 Haim Michael 20150729 Selecting Rows import pandas as pd df = pd.DataFrame({ "first name":["moshe","daniel","tal"], "last name":["israeli","cohen","lahat"], "id":["234234","645645","678678"], "average":[85,90,64] } ) beststudents = df[df["average"]>80] print(beststudents) print(type(beststudents))
  • 19. © 2008 Haim Michael 20150729 Selecting Rows  Selecting rows that a specific column in the selected row has a value which is a specific value.
  • 20. © 2008 Haim Michael 20150729 Selecting Rows import pandas as pd df = pd.DataFrame({ "first name":["moshe","daniel","tal","jane"], "last name":["israeli","cohen","lahat","lala"], "id":["234234","645645","678678","234234"], "class":["1st","1st","2nd","3rd"] } ) premiumpassengers = df[(df["class"] == "1st") | (df["class"] == "3rd")] print(premiumpassengers) print(type(premiumpassengers))
  • 21. © 2008 Haim Michael 20150729 Selecting Multiple Rows & Cols  In order to select multiple rows and cols, a subset of our data, we should use the iloc operator.
  • 22. © 2008 Haim Michael 20150729 Selecting Multiple Rows & Cols import pandas as pd df = pd.DataFrame({ "first name":["moshe","daniel","tal"], "last name":["israeli","cohen","lahat"], "id":["234234","645645","678678"], "class":["1st","1st","2nd"] } ) ob = df.iloc[0:2,0:2] print(ob) print(type(ob))