SlideShare une entreprise Scribd logo
1  sur  18
Selenium & Scrapy
Web UI testing and Web Scraping
About me
Arcangelo Saracino
IT student at Bari University
2016-2018 Web developer at Aryma
2018- Feb2019 Web developer at Enterprise Digital Solution
saracinoarcangelo@gmail.com github.com/Arkango
Selenium
Selenium is a portable framework for testing web applications.
Selenium provides a playback (formerly also recording) tool for authoring
functional tests without the need to learn a test scripting language (Selenium IDE).
It also provides a test domain-specific language (Selenese) to write tests in a
number of popular programming languages, including C#, Groovy, Java, Perl,
PHP, Python, Ruby and Scala.
The tests can then run against most modern web browsers.
Selenium deploys on Windows, Linux, and macOS platforms.
It is open-source software, released under the Apache 2.0 license: web
developers can download and use it without charge.
Source: Wikipedia
Selenium Components
●
Selenium IDE
●
Selenium Client Api
● Selenium Web Driver
● Selenium Remote Control
● Selenium Grid
Selenium IDE
Selenium IDE is a complete integrated development environment (IDE) for Selenium tests.
It is implemented as a Firefox Add-On and as a Chrome Extension.
It allows for recording, editing, and debugging of functional tests. It was previously known
as Selenium Recorder.
Selenium-IDE was originally created by Shinya Kasatani and donated to the Selenium
project in 2006.
Selenium IDE was previously little-maintained. Selenium IDE began being actively
maintained in 2018.
Scripts may be automatically recorded and edited manually providing autocompletion
support and the ability to move commands around quickly. Scripts are recorded in
Selenese, a special test scripting language for Selenium. Selenese provides commands
for performing actions in a browser (click a link, select an option), and for retrieving data
from the resulting pages.
Selenium Client API
As an alternative to writing tests in Selenese, tests can
also be written in various programming languages. These
tests then communicate with Selenium by calling methods
in the Selenium Client API. Selenium currently provides
client APIs for Java, C#, Ruby, JavaScript, R and Python.
With Selenium 2, a new Client API was introduced (with
WebDriver as its central component). However, the old API
(using class Selenium) is still supported.
Selenium Web Driver
Selenium WebDriver is the successor to Selenium RC.
Selenium WebDriver accepts commands (sent in Selenese, or
via a Client API) and sends them to a browser.
This is implemented through a browser-specific browser driver,
which sends commands to a browser and retrieves results.
Most browser drivers actually launch and access a browser
application (such as Firefox, Chrome, Internet Explorer, Safari,
or Microsoft Edge); there is also an HtmlUnit browser driver,
which simulates a browser using the headless browser
HtmlUnit.
Hands on code
● An example …..
Scrapy
Scrapy (/ skre pi/ SKRAY-pee) is a free and open-source web-crawlingˈ ɪ
framework written in Python. Originally designed for web scraping, it
can also be used to extract data using APIs or as a general-purpose
web crawler. It is currently maintained by Scrapinghub Ltd., a web-
scraping development and services company.
Scrapy project architecture is built around "spiders", which are self-
contained crawlers that are given a set of instructions. Following the
spirit of other don't repeat yourself frameworks, such as Django,[4] it
makes it easier to build and scale large crawling projects by allowing
developers to reuse their code. Scrapy also provides a web-crawling
shell, which can be used by developers to test their assumptions on a
site’s behavior.[5]
Scrapy: Basic Concept
● Command line tools
Scrapy is controlled through the scrapy command-line tool, to be referred here as the “Scrapy tool” to
differentiate it from the sub-commands, which we just call “commands” or “Scrapy commands”.
● Spiders
Spiders are classes which define how a certain site (or a group of sites) will be scraped,
including how to perform the crawl (i.e. follow links) and how to extract structured data from
their pages (i.e. scraping items). In other words, Spiders are the place where you define the
custom behaviour for crawling and parsing pages for a particular site (or, in some cases, a
group of sites).
● Selectors
Extract the data from web pages using XPath.
● Scrapy Shell
Test your extraction code in an interactive environment.
Scrapy: Basic Concept 2
● Items
Define the data you want to scrape.
● Items Loader
Populate your items with the extracted data.
● Items Pipeline
Post-process and store your scraped data.
● Feed Exports
Output your scraped data using different formats and storages.
● Request and responses
Scrapy uses Request and Response objects for crawling web sites.
Scrapy: Basic Concept 3
● Link extractor
Convenient classes to extract links to follow from pages.
● Settings
Learn how to configure Scrapy and see all available settings.
● Exceptions
See all available exceptions and their meaning.
Let’s code
● An example …..
Usages
● Testing ui
● Web crawling
● Hacking
Sources
● Wikipedia.org
● https://www.seleniumhq.org/
● https://scrapy.org/
● Tutorial: https://selenium-python.readthedocs.io/,https://www.youtube.com/watch?v=XDn60jw68tM,
https://docs.scrapy.org/en/latest/intro/tutorial.html
Questions&Answers
About me
Arcangelo Saracino
IT student at Bari University
2016-2018 Web developer at Aryma
2018- Feb2019 Web developer at Enterprise Digital Solution
saracinoarcangelo@gmail.com github.com/Arkango
Thank you

Contenu connexe

Tendances

Tendances (20)

Downloading the internet with Python + Scrapy
Downloading the internet with Python + ScrapyDownloading the internet with Python + Scrapy
Downloading the internet with Python + Scrapy
 
Web Scrapping with Python
Web Scrapping with PythonWeb Scrapping with Python
Web Scrapping with Python
 
Fun with Python
Fun with PythonFun with Python
Fun with Python
 
Scrapy
ScrapyScrapy
Scrapy
 
Scraping with Python for Fun and Profit - PyCon India 2010
Scraping with Python for Fun and Profit - PyCon India 2010Scraping with Python for Fun and Profit - PyCon India 2010
Scraping with Python for Fun and Profit - PyCon India 2010
 
Building an API with Django and Django REST Framework
Building an API with Django and Django REST FrameworkBuilding an API with Django and Django REST Framework
Building an API with Django and Django REST Framework
 
Django
DjangoDjango
Django
 
Scrapy.for.dummies
Scrapy.for.dummiesScrapy.for.dummies
Scrapy.for.dummies
 
Intro to Web Development Using Python and Django
Intro to Web Development Using Python and DjangoIntro to Web Development Using Python and Django
Intro to Web Development Using Python and Django
 
Analyse Yourself
Analyse YourselfAnalyse Yourself
Analyse Yourself
 
Web development with django - Basics Presentation
Web development with django - Basics PresentationWeb development with django - Basics Presentation
Web development with django - Basics Presentation
 
Django Introduction & Tutorial
Django Introduction & TutorialDjango Introduction & Tutorial
Django Introduction & Tutorial
 
Web Scraping in Python with Scrapy
Web Scraping in Python with ScrapyWeb Scraping in Python with Scrapy
Web Scraping in Python with Scrapy
 
Django Overview
Django OverviewDjango Overview
Django Overview
 
Django tech-talk
Django tech-talkDjango tech-talk
Django tech-talk
 
Django Framework Overview forNon-Python Developers
Django Framework Overview forNon-Python DevelopersDjango Framework Overview forNon-Python Developers
Django Framework Overview forNon-Python Developers
 
Firebase slide
Firebase slideFirebase slide
Firebase slide
 
Create responsive websites with Django, REST and AngularJS
Create responsive websites with Django, REST and AngularJSCreate responsive websites with Django, REST and AngularJS
Create responsive websites with Django, REST and AngularJS
 
Django
DjangoDjango
Django
 
Django REST Framework
Django REST FrameworkDjango REST Framework
Django REST Framework
 

Similaire à Selenium&scrapy

Test Automation Using Selenium
Test Automation Using SeleniumTest Automation Using Selenium
Test Automation Using Selenium
Nikhil Kapoor
 
Selenium Basics by Quontra Solutions
Selenium Basics by Quontra SolutionsSelenium Basics by Quontra Solutions
Selenium Basics by Quontra Solutions
QUONTRASOLUTIONS
 
Selenium Automation Using Ruby
Selenium Automation Using RubySelenium Automation Using Ruby
Selenium Automation Using Ruby
Kumari Warsha Goel
 
Automated UI testing.Selenium.DrupalCamp Kyiv 2011
Automated UI testing.Selenium.DrupalCamp Kyiv 2011Automated UI testing.Selenium.DrupalCamp Kyiv 2011
Automated UI testing.Selenium.DrupalCamp Kyiv 2011
camp_drupal_ua
 

Similaire à Selenium&scrapy (20)

Introduction to Selenium Webdriver - SpringPeople
Introduction to Selenium Webdriver - SpringPeopleIntroduction to Selenium Webdriver - SpringPeople
Introduction to Selenium Webdriver - SpringPeople
 
Automation Testing using Selenium Webdriver
Automation Testing using Selenium WebdriverAutomation Testing using Selenium Webdriver
Automation Testing using Selenium Webdriver
 
selenium-webdriver-interview-questions.pdf
selenium-webdriver-interview-questions.pdfselenium-webdriver-interview-questions.pdf
selenium-webdriver-interview-questions.pdf
 
Selenium.pptx
Selenium.pptxSelenium.pptx
Selenium.pptx
 
QSpiders - Automation using Selenium
QSpiders - Automation using SeleniumQSpiders - Automation using Selenium
QSpiders - Automation using Selenium
 
Test Automation Using Selenium
Test Automation Using SeleniumTest Automation Using Selenium
Test Automation Using Selenium
 
تست وب اپ ها با سلنیوم - علیرضا عظیم زاده میلانی
تست وب اپ ها با سلنیوم - علیرضا عظیم زاده میلانیتست وب اپ ها با سلنیوم - علیرضا عظیم زاده میلانی
تست وب اپ ها با سلنیوم - علیرضا عظیم زاده میلانی
 
Selenium
SeleniumSelenium
Selenium
 
Test automation using selenium
Test automation using seleniumTest automation using selenium
Test automation using selenium
 
Selenium Basics by Quontra Solutions
Selenium Basics by Quontra SolutionsSelenium Basics by Quontra Solutions
Selenium Basics by Quontra Solutions
 
A Simple Guide to Selenium Software Testing
A Simple Guide to Selenium Software TestingA Simple Guide to Selenium Software Testing
A Simple Guide to Selenium Software Testing
 
Demystifying Selenium framework
Demystifying Selenium frameworkDemystifying Selenium framework
Demystifying Selenium framework
 
BCS Selenium Workshop
BCS Selenium WorkshopBCS Selenium Workshop
BCS Selenium Workshop
 
Automated UI testing. Selenium. DrupalCamp Kyiv 2011
Automated UI testing. Selenium. DrupalCamp Kyiv 2011Automated UI testing. Selenium. DrupalCamp Kyiv 2011
Automated UI testing. Selenium. DrupalCamp Kyiv 2011
 
Basics of Selenium IDE,Core, Remote Control
Basics of Selenium IDE,Core, Remote ControlBasics of Selenium IDE,Core, Remote Control
Basics of Selenium IDE,Core, Remote Control
 
Selenium Presentation at Engineering Colleges
Selenium Presentation at Engineering CollegesSelenium Presentation at Engineering Colleges
Selenium Presentation at Engineering Colleges
 
Selenium Automation Using Ruby
Selenium Automation Using RubySelenium Automation Using Ruby
Selenium Automation Using Ruby
 
Selenium PPT 2.pptx
Selenium PPT 2.pptxSelenium PPT 2.pptx
Selenium PPT 2.pptx
 
Selenium
SeleniumSelenium
Selenium
 
Automated UI testing.Selenium.DrupalCamp Kyiv 2011
Automated UI testing.Selenium.DrupalCamp Kyiv 2011Automated UI testing.Selenium.DrupalCamp Kyiv 2011
Automated UI testing.Selenium.DrupalCamp Kyiv 2011
 

Dernier

Dernier (20)

Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 

Selenium&scrapy

  • 1. Selenium & Scrapy Web UI testing and Web Scraping
  • 2. About me Arcangelo Saracino IT student at Bari University 2016-2018 Web developer at Aryma 2018- Feb2019 Web developer at Enterprise Digital Solution saracinoarcangelo@gmail.com github.com/Arkango
  • 3. Selenium Selenium is a portable framework for testing web applications. Selenium provides a playback (formerly also recording) tool for authoring functional tests without the need to learn a test scripting language (Selenium IDE). It also provides a test domain-specific language (Selenese) to write tests in a number of popular programming languages, including C#, Groovy, Java, Perl, PHP, Python, Ruby and Scala. The tests can then run against most modern web browsers. Selenium deploys on Windows, Linux, and macOS platforms. It is open-source software, released under the Apache 2.0 license: web developers can download and use it without charge. Source: Wikipedia
  • 4. Selenium Components ● Selenium IDE ● Selenium Client Api ● Selenium Web Driver ● Selenium Remote Control ● Selenium Grid
  • 5. Selenium IDE Selenium IDE is a complete integrated development environment (IDE) for Selenium tests. It is implemented as a Firefox Add-On and as a Chrome Extension. It allows for recording, editing, and debugging of functional tests. It was previously known as Selenium Recorder. Selenium-IDE was originally created by Shinya Kasatani and donated to the Selenium project in 2006. Selenium IDE was previously little-maintained. Selenium IDE began being actively maintained in 2018. Scripts may be automatically recorded and edited manually providing autocompletion support and the ability to move commands around quickly. Scripts are recorded in Selenese, a special test scripting language for Selenium. Selenese provides commands for performing actions in a browser (click a link, select an option), and for retrieving data from the resulting pages.
  • 6. Selenium Client API As an alternative to writing tests in Selenese, tests can also be written in various programming languages. These tests then communicate with Selenium by calling methods in the Selenium Client API. Selenium currently provides client APIs for Java, C#, Ruby, JavaScript, R and Python. With Selenium 2, a new Client API was introduced (with WebDriver as its central component). However, the old API (using class Selenium) is still supported.
  • 7. Selenium Web Driver Selenium WebDriver is the successor to Selenium RC. Selenium WebDriver accepts commands (sent in Selenese, or via a Client API) and sends them to a browser. This is implemented through a browser-specific browser driver, which sends commands to a browser and retrieves results. Most browser drivers actually launch and access a browser application (such as Firefox, Chrome, Internet Explorer, Safari, or Microsoft Edge); there is also an HtmlUnit browser driver, which simulates a browser using the headless browser HtmlUnit.
  • 8. Hands on code ● An example …..
  • 9. Scrapy Scrapy (/ skre pi/ SKRAY-pee) is a free and open-source web-crawlingˈ ɪ framework written in Python. Originally designed for web scraping, it can also be used to extract data using APIs or as a general-purpose web crawler. It is currently maintained by Scrapinghub Ltd., a web- scraping development and services company. Scrapy project architecture is built around "spiders", which are self- contained crawlers that are given a set of instructions. Following the spirit of other don't repeat yourself frameworks, such as Django,[4] it makes it easier to build and scale large crawling projects by allowing developers to reuse their code. Scrapy also provides a web-crawling shell, which can be used by developers to test their assumptions on a site’s behavior.[5]
  • 10. Scrapy: Basic Concept ● Command line tools Scrapy is controlled through the scrapy command-line tool, to be referred here as the “Scrapy tool” to differentiate it from the sub-commands, which we just call “commands” or “Scrapy commands”. ● Spiders Spiders are classes which define how a certain site (or a group of sites) will be scraped, including how to perform the crawl (i.e. follow links) and how to extract structured data from their pages (i.e. scraping items). In other words, Spiders are the place where you define the custom behaviour for crawling and parsing pages for a particular site (or, in some cases, a group of sites). ● Selectors Extract the data from web pages using XPath. ● Scrapy Shell Test your extraction code in an interactive environment.
  • 11. Scrapy: Basic Concept 2 ● Items Define the data you want to scrape. ● Items Loader Populate your items with the extracted data. ● Items Pipeline Post-process and store your scraped data. ● Feed Exports Output your scraped data using different formats and storages. ● Request and responses Scrapy uses Request and Response objects for crawling web sites.
  • 12. Scrapy: Basic Concept 3 ● Link extractor Convenient classes to extract links to follow from pages. ● Settings Learn how to configure Scrapy and see all available settings. ● Exceptions See all available exceptions and their meaning.
  • 13. Let’s code ● An example …..
  • 14. Usages ● Testing ui ● Web crawling ● Hacking
  • 15. Sources ● Wikipedia.org ● https://www.seleniumhq.org/ ● https://scrapy.org/ ● Tutorial: https://selenium-python.readthedocs.io/,https://www.youtube.com/watch?v=XDn60jw68tM, https://docs.scrapy.org/en/latest/intro/tutorial.html
  • 17. About me Arcangelo Saracino IT student at Bari University 2016-2018 Web developer at Aryma 2018- Feb2019 Web developer at Enterprise Digital Solution saracinoarcangelo@gmail.com github.com/Arkango