SlideShare une entreprise Scribd logo
1  sur  9
Télécharger pour lire hors ligne
Big Data made Small
        http://promptcloud.com




                                                         1
  © PromptCloud Technologies 2012, All rights reserved
Problem Identified
There’s a lot of data around
in the form of reviews,                                     Big Data,
blogs, social media,                                     very big data,
catalogs, etc. but there’s                             very very big data..
only 24 hours in a day - to
aggregate all the “relevant”
data, arrange it in a
“format”, derive “insights”
from it, and hmph! to                          Identify??         Aggregate??   Analyze??
realize that your focus was
something else.

                                                                                      2
                © PromptCloud Technologies 2012, All rights reserved
Our Answer
  We realize that Big Data = More Info = Bigger Opportunities,
  so we do the following for you.


Crawl Web                    Extract Data                                   Normalize Data

   • We do deep                    • We extract                                • We de-dupe
     data crawling                   data in the                                 data and join
     and reach                       desired                                     extracts
     where                           format from                                 across pages.
     search                          as many
     engines                         sources as
     don’t!                          needed.




                                                                                                 3
                     © PromptCloud Technologies 2012, All rights reserved
Underlying Magic
Distributed                        Extraction
 Crawling                                                                Pattern
 - Hadoop                                                              Recognition
                                                                      -Parsing Agent


                         Cloud
                       Computing


                                                                      Cassandra/
Lucene                                                                  HBase

                       Machine
                       Learning

                                                                                       4
               © PromptCloud Technologies 2012, All rights reserved
Business Model
                                                              Data as a Service (DaaS) Platform
                Custom data from deep
                crawl & incremental
                crawls




                                           XML

                                                     CSV

                                                                  YAML




PromptClouder                                                                   Happy Customer


                                                                                                 5
                         © PromptCloud Technologies 2012, All rights reserved
Features & Functions
                             • Unlimited data in Terabyte/ Petabyte/ Exabyte (YOU ask
 Unlimited Data                for it!) that directly converts into business


 Vertical Search             • Vertical content based on topicality, media type, or
                               genre of content. Egs- Legal, Medical, Patent, Travel,
    Engines                    and Automobile search engines


  Social Media               • Aggregated data from across social networks viz.
                               Twitter, LinkedIn, Google+ , etc.
    Content
                             • Collection/ analysis of reviews/ ratings on products and
Consumer Insights              services providing direct insight into consumer
                               preferences

     Business                • Real-time information about your competitors and BD
                               opportunities (open tenders, project announcements,
   Intelligence                etc.)

                                                                                        6
                   © PromptCloud Technologies 2012, All rights reserved
Customers Speak

                                                     “These guys at PromptCloud
"They have a state-of-art data                       have done an excellent job.
 platform. It was definitely a                       They have not only provided
   good decision to go for                             exhaustive data but also
 customized crawls than get                           have done the same within
  our feet wet with just any                             stipulated SLAs. Their
  other mass data crawler.“                                  technology and
        -WisdomTap                                   methodology is excellent and
                                                       they get closely involved
                                                          with the business.“
                                                              - FunnelScope



                                                                                    7
                    © PromptCloud Technologies 2012, All rights reserved
Our Advantage
          Making big data small to alleviate tech-aches


•Low ETA’s                                                                 •Highly Scalable
                              •Flexible Pricing
•Precision                                                                 •Access to real-
                              based on size and
Extraction                                                                 time data
                              frequency of
•Exhaustive data
                              crawls
available as feed




Performance                         Price                                    Technology       8
                    © PromptCloud Technologies 2012, All rights reserved
Ask Us for Free Demo
We can provide you with customized sample data from
2-3 sites of your choice.


                           Contact Us
             Email: info@promptcloud.com
                Phone: +91-96 86 56 70 70




                                                                       9
                © PromptCloud Technologies 2012, All rights reserved

Contenu connexe

Plus de PromptCloud

Plus de PromptCloud (20)

Parsehub and competitior ppt.pptx
Parsehub and competitior ppt.pptxParsehub and competitior ppt.pptx
Parsehub and competitior ppt.pptx
 
Product Visibility- What Is Seen First, Will ppt.pptx
Product Visibility- What Is Seen First, Will ppt.pptxProduct Visibility- What Is Seen First, Will ppt.pptx
Product Visibility- What Is Seen First, Will ppt.pptx
 
Data Trends in Fashion Industry
Data Trends in Fashion IndustryData Trends in Fashion Industry
Data Trends in Fashion Industry
 
Data Standardization with Web Data Integration
Data Standardization with Web Data Integration Data Standardization with Web Data Integration
Data Standardization with Web Data Integration
 
Visualizing Marvel Cinematic Universe Movies
Visualizing Marvel Cinematic Universe MoviesVisualizing Marvel Cinematic Universe Movies
Visualizing Marvel Cinematic Universe Movies
 
15 Key Metrics Every E-commerce Business Should Track
15 Key Metrics Every E-commerce Business Should Track15 Key Metrics Every E-commerce Business Should Track
15 Key Metrics Every E-commerce Business Should Track
 
Top Amazon Services for Ecommerce Players
Top Amazon Services for Ecommerce PlayersTop Amazon Services for Ecommerce Players
Top Amazon Services for Ecommerce Players
 
The Birth of a Web Crawling Bot
The Birth of a Web Crawling BotThe Birth of a Web Crawling Bot
The Birth of a Web Crawling Bot
 
Upcoming Applications of Artificial intelligence in 2019
Upcoming Applications of Artificial intelligence in 2019Upcoming Applications of Artificial intelligence in 2019
Upcoming Applications of Artificial intelligence in 2019
 
Zipcode based price benchmarking for retailers
Zipcode based price benchmarking for retailersZipcode based price benchmarking for retailers
Zipcode based price benchmarking for retailers
 
Analyzing Positiveness in 160+ Holiday Songs
Analyzing Positiveness in 160+ Holiday SongsAnalyzing Positiveness in 160+ Holiday Songs
Analyzing Positiveness in 160+ Holiday Songs
 
PromptCloud's Year in Review - 2019
PromptCloud's Year in Review - 2019PromptCloud's Year in Review - 2019
PromptCloud's Year in Review - 2019
 
Top Data Analytics Trends for 2019
Top Data Analytics Trends for 2019Top Data Analytics Trends for 2019
Top Data Analytics Trends for 2019
 
10 Mobile App Ideas that can be Fueled by Web Scraping
10 Mobile App Ideas that can be Fueled by Web Scraping10 Mobile App Ideas that can be Fueled by Web Scraping
10 Mobile App Ideas that can be Fueled by Web Scraping
 
How Web Scraping Can Help Affiliate Marketers
How Web Scraping Can Help Affiliate MarketersHow Web Scraping Can Help Affiliate Marketers
How Web Scraping Can Help Affiliate Marketers
 
Hotel Review Data Analysis
Hotel Review Data AnalysisHotel Review Data Analysis
Hotel Review Data Analysis
 
Why and how to scrape geospatial data from the web
Why and how to scrape geospatial data from the webWhy and how to scrape geospatial data from the web
Why and how to scrape geospatial data from the web
 
Deploying Web Scraping to Enforce Minimum Advertised Price (MAP)
Deploying Web Scraping to Enforce Minimum Advertised Price (MAP)Deploying Web Scraping to Enforce Minimum Advertised Price (MAP)
Deploying Web Scraping to Enforce Minimum Advertised Price (MAP)
 
Twitter Data Analysis for FIFA World Cup Final
Twitter Data Analysis for FIFA World Cup FinalTwitter Data Analysis for FIFA World Cup Final
Twitter Data Analysis for FIFA World Cup Final
 
Impact of GDPR on Data Collection and Processing
Impact of GDPR on Data Collection and ProcessingImpact of GDPR on Data Collection and Processing
Impact of GDPR on Data Collection and Processing
 

Dernier

Dernier (20)

Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot ModelNavi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 

PromptCloud-Big Data Crawl and Extraction

  • 1. Big Data made Small http://promptcloud.com 1 © PromptCloud Technologies 2012, All rights reserved
  • 2. Problem Identified There’s a lot of data around in the form of reviews, Big Data, blogs, social media, very big data, catalogs, etc. but there’s very very big data.. only 24 hours in a day - to aggregate all the “relevant” data, arrange it in a “format”, derive “insights” from it, and hmph! to Identify?? Aggregate?? Analyze?? realize that your focus was something else. 2 © PromptCloud Technologies 2012, All rights reserved
  • 3. Our Answer We realize that Big Data = More Info = Bigger Opportunities, so we do the following for you. Crawl Web Extract Data Normalize Data • We do deep • We extract • We de-dupe data crawling data in the data and join and reach desired extracts where format from across pages. search as many engines sources as don’t! needed. 3 © PromptCloud Technologies 2012, All rights reserved
  • 4. Underlying Magic Distributed Extraction Crawling Pattern - Hadoop Recognition -Parsing Agent Cloud Computing Cassandra/ Lucene HBase Machine Learning 4 © PromptCloud Technologies 2012, All rights reserved
  • 5. Business Model Data as a Service (DaaS) Platform Custom data from deep crawl & incremental crawls XML CSV YAML PromptClouder Happy Customer 5 © PromptCloud Technologies 2012, All rights reserved
  • 6. Features & Functions • Unlimited data in Terabyte/ Petabyte/ Exabyte (YOU ask Unlimited Data for it!) that directly converts into business Vertical Search • Vertical content based on topicality, media type, or genre of content. Egs- Legal, Medical, Patent, Travel, Engines and Automobile search engines Social Media • Aggregated data from across social networks viz. Twitter, LinkedIn, Google+ , etc. Content • Collection/ analysis of reviews/ ratings on products and Consumer Insights services providing direct insight into consumer preferences Business • Real-time information about your competitors and BD opportunities (open tenders, project announcements, Intelligence etc.) 6 © PromptCloud Technologies 2012, All rights reserved
  • 7. Customers Speak “These guys at PromptCloud "They have a state-of-art data have done an excellent job. platform. It was definitely a They have not only provided good decision to go for exhaustive data but also customized crawls than get have done the same within our feet wet with just any stipulated SLAs. Their other mass data crawler.“ technology and -WisdomTap methodology is excellent and they get closely involved with the business.“ - FunnelScope 7 © PromptCloud Technologies 2012, All rights reserved
  • 8. Our Advantage Making big data small to alleviate tech-aches •Low ETA’s •Highly Scalable •Flexible Pricing •Precision •Access to real- based on size and Extraction time data frequency of •Exhaustive data crawls available as feed Performance Price Technology 8 © PromptCloud Technologies 2012, All rights reserved
  • 9. Ask Us for Free Demo We can provide you with customized sample data from 2-3 sites of your choice. Contact Us Email: info@promptcloud.com Phone: +91-96 86 56 70 70 9 © PromptCloud Technologies 2012, All rights reserved