More Related Content Similar to AI-SDV 2021 - Tony Trippe - The Current State of Machine Learning for Patent Searching and Analytics: Practical Perspectives from ML4Patents.com (20) More from Dr. Haxel Consult (20) AI-SDV 2021 - Tony Trippe - The Current State of Machine Learning for Patent Searching and Analytics: Practical Perspectives from ML4Patents.com1. ©All rights reserved. Not for reproduction, distribution or sale.
Improving Patent Analytics Using
Semantic and Machine Learning
Technologies
Anthony Trippe
Managing Director
Patinformatics, LLC
ii-SDV
October 5th, 2021
2. ©All rights reserved. Not for reproduction, distribution or sale.
PATENT ANALYTICS USING
SEMANTIC AND MACHINE
LEARNING TECHNOLOGIES
3. www.patinformatics.com
©All rights reserved. Not for reproduction, distribution or sale.
3
• Taken from: Guidelines for Preparing Patent Landscape Reports
• http://www.wipo.int/edocs/pubdocs/en/wipo_pub_946.pdf
• Building a collection
• Conducting a patent search
• Determining relevance
• TIDYing the collection
• Patent family reduction
• Deciding on type of year to use
• Standardizing key fields
• Creating categories
• Steps in preparing a patent landscape report
4. www.patinformatics.com
©All rights reserved. Not for reproduction, distribution or sale.
4
• Analyzing the collection
• Building the models
• Looking for trends
• Visualizing the collection
• Charts & Graphs
• Network diagrams
• Spatial concept maps
• Sharing conclusions
• Steps in preparing a patent landscape report
5. www.patinformatics.com
©All rights reserved. Not for reproduction, distribution or sale.
5
• In searching during the preparation of a PLR, information retrieval methods usually look at precision
and recall simultaneously and measure their effectiveness looking at both elements
• Even though this is the case, precision and recall are normally opposed to one another such that
with an increase in recall there is usually a subsequent drop in the level of precision
• In generating collections for PLRs it might be more productive to begin with creating sets using
methods that produce high recall exclusive of precision
• Once an initial collection with high recall is built different methods can be used to increase the
precision of the collection by determining relevance of the families
• From a practical perspective, if the level of recall can be established at higher than 90%, while the
precision kept above 70%, then the likelihood of finding statistically relevant, but conceptually
irrelevant items in the subsequent analysis steps is reasonably small
• Patent Search/Determine Relevance
6. www.patinformatics.com
©All rights reserved. Not for reproduction, distribution or sale.
6
• Improving recall
• Cosine similarity
• Latent semantic analysis
• More like this / Practical scoring function
• https://cloudblog-withgoogle-com.cdn.ampproject.org/c/s/cloudblog.withgoogle.com/products/data-analytics/expanding-your-patent-
set-with-ml-and-bigquery/amp/
• Others?
• How does this really compare to Boolean/traditional patent queries? –will be discussed at EPOPIC
• Determining relevance – https://github.com/swh/classification-gold-standard
• Binary classification
• Support vector machines / vector space models
• Neural networks
• Others?
• Semantic Tools for Patent Search/Determine Relevance
7. www.patinformatics.com
©All rights reserved. Not for reproduction, distribution or sale.
7
• Field Cleanup
• Patent Assignee Standardization – fuzzy logic, rules-based
• disambiguation
• Family or Invention Reduction
• Reconciling Forward Citations
• Based on family reduction method
• Determining Reporting Year
• Using earliest publication year
• TIDYing the Collection
8. www.patinformatics.com
©All rights reserved. Not for reproduction, distribution or sale.
8
• Binary classification provides a means for categorizing large collections of patent documents into
the references that are likely to be of highest interest to the information professional, and those
that are likely not related, but were still retrieved in a broad search
• A training set will be made up of references that are highly relevant to the interests of the analyst
• In training the classifier, the analyst will need to identify documents that are off-topic as well, so the
classifier can establish a hyperplane that will distinguish between the two categories
• Technology categories are sometimes identified using the patent data itself, for instance, with
classification codes, but ideally, they should be generated based on input from a subject-matter
expert based on an industry standard view on how approaches are categorized
• Using a market or industry-based approach to creating categories will make it easier for the clients
of the PLR to identify with the technology
• Creating Categories for the Collection
9. ©All rights reserved. Not for reproduction, distribution or sale.
SAMPLE PATENT VISUALIZATIONS
INVOLVING MACHINE LEARNING
10. www.patinformatics.com
©All rights reserved. Not for reproduction, distribution or sale.
• Spatial Concept Maps
10
• Patents that are similar to one another
based on their language are organized
close to one another on the map
• Relative distance between different
technical subjects show which concepts
are related to one another
• Labels are added to identify the sub-
sections within a technology field
• Colored dots are used to provide
comparisons within the context of the
map
• In this case, the dots provide a
means to compare different
companies in the space
• Colored dots can also be used for
distributions over time
11. www.patinformatics.com
©All rights reserved. Not for reproduction, distribution or sale.
• Patent Citation Network MapsTM
11
• When a patent is referenced in a future
patent it’s called a forward citation
• Patents that have a large number of
forward citations, especially from other
organizations can be considered
influential
• These citations can be aggregated by the
organizations that own the patents in
question
• Using a network diagram the connections
between organizations can be found, and
the most influential groups identified
• In this case, the HP and Northrup
portfolios are not as large as the ones for
D-Wave, and IBM, but they are very
influential
13. www.patinformatics.com
©All rights reserved. Not for reproduction, distribution or sale.
• Supporting and Additional Vendors
13
• Vendor pages
• 37 unique machine
learning for IP tools
from 31 vendors
• Detailed
information
provided on each
tool
• Ten primary and 21
secondary
categories index for
ease of data
retrieval on the
tools and vendors
14. www.patinformatics.com
©All rights reserved. Not for reproduction, distribution or sale.
• Blog Post Selection: Can an AI Get a Patent?
14
• From ML4Patents.com — 219 posts currently on the site
• When is There an “Actual Invention” Involving Computers?
• 2021: An AI Inventor Odyssey
• USPTO Releases Public Comments on AI
• The Time is Now: Opportunities to Advise the E.D. Va. or EPO as to Whether to Prohibit,
Permit, or Require Listing an AI Algorithm as an Inventor
• AI Magazine: What Happens When Artificial Intelligence Invents: Is the Invention
Patentable?
• Legal challenges have occurred in Australia, Europe, the United
Kingdom and the US
• In all cases patentability requires a human inventor and thus an AI system can’t be an
inventor
15. www.patinformatics.com
©All rights reserved. Not for reproduction, distribution or sale.
• Blog Post Selection: Is AI Patentable?
15
• From ML4Patents.com
• How to Safeguard AI Technology: Patents versus Trade Secrets
• Brazilian PTO publishes Guidelines for Examining Patent Applications Involving
Computer-Implemented Inventions
• European Patent Office Artificial Intelligence Page
• Public Views on AI and IP Policy from USPTO
• AI Initiatives at the JPO
• Patenting AI in the EPO Guidelines - The Best Practice Podcast
• UK Government Mulls Legislative Changes to Safeguard AI Inventions
• Absolutely — most countries are making special provisions
• Software patents used to mostly be strictly forbidden as abstract and not allowable
subject matter, but as the importance of machine learning has increased more and
more countries are posting guidelines on how to protect these inventions
16. www.patinformatics.com
©All rights reserved. Not for reproduction, distribution or sale.
• Webinar Series
16
• 4th episode is on Wednesday October 6th with Intellar from
Perception Partners
• Learn more and register at: https://www.ml4patents.com/webinars/episode-4-w-
perception-partners
• 1st three episodes have been a resounding success
• Relativity — providing a collection of methods to chose from
• Cipher — deep indexing in selected verticals with rich hierarchies
• Amplified — provided an excellent primer on modern ML for text retrieval methods
• Archive available at: https://www.ml4patents.com/webinars