SlideShare une entreprise Scribd logo
1  sur  64
Télécharger pour lire hors ligne
Type Profiler: An analysis to
guess type signatures
Yusuke Endoh (@mametter)
Cookpad Inc.
RubyKaigi 2018 (2018/06/01)
Yusuke Endoh (@mametter)
• A full-time MRI committer @ Cookpad
– w/ Koichi Sasada
Recent achievement for Ruby 2.6
• Endless range [Feature #12912]
(1..)
Endless!
Endless range
• Take an array without the first element
ary=["a","b","c"]
ary[1..-1] #=> ["b","c"]
ary.drop(1) #=> ["b","c"]
ary[1..] #=> ["b","c"]
Endless range
• Loop from 1 to infinity
i=1; loop { ……; i+=1 }
(1..Float::INFINITY).each {……}
1.step {|i|……}
(1..).each {|i|……}
Endless range
• each_with_index from index 1
i=1; ary.each { ……; i+=1 }
ary.each.with_index(1){|x,i|……}
ary.zip(1..) {|x,i|……}
Endless range
✓Has been already committed in trunk
✓Will be included in Ruby 2.6
• Stay tuned!
ary[1..]
(1..).each {……}
ary.zip(1..) {|x,i|……}
Beginless range...?
• Just have implemented yesterday
[Feature #14799]
(..1)
Beginless!
Today’s theme
• Ruby3's type.
• Some people held some meetings
to discuss Ruby3's type
– Matz, soutaro, akr, ko1, mame
– Main objective: clarify matz's hidden
requirements (and compromises) for
Ruby3's type
• (Not to decide everything behind closed door)
• We'll explain the (current) requirements
Agenda
• A whirlwind tour of already-proposed
"type systems" for Ruby
• Type DB: A key concept of Ruby3's
type system
• A missing part: Type profiler
A whirlwind tour of
already-proposed
"type systems" for Ruby
Type-related systems for Ruby
• Steep
– Static type check
• RDL
– (Semi) static type check
• contracts.ruby
– Only dynamic check of arguments/return values
• dry-types
– Only dynamic checks of typed structs
• RubyTypeInference (by JetBrains)
– Type information extractor by dynamic analysis
• Sorbet (by Stripe)
RDL: Types for Ruby
• Most famous in academic world
– Jeff Foster at Univ. of Maryland
– Accepted in OOPSLA, PLDI, and POPL!
• The gem is available
– https://github.com/plum-umd/rdl
• We evaluated RDL
– thought writing type annotations for
OptCarrot
Basis for RDL
# load RDL library
require "rdl"
class NES
# activate type annotations for RDL
extend RDL::Annotate
# type annotation before method definition
type "(?Array<String>) -> self", typecheck: :call
def initialize(conf = ARGV)
...
RDL type annotation
• Accepts one optional parameter typed
Array of String
• Returns self
– Always "self" for initialize method
type "(?Array<String>) -> self", typecheck: :call
def initialize(conf = ARGV)
...
RDL type annotation
• "typecheck" controls type check timing
– :call: when this method is called
– :now: when this method is defined
– :XXX: when "RDL.do_typecheck :XXX" is
done
– nil: no "static check" is done
• Used to type-check code that uses the method
• Still "run-time check" is done
type "(?Array<String>) -> self", typecheck: :call
def initialize(conf = ARGV)
...
Annotation for instance variables
• Needs type annotations for all
instance variables
class NES
# activate type annotations for RDL
extend RDL::Annotate
var_type :@cpu, "%any"
type "() -> %any", typecheck: :call
def reset
@cpu.reset
#=> receiver type %any not supported yet
...
Annotation for instance variables
• Needs type annotations for all
instance variables
class NES
# activate type annotations for RDL
extend RDL::Annotate
var_type :@cpu, "[reset: () -> %any]"
type "() -> %any", typecheck: :call
def reset
@cpu.reset
#=> receiver type [reset: () -> %any] not sup
...
Annotation for instance variables
• Needs type annotations for all
instance variables
class NES
# activate type annotations for RDL
extend RDL::Annotate
var_type :@cpu, "Optcarrot::CPU"
type "() -> %any", typecheck: :call
def reset
@cpu.reset
# error: no type information for
# instance method `Optcarrot::CPU#reset'
Annotation for instance variables
• Succeeded to type check
class NES
# activate type annotations for RDL
extend RDL::Annotate
type "Optcarrot::CPU","reset","()->%any"
var_type :@cpu, "Optcarrot::CPU"
type "() -> %any", typecheck: :call
def reset
@cpu.reset
...
Requires many annotations...
type "() -> %bot", typecheck: :call
def reset
@cpu.reset
@apu.reset
@ppu.reset
@rom.reset
@pads.reset
@cpu.boot
@rom.load_battery
end
Requires many annotations...
type "() -> %bot", typecheck: nil
def reset
@cpu.reset
@apu.reset
@ppu.reset
@rom.reset
@pads.reset
@cpu.boot
@rom.load_battery
end
No static
check
… still does not work
type "() -> %bot", typecheck: nil
def reset
...
@rom.load_battery #=> [65533]
end
# Optcarrot::CPU#reset: Return type error.…
# Method type:
# *() -> %bot
# Actual return type:
# Array
# Actual return value:
# [65533]
Why?
• typecheck:nil doesn't mean no check
– Still dynamic check is done
• %bot means "no-return"
– Always raises exception, process exit, etc.
– But this method returns [65533]
– In short, this is my bug in the annotation
type "() -> %bot", typecheck: nil
def reset
...
@rom.load_battery #=> [65533]
end
Lessons: void type
• In Ruby, a lot of methods return
meaningless value
– No intention to
allow users
to use the value
• What type should we use in this case?
– %any, or return nil explicitly?
• We need a "void" type
– %any for the method; it can return anything
– "don't use" for users of the method
def reset
LIBRARY_INTERNAL_ARRAY.
each { … }
end
RDL's programmable annotation
• RDL supports meta-programming
symbols.each do |id|
attr_reader_type, id, "String"
attr_reader id
end
RDL's programmable annotation
• RDL supports pre-condition check
– This can be also used to make type
annotation automatically
• I like this feature, but matz doesn't
– He wants to avoid type annotations
embedded in the code
– He likes separated, non-Ruby type definition
language (as Steep)
pre(:belongs_to) do |name|
……
type name, "() -> #{klass}"
end
Summary: RDL
• Semi-static type check
– The timing is configurable
• It checks the method body
– Not only dynamic check of
arguments/return values
• The implementation is mature
– Many features actually works, great!
• Need type annotations
• Supports meta-programming
Steep
• Snip: You did listen to soutaro's talk
• Completely static type check
• Separated type definition language
– .rbi
– But also requires (minimal?) type
annotation embedded in .rb files
Digest: contracts.ruby
require 'contracts'
class Example
include Contracts::Core
include Contracts::Builtin
Contract Num => Num
def double(x)
x * 2
end
end
• RDL-like type annotation
– Run-time type check
Digest: dry-types
require 'dry-types'
require 'dry-struct'
module Types
include Dry::Types.module
end
class User < Dry::Struct
attribute :name, Types::String
attribute :age, Types::Integer
end
• Can define structs with typed fields
– Run-time type check
– "type_struct" gem is similar
Digest: RubyTypeInference
• Type information extractor by dynamic
analysis
– Run test suites under monitoring of
TracePoint API
– Hooks method call/return events, logs
the passed values, and aggregate them
to type information
– Used by RubyMine IDE
Digest: RubyTypeInference
https://speakerdeck.com/valich/automated-type-contracts-generation-1
Summary of Type Systems
Objective Targets Annotations
Steep Static type
check
Method body Separated
(mainly)
RDL Semi-static
type check
Method body Embedded in
code
contracts.
ruby
Dynamic
type check
Arguments and
return values
Embedded in
code
dry-types Typed
structs
Only Dry::Struct
classes
Embedded in
code
RubyType
Inference
Extract type
information
Arguments and
return values
N/A
Type DB: A key concept of
Ruby3's Type System
Idea
• Separated type definition file is good
• But meta-programming like attr_* is
difficult to support
– Users will try to generate it programmatically
• We may want to keep code position
– To show lineno of code in type error report
– Hard to manually keep the correspondence
between type definition and code position
in .rbi file
– We may also want to keep other information
Type DB
Type
DB
Steep type
definition
typecheck
Steep
RDL/Sorbet type
annotation
RDL
typecheck
better error report
Ruby interpreter
IDE
How to create Type DB
Type
DB
Steep type
definition
Ruby
code
write
manually compile
stdlib
Already included
RubyTypeInference
automatically extract by dynamic analysis
Type Profiler
A missing part: Type Profiler
Type Profiler
• Another way to extract type information
from Ruby code
– Alternative "RubyTypeInference"
• Is not a type inference
– Type inference of Ruby is hopeless
– Conservative static type inference can
extracts little information
• Type profiler "guesses" type information
– It may extract wrong type information
– Assumes that user checks the result
Type Profilers
• There is no "one-for-all" type profiler
– Static type profiling cannot handle
ActiveRecord
– Dynamic type profiling cannot extract
syntactic features (like void type)
• We need a variety of type profilers
– For ActiveRecord by reading DB schema
– Extracting from RDoc/YARD
In this talk
• We prototyped three more generic
type profilers
– Static analysis 1 (SA1)
• Mainly for used-defined classes
– Static analysis 2 (SA2)
• Mainly for builtin classes
– Dynamic analysis (DA)
• Enhancement of "RubyTypeInference"
SA1: Idea
• Guess a type of formal parameters
based on called method names
class FooBar
def foo(...); ...; end
def bar(...); ...; end
end
def func(x) #=> x:FooBar
x.foo(1)
x.bar(2)
end
SA1: Prototyped algorithm
• Gather method
definitions in each
class/modules
– FooBar={foo,bar}
• Gather method calls
for each parameters
– x={foo,bar}
– Remove general methods (like #[] and #+)
to reduce false positive
– Arity, parameter and return types aren't used
• Assign a class that all methods match
class FooBar
def foo(...);...;end
def bar(...);...;end
end
def func(x)
x.foo(1)
x.bar(2)
end
SA1: Evaluation
• Experimented SA1 with WEBrick
– As a sample code that has many user-
defined classes
• Manually checked the guessed result
– Found some common guessing failures
• Wrong result / no-match result
– No quantitative evaluation yet
SA1: Problem 1
• A parameter is not used
• Many methods are affected
def do_GET(req, res)
raise HTTPStatus::NotFound, "not found."
end
DefaultFileHandler#do_GET(req:#{}, res:HTTPResponse)
FileHandler#do_GET(req:#{}, res:#{})
AbstractServlet#do_GET(req:#{}, res:#{})
ProcHandler#do_GET(request:#{}, response:#{})
ERBHandler#do_GET(req:#{}, res:HTTPResponse)
SA1: Problem 2
• Incomplete guessing
• Cause
– the method calls req.request_uri
– Both HTTPResponse and HTTPRequest
provides request_uri
HTTPProxyServer#perform_proxy_request(
req: HTTPResponse | HTTPRequest,
res: WEBrick::HTTPResponse,
req_class:#{new}, :nil)
(Argurable) solution?
• Exploit the name of parameter
– Create a mapping from parameter name
to type after profiling
• "req"  HTTPRequest
– Revise guessed types using the mapping
• Fixed!
DefaultFileHandler#do_GET(req:HTTPRequest, res:HTTPResponse)
FileHandler#do_GET(req:HTTPRequest, res:HTTPResponse)
AbstractServlet#do_GET(req:HTTPRequest, res:HTTPResponse)
ProcHandler#do_GET(request:#{}, response:#{})
ERBHandler#do_GET(req:HTTPRequest, res:HTTPResponse)
CGIHandler#do_GET(req:HTTPRequest, res:HTTPResponse)
SA1: Problem 3
• Cannot guess return type
• Can guess in only limited cases
– Returns formal parameter
– Returns a literal or "Foo.new"
– Returns an expression which is already
included Type DB
• See actual usage of the method?
– Requires inter-procedural or
whole-program analysis!
SA1: Pros/Cons
• Pros
– No need to run tests
– Can guess void type
• Cons
– Hard when parameters are not used
• This is not a rare case
– Heuristic may work, but cause wrong
guessing
SA2: Idea
• I believe this method expects Numeric!
def add_42(x) #=> (x:Num)=>Num
x + 42
end
SA2: Prototyped algorithm
• Limited type DB of stdlib
– Num#+(Num)  Num
– Str#+(Str)  Str, etc.
• "Unification-based type-inference"
inspired algorithm
– searches "α#+(Num)  β"
– Matches "Num#+(Num)  Num"
• Type substitution: α=Num, β=Num
x + 42
SA2: Prototyped algorithm (2)
• When multiple candidates found
– matches:
• Num#<<(Num)  Num
• Str#<<(Num)  Str
• Array[α]#<<(α)  Array[α]
– Just take union types of them
• (Overloaded types might be better)
def push_42(x)
x << 42
end
#=> (x:(Num|Str|Array))=>(Num|Str|Array)
x << 42
SA2: Evaluation
• Experimented SA1 with OptCarrot
– As a sample code that uses many builtin
types
• Manually checked the guessed result
– Found some common guessing failures
• Wrong result / no-match result
– No quantitative evaluation yet
SA2: Problem 1
• Surprising result
– Counterintuitive, but actually it works
with @fetch:Array[Num|Str]
def peek16(addr)
@fetch[addr] + (@fetch[addr + 1] << 8)
end
# Optcarrot::CPU#peek16(Num) => (Num|Str)
SA2: Problem 2
• Difficult to handle type parameters
– Requires constraint-based type-inference
@ary = [] # Array[α]
@ary[0] = 1 # unified to Array[Num]
@ary[1] = "str" # cannot unify Num and Str
SA2: Pros/Cons
• Pros
– No need to run tests
– Can guess void type
– Can guess parameters that is not used as a
receiver
• Cons
– Cause wrong guessing
– Hard to handle type parameters (Array[α])
– Hard to scale
• The bigger type DB is, more wrong results will
happen
DA: Idea
• Recording actual inputs/output of
methods by using TracePoint API
– The same as RubyTypeInference
• Additional features
– Support block types
• Required enhancement of TracePoint API
– Support container types: Array[Int]
• By sampling elements
DA: Evaluation
• Evaluated with OptCarrot and WEBrick
• It works easily and robust
DA: Problem 1
• Very slow (in some cases)
– Recording OptCarrot may take hours
– Element-sampling for Array made it faster,
but still take a few minutes
• Without tracing, it runs in a few seconds
– It may depend on application
• Profiling WEBrick is not so slow
DA: Problem 2
• Cannot guess void type
– Many methods returns garbage
– DA cannot distinguish garbage and
intended return value
• SA can guess void type by heuristic
– Integer#times, Array#each, etc.
– if statement that has no "else"
– while and until statements
– Multiple assignment
• (Steep scaffold now supports some of them)
DA: Problem 3
• Some tests confuse the result
– Need to ignore error-handling tests by
cooperating test framework
assert_raise(TypeError) { … }
DA: Pros/Cons
• Pros
– Easy to implement, and robust
– It can profile any programs
• Including meta-programming like
ActiveRecord
• Cons
– Need to run tests; it might be very slow
– Hard to handle void type
– TracePoint API is not enough yet
– Need to cooperate with test frameworks
Conclusion
• Reviewed already-proposed type
systems for Ruby
– Whose implementations are available
• Type DB: Ruby3's key concept
• Some prototypes and experiments of
type profilers
– Need more improvements / experiments!

Contenu connexe

Tendances

Ruby projects of interest for DevOps
Ruby projects of interest for DevOpsRuby projects of interest for DevOps
Ruby projects of interest for DevOps
Ricardo Sanchez
 
使用.NET构建轻量级分布式框架
使用.NET构建轻量级分布式框架使用.NET构建轻量级分布式框架
使用.NET构建轻量级分布式框架
jeffz
 
LINQ Inside
LINQ InsideLINQ Inside
LINQ Inside
jeffz
 
How to start using Scala
How to start using ScalaHow to start using Scala
How to start using Scala
Ngoc Dao
 

Tendances (20)

IDLs
IDLsIDLs
IDLs
 
Ruby projects of interest for DevOps
Ruby projects of interest for DevOpsRuby projects of interest for DevOps
Ruby projects of interest for DevOps
 
TypeScript Best Practices
TypeScript Best PracticesTypeScript Best Practices
TypeScript Best Practices
 
使用.NET构建轻量级分布式框架
使用.NET构建轻量级分布式框架使用.NET构建轻量级分布式框架
使用.NET构建轻量级分布式框架
 
A Recovering Java Developer Learns to Go
A Recovering Java Developer Learns to GoA Recovering Java Developer Learns to Go
A Recovering Java Developer Learns to Go
 
Advanced Reflection in Pharo
Advanced Reflection in PharoAdvanced Reflection in Pharo
Advanced Reflection in Pharo
 
Clojure in real life 17.10.2014
Clojure in real life 17.10.2014Clojure in real life 17.10.2014
Clojure in real life 17.10.2014
 
Oslo.versioned objects - Deep Dive
Oslo.versioned objects - Deep DiveOslo.versioned objects - Deep Dive
Oslo.versioned objects - Deep Dive
 
LINQ Inside
LINQ InsideLINQ Inside
LINQ Inside
 
Ruby, the language of devops
Ruby, the language of devopsRuby, the language of devops
Ruby, the language of devops
 
Why scala is not my ideal language and what I can do with this
Why scala is not my ideal language and what I can do with thisWhy scala is not my ideal language and what I can do with this
Why scala is not my ideal language and what I can do with this
 
Invitation to the dark side of Ruby
Invitation to the dark side of RubyInvitation to the dark side of Ruby
Invitation to the dark side of Ruby
 
Maccro Strikes Back
Maccro Strikes BackMaccro Strikes Back
Maccro Strikes Back
 
Implementing a JavaScript Engine
Implementing a JavaScript EngineImplementing a JavaScript Engine
Implementing a JavaScript Engine
 
2013.02.02 지앤선 테크니컬 세미나 - Xcode를 활용한 디버깅 팁(OSXDEV)
2013.02.02 지앤선 테크니컬 세미나 - Xcode를 활용한 디버깅 팁(OSXDEV)2013.02.02 지앤선 테크니컬 세미나 - Xcode를 활용한 디버깅 팁(OSXDEV)
2013.02.02 지앤선 테크니컬 세미나 - Xcode를 활용한 디버깅 팁(OSXDEV)
 
Introduction to Kotlin Language and its application to Android platform
Introduction to Kotlin Language and its application to Android platformIntroduction to Kotlin Language and its application to Android platform
Introduction to Kotlin Language and its application to Android platform
 
Eval4j @ JVMLS 2014
Eval4j @ JVMLS 2014Eval4j @ JVMLS 2014
Eval4j @ JVMLS 2014
 
Ruby Presentation
Ruby Presentation Ruby Presentation
Ruby Presentation
 
Crystal internals (part 1)
Crystal internals (part 1)Crystal internals (part 1)
Crystal internals (part 1)
 
How to start using Scala
How to start using ScalaHow to start using Scala
How to start using Scala
 

Similaire à Type Profiler: An Analysis to guess type signatures

Connecting C++ and JavaScript on the Web with Embind
Connecting C++ and JavaScript on the Web with EmbindConnecting C++ and JavaScript on the Web with Embind
Connecting C++ and JavaScript on the Web with Embind
Chad Austin
 
Code Analysis-run time error prediction
Code Analysis-run time error predictionCode Analysis-run time error prediction
Code Analysis-run time error prediction
NIKHIL NAWATHE
 
Reducing Redundancies in Multi-Revision Code Analysis
Reducing Redundancies in Multi-Revision Code AnalysisReducing Redundancies in Multi-Revision Code Analysis
Reducing Redundancies in Multi-Revision Code Analysis
Sebastiano Panichella
 

Similaire à Type Profiler: An Analysis to guess type signatures (20)

Type Profiler: Ambitious Type Inference for Ruby 3
Type Profiler: Ambitious Type Inference for Ruby 3Type Profiler: Ambitious Type Inference for Ruby 3
Type Profiler: Ambitious Type Inference for Ruby 3
 
Code for Startup MVP (Ruby on Rails) Session 2
Code for Startup MVP (Ruby on Rails) Session 2Code for Startup MVP (Ruby on Rails) Session 2
Code for Startup MVP (Ruby on Rails) Session 2
 
Bringing nullability into existing code - dammit is not the answer.pptx
Bringing nullability into existing code - dammit is not the answer.pptxBringing nullability into existing code - dammit is not the answer.pptx
Bringing nullability into existing code - dammit is not the answer.pptx
 
Connecting C++ and JavaScript on the Web with Embind
Connecting C++ and JavaScript on the Web with EmbindConnecting C++ and JavaScript on the Web with Embind
Connecting C++ and JavaScript on the Web with Embind
 
TypeScript: Basic Features and Compilation Guide
TypeScript: Basic Features and Compilation GuideTypeScript: Basic Features and Compilation Guide
TypeScript: Basic Features and Compilation Guide
 
Building source code level profiler for C++.pdf
Building source code level profiler for C++.pdfBuilding source code level profiler for C++.pdf
Building source code level profiler for C++.pdf
 
Angular2
Angular2Angular2
Angular2
 
First Class Variables as AST Annotations
 First Class Variables as AST Annotations First Class Variables as AST Annotations
First Class Variables as AST Annotations
 
First Class Variables as AST Annotations
First Class Variables as AST AnnotationsFirst Class Variables as AST Annotations
First Class Variables as AST Annotations
 
Lua pitfalls
Lua pitfallsLua pitfalls
Lua pitfalls
 
Functions, List and String methods
Functions, List and String methodsFunctions, List and String methods
Functions, List and String methods
 
Triton and symbolic execution on gdb
Triton and symbolic execution on gdbTriton and symbolic execution on gdb
Triton and symbolic execution on gdb
 
The operation principles of PVS-Studio static code analyzer
The operation principles of PVS-Studio static code analyzerThe operation principles of PVS-Studio static code analyzer
The operation principles of PVS-Studio static code analyzer
 
Code Analysis-run time error prediction
Code Analysis-run time error predictionCode Analysis-run time error prediction
Code Analysis-run time error prediction
 
3 boyd direct3_d12 (1)
3 boyd direct3_d12 (1)3 boyd direct3_d12 (1)
3 boyd direct3_d12 (1)
 
Introduction to C++
Introduction to C++Introduction to C++
Introduction to C++
 
Rapid Application Development using Ruby on Rails
Rapid Application Development using Ruby on RailsRapid Application Development using Ruby on Rails
Rapid Application Development using Ruby on Rails
 
Reducing Redundancies in Multi-Revision Code Analysis
Reducing Redundancies in Multi-Revision Code AnalysisReducing Redundancies in Multi-Revision Code Analysis
Reducing Redundancies in Multi-Revision Code Analysis
 
TypeScript . the JavaScript developer best friend!
TypeScript . the JavaScript developer best friend!TypeScript . the JavaScript developer best friend!
TypeScript . the JavaScript developer best friend!
 
Meta Object Protocols
Meta Object ProtocolsMeta Object Protocols
Meta Object Protocols
 

Plus de mametter

クックパッド春の超絶技巧パンまつり 超絶技巧プログラミング編 資料
クックパッド春の超絶技巧パンまつり 超絶技巧プログラミング編 資料クックパッド春の超絶技巧パンまつり 超絶技巧プログラミング編 資料
クックパッド春の超絶技巧パンまつり 超絶技巧プログラミング編 資料
mametter
 
Cookpad Hackarade #04: Create Your Own Interpreter
Cookpad Hackarade #04: Create Your Own InterpreterCookpad Hackarade #04: Create Your Own Interpreter
Cookpad Hackarade #04: Create Your Own Interpreter
mametter
 

Plus de mametter (20)

error_highlight: User-friendly Error Diagnostics
error_highlight: User-friendly Error Diagnosticserror_highlight: User-friendly Error Diagnostics
error_highlight: User-friendly Error Diagnostics
 
TRICK 2022 Results
TRICK 2022 ResultsTRICK 2022 Results
TRICK 2022 Results
 
クックパッド春の超絶技巧パンまつり 超絶技巧プログラミング編 資料
クックパッド春の超絶技巧パンまつり 超絶技巧プログラミング編 資料クックパッド春の超絶技巧パンまつり 超絶技巧プログラミング編 資料
クックパッド春の超絶技巧パンまつり 超絶技巧プログラミング編 資料
 
Ruby 3の型解析に向けた計画
Ruby 3の型解析に向けた計画Ruby 3の型解析に向けた計画
Ruby 3の型解析に向けた計画
 
emruby: ブラウザで動くRuby
emruby: ブラウザで動くRubyemruby: ブラウザで動くRuby
emruby: ブラウザで動くRuby
 
型プロファイラ:抽象解釈に基づくRuby 3の静的解析
型プロファイラ:抽象解釈に基づくRuby 3の静的解析型プロファイラ:抽象解釈に基づくRuby 3の静的解析
型プロファイラ:抽象解釈に基づくRuby 3の静的解析
 
Ruby 3の型推論やってます
Ruby 3の型推論やってますRuby 3の型推論やってます
Ruby 3の型推論やってます
 
マニアックなRuby 2.7新機能紹介
マニアックなRuby 2.7新機能紹介マニアックなRuby 2.7新機能紹介
マニアックなRuby 2.7新機能紹介
 
Ruby 3 の型解析に向けた計画
Ruby 3 の型解析に向けた計画Ruby 3 の型解析に向けた計画
Ruby 3 の型解析に向けた計画
 
本番環境で使える実行コード記録機能
本番環境で使える実行コード記録機能本番環境で使える実行コード記録機能
本番環境で使える実行コード記録機能
 
Transcendental Programming in Ruby
Transcendental Programming in RubyTranscendental Programming in Ruby
Transcendental Programming in Ruby
 
Cookpad Hackarade #04: Create Your Own Interpreter
Cookpad Hackarade #04: Create Your Own InterpreterCookpad Hackarade #04: Create Your Own Interpreter
Cookpad Hackarade #04: Create Your Own Interpreter
 
Ruby 3のキーワード引数について考える
Ruby 3のキーワード引数について考えるRuby 3のキーワード引数について考える
Ruby 3のキーワード引数について考える
 
TRICK 2018 results
TRICK 2018 resultsTRICK 2018 results
TRICK 2018 results
 
Esoteric, Obfuscated, Artistic Programming in Ruby
Esoteric, Obfuscated, Artistic Programming in RubyEsoteric, Obfuscated, Artistic Programming in Ruby
Esoteric, Obfuscated, Artistic Programming in Ruby
 
Cookpad Spring 1day internship 2018 超絶技巧プログラミングコース資料
Cookpad Spring 1day internship 2018 超絶技巧プログラミングコース資料Cookpad Spring 1day internship 2018 超絶技巧プログラミングコース資料
Cookpad Spring 1day internship 2018 超絶技巧プログラミングコース資料
 
Esoteric, Obfuscated, Artistic Programming in Ruby
Esoteric, Obfuscated, Artistic Programming in RubyEsoteric, Obfuscated, Artistic Programming in Ruby
Esoteric, Obfuscated, Artistic Programming in Ruby
 
An introduction and future of Ruby coverage library
An introduction and future of Ruby coverage libraryAn introduction and future of Ruby coverage library
An introduction and future of Ruby coverage library
 
Ruby でつくる型付き Ruby
Ruby でつくる型付き RubyRuby でつくる型付き Ruby
Ruby でつくる型付き Ruby
 
Ruby で高速なプログラムを書く
Ruby で高速なプログラムを書くRuby で高速なプログラムを書く
Ruby で高速なプログラムを書く
 

Dernier

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 

Dernier (20)

Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Cyberprint. Dark Pink Apt Group [EN].pdf
Cyberprint. Dark Pink Apt Group [EN].pdfCyberprint. Dark Pink Apt Group [EN].pdf
Cyberprint. Dark Pink Apt Group [EN].pdf
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 

Type Profiler: An Analysis to guess type signatures

  • 1. Type Profiler: An analysis to guess type signatures Yusuke Endoh (@mametter) Cookpad Inc. RubyKaigi 2018 (2018/06/01)
  • 2. Yusuke Endoh (@mametter) • A full-time MRI committer @ Cookpad – w/ Koichi Sasada
  • 3. Recent achievement for Ruby 2.6 • Endless range [Feature #12912] (1..) Endless!
  • 4. Endless range • Take an array without the first element ary=["a","b","c"] ary[1..-1] #=> ["b","c"] ary.drop(1) #=> ["b","c"] ary[1..] #=> ["b","c"]
  • 5. Endless range • Loop from 1 to infinity i=1; loop { ……; i+=1 } (1..Float::INFINITY).each {……} 1.step {|i|……} (1..).each {|i|……}
  • 6. Endless range • each_with_index from index 1 i=1; ary.each { ……; i+=1 } ary.each.with_index(1){|x,i|……} ary.zip(1..) {|x,i|……}
  • 7. Endless range ✓Has been already committed in trunk ✓Will be included in Ruby 2.6 • Stay tuned! ary[1..] (1..).each {……} ary.zip(1..) {|x,i|……}
  • 8. Beginless range...? • Just have implemented yesterday [Feature #14799] (..1) Beginless!
  • 9. Today’s theme • Ruby3's type. • Some people held some meetings to discuss Ruby3's type – Matz, soutaro, akr, ko1, mame – Main objective: clarify matz's hidden requirements (and compromises) for Ruby3's type • (Not to decide everything behind closed door) • We'll explain the (current) requirements
  • 10. Agenda • A whirlwind tour of already-proposed "type systems" for Ruby • Type DB: A key concept of Ruby3's type system • A missing part: Type profiler
  • 11. A whirlwind tour of already-proposed "type systems" for Ruby
  • 12. Type-related systems for Ruby • Steep – Static type check • RDL – (Semi) static type check • contracts.ruby – Only dynamic check of arguments/return values • dry-types – Only dynamic checks of typed structs • RubyTypeInference (by JetBrains) – Type information extractor by dynamic analysis • Sorbet (by Stripe)
  • 13. RDL: Types for Ruby • Most famous in academic world – Jeff Foster at Univ. of Maryland – Accepted in OOPSLA, PLDI, and POPL! • The gem is available – https://github.com/plum-umd/rdl • We evaluated RDL – thought writing type annotations for OptCarrot
  • 14. Basis for RDL # load RDL library require "rdl" class NES # activate type annotations for RDL extend RDL::Annotate # type annotation before method definition type "(?Array<String>) -> self", typecheck: :call def initialize(conf = ARGV) ...
  • 15. RDL type annotation • Accepts one optional parameter typed Array of String • Returns self – Always "self" for initialize method type "(?Array<String>) -> self", typecheck: :call def initialize(conf = ARGV) ...
  • 16. RDL type annotation • "typecheck" controls type check timing – :call: when this method is called – :now: when this method is defined – :XXX: when "RDL.do_typecheck :XXX" is done – nil: no "static check" is done • Used to type-check code that uses the method • Still "run-time check" is done type "(?Array<String>) -> self", typecheck: :call def initialize(conf = ARGV) ...
  • 17. Annotation for instance variables • Needs type annotations for all instance variables class NES # activate type annotations for RDL extend RDL::Annotate var_type :@cpu, "%any" type "() -> %any", typecheck: :call def reset @cpu.reset #=> receiver type %any not supported yet ...
  • 18. Annotation for instance variables • Needs type annotations for all instance variables class NES # activate type annotations for RDL extend RDL::Annotate var_type :@cpu, "[reset: () -> %any]" type "() -> %any", typecheck: :call def reset @cpu.reset #=> receiver type [reset: () -> %any] not sup ...
  • 19. Annotation for instance variables • Needs type annotations for all instance variables class NES # activate type annotations for RDL extend RDL::Annotate var_type :@cpu, "Optcarrot::CPU" type "() -> %any", typecheck: :call def reset @cpu.reset # error: no type information for # instance method `Optcarrot::CPU#reset'
  • 20. Annotation for instance variables • Succeeded to type check class NES # activate type annotations for RDL extend RDL::Annotate type "Optcarrot::CPU","reset","()->%any" var_type :@cpu, "Optcarrot::CPU" type "() -> %any", typecheck: :call def reset @cpu.reset ...
  • 21. Requires many annotations... type "() -> %bot", typecheck: :call def reset @cpu.reset @apu.reset @ppu.reset @rom.reset @pads.reset @cpu.boot @rom.load_battery end
  • 22. Requires many annotations... type "() -> %bot", typecheck: nil def reset @cpu.reset @apu.reset @ppu.reset @rom.reset @pads.reset @cpu.boot @rom.load_battery end No static check
  • 23. … still does not work type "() -> %bot", typecheck: nil def reset ... @rom.load_battery #=> [65533] end # Optcarrot::CPU#reset: Return type error.… # Method type: # *() -> %bot # Actual return type: # Array # Actual return value: # [65533]
  • 24. Why? • typecheck:nil doesn't mean no check – Still dynamic check is done • %bot means "no-return" – Always raises exception, process exit, etc. – But this method returns [65533] – In short, this is my bug in the annotation type "() -> %bot", typecheck: nil def reset ... @rom.load_battery #=> [65533] end
  • 25. Lessons: void type • In Ruby, a lot of methods return meaningless value – No intention to allow users to use the value • What type should we use in this case? – %any, or return nil explicitly? • We need a "void" type – %any for the method; it can return anything – "don't use" for users of the method def reset LIBRARY_INTERNAL_ARRAY. each { … } end
  • 26. RDL's programmable annotation • RDL supports meta-programming symbols.each do |id| attr_reader_type, id, "String" attr_reader id end
  • 27. RDL's programmable annotation • RDL supports pre-condition check – This can be also used to make type annotation automatically • I like this feature, but matz doesn't – He wants to avoid type annotations embedded in the code – He likes separated, non-Ruby type definition language (as Steep) pre(:belongs_to) do |name| …… type name, "() -> #{klass}" end
  • 28. Summary: RDL • Semi-static type check – The timing is configurable • It checks the method body – Not only dynamic check of arguments/return values • The implementation is mature – Many features actually works, great! • Need type annotations • Supports meta-programming
  • 29. Steep • Snip: You did listen to soutaro's talk • Completely static type check • Separated type definition language – .rbi – But also requires (minimal?) type annotation embedded in .rb files
  • 30. Digest: contracts.ruby require 'contracts' class Example include Contracts::Core include Contracts::Builtin Contract Num => Num def double(x) x * 2 end end • RDL-like type annotation – Run-time type check
  • 31. Digest: dry-types require 'dry-types' require 'dry-struct' module Types include Dry::Types.module end class User < Dry::Struct attribute :name, Types::String attribute :age, Types::Integer end • Can define structs with typed fields – Run-time type check – "type_struct" gem is similar
  • 32. Digest: RubyTypeInference • Type information extractor by dynamic analysis – Run test suites under monitoring of TracePoint API – Hooks method call/return events, logs the passed values, and aggregate them to type information – Used by RubyMine IDE
  • 34. Summary of Type Systems Objective Targets Annotations Steep Static type check Method body Separated (mainly) RDL Semi-static type check Method body Embedded in code contracts. ruby Dynamic type check Arguments and return values Embedded in code dry-types Typed structs Only Dry::Struct classes Embedded in code RubyType Inference Extract type information Arguments and return values N/A
  • 35. Type DB: A key concept of Ruby3's Type System
  • 36. Idea • Separated type definition file is good • But meta-programming like attr_* is difficult to support – Users will try to generate it programmatically • We may want to keep code position – To show lineno of code in type error report – Hard to manually keep the correspondence between type definition and code position in .rbi file – We may also want to keep other information
  • 37. Type DB Type DB Steep type definition typecheck Steep RDL/Sorbet type annotation RDL typecheck better error report Ruby interpreter IDE
  • 38. How to create Type DB Type DB Steep type definition Ruby code write manually compile stdlib Already included RubyTypeInference automatically extract by dynamic analysis Type Profiler
  • 39. A missing part: Type Profiler
  • 40. Type Profiler • Another way to extract type information from Ruby code – Alternative "RubyTypeInference" • Is not a type inference – Type inference of Ruby is hopeless – Conservative static type inference can extracts little information • Type profiler "guesses" type information – It may extract wrong type information – Assumes that user checks the result
  • 41. Type Profilers • There is no "one-for-all" type profiler – Static type profiling cannot handle ActiveRecord – Dynamic type profiling cannot extract syntactic features (like void type) • We need a variety of type profilers – For ActiveRecord by reading DB schema – Extracting from RDoc/YARD
  • 42. In this talk • We prototyped three more generic type profilers – Static analysis 1 (SA1) • Mainly for used-defined classes – Static analysis 2 (SA2) • Mainly for builtin classes – Dynamic analysis (DA) • Enhancement of "RubyTypeInference"
  • 43. SA1: Idea • Guess a type of formal parameters based on called method names class FooBar def foo(...); ...; end def bar(...); ...; end end def func(x) #=> x:FooBar x.foo(1) x.bar(2) end
  • 44. SA1: Prototyped algorithm • Gather method definitions in each class/modules – FooBar={foo,bar} • Gather method calls for each parameters – x={foo,bar} – Remove general methods (like #[] and #+) to reduce false positive – Arity, parameter and return types aren't used • Assign a class that all methods match class FooBar def foo(...);...;end def bar(...);...;end end def func(x) x.foo(1) x.bar(2) end
  • 45. SA1: Evaluation • Experimented SA1 with WEBrick – As a sample code that has many user- defined classes • Manually checked the guessed result – Found some common guessing failures • Wrong result / no-match result – No quantitative evaluation yet
  • 46. SA1: Problem 1 • A parameter is not used • Many methods are affected def do_GET(req, res) raise HTTPStatus::NotFound, "not found." end DefaultFileHandler#do_GET(req:#{}, res:HTTPResponse) FileHandler#do_GET(req:#{}, res:#{}) AbstractServlet#do_GET(req:#{}, res:#{}) ProcHandler#do_GET(request:#{}, response:#{}) ERBHandler#do_GET(req:#{}, res:HTTPResponse)
  • 47. SA1: Problem 2 • Incomplete guessing • Cause – the method calls req.request_uri – Both HTTPResponse and HTTPRequest provides request_uri HTTPProxyServer#perform_proxy_request( req: HTTPResponse | HTTPRequest, res: WEBrick::HTTPResponse, req_class:#{new}, :nil)
  • 48. (Argurable) solution? • Exploit the name of parameter – Create a mapping from parameter name to type after profiling • "req"  HTTPRequest – Revise guessed types using the mapping • Fixed! DefaultFileHandler#do_GET(req:HTTPRequest, res:HTTPResponse) FileHandler#do_GET(req:HTTPRequest, res:HTTPResponse) AbstractServlet#do_GET(req:HTTPRequest, res:HTTPResponse) ProcHandler#do_GET(request:#{}, response:#{}) ERBHandler#do_GET(req:HTTPRequest, res:HTTPResponse) CGIHandler#do_GET(req:HTTPRequest, res:HTTPResponse)
  • 49. SA1: Problem 3 • Cannot guess return type • Can guess in only limited cases – Returns formal parameter – Returns a literal or "Foo.new" – Returns an expression which is already included Type DB • See actual usage of the method? – Requires inter-procedural or whole-program analysis!
  • 50. SA1: Pros/Cons • Pros – No need to run tests – Can guess void type • Cons – Hard when parameters are not used • This is not a rare case – Heuristic may work, but cause wrong guessing
  • 51. SA2: Idea • I believe this method expects Numeric! def add_42(x) #=> (x:Num)=>Num x + 42 end
  • 52. SA2: Prototyped algorithm • Limited type DB of stdlib – Num#+(Num)  Num – Str#+(Str)  Str, etc. • "Unification-based type-inference" inspired algorithm – searches "α#+(Num)  β" – Matches "Num#+(Num)  Num" • Type substitution: α=Num, β=Num x + 42
  • 53. SA2: Prototyped algorithm (2) • When multiple candidates found – matches: • Num#<<(Num)  Num • Str#<<(Num)  Str • Array[α]#<<(α)  Array[α] – Just take union types of them • (Overloaded types might be better) def push_42(x) x << 42 end #=> (x:(Num|Str|Array))=>(Num|Str|Array) x << 42
  • 54. SA2: Evaluation • Experimented SA1 with OptCarrot – As a sample code that uses many builtin types • Manually checked the guessed result – Found some common guessing failures • Wrong result / no-match result – No quantitative evaluation yet
  • 55. SA2: Problem 1 • Surprising result – Counterintuitive, but actually it works with @fetch:Array[Num|Str] def peek16(addr) @fetch[addr] + (@fetch[addr + 1] << 8) end # Optcarrot::CPU#peek16(Num) => (Num|Str)
  • 56. SA2: Problem 2 • Difficult to handle type parameters – Requires constraint-based type-inference @ary = [] # Array[α] @ary[0] = 1 # unified to Array[Num] @ary[1] = "str" # cannot unify Num and Str
  • 57. SA2: Pros/Cons • Pros – No need to run tests – Can guess void type – Can guess parameters that is not used as a receiver • Cons – Cause wrong guessing – Hard to handle type parameters (Array[α]) – Hard to scale • The bigger type DB is, more wrong results will happen
  • 58. DA: Idea • Recording actual inputs/output of methods by using TracePoint API – The same as RubyTypeInference • Additional features – Support block types • Required enhancement of TracePoint API – Support container types: Array[Int] • By sampling elements
  • 59. DA: Evaluation • Evaluated with OptCarrot and WEBrick • It works easily and robust
  • 60. DA: Problem 1 • Very slow (in some cases) – Recording OptCarrot may take hours – Element-sampling for Array made it faster, but still take a few minutes • Without tracing, it runs in a few seconds – It may depend on application • Profiling WEBrick is not so slow
  • 61. DA: Problem 2 • Cannot guess void type – Many methods returns garbage – DA cannot distinguish garbage and intended return value • SA can guess void type by heuristic – Integer#times, Array#each, etc. – if statement that has no "else" – while and until statements – Multiple assignment • (Steep scaffold now supports some of them)
  • 62. DA: Problem 3 • Some tests confuse the result – Need to ignore error-handling tests by cooperating test framework assert_raise(TypeError) { … }
  • 63. DA: Pros/Cons • Pros – Easy to implement, and robust – It can profile any programs • Including meta-programming like ActiveRecord • Cons – Need to run tests; it might be very slow – Hard to handle void type – TracePoint API is not enough yet – Need to cooperate with test frameworks
  • 64. Conclusion • Reviewed already-proposed type systems for Ruby – Whose implementations are available • Type DB: Ruby3's key concept • Some prototypes and experiments of type profilers – Need more improvements / experiments!