Dutch Power - 26 maart 2024 - Henk Kras - Circular Plastics
Towards hasktorch 1.0
1. Towards Hasktorch 1.0
Junji Hashimoto, Sam Stites, and Austin Huang
on behalf of the Hasktorch Contributor Team
1
2. ● Junji Hashimoto
● Austin Huang
● Adam Pazske
● Sam Stites
Contributor Team
2
Google Summer of Code
● Atreya Ghosal
● Jesse Segal
3. Outline
● What is hasktorch?
● Codegen
○ Architecture. (Easy to maintain.)
○ How to connect C++ library.
○ Reference Counter and Garbage Collection
● Pure functions and ML models
○ High level API but pure.
○ Easy to write like numpy
○ ML model (Spec, data model and computational model)
● Examples
○ Gaussian Process -- Not using gradient
○ Gated Recurrent Unit -- Using gradient
● Results
● Next Steps
3
4. What is hasktorch?
● General machine learning framework.
● Type Safe
● Differentiable programming
● To put Pytorch-power into Haskell
● Easy to use (Based on pure functions)
4
= +
Hasktorch Haskell Pytorch
5. Codegen
● Pytorch has thousands of functions in its API.
● The APIs are changed everyday. (Stable release is every 3 months.)
● Base library is C++.
5
● What is generated?
● How to connect C++?
● How to manage memory?
● Is it efficient for calculation?
Generate haskell from Pytorch’s spec
6. Architecture
6
Raw layer without GC
Managed layer with GC
libtorch(C++)
libtorch(Spec) Codegen
High level API
with pure funcitons
High level API
with static types
under
development
Generate
Haskell
Code at::Tensor
About 1500 functions
Ptr Tensor
ForeignPtr Tensor
Tensor
Tensor Float ‘[3,3]
7. How to connect C++?
● Inline-c-cpp
○ Write haskell and generate codes at compilation-time
7
main = do
let x = 1
print [C.pure| double{ cos($(double x)) } |]
Template Haskell
foreign import ccall "cos" c_cos ::Double -> Double
main = do
let x = 1
print (c_cos x)
extern “C” {
double c_cos (double x) {
return cos(x);
}
}
Haskell
C++
8. How to connect objects of C++?
● Connect object’s pointer.
● Support namespace(at::Tensor) and template(std::vector<at::Tensor>)
8
tensor_cos :: Ptr Tensor -> Ptr Tensor
tensor_cos x =
[C.pure| at::Tensor* {
return new at::Tensor(cos(*$(at::Tensor* x)))
} |]
haskell
Ptr Tensor
c
c++
Ptr Tensorderef. newcos
9. How to manage memory?
● Use GHC’s garbage collection with ForeignPtr.
● Generic cast converts Ptr into ForeignPtr with GC.
9
tensor_cos :: Ptr Tensor -> IO (Ptr Tensor)
tensor_cos’ :: ForeignPtr Tensor -> IO (ForeignPtr Tensor)
tensor_cos’ = cast tensor_cos
Does GHC know Tensor’s memory-usage?
● Use memory-monitor and memory-exception for both CPU and GPU.
● Call performGC.
heap
stack
ghc runtime
libtorch
memory-monitor
memory-exception checker
CPU
memory
GPU
memory
libtorch
Call GC
10. How to connect objects of C++?
● Is it OK to call new at::Tensor?
● at::Tensor is just reference counter.
10
at::Tensor
(Reference Counter)
c10::TensorImpl
(Numerical Array)
ForeignPtr Tensor
at::Tensor
(Reference Counter)
ForeignPtr Tensor
The cost creating at::Tensor is low.
11. Outline
● What is hasktorch?
● Codegen
○ Architecture. (Easy to maintain.)
○ How to connect C++ library.
○ Reference Counter and Garbage Collection
● Pure functions and ML models
○ High level API but pure.
○ Easy to write like numpy
○ ML model (Spec, data model and computational model)
● Examples
○ Gaussian Process -- Not using gradient
○ Gated Recurrent Unit -- Using gradient
● Results
● Next Steps
11
12. High-level API: Pure functions and ML model
(Adam Pazske)
● High-Level API (not typed dimensions) == Pure functions
12
LinearSpec
- input size : 10
- output size : 1
Linear
- weight : Tensor [10,1]
- bias: Tensor [1]
linear :: Linear -> Tensor -> Tensor
linear Linear{..} input =
( input `matmul` weight )+ bias
Function
(e.g. sin)
Tensor Tensor
● Design pattern of ML model
○ Spec, Data Model, Computational Model
13. Pure functions
● Almost high level API are pure functions.
● Autograd is a pure function.
13
newtype IndependentTensor = IndependentTensor Tensor
grad :: Tensor -> [IndependentTensor] -> [Tensor]
grad y inputs = unsafePerformIO …
main = do
xi <- makeIndependent $ ones' [2,2]
yi <- makeIndependent $ ones' [2,2]
let x = toDependent xi
y = toDependent yi
z = x * x * y
print (grad z [xi])
Function
(e.g. sin)
Tensor Tensor
ones’ [2,2]
xi yi
z
Allocate new memory area
(call makeIndependent)
grad z [xi]
14. Pure functions
● List <-> 1 d Tensor
● List of List <-> 2d Tensor
14
> t = asTensor ([0..3]::[Float])
> t2 = asTensor ([0..3]::[Float])
> t + t2
Tensor Float [4] [ 0.0000, 2.0000 , 4.0000 , 6.0000 ]
> asValue (t + t2) :: [Float]
[0.0,2.0,4.0,6.0]
> t = asTensor ([[0,1],[2,3]]::[[Float]])
> t2 = asTensor ([[3,2],[2,1]]::[[Float]])
> matmul t t2
Tensor Float [2,2] [[ 2.0000 , 1.0000 ],
[ 12.0000 , 7.0000 ]]
15. Implementation Pattern for Machine Learning Models
15
● Model consists of spec, data-model and computational model.
● Spec initializes data-model including dimensions and some functions.
○ This may be for tensor not using typed dimensions.
● Data-model is parameters for Machine Learning.
● Computational model calculates Tensor-values.
Spec Data Model Computational Model
Spec Data Model
Compute
Model
Autograd
Init
Replace
parameters
LinearSpec
- input size : 10
- output size : 1
Linear
- weight : Tensor [10,1]
- bias: Tensor [1]
linear :: Linear -> Tensor -> Tensor
linear Linear{..} input =
( input `matmul` weight )+ bias
19. Training Loop
19
Spec Data Model
(state)
Compute
Model
(model)
Autograd
(grad)
Init
Flatten
parameters
Replace
parameters
20. Outline
● What is hasktorch?
● Codegen
○ Architecture. (Easy to maintain.)
○ How to connect C++ library.
○ Reference Counter and Garbage Collection
● Pure functions and ML models
○ High level API but pure.
○ Easy to write like numpy
○ ML model (Spec, data model and computational model)
● Examples
○ Gaussian Process -- Not using gradient
○ Gated Recurrent Unit -- Using gradient
● Results
● Next Steps
20
21. Gaussian Process
21
From “A Visual Exploration of Gaussian Processes”
Jochen Görtler, Rebecca Kehlbeck, Oliver Deussen
https://distill.pub/2019/visual-exploration-gaussian-processes/
27. Results
● Hasktorch = Pytorch + Haskell
● Codegen generates thousands of APIs.
● Garbage Collection manages memory of both CPU and GPU.
● High level API == Pure functions and ML model
● The examples of practical ML-models are available.
○ gaussian process.
○ recurrent networks.(GRU, LSTM and Elman)
27
28. Next Steps
● Refine High-level Interfaces, including typed dimensions
○ Broadcast operators, High level LSTM with napian functor(Jesse Segal)
● More examples for test case (e.g. MNIST)
● Optimizers except for sgd
● Second major release around Q4 2019!
Contributors Welcome!
https://github.com/hasktorch/hasktorch/
28