Ce diaporama a bien été signalé.
Nous utilisons votre profil LinkedIn et vos données d’activité pour vous proposer des publicités personnalisées et pertinentes. Vous pouvez changer vos préférences de publicités à tout moment.
NumPy and SciPy: History
                     and Ideas for the Future
                               Travis E. Oliphant

...
My Roots




Saturday, March 17, 12
My Roots
                         Images from BYU Mers Lab




Saturday, March 17, 12
Science led to Python
                         2
   ⇢0 (2⇡f ) Ui (a, f ) = [Cijkl (a, f ) Uk,l (a, f )],j

          Raja ...
Finding derivatives of 5-d data
                                  ⌅=r⇥U




Saturday, March 17, 12
Scientist at heart




Saturday, March 17, 12
Python origins.                      http://python-history.blogspot.com/2009/01/brief-timeline-of-python.html



         ...
Python origins.                      http://python-history.blogspot.com/2009/01/brief-timeline-of-python.html



         ...
Brief History

                              Person               Package       Year
                                     ...
Found Python and Numeric in 1997
              I was a fairly proficient MATLAB user, but it was not memory efficient enough...
First problem: Efficient Data Input
                             “It’s All About the Data”

                               ...
First problem: Efficient Data Input
                             “It’s All About the Data”

                               ...
Early pieces of SciPy
                   fftw wrappers              cephesmodule
                     June 1998           ...
1999 : Early SciPy emerges
       Discussions on the matrix-sig from 1997 to 1999 wanting a complete data analysis
    env...
Early 2000 : Numeric needs


             •      Memory mapped arrays
             •      Rank-0 arrays or scalars
       ...
Early contributors in 1999
                                Hosting of first Multipack CVS repository (June 1999)
          ...
Early contributors in 1999
                                Hosting of first Multipack CVS repository (June 1999)
          ...
Saturday, March 17, 12
Saturday, March 17, 12
Saturday, March 17, 12
2000-2001




Saturday, March 17, 12
SciPy 2001        Travis Oliphant
                             optimize
                              sparse
             ...
Finally... Plotting



                    John Hunter
                       2001



Saturday, March 17, 12
IPython (building on IPP)
                    Fernando Perez
                     Dec 10, 2001




Saturday, March 17, 12
STSCI leads out with Numarray


             Perry Greenfield
               J. Todd Miller
                Rick White
    ...
Numarray released in 2003

         • too slow for small arrays
         • incomplete implementation for ufuncs
         •...
Split in the community


                         Numeric
                          SciPy

                               ...
ndimage




Saturday, March 17, 12
started January 2005

                              NumPy
                          Version 1.0 October 2006



          ...
NumPy in Python
            Q: When is NumPy going to be part of the
            core Python?

            A: Data structu...
Saturday, March 17, 12
Left Academica For Industry




Saturday, March 17, 12
Community effort                           many, many others --- forgive me!
      • Chuck Harris
      • Pauli Virtanen
 ...
NumPy Array




                         shape




Saturday, March 17, 12
NumPy Essentials
        • Data: the array object
               – slicing
               – shapes and strides
           ...
Zen of NumPy
          •   strided is better than scattered
          •   contiguous is better than strided
          •   ...
Now What?




Saturday, March 17, 12
Experiences Conulsting
         • Basically, I was a consultant and manager of
           consultants for 4 years.
       ...
Putting Science back in CS
                • Much of the software stack is for systems
                  programming --- C...
APL : the first array-oriented language
    • Appeared in 1964
    • Originated by Ken Iverson
    • Direct descendants (J,...
Conway’s game of Life
              • Dead cell with exactly 3 live neighbors
                will come to life
          ...
Interesting Patterns emerge




Saturday, March 17, 12
Conway’s Game of Life
     APL


    NumPy
                         Initialization


        Update Step




Saturday, Mar...
Improvements needed
        • NDArray improvements
          • Indexes (esp. for Structured arrays)
          • SQL front-...
Improvements needed
         • Dtype improvements
           • Enumerated types (including dynamic enumeration)
          ...
Example of Object Dtype

                         @np.dtype
                         class Stock(np.DType):
              ...
Improvements needed
         • Ufunc improvements
           • Generalized ufuncs support more than just
             cont...
Improvements needed
         • Miscellaneous improvements
           • ABI-management
           • Eventual Move to C++ li...
NumPy Users

             • Want to be able to write Python to get fast
                   code that works on arrays and s...
Ufuncs




Saturday, March 17, 12
                                         Generalized
                                   ...
SciPy needs a Python compiler

                         optimize                   integrate


                         sp...
Numba -- a Python compiler

               • Replays byte-code on a stack with simple type-
                 inference
   ...
NumPy + Mamba = Numba
                  Python Function                             Machine Code


                       ...
Examples




Saturday, March 17, 12
Examples




Saturday, March 17, 12
Software Stack Future?
                           Plateaus of Code re-use + DSLs
                     SQL                 ...
Seeking Developers!



                    https://github.com/ContinuumIO/numba




Saturday, March 17, 12
How to pay for all this?




Saturday, March 17, 12
Dual strategy




                         NumPy 2.0



Saturday, March 17, 12
NumFOCUS
    Num(Py) Foundation for Open Code for Usable Science




Saturday, March 17, 12
NumFOCUS


            • Mission
              • To initiate and support educational programs
                furthering t...
NumFOCUS


           • Activites
             • Sponsor sprints and conferences
             • Provide scholarships and g...
NumFOCUS

                • Directors
                  • Jarrod Millman
                  • Fernando Perez
              ...
Vous avez terminé ce document.
Prochain SlideShare
Pythonによる機械学習入門 ~Deep Learningに挑戦~
Suivant
Prochain SlideShare
Pythonによる機械学習入門 ~Deep Learningに挑戦~
Suivant

17

Partager

Travis E. Oliphant, "NumPy and SciPy: History and Ideas for the Future"

Travis E. Oliphant gave an invited talk titled "NumPy and SciPy: History and Ideas for the Future" at Tokyo.SciPy#3 on Mar. 18th, 2012.

Livres associés

Gratuit avec un essai de 30 jours de Scribd

Tout voir

Travis E. Oliphant, "NumPy and SciPy: History and Ideas for the Future"

  1. 1. NumPy and SciPy: History and Ideas for the Future Travis E. Oliphant SciPy Tokyo. March 18, 2012 Saturday, March 17, 12
  2. 2. My Roots Saturday, March 17, 12
  3. 3. My Roots Images from BYU Mers Lab Saturday, March 17, 12
  4. 4. Science led to Python 2 ⇢0 (2⇡f ) Ui (a, f ) = [Cijkl (a, f ) Uk,l (a, f )],j Raja Muthupillai Richard Ehman 1997 Armando Manduca Saturday, March 17, 12
  5. 5. Finding derivatives of 5-d data ⌅=r⇥U Saturday, March 17, 12
  6. 6. Scientist at heart Saturday, March 17, 12
  7. 7. Python origins. http://python-history.blogspot.com/2009/01/brief-timeline-of-python.html Version Date 0.9.0 Feb. 1991 0.9.4 Dec. 1991 0.9.6 Apr. 1992 0.9.8 Jan. 1993 1.0.0 Jan. 1994 1.2 Apr. 1995 1.4 Oct. 1996 1.5.2 Apr. 1999 Saturday, March 17, 12
  8. 8. Python origins. http://python-history.blogspot.com/2009/01/brief-timeline-of-python.html Version Date 0.9.0 Feb. 1991 0.9.4 Dec. 1991 0.9.6 Apr. 1992 0.9.8 Jan. 1993 1.0.0 Jan. 1994 1.2 Apr. 1995 1.4 Oct. 1996 1.5.2 Apr. 1999 Saturday, March 17, 12
  9. 9. Brief History Person Package Year Matrix Object Jim Fulton 1994 in Python Jim Hugunin Numeric 1995 Perry Greenfield, Rick White, Todd Miller Numarray 2001 Travis Oliphant NumPy 2005 Saturday, March 17, 12
  10. 10. Found Python and Numeric in 1997 I was a fairly proficient MATLAB user, but it was not memory efficient enough. Loved the expressive syntax of Python Loved the fact that slicing didn’t make copies Loved the existing multiple data-types Loved how much more flexible it was to extend than MATLAB was Loved that I could read the source code and extend it Saturday, March 17, 12
  11. 11. First problem: Efficient Data Input “It’s All About the Data” Reference Counting Essay TableIO http://www.python.org/doc/essays/refcnt/ April 1998 May 1998 Michael A. Miller Guido van Rossum NumPyIO June 1998 Saturday, March 17, 12
  12. 12. First problem: Efficient Data Input “It’s All About the Data” Reference Counting Essay TableIO http://www.python.org/doc/essays/refcnt/ April 1998 May 1998 Michael A. Miller Guido van Rossum NumPyIO June 1998 Saturday, March 17, 12
  13. 13. Early pieces of SciPy fftw wrappers cephesmodule June 1998 November 1998 stats.py December 1998 Gary Strangman Saturday, March 17, 12
  14. 14. 1999 : Early SciPy emerges Discussions on the matrix-sig from 1997 to 1999 wanting a complete data analysis environment: Paul Barrett, Joe Harrington, Perry Greenfield, Paul Dubois, Konrad Hinsen, and others. Activity in 1998, led to increased interest in 1999. In response on 15 Jan, 1999, I posted to matrix-sig a list of routines I felt needed to be present and began wrapping / writing in earnest. On 6 April 1999, I announced I would be creating this uber-package which eventually became SciPy Gaussian quadrature 5 Jan 1999 cephes 1.0 30 Jan 1999 sigtools 0.40 23 Feb 1999 Numeric docs March 1999 cephes 1.1 9 Mar 1999 Plotting?? multipack 0.3 13 Apr 1999 Helper routines 14 Apr 1999 Gist multipack 0.6 (leastsq, ode, fsolve, 29 Apr 1999 XPLOT quad) DISLIN sparse plan described 30 May 1999 Gnuplot multipack 0.7 14 Jun 1999 SparsePy 0.1 cephes 1.2 (vectorize) 5 Nov 1999 29 Dec 1999 Helping with f2py Saturday, March 17, 12
  15. 15. Early 2000 : Numeric needs • Memory mapped arrays • Rank-0 arrays or scalars • Handling indirect indexing: a[[10,5,7]] • Handling masked indexing: a[[True, False, False] • More attributes to N-d arrays • “Record arrays” Saturday, March 17, 12
  16. 16. Early contributors in 1999 Hosting of first Multipack CVS repository (June 1999) Amazing makefiles Interface to FITPACK Wrote f2py as he watched my brute-force approach (July 1999) Pearu Peterson (IPP) Early IPython interactive environment (27 Apr 1999) Matlab file reader (24 Apr 1999) Janko Hauser Created windows binaries of multipack, cephesmodule, fftw, and signaltools (June 1999 while still in high school!) Robert Kern Saturday, March 17, 12
  17. 17. Early contributors in 1999 Hosting of first Multipack CVS repository (June 1999) Amazing makefiles Interface to FITPACK Wrote f2py as he watched my brute-force approach (July 1999) Pearu Peterson (IPP) Early IPython interactive environment (27 Apr 1999) Matlab file reader (24 Apr 1999) Janko Hauser Created windows binaries of multipack, cephesmodule, fftw, and signaltools (June 1999 while still in high school!) Robert Kern Saturday, March 17, 12
  18. 18. Saturday, March 17, 12
  19. 19. Saturday, March 17, 12
  20. 20. Saturday, March 17, 12
  21. 21. 2000-2001 Saturday, March 17, 12
  22. 22. SciPy 2001 Travis Oliphant optimize sparse interpolate integrate special signal stats Founded in 2001 with Travis Vaught fftpack misc Eric Jones weave cluster Pearu Peterson GA* linalg interpolate f2py Saturday, March 17, 12
  23. 23. Finally... Plotting John Hunter 2001 Saturday, March 17, 12
  24. 24. IPython (building on IPP) Fernando Perez Dec 10, 2001 Saturday, March 17, 12
  25. 25. STSCI leads out with Numarray Perry Greenfield J. Todd Miller Rick White Paul Barrett Saturday, March 17, 12
  26. 26. Numarray released in 2003 • too slow for small arrays • incomplete implementation for ufuncs • minimal Numeric code re-use • lots of very nice things, though (e.g. memory maps, fast code for large arrays, better sorting algorithms) Saturday, March 17, 12
  27. 27. Split in the community Numeric SciPy Numarray ndimage others Saturday, March 17, 12
  28. 28. ndimage Saturday, March 17, 12
  29. 29. started January 2005 NumPy Version 1.0 October 2006 Key contributions from: Numarray Numeric Chuck Harris Robert Kern David Cooke Pierre GM Saturday, March 17, 12
  30. 30. NumPy in Python Q: When is NumPy going to be part of the core Python? A: Data structure is in Python as PEP 3118 Saturday, March 17, 12
  31. 31. Saturday, March 17, 12
  32. 32. Left Academica For Industry Saturday, March 17, 12
  33. 33. Community effort many, many others --- forgive me! • Chuck Harris • Pauli Virtanen • David Cournapeau • Stefan van der Walt • Jarrod Millman • Josef Perktold • Anne Archibald • Dag Sverre Seljebotn • Robert Kern • Matthew Brett • Warren Weckesser • Ralf Gommers • Joe Harrington --- Documentation effort • Andrew Straw --- www.scipy.org Saturday, March 17, 12
  34. 34. NumPy Array shape Saturday, March 17, 12
  35. 35. NumPy Essentials • Data: the array object – slicing – shapes and strides – data-type generality • Fast Math: – vectorization – broadcasting – aggregations Saturday, March 17, 12
  36. 36. Zen of NumPy • strided is better than scattered • contiguous is better than strided • descriptive is better than imperative • array-oriented is better than object-oriented • broadcasting is a great idea • vectorized is better than an explicit loop • unless it’s too complicated --- then use Cython / weave • think in higher dimensions Saturday, March 17, 12
  37. 37. Now What? Saturday, March 17, 12
  38. 38. Experiences Conulsting • Basically, I was a consultant and manager of consultants for 4 years. • It kept the lights on at home, but gave me minimal time for NumPy. • Silver lining was that I learned exactly what “big-data” problems big companies have and saw first-hand that numerous “little” improvements need to be made to NumPy which will allow it to have a larger impact on helping to manage the world’s data. Saturday, March 17, 12
  39. 39. Putting Science back in CS • Much of the software stack is for systems programming --- C++, Java, .NET, ObjC, web - Complex numbers? - Vectorization primitives? • Array-based programming has been supplanted by Object-oriented programming • Software stack for scientists is not as helpful as it should be • Fortran is still where many scientists end up Saturday, March 17, 12
  40. 40. APL : the first array-oriented language • Appeared in 1964 • Originated by Ken Iverson • Direct descendants (J, K, Matlab) are still used heavily and people pay a lot of money for them APL • NumPy is a descendent J K Matlab Numeric NumPy Saturday, March 17, 12
  41. 41. Conway’s game of Life • Dead cell with exactly 3 live neighbors will come to life • A live cell with 2 or 3 neighbors will survive • With too few or too many neighbors, the cell dies Saturday, March 17, 12
  42. 42. Interesting Patterns emerge Saturday, March 17, 12
  43. 43. Conway’s Game of Life APL NumPy Initialization Update Step Saturday, March 17, 12
  44. 44. Improvements needed • NDArray improvements • Indexes (esp. for Structured arrays) • SQL front-end • Multi-level, hierarchical labels • selection via mappings (labeled arrays) • Memory spaces (array made up of regions) • Distributed arrays (global array) • Compressed arrays • Standard distributed persistance • fancy indexing as view and optimizations • streaming arrays Saturday, March 17, 12
  45. 45. Improvements needed • Dtype improvements • Enumerated types (including dynamic enumeration) • Derived fields • Specification as a class (or JSON) • Pointer dtype (i.e. C++ object, or varchar) • Finishing datetime • Missing data with both bit-patterns and mask • Parameterized field names Saturday, March 17, 12
  46. 46. Example of Object Dtype @np.dtype class Stock(np.DType): symbol = np.Str(4) open = np.Int(2) close = np.Int(2) high = np.Int(2) low = np.Int(2) @np.Int(2) def mid(self): return (self.high + self.low) / 2.0 Saturday, March 17, 12
  47. 47. Improvements needed • Ufunc improvements • Generalized ufuncs support more than just contiguous arrays • Specification of ufuncs in Python • Move most dtype “array functions” to ufuncs • Unify error-handling for all computations • Allow lazy-evaluation and remote computation --- streaming and generator data • Structured and string dtype ufuncs • Multi-core and GPU optimized ufuncs • Group-by reduction Saturday, March 17, 12
  48. 48. Improvements needed • Miscellaneous improvements • ABI-management • Eventual Move to C++ library (NDLib) • NDLib could serve as base for Javascript and other high-level languages • Integration with LLVM • Possible dtype / shape / stride unification into a “dimension protocol” • Remote computation • Fast I/O for CSV and Excel Saturday, March 17, 12
  49. 49. NumPy Users • Want to be able to write Python to get fast code that works on arrays and scalars • Need access to a boat-load of C-extensions (NumPy is just the beginning) PyPy doesn’t cut it for us! Saturday, March 17, 12
  50. 50. Ufuncs Saturday, March 17, 12 Generalized UFuncs Python Function Window Kernel Funcs Function- based Indexing Memory Dynamic compilation Filters Dynamic Compilation NumPy Runtime I/O Filters Reduction Filters Computed Columns function pointer
  51. 51. SciPy needs a Python compiler optimize integrate special ode writing more of SciPy at high-level Saturday, March 17, 12
  52. 52. Numba -- a Python compiler • Replays byte-code on a stack with simple type- inference • Translates to LLVM (using LLVM-py) • Uses LLVM for code-gen • Resulting C-level function-pointer can be inserted into NumPy run-time • Understands NumPy arrays • Is NumPy / SciPy aware Saturday, March 17, 12
  53. 53. NumPy + Mamba = Numba Python Function Machine Code LLVM-PY LLVM 3.1 ISPC OpenCL OpenMP CUDA CLANG Intel AMD Nvidia Apple Saturday, March 17, 12
  54. 54. Examples Saturday, March 17, 12
  55. 55. Examples Saturday, March 17, 12
  56. 56. Software Stack Future? Plateaus of Code re-use + DSLs SQL R TDPL Matlab Python OBJC C FORTRAN C++ LLVM Saturday, March 17, 12
  57. 57. Seeking Developers! https://github.com/ContinuumIO/numba Saturday, March 17, 12
  58. 58. How to pay for all this? Saturday, March 17, 12
  59. 59. Dual strategy NumPy 2.0 Saturday, March 17, 12
  60. 60. NumFOCUS Num(Py) Foundation for Open Code for Usable Science Saturday, March 17, 12
  61. 61. NumFOCUS • Mission • To initiate and support educational programs furthering the use of open source software in science. • To promote the use of high-level languages and open source in science, engineering, and math research • To encourage reproducible scientific research Saturday, March 17, 12
  62. 62. NumFOCUS • Activites • Sponsor sprints and conferences • Provide scholarships and grants • Pay for documentation development and basic course development • Work with domain-specific organizations • Raise funds from industries using Python and NumPy Saturday, March 17, 12
  63. 63. NumFOCUS • Directors • Jarrod Millman • Fernando Perez • Travis Oliphant • Perry Greenfield • John Hunter • Members • Basically people who donate for now. In time, a body that elects directors. Saturday, March 17, 12
  • naitohiroshi988

    Oct. 5, 2019
  • TakeshiNishikawa3

    Jun. 18, 2019
  • radovan.kavicky

    Jan. 14, 2018
  • tnoda

    Oct. 17, 2017
  • TrungNgoTrong

    Aug. 14, 2017
  • ToshiakiMaegawa

    Apr. 18, 2017
  • ssuser687e18

    Jun. 2, 2016
  • passfield2003

    Aug. 2, 2014
  • Kubo_Takuya

    Jun. 21, 2014
  • svbcrypto

    May. 14, 2014
  • NikolayKarelin

    Feb. 26, 2014
  • suncloud

    Dec. 27, 2013
  • YusukeWatanabe3

    Nov. 28, 2013
  • wfreitas

    Mar. 27, 2013
  • dmj111

    Oct. 9, 2012
  • gn00023040

    Jun. 10, 2012
  • sonyu1

    Jun. 8, 2012

Travis E. Oliphant gave an invited talk titled "NumPy and SciPy: History and Ideas for the Future" at Tokyo.SciPy#3 on Mar. 18th, 2012.

Vues

Nombre de vues

7 741

Sur Slideshare

0

À partir des intégrations

0

Nombre d'intégrations

826

Actions

Téléchargements

0

Partages

0

Commentaires

0

Mentions J'aime

17

×