SlideShare une entreprise Scribd logo
1  sur  4
Télécharger pour lire hors ligne
I NVITED S ECTION : T HE F UTURE OF R                                                                                5



Facets of R
Special invited paper on “The Future of R”                      2. interactive, hands-on in real time;

by John M. Chambers                                             3. functional in its model of programming;
                                                                4. object-oriented, “everything is an object”;
We are seeing today a widespread, and welcome,
tendency for non-computer-specialists among statis-             5. modular, built from standardized pieces; and,
ticians and others to write collections of R functions
                                                                6. collaborative, a world-wide, open-source effort.
that organize and communicate their work. Along
with the flood of software sometimes comes an atti-           None of these facets is original to R, but while many
tude that one need only learn, or teach, a sort of basic     systems emphasize one or two of them, R continues
how-to-write-the-function level of R programming,            to reflect them all, resulting in a programming model
beyond which most of the detail is unimportant or            that is indeed rich but sometimes messy.
can be absorbed without much discussion. As delu-                The thousands of R packages available from
sions go, this one is not very objectionable if it en-       CRAN, BioConductor, R-Forge, and other reposi-
courages participation. Nevertheless, a delusion it is.      tories, the uncounted other software contributions
In fact, functions are only one of a variety of impor-       from groups and individuals, and the many citations
tant facets that R has acquired by intent or circum-         in the scientific literature all testify to the useful com-
stance during the three-plus decades of the history of       putations created within the R model. Understand-
the software and of its predecessor S. To create valu-       ing how the various facets arose and how they can
able and trustworthy software using R often requires         work together may help us improve and extend that
an understanding of some of these facets and their           software.
interrelations. This paper identifies six facets, dis-            We introduce the facets more or less chronolog-
cussing where they came from, how they support or            ically, in three pairs that entered into the software
conflict with each other, and what implications they          during successive intervals of roughly a decade. To
have for the future of programming with R.                   provide some context, here is a brief chronology,
                                                             with some references. The S project began in our
                                                             statistics research group at Bell Labs in 1976, evolved
Facets                                                       into a generally licensed system through the 1980s
                                                             and continues in the S+ software, currently owned by
Any software system that has endured and retained            TIBCO Software Inc. The history of S is summarized
a reasonably happy user community will likely                in the Appendix to Chambers (2008); the standard
have some distinguishing characteristics that form           books introducing still-relevant versions of S include
its style. The characteristics of different systems are      Becker et al. (1988) (no longer in print), Chambers
usually not totally unrelated, at least among systems        and Hastie (1992), and Chambers (1998). R was an-
serving roughly similar goals—in computing as in             nounced to the world in Ihaka and Gentleman (1996)
other fields, the set of truly distinct concepts is not       and evolved fairly soon to be developed and man-
that large. But in the mix of different characteris-         aged by the R-core group and other contributors. Its
tics and in the details of how they work together lies       history is harder to pin down, partly because the
much of the flavor of a particular system.                    story is very much still happening and partly be-
     Understanding such characteristics, including a         cause, as a collaborative open-source project, indi-
bit of the historical background, can be helpful in          vidual responsibility is sometimes hard to attribute;
making better use of the software. It can also guide         in addition, not all the important new ideas have
thinking about directions for future improvements of         been described separately by their authors. The
the system itself.                                           http://www.r-project.org site points to the on-line
     The R software and the S software that preceded         manuals, as well as a list of over 70 related books and
it reflect a rather large range of characteristics, result-   other material. The manuals are invaluable as a tech-
ing in software that might be termed rich or messy           nical resource, but short on background information.
according to one’s taste. Since R has become a very          Articles in R News, the predecessor of this journal,
widely used environment for applications and re-             give descriptions of some of the key contributions,
search in data analysis, understanding something of          such as namespaces (Tierney, 2003), and internation-
these characteristics may be helpful to the commu-           alization (Ripley, 2005).
nity.
     This paper considers six characteristics, which we
will call facets. They characterize R as:                    An interactive interface
   1. an interface to computational procedures of            The goals for the first version of S in 1976 already
      many kinds;                                            combined two facets of computing usually kept


The R Journal Vol. 1/1, May 2009                                                                     ISSN 2073-4859
6                                                                           I NVITED S ECTION : T HE F UTURE OF R


apart, interactive computing and the development of         Functional and object-oriented pro-
procedural software for scientific applications.
                                                            gramming
    One goal was a better interface to new computa-
tional procedures for scientific data analysis, usually      During the 1980s the topics of functional programming
implemented then as Fortran subroutines. Bell Labs          and object-oriented programming stimulated growing
statistics research had collected and written a variety     interest in the computer science literature. At the
of these while at the same time numerical libraries         same time, I was exploring possibilities for the next
and published algorithm collections were establish-         S, perhaps “after S”. These ideas were eventually
ing many of the essential computational techniques.         blended with other work to produce the “new S”,
These procedures were the essence of the actual data        later called Version 3 of S, or S3, (Becker et al., 1988).
analysis; the initial goal was essentially a better in-         As before, user expressions were still interpreted,
terface to procedural computation. A graphic from           but were now parsed into objects representing the
the first plans for S (Chambers, 2008, Appendix, page        language elements, mainly function calls. The func-
476) illustrated this.                                      tions were also objects, the result of parsing expres-
                                                            sions in the language that defined the function. As
    But that interface was to be interactive, specifically   the motto expressed it, “Everything is an object”. The
via a language used by the data analyst communicat-         procedural interface facet was not abandoned, how-
ing in real time with the software. At this time, a few     ever, but from the user’s perspective it was now de-
systems had pointed the way for interactive scientific       fined by functions that provided an interface to a C
computing, but most of them were highly specialized         or Fortran subroutine.
with limited programming facilities; that is, a limited         The new programming model was functional in
ability for the user to express novel computations.         two senses. It largely followed the functional pro-
The most striking exception was APL, created by             gramming paradigm, which defined programming
Kenneth Iverson (1962), an operator-based interac-          as the evaluation of function calls, with the value
tive language with heavy emphasis on general mul-           uniquely determined by the objects supplied as ar-
tiway arrays. APL introduced conventions that now           guments and with no external side-effects. Not all
seem obvious; for example, the result of evaluating         functions followed this strictly, however.
a user’s expression was either an assignment or a               The new version was functional also in the sense
printed display of the computed object (rather than         that nearly everything that happened in evaluation
expecting users to type a print command). But APL           was a function call. Evaluation in R is even more
at the time had no notion of an interface to proce-         functional in that language components such as as-
dures, which along with a limited, if exotic, syntax        signment and loops are formally function calls inter-
caused us to reject it, although S adopted a number         nally, reflecting in part the influence of Lisp on the
of features from it, such as its approach to general        initial implementation of R.
multidimensional arrays.                                        Subsequently, the programing model was ex-
                                                            tended to include classes of objects and methods
    From its first implementation, the S software            that specialized functions according to the classes
combined these two facets: users carried out data           of their arguments. That extension itself came in
analysis interactively by writing expressions in the        two stages. First, a thin layer of additional code
S language, but much of the programming to extend           implemented classes and methods by two simple
the system was via new Fortran subroutines accom-           mechanisms: objects could have an attribute defin-
panied by interfaces to call them from S. The inter-        ing their class as a character string or a vector of
faces were written in a special language that was           multiple strings; and an explicitly invoked method
compiled into Fortran.                                      dispatch function would match the class of the func-
   The result was already a mixed system that had           tion’s first argument against methods, which were
much of the tension between human and machine               simply saved function objects with a class string ap-
efficiency that still stimulates debate among R users        pended to the object’s name. In the second stage, the
and programmers. At the same time, the flexibility           version of the system known as S4 provided formal
from that same mixture helped the software to grow          class definitions, stored as a special metadata object,
and adapt with a growing user community.                    and similarly formal method definitions associated
                                                            with generic functions, also defined via object classes
    Later versions of the system replaced the inter-        (Chambers, 2008, Chapters 9 and 10, for the R ver-
face language with individual functions providing           sion).
interfaces to C or Fortran, but the combination of              The functional and object-oriented facets were
an interactive language and an interface to com-            combined in S and R, but they appear more fre-
piled procedures remains a key feature of R. Increas-       quently in separate, incompatible languages. Typi-
ingly, interfaces to other languages and systems have       cal object-oriented languages such as C++ or Java are
been added as well, for example to database systems,        not functional; instead, their programming model is
spreadsheets and user interface software.                   of methods associated with a class rather than with a


The R Journal Vol. 1/1, May 2009                                                                    ISSN 2073-4859
I NVITED S ECTION : T HE F UTURE OF R                                                                             7


function and invoked on an object. Because the ob-        essential tools provided in R to create, develop, test,
ject is treated as a reference, the methods can usu-      and distribute packages. Among many benefits for
ally modify the object, again quite a different model     users, these tools help programmers create software
from that leading to R. The functional use of meth-       that can be installed and used on the three main op-
ods is more complicated, but richer and indeed nec-       erating systems for computing with data—Windows,
essary to support the essential role of functions. A      Linux, and Mac OS X.
few other languages do combine functional structure
with methods, such as Dylan, (Shalit, 1996, chapters          The sixth facet for our discussion is that R is a col-
5 and 6), and comparisons have proved useful for R,       laborative enterprise, which a large group of people
but came after the initial design was in place.           share in and contribute to. Central to the collabo-
                                                          rative facet is, of course, that R is a freely available,
                                                          open-source system, both the core R software and
Modular design and collaborative                          most of the packages built upon it. As most read-
                                                          ers of this journal will be aware, the combination of
support                                                   the collaborative efforts and the package mechanism
                                                          have enormously increased the availability of tech-
Our third pair of facets arrives with R. The paper by
                                                          niques for data analysis, especially the results of new
Ihaka and Gentleman (1996) introduced a statistical
                                                          research in statistics and its applications. The pack-
software system, characterized later as “not unlike
                                                          age management tools, in fact, illustrate the essential
S”, but distinguished by some ideas intended to im-
                                                          role of collaboration. Another highly collaborative
prove on the earlier system. A facet of R related to
                                                          contribution has facilitated use of R internationally
some of those ideas and to important later develop-
                                                          (Ripley, 2005), both by the use of locales in compu-
ments is its approach to modularity. In a number of
                                                          tations and by the contribution of translations for R
fields, including architecture and computer science,
                                                          documentation and messages.
modules are units designed to be useful in them-
selves and to fit together easily to create a desired          While the collaborative facet of R is the least tech-
result, such as a building or a software system. R has    nical, it may eventually have the most profound im-
two very important levels of modules: functions and       plications. The growth of the collaborative commu-
packages.                                                 nity involved is unprecedented. For example, a plot
    Functions are an obvious modular unit, espe-          in Fox (2008) shows that the number of packages
cially functions as objects. R extended the modu-         in the CRAN repository exhibited literal exponen-
larity of functions by providing them with an envi-       tial growth from 2001 to 2008. The current size of
ronment, usually the environment in which the func-       this and other repositories implies at a conservative
tion object was created. Objects assigned in this en-     estimate that several hundreds of contributors have
vironment can be accessed by name in a call to the        mastered quite a bit of technical expertise and satis-
function, and even modified, by stepping outside           fied some considerable formal requirements to offer
the strict functional programming model. Functions        their efforts freely to the worldwide scientific com-
sharing an environment can thus be used together          munity.
as a unit. The programming technique resulting is
usually called programming with closures in R; for            This facet does have some technical implications.
a discussion and comparison with other techniques,        One is that, being an open-source system, R is gener-
see (Chambers, 2008, section 5.4). Because the eval-      ally able to incorporate other open-source software
uator searches for external objects in the function’s     and does indeed do so in many areas. In contrast,
environment, dependencies on external objects can         proprietary systems are more likely to build exten-
be controlled through that environment. The names-        sions internally, either to maintain competitive ad-
pace mechanism (Tierney, 2003), an important con-         vantage or to avoid the free distribution require-
tribution to trustworthy programming with R, uses         ments of some open-source licenses. Looking for
this technique to help ensure consistent behavior of      quality open-source software to extend the system
functions in a package.                                   should continue to be a favored strategy for R de-
    The larger module introduced by R is probably         velopment.
the most important for the system’s growing use, the
package. As it has evolved, and continues to evolve,          On the downside, a large collaborative enterprise
an R package is a module combining in most cases          with a general practice of making collective decisions
R functions, their documentation, and possibly data       has a natural tendency towards conservatism. Rad-
objects and/or code in other languages.                   ical changes do threaten to break currently working
    While S always had the idea of collections of soft-   features. The future benefits they might bring will of-
ware for distribution, somewhat formalized in S4          ten not be sufficiently persuasive. The very success
as chapters, the R package greatly extends and im-        of R, combined with its collaborative facet, poses a
proves the ability to share computing results as ef-      challenge to cultivate the “next big step” in software
fective modules. The mechanism benefits from some          for data analysis.


The R Journal Vol. 1/1, May 2009                                                                  ISSN 2073-4859
8                                                                         I NVITED S ECTION : T HE F UTURE OF R


Implications                                               “workarounds” are a good thing or not. One may
                                                           feel that the total effort devoted to multiple approx-
A number of specific implications of the facets of R,       imate solutions would have been more productive if
and of their interaction, have been noted in previous      coordinated and applied to improving the core im-
sections. Further implications are suggested in con-       plementation. In practice, however, the nature of the
sidering the current state of R and its future possibil-   collaborative effort supporting R makes it much eas-
ities.                                                     ier to obtain a modest effort from one or a few indi-
    Many reasons can be suggested for R’s increasing       viduals than to organize and execute a larger project.
popularity. Certainly part of the picture is a produc-     Perhaps the growth of R will encourage the com-
tive interaction among the interface facet, the modu-      munity to push for such projects, with a new view
larity of packages, and the collaborative, open-source     of funding and organization. Meanwhile, we can
approach, all in an interactive mode. From the very        hope that the growing communication facilities in
beginning, S was not viewed as a set of procedures         the community will assess and improve on the ex-
(the model for SAS, for example) but as an interactive     isting efforts. Particularly helpful facilities in this
system to support the implementation of new pro-           context include the CRAN task views (http://cran.
cedures. The growth in packages noted previously           r-project.org/web/views/), the mailing lists and
reflects the same view.                                     publications such as this journal.
    The emphasis on interfaces has never diminished
and the range of possible procedures across the in-
terface has expanded greatly. That R’s growth has          Bibliography
been especially strong in academic and other re-
search environments is at least partly due to the abil-    R. A. Becker, J. M. Chambers, and A. R. Wilks. The
ity to write interfaces to existing and new software         New S Language. Chapman & Hall, London, 1988.
of many forms. The evolution of the package as             J. M. Chambers. Programming with Data. Springer,
an effective module and of the repositories for con-          New York, 1998. URL http://cm.bell-labs.
tributed software from the community have resulted            com/cm/ms/departments/sia/Sbook/. ISBN 0-387-
in many new interfaces.                                       98503-4.
    The facets also have implications for the possible
future of R. The hope is that R can continue to sup-       J. M. Chambers. Software for Data Analysis: Program-
port the implementation and communication of im-              ming with R. Springer, New York, 2008. ISBN 978-
portant techniques for data analysis, contributed and         0-387-75935-7.
used by a growing community. At the same time,
                                                           J. M. Chambers and T. J. Hastie. Statistical Mod-
those interested in new directions for statistical com-
                                                              els in S. Chapman & Hall, London, 1992. ISBN
puting will hope that R or some future alternative
                                                              9780412830402.
engages a variety of challenges for computing with
data. The collaborative community seems the only           J. Fox. Editorial. R News, 8(2):1–2, October 2008. URL
likely generator of the new features required but, as         http://CRAN.R-project.org/doc/Rnews/.
noted in the previous section, combining the new
with maintenance of a popular current collaborative        R. Ihaka and R. Gentleman. R: A language for data
system will not be easy.                                     analysis and graphics. Journal of Computational and
    The encouragement for new packages generally             Graphical Statistics, 5:299–314, 1996.
and for interfaces to other software in particular are     K. E. Iverson. A Programming Language. Wiley, 1962.
relevant for this question also. Changes that might
be difficult if applied internally to the core R im-        B. D. Ripley. Internationalization features of R 2.1.0.
plementation can often be supplied, or at least ap-          R News, 5(1):2–7, May 2005. URL http://CRAN.
proximated, by packages. Two examples of enhance-            R-project.org/doc/Rnews/.
ments considered in discussions are an optional ob-        A. Shalit. The Dylan Reference Manual. Addison-
ject model using mutable references and computa-             Wesley Developers Press, Reading, Mass., 1996.
tions using multiple threads. In both cases, it can          ISBN 0-201-44211-6.
be argued that adding the facility to R could extend
its applicability and performance. But a substantial       L. Tierney. Name space management for R. R News,
programming effort and knowledge of the current               3(1):2–6, June 2003. URL http://CRAN.R-project.
implementation would be required to modify the in-            org/doc/Rnews/.
ternal object model or to make the core code thread-
safe. Issues of back compatibility are likely to arise
                                                           John M. Chambers
as well. Meanwhile, less ambitious approaches in
                                                           Department of Statistics
the form of packages that approximate the desired
                                                           Stanford University
behavior have been written. For the second exam-
                                                           USA
ple especially, these may interface to other software
                                                           jmc@r-project.org
designed to provide this facility.
    Opinions may reasonably differ on whether these
The R Journal Vol. 1/1, May 2009                                                                 ISSN 2073-4859

Contenu connexe

Similaire à R Journal 2009 1 Chambers

Financial Risk Mgt - Lec 4 by Dr. Syed Muhammad Ali Tirmizi
Financial Risk Mgt - Lec 4 by Dr. Syed Muhammad Ali TirmiziFinancial Risk Mgt - Lec 4 by Dr. Syed Muhammad Ali Tirmizi
Financial Risk Mgt - Lec 4 by Dr. Syed Muhammad Ali TirmiziDr. Muhammad Ali Tirmizi., Ph.D.
 
Switching from lispstat to r
Switching from lispstat to rSwitching from lispstat to r
Switching from lispstat to rAjay Ohri
 
GoOpen 2010: Roger Bivand
GoOpen 2010: Roger BivandGoOpen 2010: Roger Bivand
GoOpen 2010: Roger BivandFriprogsenteret
 
OBJECT ORIENTED PROGRAMMING.docx
OBJECT ORIENTED PROGRAMMING.docxOBJECT ORIENTED PROGRAMMING.docx
OBJECT ORIENTED PROGRAMMING.docxAleKi2
 
statistical computation using R- report
statistical computation using R- reportstatistical computation using R- report
statistical computation using R- reportKamarudheen KV
 
2 it unit-1 start learning r
2 it   unit-1 start learning r2 it   unit-1 start learning r
2 it unit-1 start learning rNetaji Gandi
 
Statistical Packages SPSS, R, Python - Business Statistics & Research Methods...
Statistical Packages SPSS, R, Python - Business Statistics & Research Methods...Statistical Packages SPSS, R, Python - Business Statistics & Research Methods...
Statistical Packages SPSS, R, Python - Business Statistics & Research Methods...Sundar B N
 
Realization of natural language interfaces using
Realization of natural language interfaces usingRealization of natural language interfaces using
Realization of natural language interfaces usingunyil96
 
R programming for psychometrics
R programming for psychometricsR programming for psychometrics
R programming for psychometricsDiane Talley
 
The Statistical Significance of "R"
The Statistical Significance of "R"The Statistical Significance of "R"
The Statistical Significance of "R"ppvora
 
Learn a language : LISP
Learn a language : LISPLearn a language : LISP
Learn a language : LISPDevnology
 
R vs SPSS: Which One is The Best Statistical Language
R vs SPSS: Which One is The Best Statistical LanguageR vs SPSS: Which One is The Best Statistical Language
R vs SPSS: Which One is The Best Statistical LanguageStat Analytica
 
R programming language
R programming languageR programming language
R programming languageKeerti Verma
 

Similaire à R Journal 2009 1 Chambers (20)

Modest Formalization of Software Design Patterns
Modest Formalization of Software Design PatternsModest Formalization of Software Design Patterns
Modest Formalization of Software Design Patterns
 
Financial Risk Mgt - Lec 4 by Dr. Syed Muhammad Ali Tirmizi
Financial Risk Mgt - Lec 4 by Dr. Syed Muhammad Ali TirmiziFinancial Risk Mgt - Lec 4 by Dr. Syed Muhammad Ali Tirmizi
Financial Risk Mgt - Lec 4 by Dr. Syed Muhammad Ali Tirmizi
 
Switching from lispstat to r
Switching from lispstat to rSwitching from lispstat to r
Switching from lispstat to r
 
R_L1-Aug-2022.pptx
R_L1-Aug-2022.pptxR_L1-Aug-2022.pptx
R_L1-Aug-2022.pptx
 
effectivegraphsmro1
effectivegraphsmro1effectivegraphsmro1
effectivegraphsmro1
 
GoOpen 2010: Roger Bivand
GoOpen 2010: Roger BivandGoOpen 2010: Roger Bivand
GoOpen 2010: Roger Bivand
 
OBJECT ORIENTED PROGRAMMING.docx
OBJECT ORIENTED PROGRAMMING.docxOBJECT ORIENTED PROGRAMMING.docx
OBJECT ORIENTED PROGRAMMING.docx
 
Analysis Report
 Analysis Report  Analysis Report
Analysis Report
 
statistical computation using R- report
statistical computation using R- reportstatistical computation using R- report
statistical computation using R- report
 
2 it unit-1 start learning r
2 it   unit-1 start learning r2 it   unit-1 start learning r
2 it unit-1 start learning r
 
UNIT-1 Start Learning R.pdf
UNIT-1 Start Learning R.pdfUNIT-1 Start Learning R.pdf
UNIT-1 Start Learning R.pdf
 
Statistical Packages SPSS, R, Python - Business Statistics & Research Methods...
Statistical Packages SPSS, R, Python - Business Statistics & Research Methods...Statistical Packages SPSS, R, Python - Business Statistics & Research Methods...
Statistical Packages SPSS, R, Python - Business Statistics & Research Methods...
 
R for data analytics
R for data analyticsR for data analytics
R for data analytics
 
Realization of natural language interfaces using
Realization of natural language interfaces usingRealization of natural language interfaces using
Realization of natural language interfaces using
 
History Of C Essay
History Of C EssayHistory Of C Essay
History Of C Essay
 
R programming for psychometrics
R programming for psychometricsR programming for psychometrics
R programming for psychometrics
 
The Statistical Significance of "R"
The Statistical Significance of "R"The Statistical Significance of "R"
The Statistical Significance of "R"
 
Learn a language : LISP
Learn a language : LISPLearn a language : LISP
Learn a language : LISP
 
R vs SPSS: Which One is The Best Statistical Language
R vs SPSS: Which One is The Best Statistical LanguageR vs SPSS: Which One is The Best Statistical Language
R vs SPSS: Which One is The Best Statistical Language
 
R programming language
R programming languageR programming language
R programming language
 

Plus de Ajay Ohri

Introduction to R ajay Ohri
Introduction to R ajay OhriIntroduction to R ajay Ohri
Introduction to R ajay OhriAjay Ohri
 
Introduction to R
Introduction to RIntroduction to R
Introduction to RAjay Ohri
 
Social Media and Fake News in the 2016 Election
Social Media and Fake News in the 2016 ElectionSocial Media and Fake News in the 2016 Election
Social Media and Fake News in the 2016 ElectionAjay Ohri
 
Download Python for R Users pdf for free
Download Python for R Users pdf for freeDownload Python for R Users pdf for free
Download Python for R Users pdf for freeAjay Ohri
 
Install spark on_windows10
Install spark on_windows10Install spark on_windows10
Install spark on_windows10Ajay Ohri
 
Ajay ohri Resume
Ajay ohri ResumeAjay ohri Resume
Ajay ohri ResumeAjay Ohri
 
Statistics for data scientists
Statistics for  data scientistsStatistics for  data scientists
Statistics for data scientistsAjay Ohri
 
National seminar on emergence of internet of things (io t) trends and challe...
National seminar on emergence of internet of things (io t)  trends and challe...National seminar on emergence of internet of things (io t)  trends and challe...
National seminar on emergence of internet of things (io t) trends and challe...Ajay Ohri
 
Tools and techniques for data science
Tools and techniques for data scienceTools and techniques for data science
Tools and techniques for data scienceAjay Ohri
 
Training in Analytics and Data Science
Training in Analytics and Data ScienceTraining in Analytics and Data Science
Training in Analytics and Data ScienceAjay Ohri
 
Software Testing for Data Scientists
Software Testing for Data ScientistsSoftware Testing for Data Scientists
Software Testing for Data ScientistsAjay Ohri
 
A Data Science Tutorial in Python
A Data Science Tutorial in PythonA Data Science Tutorial in Python
A Data Science Tutorial in PythonAjay Ohri
 
How does cryptography work? by Jeroen Ooms
How does cryptography work?  by Jeroen OomsHow does cryptography work?  by Jeroen Ooms
How does cryptography work? by Jeroen OomsAjay Ohri
 
Using R for Social Media and Sports Analytics
Using R for Social Media and Sports AnalyticsUsing R for Social Media and Sports Analytics
Using R for Social Media and Sports AnalyticsAjay Ohri
 
Kush stats alpha
Kush stats alpha Kush stats alpha
Kush stats alpha Ajay Ohri
 
Analyze this
Analyze thisAnalyze this
Analyze thisAjay Ohri
 
Summer school python in spanish
Summer school python in spanishSummer school python in spanish
Summer school python in spanishAjay Ohri
 

Plus de Ajay Ohri (20)

Introduction to R ajay Ohri
Introduction to R ajay OhriIntroduction to R ajay Ohri
Introduction to R ajay Ohri
 
Introduction to R
Introduction to RIntroduction to R
Introduction to R
 
Social Media and Fake News in the 2016 Election
Social Media and Fake News in the 2016 ElectionSocial Media and Fake News in the 2016 Election
Social Media and Fake News in the 2016 Election
 
Pyspark
PysparkPyspark
Pyspark
 
Download Python for R Users pdf for free
Download Python for R Users pdf for freeDownload Python for R Users pdf for free
Download Python for R Users pdf for free
 
Install spark on_windows10
Install spark on_windows10Install spark on_windows10
Install spark on_windows10
 
Ajay ohri Resume
Ajay ohri ResumeAjay ohri Resume
Ajay ohri Resume
 
Statistics for data scientists
Statistics for  data scientistsStatistics for  data scientists
Statistics for data scientists
 
National seminar on emergence of internet of things (io t) trends and challe...
National seminar on emergence of internet of things (io t)  trends and challe...National seminar on emergence of internet of things (io t)  trends and challe...
National seminar on emergence of internet of things (io t) trends and challe...
 
Tools and techniques for data science
Tools and techniques for data scienceTools and techniques for data science
Tools and techniques for data science
 
Training in Analytics and Data Science
Training in Analytics and Data ScienceTraining in Analytics and Data Science
Training in Analytics and Data Science
 
Tradecraft
Tradecraft   Tradecraft
Tradecraft
 
Software Testing for Data Scientists
Software Testing for Data ScientistsSoftware Testing for Data Scientists
Software Testing for Data Scientists
 
Craps
CrapsCraps
Craps
 
A Data Science Tutorial in Python
A Data Science Tutorial in PythonA Data Science Tutorial in Python
A Data Science Tutorial in Python
 
How does cryptography work? by Jeroen Ooms
How does cryptography work?  by Jeroen OomsHow does cryptography work?  by Jeroen Ooms
How does cryptography work? by Jeroen Ooms
 
Using R for Social Media and Sports Analytics
Using R for Social Media and Sports AnalyticsUsing R for Social Media and Sports Analytics
Using R for Social Media and Sports Analytics
 
Kush stats alpha
Kush stats alpha Kush stats alpha
Kush stats alpha
 
Analyze this
Analyze thisAnalyze this
Analyze this
 
Summer school python in spanish
Summer school python in spanishSummer school python in spanish
Summer school python in spanish
 

Dernier

Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGSujit Pal
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 

Dernier (20)

Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAG
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 

R Journal 2009 1 Chambers

  • 1. I NVITED S ECTION : T HE F UTURE OF R 5 Facets of R Special invited paper on “The Future of R” 2. interactive, hands-on in real time; by John M. Chambers 3. functional in its model of programming; 4. object-oriented, “everything is an object”; We are seeing today a widespread, and welcome, tendency for non-computer-specialists among statis- 5. modular, built from standardized pieces; and, ticians and others to write collections of R functions 6. collaborative, a world-wide, open-source effort. that organize and communicate their work. Along with the flood of software sometimes comes an atti- None of these facets is original to R, but while many tude that one need only learn, or teach, a sort of basic systems emphasize one or two of them, R continues how-to-write-the-function level of R programming, to reflect them all, resulting in a programming model beyond which most of the detail is unimportant or that is indeed rich but sometimes messy. can be absorbed without much discussion. As delu- The thousands of R packages available from sions go, this one is not very objectionable if it en- CRAN, BioConductor, R-Forge, and other reposi- courages participation. Nevertheless, a delusion it is. tories, the uncounted other software contributions In fact, functions are only one of a variety of impor- from groups and individuals, and the many citations tant facets that R has acquired by intent or circum- in the scientific literature all testify to the useful com- stance during the three-plus decades of the history of putations created within the R model. Understand- the software and of its predecessor S. To create valu- ing how the various facets arose and how they can able and trustworthy software using R often requires work together may help us improve and extend that an understanding of some of these facets and their software. interrelations. This paper identifies six facets, dis- We introduce the facets more or less chronolog- cussing where they came from, how they support or ically, in three pairs that entered into the software conflict with each other, and what implications they during successive intervals of roughly a decade. To have for the future of programming with R. provide some context, here is a brief chronology, with some references. The S project began in our statistics research group at Bell Labs in 1976, evolved Facets into a generally licensed system through the 1980s and continues in the S+ software, currently owned by Any software system that has endured and retained TIBCO Software Inc. The history of S is summarized a reasonably happy user community will likely in the Appendix to Chambers (2008); the standard have some distinguishing characteristics that form books introducing still-relevant versions of S include its style. The characteristics of different systems are Becker et al. (1988) (no longer in print), Chambers usually not totally unrelated, at least among systems and Hastie (1992), and Chambers (1998). R was an- serving roughly similar goals—in computing as in nounced to the world in Ihaka and Gentleman (1996) other fields, the set of truly distinct concepts is not and evolved fairly soon to be developed and man- that large. But in the mix of different characteris- aged by the R-core group and other contributors. Its tics and in the details of how they work together lies history is harder to pin down, partly because the much of the flavor of a particular system. story is very much still happening and partly be- Understanding such characteristics, including a cause, as a collaborative open-source project, indi- bit of the historical background, can be helpful in vidual responsibility is sometimes hard to attribute; making better use of the software. It can also guide in addition, not all the important new ideas have thinking about directions for future improvements of been described separately by their authors. The the system itself. http://www.r-project.org site points to the on-line The R software and the S software that preceded manuals, as well as a list of over 70 related books and it reflect a rather large range of characteristics, result- other material. The manuals are invaluable as a tech- ing in software that might be termed rich or messy nical resource, but short on background information. according to one’s taste. Since R has become a very Articles in R News, the predecessor of this journal, widely used environment for applications and re- give descriptions of some of the key contributions, search in data analysis, understanding something of such as namespaces (Tierney, 2003), and internation- these characteristics may be helpful to the commu- alization (Ripley, 2005). nity. This paper considers six characteristics, which we will call facets. They characterize R as: An interactive interface 1. an interface to computational procedures of The goals for the first version of S in 1976 already many kinds; combined two facets of computing usually kept The R Journal Vol. 1/1, May 2009 ISSN 2073-4859
  • 2. 6 I NVITED S ECTION : T HE F UTURE OF R apart, interactive computing and the development of Functional and object-oriented pro- procedural software for scientific applications. gramming One goal was a better interface to new computa- tional procedures for scientific data analysis, usually During the 1980s the topics of functional programming implemented then as Fortran subroutines. Bell Labs and object-oriented programming stimulated growing statistics research had collected and written a variety interest in the computer science literature. At the of these while at the same time numerical libraries same time, I was exploring possibilities for the next and published algorithm collections were establish- S, perhaps “after S”. These ideas were eventually ing many of the essential computational techniques. blended with other work to produce the “new S”, These procedures were the essence of the actual data later called Version 3 of S, or S3, (Becker et al., 1988). analysis; the initial goal was essentially a better in- As before, user expressions were still interpreted, terface to procedural computation. A graphic from but were now parsed into objects representing the the first plans for S (Chambers, 2008, Appendix, page language elements, mainly function calls. The func- 476) illustrated this. tions were also objects, the result of parsing expres- sions in the language that defined the function. As But that interface was to be interactive, specifically the motto expressed it, “Everything is an object”. The via a language used by the data analyst communicat- procedural interface facet was not abandoned, how- ing in real time with the software. At this time, a few ever, but from the user’s perspective it was now de- systems had pointed the way for interactive scientific fined by functions that provided an interface to a C computing, but most of them were highly specialized or Fortran subroutine. with limited programming facilities; that is, a limited The new programming model was functional in ability for the user to express novel computations. two senses. It largely followed the functional pro- The most striking exception was APL, created by gramming paradigm, which defined programming Kenneth Iverson (1962), an operator-based interac- as the evaluation of function calls, with the value tive language with heavy emphasis on general mul- uniquely determined by the objects supplied as ar- tiway arrays. APL introduced conventions that now guments and with no external side-effects. Not all seem obvious; for example, the result of evaluating functions followed this strictly, however. a user’s expression was either an assignment or a The new version was functional also in the sense printed display of the computed object (rather than that nearly everything that happened in evaluation expecting users to type a print command). But APL was a function call. Evaluation in R is even more at the time had no notion of an interface to proce- functional in that language components such as as- dures, which along with a limited, if exotic, syntax signment and loops are formally function calls inter- caused us to reject it, although S adopted a number nally, reflecting in part the influence of Lisp on the of features from it, such as its approach to general initial implementation of R. multidimensional arrays. Subsequently, the programing model was ex- tended to include classes of objects and methods From its first implementation, the S software that specialized functions according to the classes combined these two facets: users carried out data of their arguments. That extension itself came in analysis interactively by writing expressions in the two stages. First, a thin layer of additional code S language, but much of the programming to extend implemented classes and methods by two simple the system was via new Fortran subroutines accom- mechanisms: objects could have an attribute defin- panied by interfaces to call them from S. The inter- ing their class as a character string or a vector of faces were written in a special language that was multiple strings; and an explicitly invoked method compiled into Fortran. dispatch function would match the class of the func- The result was already a mixed system that had tion’s first argument against methods, which were much of the tension between human and machine simply saved function objects with a class string ap- efficiency that still stimulates debate among R users pended to the object’s name. In the second stage, the and programmers. At the same time, the flexibility version of the system known as S4 provided formal from that same mixture helped the software to grow class definitions, stored as a special metadata object, and adapt with a growing user community. and similarly formal method definitions associated with generic functions, also defined via object classes Later versions of the system replaced the inter- (Chambers, 2008, Chapters 9 and 10, for the R ver- face language with individual functions providing sion). interfaces to C or Fortran, but the combination of The functional and object-oriented facets were an interactive language and an interface to com- combined in S and R, but they appear more fre- piled procedures remains a key feature of R. Increas- quently in separate, incompatible languages. Typi- ingly, interfaces to other languages and systems have cal object-oriented languages such as C++ or Java are been added as well, for example to database systems, not functional; instead, their programming model is spreadsheets and user interface software. of methods associated with a class rather than with a The R Journal Vol. 1/1, May 2009 ISSN 2073-4859
  • 3. I NVITED S ECTION : T HE F UTURE OF R 7 function and invoked on an object. Because the ob- essential tools provided in R to create, develop, test, ject is treated as a reference, the methods can usu- and distribute packages. Among many benefits for ally modify the object, again quite a different model users, these tools help programmers create software from that leading to R. The functional use of meth- that can be installed and used on the three main op- ods is more complicated, but richer and indeed nec- erating systems for computing with data—Windows, essary to support the essential role of functions. A Linux, and Mac OS X. few other languages do combine functional structure with methods, such as Dylan, (Shalit, 1996, chapters The sixth facet for our discussion is that R is a col- 5 and 6), and comparisons have proved useful for R, laborative enterprise, which a large group of people but came after the initial design was in place. share in and contribute to. Central to the collabo- rative facet is, of course, that R is a freely available, open-source system, both the core R software and Modular design and collaborative most of the packages built upon it. As most read- ers of this journal will be aware, the combination of support the collaborative efforts and the package mechanism have enormously increased the availability of tech- Our third pair of facets arrives with R. The paper by niques for data analysis, especially the results of new Ihaka and Gentleman (1996) introduced a statistical research in statistics and its applications. The pack- software system, characterized later as “not unlike age management tools, in fact, illustrate the essential S”, but distinguished by some ideas intended to im- role of collaboration. Another highly collaborative prove on the earlier system. A facet of R related to contribution has facilitated use of R internationally some of those ideas and to important later develop- (Ripley, 2005), both by the use of locales in compu- ments is its approach to modularity. In a number of tations and by the contribution of translations for R fields, including architecture and computer science, documentation and messages. modules are units designed to be useful in them- selves and to fit together easily to create a desired While the collaborative facet of R is the least tech- result, such as a building or a software system. R has nical, it may eventually have the most profound im- two very important levels of modules: functions and plications. The growth of the collaborative commu- packages. nity involved is unprecedented. For example, a plot Functions are an obvious modular unit, espe- in Fox (2008) shows that the number of packages cially functions as objects. R extended the modu- in the CRAN repository exhibited literal exponen- larity of functions by providing them with an envi- tial growth from 2001 to 2008. The current size of ronment, usually the environment in which the func- this and other repositories implies at a conservative tion object was created. Objects assigned in this en- estimate that several hundreds of contributors have vironment can be accessed by name in a call to the mastered quite a bit of technical expertise and satis- function, and even modified, by stepping outside fied some considerable formal requirements to offer the strict functional programming model. Functions their efforts freely to the worldwide scientific com- sharing an environment can thus be used together munity. as a unit. The programming technique resulting is usually called programming with closures in R; for This facet does have some technical implications. a discussion and comparison with other techniques, One is that, being an open-source system, R is gener- see (Chambers, 2008, section 5.4). Because the eval- ally able to incorporate other open-source software uator searches for external objects in the function’s and does indeed do so in many areas. In contrast, environment, dependencies on external objects can proprietary systems are more likely to build exten- be controlled through that environment. The names- sions internally, either to maintain competitive ad- pace mechanism (Tierney, 2003), an important con- vantage or to avoid the free distribution require- tribution to trustworthy programming with R, uses ments of some open-source licenses. Looking for this technique to help ensure consistent behavior of quality open-source software to extend the system functions in a package. should continue to be a favored strategy for R de- The larger module introduced by R is probably velopment. the most important for the system’s growing use, the package. As it has evolved, and continues to evolve, On the downside, a large collaborative enterprise an R package is a module combining in most cases with a general practice of making collective decisions R functions, their documentation, and possibly data has a natural tendency towards conservatism. Rad- objects and/or code in other languages. ical changes do threaten to break currently working While S always had the idea of collections of soft- features. The future benefits they might bring will of- ware for distribution, somewhat formalized in S4 ten not be sufficiently persuasive. The very success as chapters, the R package greatly extends and im- of R, combined with its collaborative facet, poses a proves the ability to share computing results as ef- challenge to cultivate the “next big step” in software fective modules. The mechanism benefits from some for data analysis. The R Journal Vol. 1/1, May 2009 ISSN 2073-4859
  • 4. 8 I NVITED S ECTION : T HE F UTURE OF R Implications “workarounds” are a good thing or not. One may feel that the total effort devoted to multiple approx- A number of specific implications of the facets of R, imate solutions would have been more productive if and of their interaction, have been noted in previous coordinated and applied to improving the core im- sections. Further implications are suggested in con- plementation. In practice, however, the nature of the sidering the current state of R and its future possibil- collaborative effort supporting R makes it much eas- ities. ier to obtain a modest effort from one or a few indi- Many reasons can be suggested for R’s increasing viduals than to organize and execute a larger project. popularity. Certainly part of the picture is a produc- Perhaps the growth of R will encourage the com- tive interaction among the interface facet, the modu- munity to push for such projects, with a new view larity of packages, and the collaborative, open-source of funding and organization. Meanwhile, we can approach, all in an interactive mode. From the very hope that the growing communication facilities in beginning, S was not viewed as a set of procedures the community will assess and improve on the ex- (the model for SAS, for example) but as an interactive isting efforts. Particularly helpful facilities in this system to support the implementation of new pro- context include the CRAN task views (http://cran. cedures. The growth in packages noted previously r-project.org/web/views/), the mailing lists and reflects the same view. publications such as this journal. The emphasis on interfaces has never diminished and the range of possible procedures across the in- terface has expanded greatly. That R’s growth has Bibliography been especially strong in academic and other re- search environments is at least partly due to the abil- R. A. Becker, J. M. Chambers, and A. R. Wilks. The ity to write interfaces to existing and new software New S Language. Chapman & Hall, London, 1988. of many forms. The evolution of the package as J. M. Chambers. Programming with Data. Springer, an effective module and of the repositories for con- New York, 1998. URL http://cm.bell-labs. tributed software from the community have resulted com/cm/ms/departments/sia/Sbook/. ISBN 0-387- in many new interfaces. 98503-4. The facets also have implications for the possible future of R. The hope is that R can continue to sup- J. M. Chambers. Software for Data Analysis: Program- port the implementation and communication of im- ming with R. Springer, New York, 2008. ISBN 978- portant techniques for data analysis, contributed and 0-387-75935-7. used by a growing community. At the same time, J. M. Chambers and T. J. Hastie. Statistical Mod- those interested in new directions for statistical com- els in S. Chapman & Hall, London, 1992. ISBN puting will hope that R or some future alternative 9780412830402. engages a variety of challenges for computing with data. The collaborative community seems the only J. Fox. Editorial. R News, 8(2):1–2, October 2008. URL likely generator of the new features required but, as http://CRAN.R-project.org/doc/Rnews/. noted in the previous section, combining the new with maintenance of a popular current collaborative R. Ihaka and R. Gentleman. R: A language for data system will not be easy. analysis and graphics. Journal of Computational and The encouragement for new packages generally Graphical Statistics, 5:299–314, 1996. and for interfaces to other software in particular are K. E. Iverson. A Programming Language. Wiley, 1962. relevant for this question also. Changes that might be difficult if applied internally to the core R im- B. D. Ripley. Internationalization features of R 2.1.0. plementation can often be supplied, or at least ap- R News, 5(1):2–7, May 2005. URL http://CRAN. proximated, by packages. Two examples of enhance- R-project.org/doc/Rnews/. ments considered in discussions are an optional ob- A. Shalit. The Dylan Reference Manual. Addison- ject model using mutable references and computa- Wesley Developers Press, Reading, Mass., 1996. tions using multiple threads. In both cases, it can ISBN 0-201-44211-6. be argued that adding the facility to R could extend its applicability and performance. But a substantial L. Tierney. Name space management for R. R News, programming effort and knowledge of the current 3(1):2–6, June 2003. URL http://CRAN.R-project. implementation would be required to modify the in- org/doc/Rnews/. ternal object model or to make the core code thread- safe. Issues of back compatibility are likely to arise John M. Chambers as well. Meanwhile, less ambitious approaches in Department of Statistics the form of packages that approximate the desired Stanford University behavior have been written. For the second exam- USA ple especially, these may interface to other software jmc@r-project.org designed to provide this facility. Opinions may reasonably differ on whether these The R Journal Vol. 1/1, May 2009 ISSN 2073-4859