SlideShare une entreprise Scribd logo
1  sur  36
Télécharger pour lire hors ligne
Stat405             Development


                           Hadley Wickham
Monday, 30 November 2009
1. Floating point math
                  2. Optimisation

                  3. Continuing

                     education
                  4. Feedback


Monday, 30 November 2009
Your turn
                  Perform the following calculations in R.
                  Are the answers what you expect?
                   seq(0.1, 0.9, by = 0.1) - 1:9 / 10
                   sqrt(2)^2 - 2
                  What is the property of these numbers
                  that might cause the problem?



Monday, 30 November 2009
# Each number must be stored in a finite amount of space
        # => each number can only have a finite number of digits
        # => floating point math does not work like normal math


        (1e-16 + 1) == 1
        (1e-16 + 1) * 10 == 1e-16 * 10 + 1 * 10


        1e9 + 2 - 0.1 - 1e9
        1e10 + 2 - 0.1 - 1e10
        1e11 + 2 - 0.1 - 1e11
        1e12 + 2 - 0.1 - 1e12
        1e13 + 2 - 0.1 - 1e13
        1e14 + 2 - 0.1 - 1e14



Monday, 30 November 2009
# By default R only shows 7 significant digits
        # If the trailing digits are zero, the number will be rounded
        (1 / 237)
        (1 / 237) * 237
        (1 / 237) * 237 - 1


        seq(0.1, 0.9, by = 0.1)
        seq(0.1, 0.9, by = 0.1) - 1:9 / 10


        # Tricky to get to print exactly:
        formatC((1 / 237) * 237, digits = 20)
        formatC(seq(0.1, 0.9, by = 0.1), digits = 20)




Monday, 30 November 2009
# When working with floating point numbers (numeric)
        # (but not integers, this is the one place where the
        # difference is important) never test for equality with ==


        a <- seq(0.1, 0.9, by = 0.1)
        b <- 1:9 / 10


        all(a == b)
        all.equal(a, b)
        all(abs(a - b) < 1e-6)


        # Similarly, need to be careful with < and > etc




Monday, 30 November 2009
#     Places where this matters:
     #
     #     * sums
     #     * calculating the standard deviation
     #     * inverting a matrix (condition)
     #       * linear models!
     #     * maximum likelihood estimation




Monday, 30 November 2009
Optimisation
                  If, and only if, your code is too slow
                  First use system.time() to figure out
                  exactly how long things are taking: you
                  need this so you can check your changes
                  actually speed things up
                  Then see what is taking the longest
                  amount of time with the profr package


Monday, 30 November 2009
General advice
                  • Start with the slowest part of your code
                  • Use built-in R functions, where possible
                  • Use vectorised functions, where
                    possible
                  • Think through your basic algorithm
                           • Knowledge of basic CS algorithms
                             and data structures v. helpful


Monday, 30 November 2009
Monday, 30 November 2009
Continuing education

                  Learn more about R.
                  Learn more about your other tools.
                  Professional development




Monday, 30 November 2009
Mailing list
                  Sign up to R-help: https://stat.ethz.ch/
                  mailman/listinfo/r-help
                  Make sure to set up filters
                  Skim interesting subjects and read them
                  Don’t be afraid to post
                  (use a pseudonym if necessary)


Monday, 30 November 2009
Read books
                  Phil Spector. Data Manipulation with R.
                  William N. Venables and Brian D. Ripley.
                  Modern Applied Statistics with S.
                  Frank E. Harrell. Regression Modelling
                  Strategies.
                  Jose C. Pinheiro and Douglas M. Bates.
                  Mixed-Effects Models in S and S-Plus.


Monday, 30 November 2009
Read papers

                  The R Journal: http://journal.r-project.org/
                  The Journal of Statistical Software: http://
                  www.jstatsoft.org/




Monday, 30 November 2009
Learn your tools
                  • Touch typing
                  • Text editor
                  • Command line
                  • Caffeine
                  • Email



Monday, 30 November 2009
Professional
                           development
                  The aspects of being a statistician, apart
                  from knowing statistics.
                  Principally communication: written,
                  spoken, visual and electronic.
                  Take every opportunity you can to
                  practice these skills.



Monday, 30 November 2009
Visual       Electronic

                    Written
                                       Posters      Email
                       Papers
                                       Graphics     Website
                       Vita/Resume
                                                    Blog
                       Bibliography
                       Reviews                      Code


                    Spoken
                                       Oral exam    Video
                       Teaching
                                                    Slidecast
                       Short talk
                       Long talk


Monday, 30 November 2009
Written
                  Particularly important if you want to be an
                  academic, or if you‘re PhD student, or
                  want to become one.
                  “Style: Toward Clarity and Grace”
                  Sign up for the thesis writing workshops
                  when they come around.
                  Develop a regular habit.


Monday, 30 November 2009
My habit
                  • Roll out of bed at 7am
                  • Boil water
                  • Make tea
                  • Drink tea
                  • Write for an hour



Monday, 30 November 2009
Spoken

                  Seize every opportunity to practice.
                  Make use of Tracy Volz - tmvolz@rice.edu.
                  She is a fantastic resource - if you had to
                  pay for her, you wouldn’t be able to afford
                  it.




Monday, 30 November 2009
Email



Monday, 30 November 2009
1200


         1000


         800
 value




         600                                           unread
                                                       read
         400


         200


            0

                2007       2008   2009          2010



                                         265,000 emails
                                         134,000 unread!
Monday, 30 November 2009
1200


         1000


         800
 value




         600                                    unread
                                                read
         400


         200


            0

                2007       2008   2009   2010




Monday, 30 November 2009
1.0



              0.8



              0.6
   read/all




              0.4



              0.2



              0.0

                    2007   2008          2009   2010
                                  from




Monday, 30 November 2009
350


         300


         250
 value




         200                                    direct
                                                sent
         150


         100


         50


               2007        2008   2009   2010




Monday, 30 November 2009
350


         300


         250
 value




         200                                    direct
                                                sent
         150


         100


         50


               2007        2008   2009   2010




Monday, 30 November 2009
Inbox Zero
                           http://www.43folders.com/izero

                                    Merlin Mann

         There is no way you will ever be able to respond to — let alone read in
       exquisite detail — every email you ever receive for the rest of your life. If
       you take issue with this, just wait six months, because, believe me, we’re
        all getting a lot more email (and other sundry demands on our attention)
        every day. What seems like a doddle today is going to get progressively
       more difficult — even insurmountable — unless you put a realistic system
                                       in place now.




Monday, 30 November 2009
Your time is priceless
                            (and wildly limited)

               You need an agnostic system for
             dealing with mail that isn’t based on
                nonces, exceptions, and guilt.

        [The] ultimate goal is for you to spend
         less time playing with your email and
                 more time doing stuff.
Monday, 30 November 2009
Key concepts
                  Regularly empty your inbox
                  Minimal response
                  Delete, delete, delete
                  Filters
                  Email dashes



Monday, 30 November 2009
Response does not need to be
                               proportional to request

                                   “Do you still need this?”

                                        “I don’t know”

                           “Good idea. I’ll add it to my to do list.”

                       “Here’s a link that might be what you’re
                                     looking for…”
                                           [Delete]


    http://www.43folders.com/2006/03/13/email-cheats
Monday, 30 November 2009
Delete!
                  Most minimal response is none.
                  “Just remember that every email you
                  read, re-read, and re-re-re-re-re-read as it
                  sits in that big dumb pile is actually
                  incurring mental debt on your behalf.”
                  Be brutally honest - if you’re not going to
                  do anything with the email delete it now.


Monday, 30 November 2009
Filters grey mail
                  “noisy, frequent, and non-urgent items
                  which can be dealt with all at a pass and
                  later.”
                  facebook, comments, university/
                  department memos, newsletters, mailing
                  lists
                  Good catch all: contains unsubscribe

            http://www.43folders.com/2006/03/13/filters
Monday, 30 November 2009
13
                                                     00
        bannerpcard@rice.edu, carlyn@rice.edu, (5/d /35
                                                     ay 00
        cchat@rice.edu, cmtcomment@rice.edu,           !)
        giving@rice.edu, payroll@rice.edu,
        registrar@rice.edu, sandra@rice.edu,
        sallie@rice.edu subject:(weekly message),
        alldepts@rice.edu, list:"k2i-members.rice.edu",
        list:"mailman.rice.edu"
        allfaculty@stat.rice.edu, faculty@stat.rice.edu,
        statdept@stat.rice.edu, colloquium@stat.rice.edu,
        undergrad@stat.rice.edu
        from:(statements@wageworks.com)
        from:(TIAA-CREF_eDelivery@tiaa-cref.org)
Monday, 30 November 2009
Patricia Wallace, a techno-psychologist,
      believes part of the allure of e-mail—
      for adults as well as teens—is similar to
      that of a slot machine. “You have
      intermittent, variable reinforcement,”
      she explains. You are not sure you are
      going to get a reward every time or
      how often you will, so you keep pulling
      that handle.”
Monday, 30 November 2009
Email dashes
                  Don’t have your email open all day.
                  Schedule times when you respond to
                  emails.
                  You can tackle emails a lot faster when
                  you batch them up.
                  Lack self control (like me)? Try an internet
                  blocker: http://macfreedom.com/

      http://www.43folders.com/2006/03/15/email-dash
Monday, 30 November 2009
Feedback
                  http://hadley.wufoo.com/
               forms/stat405-final-feedback/



Monday, 30 November 2009

Contenu connexe

Similaire à 26 Development

Where User Experience And Software Engineering Meet
Where User Experience And Software Engineering MeetWhere User Experience And Software Engineering Meet
Where User Experience And Software Engineering MeetAmy Ko
 
Agile Development with PHP in Practice
Agile Development with PHP in PracticeAgile Development with PHP in Practice
Agile Development with PHP in PracticeLars Jankowfsky
 
Non-Relational Databases & Key/Value Stores
Non-Relational Databases & Key/Value StoresNon-Relational Databases & Key/Value Stores
Non-Relational Databases & Key/Value StoresJoël Perras
 
Rj Auburn's Presentation at Emerging Communication Conference & Awards 2009 E...
Rj Auburn's Presentation at Emerging Communication Conference & Awards 2009 E...Rj Auburn's Presentation at Emerging Communication Conference & Awards 2009 E...
Rj Auburn's Presentation at Emerging Communication Conference & Awards 2009 E...eCommConf
 
Intro Agile Software Development with Scrum for Campus Party 2009
Intro Agile Software Development with Scrum for Campus Party 2009Intro Agile Software Development with Scrum for Campus Party 2009
Intro Agile Software Development with Scrum for Campus Party 2009Antonio Silveira
 
Hardware Presentation for Samasource
Hardware Presentation for SamasourceHardware Presentation for Samasource
Hardware Presentation for SamasourceNick Pinkston
 
Beautiful Web Typography (#4)
Beautiful Web Typography (#4)Beautiful Web Typography (#4)
Beautiful Web Typography (#4)Pascal Klein
 
Strategies Tech It Up
Strategies Tech It UpStrategies Tech It Up
Strategies Tech It UpLisa Read
 

Similaire à 26 Development (9)

Where User Experience And Software Engineering Meet
Where User Experience And Software Engineering MeetWhere User Experience And Software Engineering Meet
Where User Experience And Software Engineering Meet
 
Agile Development with PHP in Practice
Agile Development with PHP in PracticeAgile Development with PHP in Practice
Agile Development with PHP in Practice
 
27 development
27 development27 development
27 development
 
Non-Relational Databases & Key/Value Stores
Non-Relational Databases & Key/Value StoresNon-Relational Databases & Key/Value Stores
Non-Relational Databases & Key/Value Stores
 
Rj Auburn's Presentation at Emerging Communication Conference & Awards 2009 E...
Rj Auburn's Presentation at Emerging Communication Conference & Awards 2009 E...Rj Auburn's Presentation at Emerging Communication Conference & Awards 2009 E...
Rj Auburn's Presentation at Emerging Communication Conference & Awards 2009 E...
 
Intro Agile Software Development with Scrum for Campus Party 2009
Intro Agile Software Development with Scrum for Campus Party 2009Intro Agile Software Development with Scrum for Campus Party 2009
Intro Agile Software Development with Scrum for Campus Party 2009
 
Hardware Presentation for Samasource
Hardware Presentation for SamasourceHardware Presentation for Samasource
Hardware Presentation for Samasource
 
Beautiful Web Typography (#4)
Beautiful Web Typography (#4)Beautiful Web Typography (#4)
Beautiful Web Typography (#4)
 
Strategies Tech It Up
Strategies Tech It UpStrategies Tech It Up
Strategies Tech It Up
 

Plus de Hadley Wickham (20)

27 development
27 development27 development
27 development
 
24 modelling
24 modelling24 modelling
24 modelling
 
23 data-structures
23 data-structures23 data-structures
23 data-structures
 
Graphical inference
Graphical inferenceGraphical inference
Graphical inference
 
R packages
R packagesR packages
R packages
 
22 spam
22 spam22 spam
22 spam
 
21 spam
21 spam21 spam
21 spam
 
20 date-times
20 date-times20 date-times
20 date-times
 
19 tables
19 tables19 tables
19 tables
 
18 cleaning
18 cleaning18 cleaning
18 cleaning
 
17 polishing
17 polishing17 polishing
17 polishing
 
16 critique
16 critique16 critique
16 critique
 
15 time-space
15 time-space15 time-space
15 time-space
 
14 case-study
14 case-study14 case-study
14 case-study
 
13 case-study
13 case-study13 case-study
13 case-study
 
12 adv-manip
12 adv-manip12 adv-manip
12 adv-manip
 
11 adv-manip
11 adv-manip11 adv-manip
11 adv-manip
 
11 adv-manip
11 adv-manip11 adv-manip
11 adv-manip
 
10 simulation
10 simulation10 simulation
10 simulation
 
10 simulation
10 simulation10 simulation
10 simulation
 

Dernier

Empowering Africa's Next Generation: The AI Leadership Blueprint
Empowering Africa's Next Generation: The AI Leadership BlueprintEmpowering Africa's Next Generation: The AI Leadership Blueprint
Empowering Africa's Next Generation: The AI Leadership BlueprintMahmoud Rabie
 
UiPath Studio Web workshop series - Day 6
UiPath Studio Web workshop series - Day 6UiPath Studio Web workshop series - Day 6
UiPath Studio Web workshop series - Day 6DianaGray10
 
Designing A Time bound resource download URL
Designing A Time bound resource download URLDesigning A Time bound resource download URL
Designing A Time bound resource download URLRuncy Oommen
 
Babel Compiler - Transforming JavaScript for All Browsers.pptx
Babel Compiler - Transforming JavaScript for All Browsers.pptxBabel Compiler - Transforming JavaScript for All Browsers.pptx
Babel Compiler - Transforming JavaScript for All Browsers.pptxYounusS2
 
COMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online CollaborationCOMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online Collaborationbruanjhuli
 
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...DianaGray10
 
20200723_insight_release_plan_v6.pdf20200723_insight_release_plan_v6.pdf
20200723_insight_release_plan_v6.pdf20200723_insight_release_plan_v6.pdf20200723_insight_release_plan_v6.pdf20200723_insight_release_plan_v6.pdf
20200723_insight_release_plan_v6.pdf20200723_insight_release_plan_v6.pdfJamie (Taka) Wang
 
COMPUTER 10 Lesson 8 - Building a Website
COMPUTER 10 Lesson 8 - Building a WebsiteCOMPUTER 10 Lesson 8 - Building a Website
COMPUTER 10 Lesson 8 - Building a Websitedgelyza
 
Videogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdfVideogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdfinfogdgmi
 
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...UbiTrack UK
 
Nanopower In Semiconductor Industry.pdf
Nanopower  In Semiconductor Industry.pdfNanopower  In Semiconductor Industry.pdf
Nanopower In Semiconductor Industry.pdfPedro Manuel
 
Bird eye's view on Camunda open source ecosystem
Bird eye's view on Camunda open source ecosystemBird eye's view on Camunda open source ecosystem
Bird eye's view on Camunda open source ecosystemAsko Soukka
 
Machine Learning Model Validation (Aijun Zhang 2024).pdf
Machine Learning Model Validation (Aijun Zhang 2024).pdfMachine Learning Model Validation (Aijun Zhang 2024).pdf
Machine Learning Model Validation (Aijun Zhang 2024).pdfAijun Zhang
 
UiPath Studio Web workshop series - Day 7
UiPath Studio Web workshop series - Day 7UiPath Studio Web workshop series - Day 7
UiPath Studio Web workshop series - Day 7DianaGray10
 
Comparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and IstioComparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and IstioChristian Posta
 
Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024D Cloud Solutions
 
AI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just MinutesAI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just MinutesMd Hossain Ali
 
UiPath Community: AI for UiPath Automation Developers
UiPath Community: AI for UiPath Automation DevelopersUiPath Community: AI for UiPath Automation Developers
UiPath Community: AI for UiPath Automation DevelopersUiPathCommunity
 
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPA
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPAAnypoint Code Builder , Google Pub sub connector and MuleSoft RPA
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPAshyamraj55
 
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...Will Schroeder
 

Dernier (20)

Empowering Africa's Next Generation: The AI Leadership Blueprint
Empowering Africa's Next Generation: The AI Leadership BlueprintEmpowering Africa's Next Generation: The AI Leadership Blueprint
Empowering Africa's Next Generation: The AI Leadership Blueprint
 
UiPath Studio Web workshop series - Day 6
UiPath Studio Web workshop series - Day 6UiPath Studio Web workshop series - Day 6
UiPath Studio Web workshop series - Day 6
 
Designing A Time bound resource download URL
Designing A Time bound resource download URLDesigning A Time bound resource download URL
Designing A Time bound resource download URL
 
Babel Compiler - Transforming JavaScript for All Browsers.pptx
Babel Compiler - Transforming JavaScript for All Browsers.pptxBabel Compiler - Transforming JavaScript for All Browsers.pptx
Babel Compiler - Transforming JavaScript for All Browsers.pptx
 
COMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online CollaborationCOMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online Collaboration
 
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
 
20200723_insight_release_plan_v6.pdf20200723_insight_release_plan_v6.pdf
20200723_insight_release_plan_v6.pdf20200723_insight_release_plan_v6.pdf20200723_insight_release_plan_v6.pdf20200723_insight_release_plan_v6.pdf
20200723_insight_release_plan_v6.pdf20200723_insight_release_plan_v6.pdf
 
COMPUTER 10 Lesson 8 - Building a Website
COMPUTER 10 Lesson 8 - Building a WebsiteCOMPUTER 10 Lesson 8 - Building a Website
COMPUTER 10 Lesson 8 - Building a Website
 
Videogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdfVideogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdf
 
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
 
Nanopower In Semiconductor Industry.pdf
Nanopower  In Semiconductor Industry.pdfNanopower  In Semiconductor Industry.pdf
Nanopower In Semiconductor Industry.pdf
 
Bird eye's view on Camunda open source ecosystem
Bird eye's view on Camunda open source ecosystemBird eye's view on Camunda open source ecosystem
Bird eye's view on Camunda open source ecosystem
 
Machine Learning Model Validation (Aijun Zhang 2024).pdf
Machine Learning Model Validation (Aijun Zhang 2024).pdfMachine Learning Model Validation (Aijun Zhang 2024).pdf
Machine Learning Model Validation (Aijun Zhang 2024).pdf
 
UiPath Studio Web workshop series - Day 7
UiPath Studio Web workshop series - Day 7UiPath Studio Web workshop series - Day 7
UiPath Studio Web workshop series - Day 7
 
Comparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and IstioComparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and Istio
 
Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024
 
AI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just MinutesAI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just Minutes
 
UiPath Community: AI for UiPath Automation Developers
UiPath Community: AI for UiPath Automation DevelopersUiPath Community: AI for UiPath Automation Developers
UiPath Community: AI for UiPath Automation Developers
 
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPA
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPAAnypoint Code Builder , Google Pub sub connector and MuleSoft RPA
Anypoint Code Builder , Google Pub sub connector and MuleSoft RPA
 
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
 

26 Development

  • 1. Stat405 Development Hadley Wickham Monday, 30 November 2009
  • 2. 1. Floating point math 2. Optimisation 3. Continuing education 4. Feedback Monday, 30 November 2009
  • 3. Your turn Perform the following calculations in R. Are the answers what you expect? seq(0.1, 0.9, by = 0.1) - 1:9 / 10 sqrt(2)^2 - 2 What is the property of these numbers that might cause the problem? Monday, 30 November 2009
  • 4. # Each number must be stored in a finite amount of space # => each number can only have a finite number of digits # => floating point math does not work like normal math (1e-16 + 1) == 1 (1e-16 + 1) * 10 == 1e-16 * 10 + 1 * 10 1e9 + 2 - 0.1 - 1e9 1e10 + 2 - 0.1 - 1e10 1e11 + 2 - 0.1 - 1e11 1e12 + 2 - 0.1 - 1e12 1e13 + 2 - 0.1 - 1e13 1e14 + 2 - 0.1 - 1e14 Monday, 30 November 2009
  • 5. # By default R only shows 7 significant digits # If the trailing digits are zero, the number will be rounded (1 / 237) (1 / 237) * 237 (1 / 237) * 237 - 1 seq(0.1, 0.9, by = 0.1) seq(0.1, 0.9, by = 0.1) - 1:9 / 10 # Tricky to get to print exactly: formatC((1 / 237) * 237, digits = 20) formatC(seq(0.1, 0.9, by = 0.1), digits = 20) Monday, 30 November 2009
  • 6. # When working with floating point numbers (numeric) # (but not integers, this is the one place where the # difference is important) never test for equality with == a <- seq(0.1, 0.9, by = 0.1) b <- 1:9 / 10 all(a == b) all.equal(a, b) all(abs(a - b) < 1e-6) # Similarly, need to be careful with < and > etc Monday, 30 November 2009
  • 7. # Places where this matters: # # * sums # * calculating the standard deviation # * inverting a matrix (condition) # * linear models! # * maximum likelihood estimation Monday, 30 November 2009
  • 8. Optimisation If, and only if, your code is too slow First use system.time() to figure out exactly how long things are taking: you need this so you can check your changes actually speed things up Then see what is taking the longest amount of time with the profr package Monday, 30 November 2009
  • 9. General advice • Start with the slowest part of your code • Use built-in R functions, where possible • Use vectorised functions, where possible • Think through your basic algorithm • Knowledge of basic CS algorithms and data structures v. helpful Monday, 30 November 2009
  • 11. Continuing education Learn more about R. Learn more about your other tools. Professional development Monday, 30 November 2009
  • 12. Mailing list Sign up to R-help: https://stat.ethz.ch/ mailman/listinfo/r-help Make sure to set up filters Skim interesting subjects and read them Don’t be afraid to post (use a pseudonym if necessary) Monday, 30 November 2009
  • 13. Read books Phil Spector. Data Manipulation with R. William N. Venables and Brian D. Ripley. Modern Applied Statistics with S. Frank E. Harrell. Regression Modelling Strategies. Jose C. Pinheiro and Douglas M. Bates. Mixed-Effects Models in S and S-Plus. Monday, 30 November 2009
  • 14. Read papers The R Journal: http://journal.r-project.org/ The Journal of Statistical Software: http:// www.jstatsoft.org/ Monday, 30 November 2009
  • 15. Learn your tools • Touch typing • Text editor • Command line • Caffeine • Email Monday, 30 November 2009
  • 16. Professional development The aspects of being a statistician, apart from knowing statistics. Principally communication: written, spoken, visual and electronic. Take every opportunity you can to practice these skills. Monday, 30 November 2009
  • 17. Visual Electronic Written Posters Email Papers Graphics Website Vita/Resume Blog Bibliography Reviews Code Spoken Oral exam Video Teaching Slidecast Short talk Long talk Monday, 30 November 2009
  • 18. Written Particularly important if you want to be an academic, or if you‘re PhD student, or want to become one. “Style: Toward Clarity and Grace” Sign up for the thesis writing workshops when they come around. Develop a regular habit. Monday, 30 November 2009
  • 19. My habit • Roll out of bed at 7am • Boil water • Make tea • Drink tea • Write for an hour Monday, 30 November 2009
  • 20. Spoken Seize every opportunity to practice. Make use of Tracy Volz - tmvolz@rice.edu. She is a fantastic resource - if you had to pay for her, you wouldn’t be able to afford it. Monday, 30 November 2009
  • 22. 1200 1000 800 value 600 unread read 400 200 0 2007 2008 2009 2010 265,000 emails 134,000 unread! Monday, 30 November 2009
  • 23. 1200 1000 800 value 600 unread read 400 200 0 2007 2008 2009 2010 Monday, 30 November 2009
  • 24. 1.0 0.8 0.6 read/all 0.4 0.2 0.0 2007 2008 2009 2010 from Monday, 30 November 2009
  • 25. 350 300 250 value 200 direct sent 150 100 50 2007 2008 2009 2010 Monday, 30 November 2009
  • 26. 350 300 250 value 200 direct sent 150 100 50 2007 2008 2009 2010 Monday, 30 November 2009
  • 27. Inbox Zero http://www.43folders.com/izero Merlin Mann There is no way you will ever be able to respond to — let alone read in exquisite detail — every email you ever receive for the rest of your life. If you take issue with this, just wait six months, because, believe me, we’re all getting a lot more email (and other sundry demands on our attention) every day. What seems like a doddle today is going to get progressively more difficult — even insurmountable — unless you put a realistic system in place now. Monday, 30 November 2009
  • 28. Your time is priceless (and wildly limited) You need an agnostic system for dealing with mail that isn’t based on nonces, exceptions, and guilt. [The] ultimate goal is for you to spend less time playing with your email and more time doing stuff. Monday, 30 November 2009
  • 29. Key concepts Regularly empty your inbox Minimal response Delete, delete, delete Filters Email dashes Monday, 30 November 2009
  • 30. Response does not need to be proportional to request “Do you still need this?” “I don’t know” “Good idea. I’ll add it to my to do list.” “Here’s a link that might be what you’re looking for…” [Delete] http://www.43folders.com/2006/03/13/email-cheats Monday, 30 November 2009
  • 31. Delete! Most minimal response is none. “Just remember that every email you read, re-read, and re-re-re-re-re-read as it sits in that big dumb pile is actually incurring mental debt on your behalf.” Be brutally honest - if you’re not going to do anything with the email delete it now. Monday, 30 November 2009
  • 32. Filters grey mail “noisy, frequent, and non-urgent items which can be dealt with all at a pass and later.” facebook, comments, university/ department memos, newsletters, mailing lists Good catch all: contains unsubscribe http://www.43folders.com/2006/03/13/filters Monday, 30 November 2009
  • 33. 13 00 bannerpcard@rice.edu, carlyn@rice.edu, (5/d /35 ay 00 cchat@rice.edu, cmtcomment@rice.edu, !) giving@rice.edu, payroll@rice.edu, registrar@rice.edu, sandra@rice.edu, sallie@rice.edu subject:(weekly message), alldepts@rice.edu, list:"k2i-members.rice.edu", list:"mailman.rice.edu" allfaculty@stat.rice.edu, faculty@stat.rice.edu, statdept@stat.rice.edu, colloquium@stat.rice.edu, undergrad@stat.rice.edu from:(statements@wageworks.com) from:(TIAA-CREF_eDelivery@tiaa-cref.org) Monday, 30 November 2009
  • 34. Patricia Wallace, a techno-psychologist, believes part of the allure of e-mail— for adults as well as teens—is similar to that of a slot machine. “You have intermittent, variable reinforcement,” she explains. You are not sure you are going to get a reward every time or how often you will, so you keep pulling that handle.” Monday, 30 November 2009
  • 35. Email dashes Don’t have your email open all day. Schedule times when you respond to emails. You can tackle emails a lot faster when you batch them up. Lack self control (like me)? Try an internet blocker: http://macfreedom.com/ http://www.43folders.com/2006/03/15/email-dash Monday, 30 November 2009
  • 36. Feedback http://hadley.wufoo.com/ forms/stat405-final-feedback/ Monday, 30 November 2009