SlideShare une entreprise Scribd logo
1  sur  28
Télécharger pour lire hors ligne
Interesting near galaxy sources

 • identified by TCP in the last 2 days
 • (last epoch observed 1 week ago)
 • Classification triggered by latest epoch
    added to the source
PI: Josh Bloom
Overview

•   TCP Software & Data Architecture
•   Classifiers & Cutting out “Junk”

•   Continuing work...

      PTF spectroscopically confirmed SN,
      subsequently classified by TCP as SN
Transients Classification Pipeline
Parallelized source correlation
                             and classification

                •   Difference objects are retrieved from LBL

                •   Each difference-object is passed to an IPython client

                •   Each parallel IPython client performs:
                     •   Source creation or correlation with existing sources

                     •   “Feature” generation (or re-generation) for that source

   source            •   Classification of that source
 generation




   feature
 generation




   source
classification
Parallelized source correlation
                             and classification
                •   Realtime TCP runs on 22 dedicated cores

                •   LCOGT’s 96 core beowulf
                     •   non run-time tasks

                     •   Classifier generation


                •   Additional resources
                     •   To be used for future timeseries classification work

   source
 generation
                     •   Yahoo’s 4000 core Hadoop academic cluster

                     •   Amazon EC2 cluster

   feature
 generation




   source
classification
Warehouse of light-curves

•   Need representative light-curves for all science

•   With these we can model each science class

•   We’ve built a warehouse of example light-curves




     TCP-TUTOR                 DotAstro.org
        internal interface        public interface
Confusion Matrix


    different ways of quantifying effeciencies
     - using original good training set, and train/evaluate efficencies via folding
     - using “noisified”, simulated sources matching sur vey shedule, cadences, limits




•   C
“Noisification”
                        (resampling light-curves)



•   For PTF, the Noisification code references:
     •   1000s of PTF pointing and survey observing plans

     •   This allows simulation of PTF cadenced light-curves

     •   Occasionally PTF observes using a faster cadence:

           •    7.5 minutes between revisiting an RA, Dec

           •    This requires a separate set of noisified light-curves and classifiers.




•   Other pointing and observing plans could be used.
     •   This means we can easily generate noisified light-curves for any survey.

     •   Thus we can generate science classifiers for any survey.
Constructing Light Curves
        from subtractions ain’t easy



                         true
mag
                         reference
                     [assumes template doesn’t
                             update]




                  time
Constructing Light Curves
        from subtractions ain’t easy



                         true
mag
                         reference
                     [assumes template doesn’t
                             update]

                            = 3 σ limiting mag




                  time
Constructing Light Curves
        from subtractions ain’t easy



                         true
mag
                         reference
                     [assumes template doesn’t
                             update]

                            = 3 σ limiting mag


                         detected in:
                         pos_sub?
                         neg_sub?
                  time
Constructing Light Curves
              from subtractions ain’t easy

      5σ exclusion
          band
                               true
mag
                               reference
                           [assumes template doesn’t
                                   update]

                                  = 3 σ limiting mag


                               detected in:
                               pos_sub?
                               neg_sub?
                        time
Constructing Light Curves
              from subtractions ain’t easy

      5σ exclusion
          band
                               true
mag
                               reference
                           [assumes template doesn’t
                                   update]

                                  = 3 σ limiting mag


                               detected in:
                               pos_sub?
                               neg_sub?
                        time
for some source at                         Constructing Light Curves
RA,DEC & ti, determine                         from subtractions ain’t easy
 best ref_mag at t=ti
                                             total mag = TM+
                                  yes            [detection]
       detection in
      positive sub?
                                            total mag = limit_mag
         no                                       [upper limit]
                                    no
   limit_mag fainter
     than ref_mag?                            total mag = ref_mag
                                                    [detection]
         yes
                                 no

    detection in the                         total mag = TM-
     negative sub?
                                                [detection]
                                    s



         yes
                                 ye




   mag in negative sub <                    total mag = limit_mag
   limit_mag - ref_mag?             no            [upper limit]
    TM+ = 2.5 log10( f_aper × 10-0.4(sub_zp-ref_zp) + flux_aper ) + ub1_ref_zp
    TM- = 2.5 log10( -f_aper × 10-0.4(sub_zp-ref_zp) + flux_aper ) + ub1_ref_zp
Classifiers
•   General Classifier
     •   Filter out: poorly subtracted sources

     •   Filter out: minor planets / rocks

     •   Filter out: long-time sampled (periodic & nonperiodic)

     •   Identify interesting sources near known galaxies

     •   Identify periodic variable science class when confidence is high


•   Timeseries Classifier
     •   Weighted combination of machine learning classifiers

     •   Astronomer crafted classifiers for specific science types

          •    Microlens, Super Nova
(Source)

                          General Classification
                         •      Three general classification groups.

                         •      Periodic variables are contained within the
                                “uninteresting” group, although more specific
   Interesting with             sub-classifications are known.
nearby galaxy context




                                                                 Poor subtraction
                                                                   JUNK class
                         SN, AGN of                                                       Uninteresting
                        various quality
                            classes                                          Rock class
                                                        (general) Periodic variable
                                                                   class
                                          Interesting without context
                                                  information
                          Nicely subtracted,
                             non-galaxy,
                            non-periodic
                           variable classes
(Source)

                        General Classification
                        •     Applied to ~80 spectroscopically confirmed
                              user classified (SN, AGN, galaxy) sources.

                        •     SN lightcurve classifier is needed when galaxy
   Interesting with           context is not available, and to improve confidence
nearby galaxy context         in SN classification.



                            SN, AGN,
                              galaxy                                          Uninteresting
                             (58 SN)                         faint, poorly
                                                              subtracted
                                                                (11 SN)
                                   Interesting without context
                                           information
General Classifier: components & cuts
•   Crowd source modeled “RealBogus” metric
     • Cut on: average RealBogus, derivatives of RB components
     • Cut on: % epochs in source with good RealBogus
•   PSF statistics
     • Cuts on: PSF symmetry, eccentricity (averages)
•   Neighboring object comparisons
     • Cuts on significance of above metrics when compared to neighboring pixels
•   Minor Planet check
                                                                                     PyEphem
     • Does an epoch intersect a Minor Planet?     (PyMPChecker)
                                                                       PyMPChecker

•   Well sampled source
      •    Cuts on: well sampled periodic & nonperiodic sources
Evaluating and Combining Classifiers
The “Netflix Prize” was won using a combination of ~1000 different classifiers.

 •    Issues when using multiple classifiers:
        •   How to combine Classifiers using weights or tree-hierarchy

        •   How to generate final classification “probabilities” when using:

             •    Widely varying types of classifiers

             •    Each classifier may contain sub-classifications with their own class
                  probabilities.


 •    Evaluate the final combination of classifiers
        •   We classify PTF09xxx user classified sources

        •   We display success / failure cases for each general class


 •    Update classifier weights & cuts, try again.

 •    OR: Iteratively & algorithmically find best weights.
Periodic variable classifiers
                   •     Currently, science classes are determined by combining
                         the weighted probabilities generated by different
                         classification models, for a source.
                                                                                                         ~0.4 day period
~0.14 day period
 RR Lyrae using    •     Each machine-learned classification model is trained using                       RR Lyrae using
                                                                                                            10 epoch
   20 epoch              “noisified” lightcurves which were generated using
                         different parameters.                                                            noisification
  noisification
                                                               ...shows highest classification
                               Clicking on a class for one
                                                                probability sources for that
                               of dozens of ML models...
                                                                        model::class




                     Overplotting of
                                                                                  period-fold plotting
                   period-folded model
                                                                                  probably failed here
                     still needs work



                                            0.1 - 0.17 day period RR Lyrae
                                             using 15 epoch noisification
Continuing Work


•   Test, improve general classifier cuts

•   Push general classifications to Followup
    Marshal

•   Push specific variable science class
    identified sources to Followup Marshal

•   Explore other timeseries classifiers for
    periodic variable classification.
TCP Explorer

Contenu connexe

Tendances

(Paper Review) Reconstruction of Monte Carlo Image Sequences using a Recurren...
(Paper Review) Reconstruction of Monte Carlo Image Sequences using a Recurren...(Paper Review) Reconstruction of Monte Carlo Image Sequences using a Recurren...
(Paper Review) Reconstruction of Monte Carlo Image Sequences using a Recurren...MYEONGGYU LEE
 
Poggi analytics - clustering - 1
Poggi   analytics - clustering - 1Poggi   analytics - clustering - 1
Poggi analytics - clustering - 1Gaston Liberman
 
Deep Learning workshop 2010: Deep Learning of Invariant Spatiotemporal Featur...
Deep Learning workshop 2010: Deep Learning of Invariant Spatiotemporal Featur...Deep Learning workshop 2010: Deep Learning of Invariant Spatiotemporal Featur...
Deep Learning workshop 2010: Deep Learning of Invariant Spatiotemporal Featur...zukun
 
Survey of Super Resolution Task (SISR Only)
Survey of Super Resolution Task (SISR Only)Survey of Super Resolution Task (SISR Only)
Survey of Super Resolution Task (SISR Only)MYEONGGYU LEE
 
Single shot multiboxdetectors
Single shot multiboxdetectorsSingle shot multiboxdetectors
Single shot multiboxdetectors지현 백
 
(Paper Review) Abnormal Event Detection in Videos using Generative Adversaria...
(Paper Review) Abnormal Event Detection in Videos using Generative Adversaria...(Paper Review) Abnormal Event Detection in Videos using Generative Adversaria...
(Paper Review) Abnormal Event Detection in Videos using Generative Adversaria...MYEONGGYU LEE
 

Tendances (6)

(Paper Review) Reconstruction of Monte Carlo Image Sequences using a Recurren...
(Paper Review) Reconstruction of Monte Carlo Image Sequences using a Recurren...(Paper Review) Reconstruction of Monte Carlo Image Sequences using a Recurren...
(Paper Review) Reconstruction of Monte Carlo Image Sequences using a Recurren...
 
Poggi analytics - clustering - 1
Poggi   analytics - clustering - 1Poggi   analytics - clustering - 1
Poggi analytics - clustering - 1
 
Deep Learning workshop 2010: Deep Learning of Invariant Spatiotemporal Featur...
Deep Learning workshop 2010: Deep Learning of Invariant Spatiotemporal Featur...Deep Learning workshop 2010: Deep Learning of Invariant Spatiotemporal Featur...
Deep Learning workshop 2010: Deep Learning of Invariant Spatiotemporal Featur...
 
Survey of Super Resolution Task (SISR Only)
Survey of Super Resolution Task (SISR Only)Survey of Super Resolution Task (SISR Only)
Survey of Super Resolution Task (SISR Only)
 
Single shot multiboxdetectors
Single shot multiboxdetectorsSingle shot multiboxdetectors
Single shot multiboxdetectors
 
(Paper Review) Abnormal Event Detection in Videos using Generative Adversaria...
(Paper Review) Abnormal Event Detection in Videos using Generative Adversaria...(Paper Review) Abnormal Event Detection in Videos using Generative Adversaria...
(Paper Review) Abnormal Event Detection in Videos using Generative Adversaria...
 

En vedette

E:\Education Powerpoint
E:\Education PowerpointE:\Education Powerpoint
E:\Education PowerpointCasandraAdams
 
Authentic Leaders
Authentic LeadersAuthentic Leaders
Authentic Leadersguest970121
 
Education Powerpoint
Education PowerpointEducation Powerpoint
Education PowerpointCasandraAdams
 
Authentic Leaders
Authentic LeadersAuthentic Leaders
Authentic Leadersguest970121
 
Education Powerpoint
Education PowerpointEducation Powerpoint
Education PowerpointCasandraAdams
 
Starr Bloom T.C.P. using Hadoop on Yahoo's M45 Cluster (20100112)
Starr Bloom T.C.P. using Hadoop on Yahoo's M45 Cluster (20100112)Starr Bloom T.C.P. using Hadoop on Yahoo's M45 Cluster (20100112)
Starr Bloom T.C.P. using Hadoop on Yahoo's M45 Cluster (20100112)Dan Starr
 

En vedette (8)

E:\Education Powerpoint
E:\Education PowerpointE:\Education Powerpoint
E:\Education Powerpoint
 
Authentic Leaders
Authentic LeadersAuthentic Leaders
Authentic Leaders
 
Culture Of Great India
Culture Of  Great IndiaCulture Of  Great India
Culture Of Great India
 
Education Powerpoint
Education PowerpointEducation Powerpoint
Education Powerpoint
 
Authentic Leaders
Authentic LeadersAuthentic Leaders
Authentic Leaders
 
Education Powerpoint
Education PowerpointEducation Powerpoint
Education Powerpoint
 
Starr Bloom T.C.P. using Hadoop on Yahoo's M45 Cluster (20100112)
Starr Bloom T.C.P. using Hadoop on Yahoo's M45 Cluster (20100112)Starr Bloom T.C.P. using Hadoop on Yahoo's M45 Cluster (20100112)
Starr Bloom T.C.P. using Hadoop on Yahoo's M45 Cluster (20100112)
 
S E V E N W O N D E R S
S E V E N W O N D E R SS E V E N W O N D E R S
S E V E N W O N D E R S
 

Similaire à Caltech 20090903 Talk on T.C.P. for LSST/PTF workshop

20190927 generative models_aia
20190927 generative models_aia20190927 generative models_aia
20190927 generative models_aiaYi-Fan Liou
 
Semi-supervised concept detection by learning the structure of similarity graphs
Semi-supervised concept detection by learning the structure of similarity graphsSemi-supervised concept detection by learning the structure of similarity graphs
Semi-supervised concept detection by learning the structure of similarity graphsSymeon Papadopoulos
 
Spectral clustering Tutorial
Spectral clustering TutorialSpectral clustering Tutorial
Spectral clustering TutorialZitao Liu
 
Range reader/writer locking for the Linux kernel
Range reader/writer locking for the Linux kernelRange reader/writer locking for the Linux kernel
Range reader/writer locking for the Linux kernelDavidlohr Bueso
 
Oxford 05-oct-2012
Oxford 05-oct-2012Oxford 05-oct-2012
Oxford 05-oct-2012Ted Dunning
 
Deep learning from a novice perspective
Deep learning from a novice perspectiveDeep learning from a novice perspective
Deep learning from a novice perspectiveAnirban Santara
 
Sparse Data Support in MLlib
Sparse Data Support in MLlibSparse Data Support in MLlib
Sparse Data Support in MLlibXiangrui Meng
 
Exploring Simple Siamese Representation Learning
Exploring Simple Siamese Representation LearningExploring Simple Siamese Representation Learning
Exploring Simple Siamese Representation LearningSungchul Kim
 
Unsupervised/Self-supervvised visual object tracking
Unsupervised/Self-supervvised visual object trackingUnsupervised/Self-supervvised visual object tracking
Unsupervised/Self-supervvised visual object trackingYu Huang
 

Similaire à Caltech 20090903 Talk on T.C.P. for LSST/PTF workshop (13)

20190927 generative models_aia
20190927 generative models_aia20190927 generative models_aia
20190927 generative models_aia
 
Semi-supervised concept detection by learning the structure of similarity graphs
Semi-supervised concept detection by learning the structure of similarity graphsSemi-supervised concept detection by learning the structure of similarity graphs
Semi-supervised concept detection by learning the structure of similarity graphs
 
Ucb2
Ucb2Ucb2
Ucb2
 
Spectral clustering Tutorial
Spectral clustering TutorialSpectral clustering Tutorial
Spectral clustering Tutorial
 
Range reader/writer locking for the Linux kernel
Range reader/writer locking for the Linux kernelRange reader/writer locking for the Linux kernel
Range reader/writer locking for the Linux kernel
 
Oxford 05-oct-2012
Oxford 05-oct-2012Oxford 05-oct-2012
Oxford 05-oct-2012
 
Deep learning from a novice perspective
Deep learning from a novice perspectiveDeep learning from a novice perspective
Deep learning from a novice perspective
 
Clustering.pdf
Clustering.pdfClustering.pdf
Clustering.pdf
 
Sparse Data Support in MLlib
Sparse Data Support in MLlibSparse Data Support in MLlib
Sparse Data Support in MLlib
 
Exploring Simple Siamese Representation Learning
Exploring Simple Siamese Representation LearningExploring Simple Siamese Representation Learning
Exploring Simple Siamese Representation Learning
 
Unsupervised/Self-supervvised visual object tracking
Unsupervised/Self-supervvised visual object trackingUnsupervised/Self-supervvised visual object tracking
Unsupervised/Self-supervvised visual object tracking
 
Markov Random Field (MRF)
Markov Random Field (MRF)Markov Random Field (MRF)
Markov Random Field (MRF)
 
Surveys
SurveysSurveys
Surveys
 

Dernier

Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 

Dernier (20)

Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 

Caltech 20090903 Talk on T.C.P. for LSST/PTF workshop

  • 1. Interesting near galaxy sources • identified by TCP in the last 2 days • (last epoch observed 1 week ago) • Classification triggered by latest epoch added to the source
  • 3. Overview • TCP Software & Data Architecture • Classifiers & Cutting out “Junk” • Continuing work... PTF spectroscopically confirmed SN, subsequently classified by TCP as SN
  • 5. Parallelized source correlation and classification • Difference objects are retrieved from LBL • Each difference-object is passed to an IPython client • Each parallel IPython client performs: • Source creation or correlation with existing sources • “Feature” generation (or re-generation) for that source source • Classification of that source generation feature generation source classification
  • 6. Parallelized source correlation and classification • Realtime TCP runs on 22 dedicated cores • LCOGT’s 96 core beowulf • non run-time tasks • Classifier generation • Additional resources • To be used for future timeseries classification work source generation • Yahoo’s 4000 core Hadoop academic cluster • Amazon EC2 cluster feature generation source classification
  • 7. Warehouse of light-curves • Need representative light-curves for all science • With these we can model each science class • We’ve built a warehouse of example light-curves TCP-TUTOR DotAstro.org internal interface public interface
  • 8.
  • 9.
  • 10.
  • 11. Confusion Matrix different ways of quantifying effeciencies - using original good training set, and train/evaluate efficencies via folding - using “noisified”, simulated sources matching sur vey shedule, cadences, limits • C
  • 12. “Noisification” (resampling light-curves) • For PTF, the Noisification code references: • 1000s of PTF pointing and survey observing plans • This allows simulation of PTF cadenced light-curves • Occasionally PTF observes using a faster cadence: • 7.5 minutes between revisiting an RA, Dec • This requires a separate set of noisified light-curves and classifiers. • Other pointing and observing plans could be used. • This means we can easily generate noisified light-curves for any survey. • Thus we can generate science classifiers for any survey.
  • 13.
  • 14. Constructing Light Curves from subtractions ain’t easy true mag reference [assumes template doesn’t update] time
  • 15. Constructing Light Curves from subtractions ain’t easy true mag reference [assumes template doesn’t update] = 3 σ limiting mag time
  • 16. Constructing Light Curves from subtractions ain’t easy true mag reference [assumes template doesn’t update] = 3 σ limiting mag detected in: pos_sub? neg_sub? time
  • 17. Constructing Light Curves from subtractions ain’t easy 5σ exclusion band true mag reference [assumes template doesn’t update] = 3 σ limiting mag detected in: pos_sub? neg_sub? time
  • 18. Constructing Light Curves from subtractions ain’t easy 5σ exclusion band true mag reference [assumes template doesn’t update] = 3 σ limiting mag detected in: pos_sub? neg_sub? time
  • 19. for some source at Constructing Light Curves RA,DEC & ti, determine from subtractions ain’t easy best ref_mag at t=ti total mag = TM+ yes [detection] detection in positive sub? total mag = limit_mag no [upper limit] no limit_mag fainter than ref_mag? total mag = ref_mag [detection] yes no detection in the total mag = TM- negative sub? [detection] s yes ye mag in negative sub < total mag = limit_mag limit_mag - ref_mag? no [upper limit] TM+ = 2.5 log10( f_aper × 10-0.4(sub_zp-ref_zp) + flux_aper ) + ub1_ref_zp TM- = 2.5 log10( -f_aper × 10-0.4(sub_zp-ref_zp) + flux_aper ) + ub1_ref_zp
  • 20. Classifiers • General Classifier • Filter out: poorly subtracted sources • Filter out: minor planets / rocks • Filter out: long-time sampled (periodic & nonperiodic) • Identify interesting sources near known galaxies • Identify periodic variable science class when confidence is high • Timeseries Classifier • Weighted combination of machine learning classifiers • Astronomer crafted classifiers for specific science types • Microlens, Super Nova
  • 21. (Source) General Classification • Three general classification groups. • Periodic variables are contained within the “uninteresting” group, although more specific Interesting with sub-classifications are known. nearby galaxy context Poor subtraction JUNK class SN, AGN of Uninteresting various quality classes Rock class (general) Periodic variable class Interesting without context information Nicely subtracted, non-galaxy, non-periodic variable classes
  • 22. (Source) General Classification • Applied to ~80 spectroscopically confirmed user classified (SN, AGN, galaxy) sources. • SN lightcurve classifier is needed when galaxy Interesting with context is not available, and to improve confidence nearby galaxy context in SN classification. SN, AGN, galaxy Uninteresting (58 SN) faint, poorly subtracted (11 SN) Interesting without context information
  • 23. General Classifier: components & cuts • Crowd source modeled “RealBogus” metric • Cut on: average RealBogus, derivatives of RB components • Cut on: % epochs in source with good RealBogus • PSF statistics • Cuts on: PSF symmetry, eccentricity (averages) • Neighboring object comparisons • Cuts on significance of above metrics when compared to neighboring pixels • Minor Planet check PyEphem • Does an epoch intersect a Minor Planet? (PyMPChecker) PyMPChecker • Well sampled source • Cuts on: well sampled periodic & nonperiodic sources
  • 24. Evaluating and Combining Classifiers The “Netflix Prize” was won using a combination of ~1000 different classifiers. • Issues when using multiple classifiers: • How to combine Classifiers using weights or tree-hierarchy • How to generate final classification “probabilities” when using: • Widely varying types of classifiers • Each classifier may contain sub-classifications with their own class probabilities. • Evaluate the final combination of classifiers • We classify PTF09xxx user classified sources • We display success / failure cases for each general class • Update classifier weights & cuts, try again. • OR: Iteratively & algorithmically find best weights.
  • 25. Periodic variable classifiers • Currently, science classes are determined by combining the weighted probabilities generated by different classification models, for a source. ~0.4 day period ~0.14 day period RR Lyrae using • Each machine-learned classification model is trained using RR Lyrae using 10 epoch 20 epoch “noisified” lightcurves which were generated using different parameters. noisification noisification ...shows highest classification Clicking on a class for one probability sources for that of dozens of ML models... model::class Overplotting of period-fold plotting period-folded model probably failed here still needs work 0.1 - 0.17 day period RR Lyrae using 15 epoch noisification
  • 26. Continuing Work • Test, improve general classifier cuts • Push general classifications to Followup Marshal • Push specific variable science class identified sources to Followup Marshal • Explore other timeseries classifiers for periodic variable classification.
  • 27.