SlideShare une entreprise Scribd logo
1  sur  20
MPI_MPROBE: It’s Good for You Jeff Squyres
Regular MPI_PROBE Checks to see if  a matching message has arrived Tag: 3 Source: 14 Comm. ID: 32 Incoming message queue Time Tag: 9 Source: 94 Comm. ID: 17 Tag: 9 Source: 67 Comm. ID: 17
Regular MPI_PROBE Checks to see if  a matching message has arrived MPI_PROBE looking for: Tag 9, ANY_SOURCE, COMM ID 17 Tag: 3 Source: 14 Comm. ID: 32 Incoming message queue Time Tag: 9 Source: 94 Comm. ID: 17 Tag: 9 Source: 67 Comm. ID: 17
Regular MPI_PROBE Checks to see if  a matching message has arrived MPI_PROBE looking for: Tag 9, ANY_SOURCE, COMM ID 17 Tag: 3 Source: 14 Comm. ID: 32 Incoming message queue Time Tag: 9 Source: 94 Comm. ID: 17 Match Tag: 9 Source: 67 Comm. ID: 17
MPI_PROBE Succeeded Now issue a receive to actually get the message MPI_RECV(…, tag=9, src=94,comm=17, …) Tag: 3 Source: 14 Comm. ID: 32 Incoming message queue Time Tag: 9 Source: 94 Comm. ID: 17 Message is removed from incoming queue Tag: 9 Source: 67 Comm. ID: 17
Race Condition …but what if another MPI thread issues the receive first? Tag: 3 Source: 14 Comm. ID: 32 Incoming message queue Time Tag: 9 Source: 94 Comm. ID: 17 MPI_RECV(…, tag=9, src=ANY_SOURCE,comm=17) Tag: 9 Source: 67 Comm. ID: 17
Race Condition In this case, your receive will end up unexpectedly blocking (!) MPI_RECV(…, tag=9, src=94,comm=17, …) Tag: 3 Source: 14 Comm. ID: 32 Incoming message queue Time Blocked waiting for a matching message Tag: 9 Source: 67 Comm. ID: 17
Race Condition If / when the receive finally completes, it’s not the message you probed MPI_RECV(…, tag=9, src=94,comm=17, …) Tag: 3 Source: 14 Comm. ID: 32 Incoming message queue Time Tag: 9 Source: 67 Comm. ID: 17 Tag: 9 Source: 94 Comm. ID: 17
MPI_MPROBE MPROBE = Match + probe Fixes this race condition When a message is successfully probed, it is removed from the matching queue
MPI_MPROBE MPI_MPROBE looking for: Tag 9, ANY_SOURCE, COMM ID 17 Tag: 3 Source: 14 Comm. ID: 32 Incoming message queue Time Tag: 9 Source: 94 Comm. ID: 17 Tag: 9 Source: 67 Comm. ID: 17
MPI_MPROBE When the match occurs, message is removed from the incoming queue MPI_MPROBE looking for: Tag 9, ANY_SOURCE, COMM ID 17 Tag: 3 Source: 14 Comm. ID: 32 Incoming message queue Time Tag: 9 Source: 94 Comm. ID: 17 Tag: 9 Source: 67 Comm. ID: 17
MPI_MPROBE Other probes / receives will no longer match this message Tag: 3 Source: 14 Comm. ID: 32 Incoming message queue Time Tag: 9 Source: 94 Comm. ID: 17 MPI_RECV(…, tag=9, src=ANY_SOURCE,comm=17) Tag: 9 Source: 67 Comm. ID: 17
MPI_MRECV “Matched” receive is used to receive a message that was mprobed MPI_MRECV(…, match_handle, …) Tag: 3 Source: 14 Comm. ID: 32 Incoming message queue Time Tag: 9 Source: 94 Comm. ID: 17 Guarantees that you get exactly the message you mprobed
Another Useful Case Probe to find the size of an incoming message MPI_PROBE(…);
Another Useful Case But malloc takes some time to complete MPI_PROBE(…); buf = malloc(incoming_size); Malloc takes some time
Another Useful Case But malloc takes some time to complete MPI_PROBE(…); buf = malloc(incoming_size); Malloc takes some time Vulnerable race condition window
Another Useful Case But malloc takes some time to complete MPI_PROBE(…); buf = malloc(incoming_size); MPI_RECV(…) Malloc takes some time Vulnerable race condition window Message could be stolen!
Another Useful Case Delays between MPROBE and MPRECV do not matter MPI_MPROBE(…); buf = malloc(incoming_size); MPI_MRECV(…) Malloc takes some time
Another Useful Case Delays between MPROBE and MPRECV do not matter MPI_MPROBE(…); buf = malloc(incoming_size); MPI_MRECV(…) Malloc takes some time Message cannot be stolen
Summary MPI_MPROBE eliminates race condition between probe and corresponding receive Good for: Event-based applications Mutli-threaded MPI applications When message lengths are unknown Strings, serialized objects, etc. E.g., bindings for Perl, Python, Boost.mpi

Contenu connexe

En vedette

Winter Storm of 2008 April 6th
Winter Storm of 2008 April 6thWinter Storm of 2008 April 6th
Winter Storm of 2008 April 6thagraning
 
Friends don't let friends leak MPI_Requests
Friends don't let friends leak MPI_RequestsFriends don't let friends leak MPI_Requests
Friends don't let friends leak MPI_RequestsJeff Squyres
 
Post Storm Survey Background
Post Storm Survey BackgroundPost Storm Survey Background
Post Storm Survey Backgroundagraning
 
Open MPI SC'15 State of the Union BOF
Open MPI SC'15 State of the Union BOFOpen MPI SC'15 State of the Union BOF
Open MPI SC'15 State of the Union BOFJeff Squyres
 

En vedette (8)

Hiroshima 1945 2008
Hiroshima 1945   2008Hiroshima 1945   2008
Hiroshima 1945 2008
 
Gita slides
Gita slidesGita slides
Gita slides
 
Winter Storm of 2008 April 6th
Winter Storm of 2008 April 6thWinter Storm of 2008 April 6th
Winter Storm of 2008 April 6th
 
A Foto Que Chocou O Mundo
A Foto Que Chocou O MundoA Foto Que Chocou O Mundo
A Foto Que Chocou O Mundo
 
Friends don't let friends leak MPI_Requests
Friends don't let friends leak MPI_RequestsFriends don't let friends leak MPI_Requests
Friends don't let friends leak MPI_Requests
 
Post Storm Survey Background
Post Storm Survey BackgroundPost Storm Survey Background
Post Storm Survey Background
 
Ratheesh
RatheeshRatheesh
Ratheesh
 
Open MPI SC'15 State of the Union BOF
Open MPI SC'15 State of the Union BOFOpen MPI SC'15 State of the Union BOF
Open MPI SC'15 State of the Union BOF
 

Plus de Jeff Squyres

Open MPI State of the Union X SC'16 BOF
Open MPI State of the Union X SC'16 BOFOpen MPI State of the Union X SC'16 BOF
Open MPI State of the Union X SC'16 BOFJeff Squyres
 
MPI Sessions: a proposal to the MPI Forum
MPI Sessions: a proposal to the MPI ForumMPI Sessions: a proposal to the MPI Forum
MPI Sessions: a proposal to the MPI ForumJeff Squyres
 
MPI Fourm SC'15 BOF
MPI Fourm SC'15 BOFMPI Fourm SC'15 BOF
MPI Fourm SC'15 BOFJeff Squyres
 
Cisco's journey from Verbs to Libfabric
Cisco's journey from Verbs to LibfabricCisco's journey from Verbs to Libfabric
Cisco's journey from Verbs to LibfabricJeff Squyres
 
(Very) Loose proposal to revamp MPI_INIT and MPI_FINALIZE
(Very) Loose proposal to revamp MPI_INIT and MPI_FINALIZE(Very) Loose proposal to revamp MPI_INIT and MPI_FINALIZE
(Very) Loose proposal to revamp MPI_INIT and MPI_FINALIZEJeff Squyres
 
Fun with Github webhooks: verifying Signed-off-by
Fun with Github webhooks: verifying Signed-off-byFun with Github webhooks: verifying Signed-off-by
Fun with Github webhooks: verifying Signed-off-byJeff Squyres
 
Open MPI new version number scheme and roadmap
Open MPI new version number scheme and roadmapOpen MPI new version number scheme and roadmap
Open MPI new version number scheme and roadmapJeff Squyres
 
The State of libfabric in Open MPI
The State of libfabric in Open MPIThe State of libfabric in Open MPI
The State of libfabric in Open MPIJeff Squyres
 
Cisco usNIC libfabric provider
Cisco usNIC libfabric providerCisco usNIC libfabric provider
Cisco usNIC libfabric providerJeff Squyres
 
2014 01-21-mpi-community-feedback
2014 01-21-mpi-community-feedback2014 01-21-mpi-community-feedback
2014 01-21-mpi-community-feedbackJeff Squyres
 
(Open) MPI, Parallel Computing, Life, the Universe, and Everything
(Open) MPI, Parallel Computing, Life, the Universe, and Everything(Open) MPI, Parallel Computing, Life, the Universe, and Everything
(Open) MPI, Parallel Computing, Life, the Universe, and EverythingJeff Squyres
 
Cisco usNIC: how it works, how it is used in Open MPI
Cisco usNIC: how it works, how it is used in Open MPICisco usNIC: how it works, how it is used in Open MPI
Cisco usNIC: how it works, how it is used in Open MPIJeff Squyres
 
Cisco EuroMPI'13 vendor session presentation
Cisco EuroMPI'13 vendor session presentationCisco EuroMPI'13 vendor session presentation
Cisco EuroMPI'13 vendor session presentationJeff Squyres
 
Open MPI Explorations in Process Affinity (EuroMPI'13 presentation)
Open MPI Explorations in Process Affinity (EuroMPI'13 presentation)Open MPI Explorations in Process Affinity (EuroMPI'13 presentation)
Open MPI Explorations in Process Affinity (EuroMPI'13 presentation)Jeff Squyres
 
MOSSCon 2013, Cisco Open Source talk
MOSSCon 2013, Cisco Open Source talkMOSSCon 2013, Cisco Open Source talk
MOSSCon 2013, Cisco Open Source talkJeff Squyres
 
Ethernet and TCP optimizations
Ethernet and TCP optimizationsEthernet and TCP optimizations
Ethernet and TCP optimizationsJeff Squyres
 
MPI-3 Timer requests proposal
MPI-3 Timer requests proposalMPI-3 Timer requests proposal
MPI-3 Timer requests proposalJeff Squyres
 
The Message Passing Interface (MPI) in Layman's Terms
The Message Passing Interface (MPI) in Layman's TermsThe Message Passing Interface (MPI) in Layman's Terms
The Message Passing Interface (MPI) in Layman's TermsJeff Squyres
 
What is [Open] MPI?
What is [Open] MPI?What is [Open] MPI?
What is [Open] MPI?Jeff Squyres
 

Plus de Jeff Squyres (20)

Open MPI State of the Union X SC'16 BOF
Open MPI State of the Union X SC'16 BOFOpen MPI State of the Union X SC'16 BOF
Open MPI State of the Union X SC'16 BOF
 
MPI Sessions: a proposal to the MPI Forum
MPI Sessions: a proposal to the MPI ForumMPI Sessions: a proposal to the MPI Forum
MPI Sessions: a proposal to the MPI Forum
 
MPI Fourm SC'15 BOF
MPI Fourm SC'15 BOFMPI Fourm SC'15 BOF
MPI Fourm SC'15 BOF
 
Cisco's journey from Verbs to Libfabric
Cisco's journey from Verbs to LibfabricCisco's journey from Verbs to Libfabric
Cisco's journey from Verbs to Libfabric
 
(Very) Loose proposal to revamp MPI_INIT and MPI_FINALIZE
(Very) Loose proposal to revamp MPI_INIT and MPI_FINALIZE(Very) Loose proposal to revamp MPI_INIT and MPI_FINALIZE
(Very) Loose proposal to revamp MPI_INIT and MPI_FINALIZE
 
Fun with Github webhooks: verifying Signed-off-by
Fun with Github webhooks: verifying Signed-off-byFun with Github webhooks: verifying Signed-off-by
Fun with Github webhooks: verifying Signed-off-by
 
Open MPI new version number scheme and roadmap
Open MPI new version number scheme and roadmapOpen MPI new version number scheme and roadmap
Open MPI new version number scheme and roadmap
 
The State of libfabric in Open MPI
The State of libfabric in Open MPIThe State of libfabric in Open MPI
The State of libfabric in Open MPI
 
Cisco usNIC libfabric provider
Cisco usNIC libfabric providerCisco usNIC libfabric provider
Cisco usNIC libfabric provider
 
2014 01-21-mpi-community-feedback
2014 01-21-mpi-community-feedback2014 01-21-mpi-community-feedback
2014 01-21-mpi-community-feedback
 
(Open) MPI, Parallel Computing, Life, the Universe, and Everything
(Open) MPI, Parallel Computing, Life, the Universe, and Everything(Open) MPI, Parallel Computing, Life, the Universe, and Everything
(Open) MPI, Parallel Computing, Life, the Universe, and Everything
 
Cisco usNIC: how it works, how it is used in Open MPI
Cisco usNIC: how it works, how it is used in Open MPICisco usNIC: how it works, how it is used in Open MPI
Cisco usNIC: how it works, how it is used in Open MPI
 
Cisco EuroMPI'13 vendor session presentation
Cisco EuroMPI'13 vendor session presentationCisco EuroMPI'13 vendor session presentation
Cisco EuroMPI'13 vendor session presentation
 
Open MPI Explorations in Process Affinity (EuroMPI'13 presentation)
Open MPI Explorations in Process Affinity (EuroMPI'13 presentation)Open MPI Explorations in Process Affinity (EuroMPI'13 presentation)
Open MPI Explorations in Process Affinity (EuroMPI'13 presentation)
 
MPI History
MPI HistoryMPI History
MPI History
 
MOSSCon 2013, Cisco Open Source talk
MOSSCon 2013, Cisco Open Source talkMOSSCon 2013, Cisco Open Source talk
MOSSCon 2013, Cisco Open Source talk
 
Ethernet and TCP optimizations
Ethernet and TCP optimizationsEthernet and TCP optimizations
Ethernet and TCP optimizations
 
MPI-3 Timer requests proposal
MPI-3 Timer requests proposalMPI-3 Timer requests proposal
MPI-3 Timer requests proposal
 
The Message Passing Interface (MPI) in Layman's Terms
The Message Passing Interface (MPI) in Layman's TermsThe Message Passing Interface (MPI) in Layman's Terms
The Message Passing Interface (MPI) in Layman's Terms
 
What is [Open] MPI?
What is [Open] MPI?What is [Open] MPI?
What is [Open] MPI?
 

Dernier

Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024The Digital Insurer
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 

Dernier (20)

E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 

MPI_Mprobe is good for you

  • 1. MPI_MPROBE: It’s Good for You Jeff Squyres
  • 2. Regular MPI_PROBE Checks to see if a matching message has arrived Tag: 3 Source: 14 Comm. ID: 32 Incoming message queue Time Tag: 9 Source: 94 Comm. ID: 17 Tag: 9 Source: 67 Comm. ID: 17
  • 3. Regular MPI_PROBE Checks to see if a matching message has arrived MPI_PROBE looking for: Tag 9, ANY_SOURCE, COMM ID 17 Tag: 3 Source: 14 Comm. ID: 32 Incoming message queue Time Tag: 9 Source: 94 Comm. ID: 17 Tag: 9 Source: 67 Comm. ID: 17
  • 4. Regular MPI_PROBE Checks to see if a matching message has arrived MPI_PROBE looking for: Tag 9, ANY_SOURCE, COMM ID 17 Tag: 3 Source: 14 Comm. ID: 32 Incoming message queue Time Tag: 9 Source: 94 Comm. ID: 17 Match Tag: 9 Source: 67 Comm. ID: 17
  • 5. MPI_PROBE Succeeded Now issue a receive to actually get the message MPI_RECV(…, tag=9, src=94,comm=17, …) Tag: 3 Source: 14 Comm. ID: 32 Incoming message queue Time Tag: 9 Source: 94 Comm. ID: 17 Message is removed from incoming queue Tag: 9 Source: 67 Comm. ID: 17
  • 6. Race Condition …but what if another MPI thread issues the receive first? Tag: 3 Source: 14 Comm. ID: 32 Incoming message queue Time Tag: 9 Source: 94 Comm. ID: 17 MPI_RECV(…, tag=9, src=ANY_SOURCE,comm=17) Tag: 9 Source: 67 Comm. ID: 17
  • 7. Race Condition In this case, your receive will end up unexpectedly blocking (!) MPI_RECV(…, tag=9, src=94,comm=17, …) Tag: 3 Source: 14 Comm. ID: 32 Incoming message queue Time Blocked waiting for a matching message Tag: 9 Source: 67 Comm. ID: 17
  • 8. Race Condition If / when the receive finally completes, it’s not the message you probed MPI_RECV(…, tag=9, src=94,comm=17, …) Tag: 3 Source: 14 Comm. ID: 32 Incoming message queue Time Tag: 9 Source: 67 Comm. ID: 17 Tag: 9 Source: 94 Comm. ID: 17
  • 9. MPI_MPROBE MPROBE = Match + probe Fixes this race condition When a message is successfully probed, it is removed from the matching queue
  • 10. MPI_MPROBE MPI_MPROBE looking for: Tag 9, ANY_SOURCE, COMM ID 17 Tag: 3 Source: 14 Comm. ID: 32 Incoming message queue Time Tag: 9 Source: 94 Comm. ID: 17 Tag: 9 Source: 67 Comm. ID: 17
  • 11. MPI_MPROBE When the match occurs, message is removed from the incoming queue MPI_MPROBE looking for: Tag 9, ANY_SOURCE, COMM ID 17 Tag: 3 Source: 14 Comm. ID: 32 Incoming message queue Time Tag: 9 Source: 94 Comm. ID: 17 Tag: 9 Source: 67 Comm. ID: 17
  • 12. MPI_MPROBE Other probes / receives will no longer match this message Tag: 3 Source: 14 Comm. ID: 32 Incoming message queue Time Tag: 9 Source: 94 Comm. ID: 17 MPI_RECV(…, tag=9, src=ANY_SOURCE,comm=17) Tag: 9 Source: 67 Comm. ID: 17
  • 13. MPI_MRECV “Matched” receive is used to receive a message that was mprobed MPI_MRECV(…, match_handle, …) Tag: 3 Source: 14 Comm. ID: 32 Incoming message queue Time Tag: 9 Source: 94 Comm. ID: 17 Guarantees that you get exactly the message you mprobed
  • 14. Another Useful Case Probe to find the size of an incoming message MPI_PROBE(…);
  • 15. Another Useful Case But malloc takes some time to complete MPI_PROBE(…); buf = malloc(incoming_size); Malloc takes some time
  • 16. Another Useful Case But malloc takes some time to complete MPI_PROBE(…); buf = malloc(incoming_size); Malloc takes some time Vulnerable race condition window
  • 17. Another Useful Case But malloc takes some time to complete MPI_PROBE(…); buf = malloc(incoming_size); MPI_RECV(…) Malloc takes some time Vulnerable race condition window Message could be stolen!
  • 18. Another Useful Case Delays between MPROBE and MPRECV do not matter MPI_MPROBE(…); buf = malloc(incoming_size); MPI_MRECV(…) Malloc takes some time
  • 19. Another Useful Case Delays between MPROBE and MPRECV do not matter MPI_MPROBE(…); buf = malloc(incoming_size); MPI_MRECV(…) Malloc takes some time Message cannot be stolen
  • 20. Summary MPI_MPROBE eliminates race condition between probe and corresponding receive Good for: Event-based applications Mutli-threaded MPI applications When message lengths are unknown Strings, serialized objects, etc. E.g., bindings for Perl, Python, Boost.mpi