04 방응준
- 2. Trend
Multi-Core Processors Change the Rules
Quad Core
Dual Core
Single Core
www.intel.com/go/parallel
Copyright © 2009, Intel Corporation. All rights reserved. 2
*Other brands and names are the property of their respective owners
- 3. Paradigm Shift:
More Cores, Not a Faster Clock
• Power and
thermal issues
limit clock
PERFORMANCE
frequency
• Performance
Multi-core Needs increases now
Parallel Applications come from
parallelism
GHz Era Multi-core Era
TIME
www.intel.com/go/parallel
Copyright © 2009, Intel Corporation. All rights reserved. 3
*Other brands and names are the property of their respective owners
- 4. Non-parallel (serial) is old and slow.
Unnecessary
bottlenecks.
www.intel.com/go/parallel
Copyright © 2009, Intel Corporation. All rights reserved. 4
*Other brands and names are the property of their respective owners
- 5. Parallelism is the key to performance.
Unnecessary
bottlenecks.
Parallel
programming
lets work
proceed when
ready.
www.intel.com/go/parallel
Copyright © 2009, Intel Corporation. All rights reserved. 5
*Other brands and names are the property of their respective owners
- 6. Parallelism is the key toperformance.
Dead end.
Not the future.
The future
is here.
We are ready.
www.intel.com/go/parallel
Copyright © 2009, Intel Corporation. All rights reserved. 6
*Other brands and names are the property of their respective owners
- 7. Targeting the „Mass Market“ of Parallelism
MAINSTREAM
DEVELOPERS
www.intel.com/go/parallel
Copyright © 2009, Intel Corporation. All rights reserved. 7
*Other brands and names are the property of their respective owners
- 8. Targeting the „Mass Market“ of Parallelism
MAINSTREAM
DEVELOPERS
www.intel.com/go/parallel
Copyright © 2009, Intel Corporation. All rights reserved. 8
*Other brands and names are the property of their respective owners
- 9. the Mainstream Developers
www.intel.com/go/parallel
Copyright © 2009, Intel Corporation. All rights reserved. 9
*Other brands and names are the property of their respective owners
- 10. the Mainstream Developers
www.intel.com/go/parallel
Copyright © 2009, Intel Corporation. All rights reserved. 10
*Other brands and names are the property of their respective owners
- 11. Intel® Parallel Studio
Microsoft Visual Studio plug in* for Parallelism
+
The Perfect Combination for Fast & Reliable Code
DESIGN
CODE & DEBUG
VERIFY
TUNE
www.intel.com/go/parallel
Copyright © 2009, Intel Corporation. All rights reserved. 11
*Other brands and names are the property of their respective owners
- 12. New software tools drive adoption of multi-core
For Microsoft Visual Studio* C++ architects, developers, and software
innovators creating parallel Windows* applications.
• Advisor
• Composer
• Inspector
• Amplifier
www.intel.com/go/parallel
Copyright © 2009, Intel Corporation. All rights reserved. 12
*Other brands and names are the property of their respective owners
- 13. Parallel Programming development lifecycle
DESIGN
Gain insight on where parallelism will
most benefit existing source code
CODE & DEBUG
Develop effective applications with a
C/C++ compiler and comprehensive
threaded libraries
VERIFY
Ensure application reliability with
proactive parallel memory and
threading error checking
TUNE
Enhance applications with easy-to-use
performance analyzer and tuner
www.intel.com/go/parallel
Copyright © 2009, Intel Corporation. All rights reserved. 13
*Other brands and names are the property of their respective owners
- 14. Intel® Parallel Advisor (Available in Q3/2010 )
Gain insight on where parallelism will most benefit existing source code
DESIGN PHASE
• First and only threading advisor
• See where parallelism will most
benefit Windows* apps
• Step-by-step threading guidance
• Make better design decisions
• Shorter learning curve for
parallelism
www.intel.com/go/parallel
Copyright © 2009, Intel Corporation. All rights reserved. 14
*Other brands and names are the property of their respective owners
- 15. Intel® Parallel Advisor Workflow
www.intel.com/go/parallel
Copyright © 2009, Intel Corporation. All rights reserved. 15
*Other brands and names are the property of their respective owners
- 16. Mark Insert Annotate
ANNOTATE_SITE_BEGIN(site1);
for (i=0; i<N; i++) {
ANNOTATE_TASK_BEGIN(task1);
func1(i);
ANNOTATE_LOCK_ACQUIRE(0);
glob_variable++;
ANNOTATE_LOCK_RELEASE(0);
func2(i);
ANNOTATE_TASK_END(task1);
}
ANNOTATE_SITE_END(site1);
www.intel.com/go/parallel
Copyright © 2009, Intel Corporation. All rights reserved. 16
*Other brands and names are the property of their respective owners
- 17. Views of Intel® Parallel Advisor
www.intel.com/go/parallel
Copyright © 2009, Intel Corporation. All rights reserved. 17
*Other brands and names are the property of their respective owners
- 18. Intel® Parallel Composer
Develop effective applications with a C/C++
compiler and comprehensive threaded libraries
CODE & DEBUG PHASE
• Easier, faster parallelism for
Windows* apps
• C/C++ compiler and advanced
threaded libraries
• Built-in parallel debugger
• Supports OpenMP*
• Save time and increase
productivity
• Code Coverage
www.intel.com/go/parallel
Copyright © 2009, Intel Corporation. All rights reserved. 18
*Other brands and names are the property of their respective owners
- 19. Intel® Threading Building Blocks - today
• Extends C++ for parallelism
– Solves C++ challenges in multiple areas
– Portable to any C++ compiler, processor, O.S., already ported to a wide variety!
– Coordinated with Visual Studio® 2010’s PPL and Concurrency Runtime
• Open source project started by Intel
- http://threadingbuildingblocks.org
• Most used abstraction for parallelism
• Flattered
www.intel.com/go/parallel
Copyright © 2009, Intel Corporation. All rights reserved. 19
*Other brands and names are the property of their respective owners
- 20. Generic Parallel Algorithms
parallel_for(range)
Concurrent Containers
concurrent_hash_map
parallel_reduce
concurrent_queue
All these make up TBB
parallel_for_each(begin, end)
concurrent_bounded_queue
parallel_do
concurrent_vector
parallel_invoke
pipeline
parallel_sort Thread Local Storage
parallel_scan enumerable_thread_specific
combinable
Task scheduler
task_group
Synchronization Primitives
task_structured_group
atomic;
task_scheduler_init
mutex; recursive_mutex;
task_scheduler_observer
spin_mutex; spin_rw_mutex;
queuing_mutex; queuing_rw_mutex;
Miscellaneous Threads
null_mutex; null_rw_mutex
tick_count tbb_thread
Memory Allocation
tbb_allocator; cache_aligned_allocator; scalable_allocator; zero_allocator
www.intel.com/go/parallel
Copyright © 2009, Intel Corporation. All rights reserved. 20
*Other brands and names are the property of their respective owners
- 21. Classical parallel algorithm usage example
#include "tbb/blocked_range.h"
#include "tbb/parallel_for.h“ ChangeArrayclass defines
using namespace tbb; a for-loop body for parallel_for
class ChangeArray{
int* array;
blocked_range– TBB template
public:
ChangeArray (int* a): array(a) {}
representing 1D iteration space
void operator()( const blocked_range<int>& r ) const{
for (inti=r.begin();i!=r.end();i++ ){
Foo (array[i]);
} As usual with C++ function
} objects the main work
};
is done inside operator()
void ChangeArrayParallel (int* a, int n )
{
parallel_for (blocked_range<int>(0, n), ChangeArray(a), auto_partitioner());
}
int main (){
task_scheduler_init init;
A call to a template function
int A[N]; parallel_for<Range, Body>:
// initialize array here… with arguments
ChangeArrayParallel(A, N); Range blocked_range
return 0; Body ChangeArray
}
www.intel.com/go/parallel
Copyright © 2009, Intel Corporation. All rights reserved. 21
*Other brands and names are the property of their respective owners
- 22. C++0x lambda functions support
parallel_for example will transform into:
Capture variables by value
#include "tbb/blocked_range.h" from surrounding scope to
#include "tbb/parallel_for.h“ completely mimic the non-lambda
using namespace tbb; implementation. Note that [&]
could be used to capture
void ChangeArrayParallel (int* a, int n ) variables by reference.
{
parallel_for (0, n, 1,
[=](inti) { parallel_for has an overload that takes
Foo (a[i]); start, stop and step argument and
}
constructs blocked_range internally
/*, auto_partitioner*/);
}
Using lambda functions implement
int main (){
MyBody::operator() right inside
//task_scheduler_init init;
int A[N]; the call to parallel_for().
// initialize array here…
ChangeArrayParallel (A, N); auto_partitioner is used by default
return 0;
}
explicit task_scheduler_init
creation is now optional
www.intel.com/go/parallel
Copyright © 2009, Intel Corporation. All rights reserved. 22
*Other brands and names are the property of their respective owners
- 23. Intel® Threading Building Blocks (TBB) and
Microsoft* Visual Studio* 2010 Parallel Pattern Library
Identical semantics shared for a core set of concurrent
containers and algorithm classes
parallel_for(first,last,step,f)
parallel_for_each
parallel_invoke
task_handle
task_group_status
task_group
structured_task_group
is_current_task_group_cancelling
missing_wait
concurrent_vector*
concurrent_queue*
*These are based on Intel’s implementation used by Threading Building Blocks
www.intel.com/go/parallel
Copyright © 2009, Intel Corporation. All rights reserved. 23
*Other brands and names are the property of their respective owners
- 24. Built-in Parallel Debugger Extension
www.intel.com/go/parallel
Copyright © 2009, Intel Corporation. All rights reserved. 24
*Other brands and names are the property of their respective owners
- 25. Intel® Parallel Inspector
Ensure application reliability with proactive parallel
memory and threading error checking
VERIFY PHASE
• Find threading errors faster
• Parallel memory and threading
error checking
• Rapid analysis of code
• Help ensure Windows* application
reliability
• Ship apps that run error-free
www.intel.com/go/parallel
Copyright © 2009, Intel Corporation. All rights reserved. 25
*Other brands and names are the property of their respective owners
- 26. Memory Errors Analysis
• Memory Leaks
• Invalid Memory Accesses
• Invalid Partial Memory Accesses
• Mismatched Memory Allocation / Deallocation
• Missing Allocations
• Uninitialized Memory Accesses
• Uninitialized Partial Memory Access
www.intel.com/go/parallel
Copyright © 2009, Intel Corporation. All rights reserved. 26
*Other brands and names are the property of their respective owners
- 27. Intel® Parallel Inspector-Memory Errors
www.intel.com/go/parallel
Copyright © 2009, Intel Corporation. All rights reserved. 27
*Other brands and names are the property of their respective owners
- 28. Intel® Parallel Inspector-Memory Errors
www.intel.com/go/parallel
Copyright © 2009, Intel Corporation. All rights reserved. 28
*Other brands and names are the property of their respective owners
- 29. Threading Error Analysis
• Potential Threading Errors Detected
• Data Races
• Deadlock
• Potential Privacy Infringement
www.intel.com/go/parallel
Copyright © 2009, Intel Corporation. All rights reserved. 29
*Other brands and names are the property of their respective owners
- 30. Intel® Parallel Inspector-Threading Errors
Data Racing
www.intel.com/go/parallel
Copyright © 2009, Intel Corporation. All rights reserved. 30
*Other brands and names are the property of their respective owners
- 31. Race Conditions
• Threads “race” against each other for resources
• Execution order is assumed but cannot be guaranteed
• Storage conflict is most common
• Concurrent access of same memory location by multiple
threads
– At least one thread is writing
www.intel.com/go/parallel
Copyright © 2009, Intel Corporation. All rights reserved. 31
*Other brands and names are the property of their respective owners
- 32. Intel® Parallel Amplifier
Quickly find bottlenecks and tune parallel
applications for scalable multi-core performance
TUNE PHASE
• Quickly find bottlenecks
• Tune Windows* apps faster
• Optimize app performance
• Scale apps for multi-core
• Designed for parallel apps
• Performance Analysis
• Performance Scalability Analysis
• Locks & Waits Analysis
www.intel.com/go/parallel
Copyright © 2009, Intel Corporation. All rights reserved. 32
*Other brands and names are the property of their respective owners
- 33. Where to parallel…
www.intel.com/go/parallel
Copyright © 2009, Intel Corporation. All rights reserved. 33
*Other brands and names are the property of their respective owners
- 34. Thank you!
Time for Questions now.
Naver Cafe: cafe.naver.com/intelsw
Twitter: IntelSDP
www.intel.com/go/parallel
Copyright © 2009, Intel Corporation. All rights reserved. 34
*Other brands and names are the property of their respective owners