This document proposes a parallel region growing algorithm for image segmentation using GPU architecture. It summarizes that image segmentation partitions images into segments and is important for medical analysis. It describes GPUs and CUDA programming for parallel processing. The goal is to evaluate performance of serial vs parallel region growing algorithms on a GPU. The approach develops a parallel algorithm that assigns each pixel to a thread. Performance analysis on a brain MRI shows the parallel GPU implementation takes much less execution time than the serial CPU implementation. In conclusion, the parallel approach exploits GPU capabilities for fine-grained parallelism and improved performance.
A New Approach for Parallel Region Growing Algorithm in Image Segmentation using MATLAB on GPU Architecture
1. A New Approach for
Parallel Region Growing
Algorithm in Image
Segmentation using
MATLAB
on GPU Architecture
KRISHNA KATRAGADDA
MANEESH BODDU
2. IMAGE
SEGMENTATION
Image Segmentation is a process of
partitioning a digital image into multiple
segments.
Segmentation plays key role in image
analysis.
Mostly used in Medical field for locating
tumors and other pathologies.
4. GRAPHIC PROCESSING UNIT(GPU)
GPU is a programmable logic chip (processor) specialized for display functions.
It renders images, animations and video for the computer's screen.
It has extremely high floating-point processing performance.
GPU can accelerate segmentation process.
A GPU performs quick math calculations and frees up the CPU to do other things. Whereas a CPU uses a few cores
focused on sequential serial processing, a GPU has thousands of smaller cores made for multi-tasking.
5. CUDA
CUDA is a parallel computing platform and programming model developed by Nvidia for general computing on its own GPUs
CUDA model is a collection of threads running in parallel.
At instruction level, 32 consecutive threads in a thread block make up of a minimum unit of execution, which is called a thread warp.
Threads in a single block communicate through the shared memory.
CUDA consists of a set of C language extensions and a runtime library that provides APIs to control the GPU.
Thus, CUDA programming model allows the programmers to better exploit the parallel power of the GPU for general-purpose computing
6. MATLAB
MATLAB (matrix laboratory) is a
numerical computing
environment tool.
It is also widely used in image
processing task.
MATLAB also provide a way to
work with parallel programs by
integrating it with some
programming model.
Here we are using CUDA as a
programming model and trying
to integrate the MATLAB with
CUDA environment.
7. GOAL
To evaluate and compare the performance of a serial and parallel region growing
segmentation algorithm that takes benefit of the highly parallel architecture of the GPU.
This work presents a serial and parallel implementation of a region growing algorithm for
GPUs.
This paper suggests parallel processing improves the performance when compare to that
of serial processing
8. APPROACH
Propose a different parallelization scheme that takes benefit of the highly
parallel architecture of the GPU:
Each pixel is processed by a different thread.
2 new attributes to calculate spatial heterogeneity to maximize
computational efficiency.
Algorithm is implemented by C and CUDA.
9. GPU ARCHITECTURE
GPU’s are parallel processors that support fine-grain threads.
Each multiprocessor contains processor cores, multi-threaded instruction unit, number of registers
and shared memory.
CUDA is C-based development environment for GPU’s
Threads are organized into thread blocks, and executed in groups called wraps.
10.
11. REGION GROWING
ALGORITHM
Conversion of image matrix’s range between
0 to 1 rather than 0 to 255.
We then calculate region mean of image.
Calculate the homogeneity among pixels and
form segments (region wise).
Homogeneity is calculated based on the
given threshold value.
12. Homogeneity
between pixels
Compare initial seed point
with neighboring pixels.
Pixels that are homogenous
are merged into segments.
A list of different pixels that
are the candidate for forming
a different region are created.
13. PARALLEL
ALGORITHM
The parallel algorithm assigns looping
part in serial code to multiple number
of threads that work independently.
To attain parallel processing we are
trying to integrate MATLAB with CUDA
on GPU’s
14. Similar to serial implementation
matrix range is converted to 0 to 1
rather than 0 to 255.
Define GPU kernel with << number
of blocks, threads, dynamic
memory per block, stream
associated >> using CUDA.
Calculate region mean.
GPU create two shared variables to
store region mean and
homogeneity of pixels.
Pixels are segmented based on
homogeneity and the results are
stored in shared memory.
Finally segmented image will be
shown using functions in MATLAB.
Steps :
16. The above graph shows performance of both CPU and GPU in term of
execution time by choosing different seed points for image segmentation.
Here GPU takes very less execution time than CPU.
17. CONCLUSION
The parallel algorithm essentially assigns a particular thread to each image pixel so as to
exploit the GPU support of fine-grain threads and the large number of processing elements
available.
It should also be noted that these performance gains can be obtained with low
investment in hardware, as GPUs with increasing processing power are currently
available on the market at declining prices.
We can optimize our result using new version of NVIDIA graphics card by selecting
automatic seed points rather than manual.