I'm already parallelized ๐ ๐
Team member: Jiahua Huang (jiahuah), Xinna Guo (xinnag) Webpage: https://flyingsheldon.github.io/ParallelProject/
We are going to implement several image processing algorithms including sharpening and highlight/shadow adjustment using Halide and Cuda on GPU.
Sharpening increases the contrast between pixels and enhances the line structure and other details about the image. Naive approach will introduce โhaloโs around the edge, so we plan to use a bilateral/trilateral filter or other recent techniques.
For highlight adjustment and shadow adjustment, several target areas need to be determined first in order to perform corresponding operation on it. Then the filter weight for those target areas is determined by the neighbour pixels.
The speedup of these algorithms benefit from the parallelism.
Decomposition and communication for sharpening Itโs a challenge to decompose the problem. We also need to figure out a better way to assign it to each thread to minimize the communication between each thread due to the dependency on neighboring pixels.
Pixel dependency for highlight & shadow adjustment In the first step of highlight & shadow adjustment, we need to determine which category this pixel belongs to. This requires the pixel distribution of the whole image. This statistical stage makes the program hard to parallelize because one pixel depends on all the other pixels. Also, contention will happen if every thread tries to add the result to the shared address space at the same time.
Halide domain specific language Halide is a programming language that is specified to make image and array processing easier and faster. None of the team members have experience with Halide so we need to start learning everything from scratch by ourselves.
The basic deliverables that we want to achieve are a working CUDA implementation and a working Halide implementation of image sharpening. The comparison of CUDA and Halide will also be analysed as well.
During the implementation, we notice that the user interface is not friendly. So we plan to build a GUI for this project.
If everything goes ahead of the plan, we plan to try to tackle the highlight/shadow adjustment, which needs quite some time to figure out how to parallelize the first step of categorizing the pixels.
For the programming language and tools, we plan to use both Halide and CUDA. One programmer will implement the algorithms on C++ and then manually parallelize it using CUDA, while the other will implement it on Halide.
GPU is designed for image processing related problems and Halide is designed to make it easier to write high-performance image and array processing code on modern machines.The advantage of domain specific languages will be analyzed as well.
Week | Time | Work |
---|---|---|
Week 1 | 11/4 - 11/11 | Waiting for feedback from instructors. |
Week 2 | 11/11 - 11/18 | Research on sharpening algorithms. Set up workspace and enviroment. |
Week 3-1 | 11/18 - 11/21 | Complete a serial version of image sharpening on C++. Finish Milestone report. |
Week 3-2 | 11/21 - 11/25 | Start working on Halide (Xinna) and CUDA (Jiahua) implementations. |
Week 4-1 | 11/25 - 11/28 | Finish naive Halide (Xinna) and CUDA (Jiahua) implementations. |
Week 4-2 | 11/28 - 12/2 | Try to optimize Halide (Xinna) and CUDA (Jiahua) implementations. |
Week 5-1 | 12/2 - 12/5 | Finish optimizaing Halide (Xinna) and CUDA (Jiahua) implementations. |
Week 5-2 | 12/5 - 12/9 | Report the performance gain and analyze the difference between C++ and domain specific language. If time permits, explore highlight/shadow adjust. |
The project milestone report can be found here.
The project final report can be found here.