Work place: Department of ISE, Advance computing research, Nitte Meenakshi Institute of Technology, Bangalore-560064, India
E-mail: chandrashekar.bn@nmit.ac.in
Website:
Research Interests: Computer systems and computational processes, Computer Architecture and Organization, Distributed Computing, Parallel Computing
Biography
B.N Chandrashekhar is an associate professor at Nitte Meenakshi institute of Technology. Received the BE degree in computer science and engineering from the Visvesvaraya Technological University, India, in 2004 and the M.Tech degree in computer science and engineering from Visvesvaraya Technological University, India, in 2010. Currently he is pursuing a PhD at the Advance computing, dept. of information science and engineering research center at the Nitte Meenakshi institute of Technology, Bangalore. His research interests include Hybrid (CPU-GPU) computing, parallel and distributed systems, and performance modeling of parallel HPC applications. He published papers in peerreviewed journals and conference proceedings
By Chandrashekhar B. N Sanjay H. A
DOI: https://doi.org/10.5815/ijigsp.2019.08.03, Pub. Date: 8 Aug. 2019
In scientific fields, solving large and complex computational problems using central processing units (CPU) alone is not enough to meet the computation requirement. In this work we have considered a homogenous cluster in which each nodes consists of same capability of CPU and graphical processing unit (GPU). Normally CPU are used for control GPU and to transfer data from CPU to GPUs. Here we are considering CPU computation power with GPU to compute high performance computing (HPC) applications. The framework adopts pinned memory technique to overcome the overhead of data transfer between CPU and GPU. To enable the homogeneous platform we have considered hybrid [message passing interface (MPI), OpenMP (open multi-processing), Compute Unified Device Architecture (CUDA)] programming model strategy. The key challenge on the homogeneous platform is allocation of workload among CPU and GPU cores. To address this challenge we have proposed a novel analytical workload division strategy to predict an effective workload division between the CPU and GPU. We have observed that using our hybrid programming model and workload division strategy, an average performance improvement of 76.06% and 84.11% in Giga floating point operations per seconds(GFLOPs) on NVIDIA TESLA M2075 cluster and NVIDIA QUADRO K 2000 nodes of a cluster respectively for N-dynamic vector addition when compared with Simplice Donfack et.al [5] performance models. Also using pinned memory technique with hybrid programming model an average performance improvement of 33.83% and 39.00% on NVIDIA TESLA M2075 and NVIDIA QUADRO K 2000 respectively is observed for saxpy applications when compared with pagable memory technique.
[...] Read more.Subscribe to receive issue release notifications and newsletters from MECS Press journals