• GHand: A GPU algorithm for realtime hand pose estimation using depth camera
    To appear in Computer Graphics Forum (Eurographics 2015)

  • Estimate Hand Poses Efficiently from Single Depth Images
    To appear in International Journal of Computer Vision (IJCV)

  • Hand Pose Estimation Demo Booth
    Best Booth Award, A*STAR Scientific Conference (ASC) 2014

  • Efficient hand pose estimation from single depth images
    X-periment!, Singapore Science Festival, 2014
    Poster 1Poster 2

  • Real-time hand pose estimation from depth camera using GPU
    GPU Technology Conference 2014 (South East Asia)

  • Delaunay mesh generation using the GPU
    Merit Award, NVIDIA Poster Contest,
    GPU Technology Conference 2014 (South East Asia)

  • A GPU accelerated algorithm for 3D Delaunay triangulation
    ACM SIGGRAPH Symposium on Interactive 3D Graphics and Games (I3D), 2014

  • Instant GLEW
    Packt Publishing, 2013

  • gHull: A GPU algorithm for 3D Convex Hull
    ACM Transactions on Mathematical Software (TOMS), 2013

  • Delaunay triangulation in R³ on the GPU
    PhD Thesis, National University of Singapore, 2012
    Thesis • Code [1, 2] • BibTeX


Source code from my research projects and my PhD thesis can be found at Github here.


The gStar4D algorithm computes the 3D Delaunay triangulation on the GPU.

The gStar4D algorithm uses neighbourhood information in the 3D digital Voronoi diagram as an approximation of the 3D Delaunay triangulation. It uses this to perform massively parallel creation of stars of each input point lifted to 4D and employs an unique star splaying approach to splay these 4D stars in parallel and make them consistent. The result is the 3D Delaunay triangulation of the input constructed fully on the GPU.

Our CUDA implementation of gStar4D is robust and achieves a speedup of up to 5 times over the 3D Delaunay triangulator of CGAL.


The gDel3D algorithm constructs the Delaunay Triangulation of a set of points in 3D using the GPU.

The algorithm used is a combination of incremental insertion, flipping and star splaying. The code is written using CUDA programming model of NVIDIA.


The gReg3D algorithm computes the 3D regular (weighted Delaunay) triangulation on the GPU.

The gReg3D algorithm extends the star splaying concepts of the gStar4D and gDel3D algorithms to construct the 3D regular (weighted Delaunay) triangulation on the GPU. This algorithm allows stars to die, finds their death certificate and uses methods to propagate this information to other stars efficiently. The result is the 3D regular triangulation of the input computed fully on the GPU.

Our CUDA implementation of gReg3D is robust and achieves a speedup of up to 4 times over the 3D regular triangulator of CGAL.

Coursera Heterogenous

I created this library of code to work offline on the assignments of Heterogenous Parallel Programming course offered by Coursera. Many folks chipped in and have converted this into an easy to use library for the course.


I blog at Chooru.Code. It started in Sep 2009 as a journal to capture all kinds of short entries about computing problems, solutions and discoveries I faced every day. Since then it has grown to a collection of more than 1600 posts. I have been truly overwhelmed by the 2.4 million visitors who have viewed these posts.