Reader Comments

Post a new comment on this article

GPU performance comparison

Posted by wjpalenstijn on 02 May 2013 at 08:51 GMT

Dear authors,

This paper describes very impressive multi-core CPU performance improvements achieved through a combination of single-core code optimization, vector processing, multithreading and efficient disk I/O operations. In table 5 you compare these with other current GPU and CPU implementations of SIRT.

In our paper Performance improvements for iterative electron tomography reconstruction using graphics processing units (GPUs), which you cite as [26], we describe an alternative GPU design and implementation, which includes benchmarks for the same dataset dimensions as used in your Table 5.

If we include our reported figures in this table, it looks as follows:

GPU/CPU -- Dataset A -- Dataset B -- Dataset C

GTX 280 [29] -- 2.10 -- 10.66 -- 58.47

Tesla C1060 [28] -- 0.53 -- 4.80 -- 51.28

GTX 285 [28] -- 0.46 -- 3.90 -- 42.43

Tesla C2050 [28] -- 0.57 -- 3.70 -- 37.91

Q9550 (4T) -- 0.48 -- 4.03 -- 32.17

E5405 (8T) -- 0.35 -- 2.94 -- 24.55

GTX 280 [26] -- 0.45 -- 2.0 -- 11.13


This benchmark is in Table 3 in our paper [26], given in ms per volume slice. It can be independently verified using the open source ASTRA Tomography Toolbox at http://visielab.ua.ac.be/... , which contains our GPU implementation.

While this does not in any way detract from your results on CPU optimization, we feel that Table 5, its caption, and the conclusions drawn from it do not represent the state of the art of GPU tomographic reconstruction at time of
publication.


With best regards,

Willem Jan Palenstijn,
Joost Batenburg,
Jan Sijbers

No competing interests declared.