Improving Particle Filter Performance Using SSE Instructions (2009)
Robotics researchers are often faced with real-time constraints, and for that reason algorithmic and implementation-level optimization can dramatically increase the overall performance of a robot. In this paper we illustrate how a substantial run-time gain can be achieved by taking advantage of the extended instruction sets found in modern processors, in particular the SSE1 and SSE2 instruction sets. We present an SSE version of Monte Carlo Localization that results in an impressive 9x speedup over an optimized scalar implementation. In the process, we discuss SSE implementations of atan, atan2 and exp that achieve up to a 4x speedup in these mathematical operations alone.
In Proceedings of IROS 2009: 2009 IEEE/RSJ International Conference on Intelligent RObots and Systems, October 2009.

Michael Quinlan Formerly affiliated Research Scientist mquinlan [at] cs utexas edu
Peter Stone Faculty pstone [at] cs utexas edu