Alex Stepanov, 6 Oct 2016, collected papers, here.
Alexander A. Stepanov, Anil R. Gangolli, Daniel E. Rose, Ryan J. Ernst, and Paramjit S. Oberoi: SIMD-Based Decoding of Posting Lists. ACM Conference on Information and Knowledge Management (CIKM 2011), October 24–28, 2011, Glasgow, Scotland, UK.
Alexander A. Stepanov, Anil R. Gangolli, Daniel E. Rose, Ryan J. Ernst, and Paramjit S. Oberoi: SIMD-Based Decoding of Posting Lists. A9 Technical Report A9TR-2011-01, revision 2, June 2014, 30 pages. Appendix includes C++ code. PDF
Plaisance, Kurz, Lemire, June 2015, 1st International Symposium on Web AlGorithms, Vectorized VByte Decoding, here.
We consider the ubiquitous technique of VByte compression, which represents each integer as a variable length sequence of bytes. The low 7 bits of each byte encode a portion of the integer, and the high bit of each byte is reserved as a continuation flag. This flag is set to 1 for all bytes except the last, and the decod- ing of each integer is complete when a byte with a high bit of 0 is encountered. VByte decoding can be a performance bottleneck especially when the unpredictable lengths of the encoded integers cause frequent branch mispredictions. Previous attempts to accel- erate VByte decoding using SIMD vector instructions have been disappointing, prodding search engines such as Google to use more complicated but faster-to-decode formats for performance- critical code. Our decoder (MASKED VBYTE) is 2 to 4 times faster than a conventional scalar VByte decoder, making the for- mat once again competitive with regard to speed.
Daniel Lemire and Leonid Boystov, 10 Sep 2012, arXiv, Decoding billions of integers per second through vectorization, here.
In many important applications — such as search engines and relational database systems — data is stored in the form of arrays of integers. Encoding and, most importantly, decoding of these arrays consumes considerable CPU time. Therefore, substantial effort has been made to reduce costs associated with compression and decompression. In particular, researchers have exploited the superscalar nature of modern processors and SIMD instructions. Nevertheless, we introduce a novel vectorized scheme called SIMD-BP128 that improves over previously proposed vectorized approaches. It is nearly twice as fast as the previously fastest schemes on desktop processors (varint-G8IU and PFOR). At the same time, SIMD-BP128 saves up to 2 bits per integer. For even better compression, we propose another new vectorized scheme (SIMD-FastPFOR) that has a compression ratio within 10% of a state-of-the-art scheme (Simple-8b) while being two times faster during decoding.
Peter Carr, Oct 2006, Bloomberg, Harvard’s Financial Scientist, here. Merton had a model at IDL for buy and hold portfolio optimization for retirees. Problem was retirees don’t have enough money to optimize so IDL didn’t fly. Merton’s papers are here.
One major difference be- tween quantitative finance today and the field 30 or 35 years ago, according to Merton, rapidly increasing computing power has taken away much of the im- portance of intuition. Merton recalls his graduate students’ needing to spend considerable time looking up possible solutions to functions in books he kept in his office.