We show that any <i>n</i> x <i>n</i> matrix <i>A</i> over any finite semiring can be preprocessed in <i>O</i>(<i>n</i><sup>2</sup>+ε) time, such that all subsequent vector multiplications with <i>A</i> can be performed in <i>O</i>(<i>n</i><sup>2</sup>/(εlog<i>n</i>)<sup>2</sup>) time, for all ε > 0. The approach is combinatorial and can be implemented on a pointer machine or a (log<i>n</i>)-word RAM. Some applications are described. 
