In fact, bit shifting is always faster than anything. At the hardware level, it's just wiring; there is no logic involved in shifting the bits, so there is no propagation delay involved, unlike even the single logic gate operations like &, |, ~, etc.
It's so much faster, that many compilers, for integer multiplication, will optimize these multiplications by converting them to shifts and adds. Integer division, however, usually just involves a lookup table.
> In fact, bit shifting is always faster than anything.
This is overly simplistic. A fixed bitwise mapping is simple wiring and about as fast as anything can be, yes. Extending that, the "fastest" 32-bit shifter might be to put down 31 circuits, one for each possible shift amount. But that still requires a mux tree to select between the different circuits, which already goes beyond simple wiring.
While that might be the minimum-delay implementation, it's also very area inefficient. A more area-efficient design would be to cascade lg(32) = 5 muxed shifters, one for each bit in the shift amount. For example, x << 19 is the same as ((x << 16) << 2) << 1. If you read a VLSI design textbook, you'll see that there are all kinds of low-level tricks for designing barrel shifters, especially when it comes to layout.
Finally, the practical reality for programmers is that variable-amount shifts and rotates are relatively expensive on quite a few processors.
As well as the practical details of barrel shifter implementation design that psykotic points out, older CPUs couldn't afford the space for a barrel shifter and implemented shifts as a sequence of one-bit shifts. So for instance the 8086 took 8 clock cycles plus another 4 cycles per bit shift, and on that kind of CPU it was definitely not as fast as an addition.
Incidentally there is a standard trick for integer division by constant which allows you to convert it into a series of 3 to 5 shift and add/subtract operations; any decent C compiler will implement this. _Hacker's Delight_ has the explanation of the algorithm, I think.
It's so much faster, that many compilers, for integer multiplication, will optimize these multiplications by converting them to shifts and adds. Integer division, however, usually just involves a lookup table.