High hardware design and mask production costs dictate the need to reuse an architectural platform for as many applications as possible. Embedded multimedia portable devices are required to perform in real time a huge variety of different algorithms, ranging from audio and image processing, to channel coding, to video games and java virtual machines. Dynamically reconfigurable architectures are an effective means to cope with both requirements. However, their effective and efficient use today is hindered by a lack of methodology and tools to extensively explore the hardware/software (HW/SW) design space, without requiring software developers to have a deep knowledge of the underlying architecture. This paper describes one such methodology, which extends the software programming model to the design flow for a reconfigurable processor. Its effectiveness is shown with the case study of a turbo decoder for universal mobile telecommunications systems, in which a remarkable 11X speed-up and 4X reduction of energy requirements with respect to a pure software implementation has been obtained, by mapping the more computation-intensive kernels to the reconfigurable hardware.