This paper is focused in the inverse transforms defined in the video coding standard HEVC - High Efficiency Video Coding. The transforms stage is one of the innovations proposed by HEVC since it allows the use of the biggest number of transforms sizes (four) and also the biggest transform sizes (till 32×32) when compared with previous standards. The inverse DCT is performed by the video encoder and decoder as well. This paper presents an efficient hardware design for the 32×32 HEVC IDCT based on the separability principle. The hardware design was planned to reach real time processing (at least 30 frames per second) for high resolution videos, exploiting a high parallelism level (32 samples consumed per clock cycle). The architecture was also planned to reach a low latency and a low cost, then it was designed in a purely combinational way and using a multiplierless approach. The synthesis process was targeted to an Altera Stratix IV FPGA. The synthesis results show that the designed architecture is capable to process more than 30 QFHD frames (3840×2160 pixels) per second, with a latency of 33 clock cycles.