The Discrete Cosine Transform (DCT) is significantly of interest in the area of image compression according to its high compaction energy. It has become the core of many international standards such as JPEG, H.26x and the MPEG family [1-3]. In both software and hardware implementations, there appear many fast algorithms to speed up the computation of DCT. A 2-D DCT can be easily computed by recursively used of a 1-D DCT computing scheme. However, the direct implementation of 2-D DCT is generally requires more efforts. Most DCT computations require floating point multiplications, which indeed slow and clumsy. Such a mathematical notation can be avoided by using Integer implementations, which are usually based on distributed arithmetic . However it stills accompany some drawbacks since these fixed-point multiplications need rather wide data bus (32-bit, for instance). This can lead to a limitation of low power applications such as handheld devices. Based on Chen's factorisation of the DCT matrix , Tran et al. [6-7] have proposed an approximation computation of DCT by introducing the lifting scheme. The basis multiplication is approximated by the rationals of the form k/2m, which can be implemented efficiently by binary shifts. This multiplierless type DCT is also known as a binary DCT or bin DCT. Both the forward and the inverse transforms can be implemented in the similar manner. The implementation can be made further less complicated and more regular by making used of the scaled DCT and in-place computation. Our work concerns the implementation of a 1-D DCT based on Liang and Tran's work . For very low bit rate applications with quite high compression gain, we treated our design by making used of bit-serial computation scheme. The resulted design is compact and low power consumption. In section 2, we will outline the fast DCT techniques. A multiplierless approximation binDCT proposed by  is greatly reviewed. Our approximation is detailed in section 3. Effects of different word-length computation were also investigated. In section 4, we demonstrated the use of a bit-serial architecture in the implementation of such a binDCT. The simulation results of both software and hardware are given.