We explore the area efficiency of a class of stream-based dataflow architectures as a function of the grain-size, for a given set of applications. We believe the grain-size is a key parameter in balancing flexibility and efficiency of this class of architectures. We apply a clustering approach on a well-defined set of applications to derive a set of processing elements of varying grain-sizes. The resulting architectures are compared with respect to their silicon area. For a set of twenty-one industrially relevant video algorithms, we determined architectures with various grain-sizes. The results of this exercise indicate an improvement factor of two for the silicon area, while changing the grain-size from fine-grain to coarser-grain.