Abstract:
We report on the Winograd-based implementation for the Number Theoretical Transform. It uses less multiplications than the better-known Cooley-Tuckey alternative. This optimization is important for very high order finite-fields. Unfortunately, the Winograd scheme is difficult to generalize for arbitrary sizes and is only known for small-size transforms. We open-source our hardware implementation for size 32 based on [1].