The vector processor architecture basically consists of a scalar RISC-V-based CPU and the massive-parallel vector processing element array. The RISC-V processor system serves as the global controller and general purpose processor, which also computes any kind of flow control for the actual programs. In contrast, the vector processing array is in charge of processing the computation-intensive tasks. The vector processing element array consists of a configurable number of vector units (VU). These units contain a configurable number of vector lanes (VL), which present the actual vertical data processing units. The lanes of one unit are connected via chaining to exchange processing results directly. Each VU contains a configurable local memory, which is accessible by all lanes and serves as fast scratch pad memory. Additionally, scheduling logic and a FIFO is part of each VU to buffer and distribute incoming vector operations to different lanes in the unit. The actual vector operations are sent from the instruction decode stage of the RISC-V processor system and are executed in parallel.
ProjectVideo DATE23 Presentation
The NanoController is a programmable processor architecture with a compact 4-bit ISA. It is designed for minimal silicon area and power consumption, and is intended to be used as an independent system state controller for smart devices (home and building automation, portable and intelligent medical sensors, etc.). In these embedded systems, the NanoController can perform non-complex control and system management tasks, which are occurring most of the time (background operations). The main processor of the system, which typically is a large, full-featured general-purpose RISC micro-controller core, then needs to be active only for infrequent handling of events with complex computations, e.g., encrypted wireless communication, and can be powered down completely for long time intervals. This mechanism supports in minimizing the average power consumption of embedded systems, which is a key aspect to increase the energy efficiency and extend the limited operation lifetime of battery-less devices powered by energy harvesting. Due to its programmability, the architecture provides run-time flexibility for advanced system management. Furthermore, it is extendable with additional functional units for specific systems, e.g., basic digital signal processing blocks.