Coolness matters: An equation for low-power mobile SoCs
Sponsored content [Friday 26 July 2013]
After acquiring MIPS in December of 2012, Imagination Technologies simultaneously possesses the four key mobile-platform silicon intellectual property technologies (CPU, GPU, VPU, and RPU) and brings opportunities for the heterogeneous platform integration technology of tomorrow.
The four silicon intellectual property technologies that controls SoC networking
Amit Rohatgi, vice president of mobile solutions for Imagination Technologies, stated that all networking SoC devices require key components such as the application processor (CPU), the graphic processor (GPU), the radio comms processor (RPU), and the video processor (VPU). Semiconductor evolutions in the past primarily focused on performances and ignored the serious power-down transistor gate power leakage condition. Imagination possesses the silicon intellectual property for these four key chips, and collaborates closely with upstream EDA, IP providers, SoC designers, and foundries to design optimized power consumption chips through hardware and software innovation. This allows the industry to develop low power consumption and always-connected networking to mobile devices that offers a rich variety of media content.
Efficiency upgrade with multi-threaded CPU cores
Amit Rohatgi indicated that the CPUs used in current mobile devices under the Linux operating system are prone to cache errors, difference forecast errors, data dependencies, as well as memory performance and other CPU idle time problems that can cause 50-200 machine cycle pipeline stagnations and reduce the overall implementation efficiency to less than 50%. Multithreading (MT) allows a single-core CPU to implement synchronized multi-threaded operations, which can optimize CPU execution units, significantly reduce the total number of execution cycles and stagnation cycles, and improve the application computing output rate to achieve the same objective at lower clocks and reduce system costs.
Amit Rohatgi raised an example. If a MIPS multi-threaded CPU is used as the modem chip for a mobile device, the multi-threaded core plus the assistance of a multi-thread optimization RTOS and LTE baseband software stack layer can improve the video/date output rate by 37-57% compared to that of a traditional non-threaded CPU core. When comparable high-end proAptiv and ARM Cortex A15 chips compared (both of which are produced under a TSMC 28nm High-K metal gate process [28nm HKMG]), the former clocks at 1GHz to 2 GHz and the latter clocks at 1.8 GHz. However, the unit execution performance (CoreMark/MHz) of proAptive can reach 4.5 CoreMark/MHz, and the Cortex A15 can only reach 3.5 CoreMark/MHz. Meanwhile, the surface area of the proActive single-core silicon is 1.85 mm2, which is less than 60% compared to that of the 3.25 mm2 Cortex A15.
The proAptive has dynamic voltage/dynamic voltage and frequency scaling (DVFS) technologies and can save 50% in power consumption by switching from 1.3V to 0.9V during standby mode, save 33% in power consumption by switching between idle and operational modes, and save 10-20% in power consumption even under the full speed operation. It also features PowerArtist power analysis tool to help system designers reduce overall power consumption.
Achieving optimal GPU and VPU unit power efficiency through architecture
Amit Rohatgi mentioned that the PowerVR SGX of Imagination uses a tile-based rendering architecture model design to remove 3D objects from the shielding surface data of the screen for recompression transmission to reducing memory bandwidth and optimizing power consumption without sacrificing the graphic performance of the GPU, which is why this chip is adopted by Apple's iPad 4. He cited PowerVR SGX, ARM Mali, and Nvidia Tegra3 power consumption tests conducted by a third party. The results of the test showed that ARM Mali experienced huge shocks when it went from 0.5W to 4W. The Nvidia Tegra3 has an average of 1.7-1.8W and the PowerVR SGX has an average of 0.5-1W, and both consumption ranges are relatively flat.
In terms of VPU, standard coded circuits are adopted with the addition of deblocking, discrete cosine transform (DCT), and dynamic forecast/compensation circuits to automatically turn off each circuit block when it is not in use. The embedded DRAM memory of the SoC technology also provides dynamic clock gate control that can save 60% in power consumption. He cited the fact that when Microsoft's Surface tablet is compared to Acer's W510 tablet that adopts the Imagination VPU, during 1080 HD video playback, Acer W510's total/CPU/GPU/Memory power consumption rates of 3.5W, 0.17W, 0.37W, and 0.45W all performed better than Microsoft Surface's power consumption rates of4.21W, 0.35W, 0.51W, and 0.58W, respectively.
Regarding the RPU, Amit Rohatgi indicated that radio signal transmissions must consider the location and time where the user chooses to switch between different channels as well as the signal shielding and attenuation caused by the complex and multiple paths. However, intelligent wave acquisition and tracking circuits are like multi-phase antennas that can enhance the 10dB noise ratio and reduce power consumption.
Imagination's solution and heterogeneous SoC integration technology
Amit Rohatgi mentioned that Imagination offers technologies for the four major components (CPU, GPU, VPU, and RPU). The MIPS proAptiveas is designed with low-power consumption and multi-threading. The PowerVR SGX graphics chip can improve the unit power consumption performance by 50% under graphic rendering operations. The PowerVR Series4 VPU that provides high-efficiency video coding (HVEC), H.265 HD instant compression coding/decoding, and supports 4K x 2K resolution while consuming less than 10mW of power under the operational state. The Ensigma Series4 RPU provides 802.11ac 2x2 MIMO connectivity, supports all Wi-Fi protocols as well as Bluetooth, TV, and radio, and can maintain low power levels.
Imagination is currently developing an image signal processor. Today's mobile devices are equipped with multiple cameras and must execute complex software instructions. The image signal processor can reduce/unload the digital signal processing burden of the CPU and further reduce power consumption.
Finally, Amit Rohatgi indicated that Imagination is currently undergoing research and development for a CPU, GPU, VPU, and RPU heterogeneous compute system. The MCM and SiP/PoP system-level packaging would initially be offered at 28nm, and the four-in-one CPU, GPU, VPU, and RPU super single-chip is expected to be implemented under the 20nm process to further exert the power of heterogeneous computing capabilities, such as providing the dynamic voltage/clock switching between CPT/GPU required to enable the maximum of 1 TFLOPS of floating point performance.
Amit Rohatgi, VP of mobile solutions for Imagination Technologies