Faced with Sandy Bridge's discrete graphics

The "pendulum" law continues to follow the legend of Intel's Tick-Tock Pendulum Law. 2011 will be the launch year of its new architecture. In more than one month, the second-generation Intel Core processor Sandy Bridge will be released. In addition to the processor, supporting chipsets, motherboards, heat sinks, etc. will also be updated, including the chipset is 6 series, the desktop mainly P67, H67, H61 and other models.

From a high-level perspective, Sandy Bridge can be said to be an evolution, but if you look at the scale of transistor changes since Nehalem/Westmere, it is definitely a big leap forward. Also, the logic introduced in the loopflow detector is fast and a micro-instruction cache is added for temporary storage of the instruction decode. It is not difficult for us to find that processors are more and more focused on energy saving. While saving a lot of energy consumption, the performance of processors is 10-30% higher than current products. However, these are not the focus, what really exciting is its seamless fusion of the core graphics card, its performance is easily improved several times, this is definitely a bad news for low-end discrete graphics.

Seamless fusion core graphics performance and then peaking Although the Core series released in early 2010 also comes with a graphics core, but with the CPU is a dual-core package - 32nm CPU and 45nm GPU. The GPU has improved its performance with 45nm process, more coloring hardware, higher frequency, etc., and Sandy Bridge's new design uses a 32nm process synchronizing with the CPU core to manufacture the GPU, which further significantly improves its performance. The most critical is that the GPU and the CPU are integrated and become a standard part of the new generation of CPUs. Therefore, such CPUs are called core graphics cards.

Currently everyone is familiar with the Core i5, Core i7 support Turbo Boost technology, and Sandy Bridge's core graphics card has its own power island and clock domain, but also supports Turbo Boost technology, can work with the CPU's various computing cores Accelerate or down-clock coordination. The greatest benefit of this on-demand intelligent change frequency is that it can save users a lot of energy consumption when not busy, and significantly reduce the amount of heat, and quickly increase the speed when needed.

The biggest difference between Sandy Bridge's GPU and the general integrated display core is that the display driver controls the three-level cache that accesses the processor core. The GPU and the CPU compute core are in the same location to access the shared cache, which is also the core One of the origins of the video card's name. The core graphics card is not the same as the integrated graphics card, because the integrated graphics core can only put the graphics data in the system memory. To know that the memory speed can be much slower than the three-level cache, and to be more "distant" The time alone is a major constraint on its performance. As such, putting graphics data in cache has a multiplier effect for improving performance and reducing power consumption.

Sandy Bridge's core graphics cards use full-featured hardware. The functional features correspond to the hardware units. The benefit is that performance, power consumption, and core area are all greatly optimized. The focus of its performance improvement lies in the extensive use of fully programmable hardware, which is what we call the EU, including shaders, cores, execution units, etc., and can fetch instructions from multiple threads when dual-launched.

In previous graphics architectures, the register stacks were all real-time redistributed. If the demand for one of the threads was reduced, the savings would be allocated to other threads. The biggest benefit of doing so was to save the core area, but in some cases However, there is a risk that no register is available in the thread, which greatly reduces its performance. The other integrated display core has an average of 64 registers per thread. By the time the Westmere Core GPU was increased to an average of 80, Sandy Bridge's core graphics cards were fixed at 120 per thread. In this sense, a significant increase in performance became inevitable. result.

Internal ISA mapping and DX10 API instructions are almost one-to-one, and the architecture is very similar to CISC. The result is that the EU's breadth is effectively expanded, and IPC is also significantly improved. Moreover, the abstract mathematics also has the responsibility of the hardware within the EU, and the performance has been improved synchronously. As Intel stated, the sine and cosine operations are several orders of magnitude faster than the current Core HD graphics cards. All of these improvements add up to the fact that the throughput of each EU in Sandy Bridge core graphics cards is double that of current Core HD graphics cards.

Of course, Sandy Bridge's core graphics cards will also artificially open the performance gap, divided into two different versions of performance, divided into two versions through the EU, with 12 or 6 EU. But even with six EUs, the performance is quite satisfactory thanks to the fact that each EU has twice the throughput, operates at a higher frequency, and shares three levels of cache. What's more, the performance of having 12 EUs can directly threaten the independent low-end graphics card market.

If Sandy Bridge's core graphics card is mainly to enhance 3D performance and picture quality can only account for half of the role, then the other half of the role is digital media processing capabilities. At present, high-definition video has become very popular, and many independent graphics cards also have the banner of high-definition video cards. The current Core i3 has also been able to smoothly play high-definition video, so the most basic smooth playback is not a technology threshold, but Intel will continue to develop and continue to carry forward video and derivative functions.

There is a media processor in Sandy Bridge, which is responsible for video decoding and encoding. In the new hardware acceleration decoding engine, the entire video pipeline is decoded through a fixed function unit. The biggest advantage of this is that the power consumption can be when playing video. Half it down.

Although the new video encoding engine did not publicize its specific details, a demo session on the IDF site revealed its powerful performance. A 3-minute 1080P 30Mbps HD video, converted to a 640x360 iPhone format, may take a few minutes if the average user's computer, but the Sandy Bridge only takes 14 seconds, the conversion speed is up to 400FPS, and all this takes place in about 3 square millimeters Above the core area. And, we can expect that Intel will work closely with more software vendors, and believe that this video transcoding technology will soon be widely supported.

Convergence brings innovation and innovation To win the future of IDF in mid-September, Intel demonstrated the Sandy Bridge processor, which directly threatens the living space of discrete graphics cards, which is now less than two months old. As for its performance, at the IDF conference, Intel CEO Paul Otellini showed off the most popular game "StarCraft 2: Wings of Freedom" and placed it together with a discrete video card. Consumers can't distinguish which one Is the independent graphics card running, and which is the core graphics card running from the Sandy Bridge processor, which shows Sandy Bridge processor can completely replace the low-end products. Data from market research firm iSuppli shows that by 2014, about 80% of PCs shipped throughout the year will be equipped with such integrated processors.

True integration of CPU and GPU, based on 32nm process, several times the 3D performance of previous products, better video and audio playback capabilities and higher processor performance, but also better in power consumption and heat generation The control, which is the second-generation Intel Core processor built for us by Sandy Bridge.

Posted on