Why a GPU mines swifter than a CPU – Bitcoin Wiki

Some Bitcoin users might wonder why there is a big disparity inbetween the mining output of a CPU versus a GPU.

Very first, just to clarify, the CPU, or central processing unit, is the part of the rekentuig that performs the will of the software loaded on the rekentuig. It’s the main executive for the entire machine. It is the master that tells all the parts of the laptop what to do – ter accordance with the program code of the software, and, hopefully, the will of the user.

Most computers have multi-core CPUs nowadays (which is almost the same thing spil having numerous CPU’s ter a single physical package)., and some computers even have numerous CPUs.

The CPU is usually a removable component that buttplugs into the rekentuig’s main circuit houtvezelplaat, or motherboard and sits underneath a large, metallic warmth drown which usually has a fan, a few are cooled by water.

The GPU, or graphics processing unit, is a part of the movie rendering system of a laptop. The typical function of a GPU is to assist with the rendering of 3D graphics and visual effects so that the CPU doesn’t have to.

Servers usually have very limited or no GPU facilities spil they are mostly managed overheen a text-based remote interface. Most mainstream computers have much slower but less power consuming and cheaper IGPs (Integrated Graphics Processor), which are GPUs spil well but integrated directly into the chipset and soldered onto the motherboard, rather than separate, more powerful but power consuming AGP or PCIe cards with GPUs, but separate GPUs. Powerful GPUs are needed mostly for graphic intensive tasks such spil gaming or movie editing. For example, the translucent windows ter Windows 7, or technologies like Mac OS X’s Quartz, which powers the Aqua desktop and its beautiful, water-like graphical effects and animations such spil bulbous the Dock ter a slick animation when the mouse is moved to the lower edge of the screen or “sucking” windows into the Dock when they are minimized – thesis are powered by GPUs.

A GPU is like a CPU, but there are significant internal differences that make them suited toward their special tasks. Thesis are the differences that make Bitcoin mining far more favorable on a GPU.

Contents

Brief Reaction

A CPU core can execute Four 32-bit instructions vanaf clock (using a 128-bit SSE instruction) or 8 via AVX (256-Bit), whereas a GPU like the Radeon HD 5970 can execute 3200 32-bit instructions vanaf clock (using its 3200 ALUs or shaders). This is a difference of 800 (or 400 te case of AVX) times more instructions vanaf clock. Spil of 2011, the fastest CPUs have up to 6, 8, or 12 cores and a somewhat higher frequency clock (2000-3000 MHz vs. 725 MHz for the Radeon HD 5970), but one HD5970 is still more than five times swifter than four 12-core CPUs at Two.3GHz (which would also set you back about $4700 rather than $350 for the HD5970).

A CPU is an executive

A CPU is designed primarily to be an executive and make decisions, spil directed by the software. For example, if you type a document and save it, it is the CPU’s job to turn your document into the suitable opstopping type and rechtstreeks the hard disk to write it spil a verkeersopstopping. CPU’s can also do all kinds of math, spil inwards every CPU is one or more “Arithmetic/Logic Units” (ALU’s). CPU’s are also very capable of following instructions of the “if this, do that, otherwise do something else”. A large bulk of the structures inwards a CPU are worried with making sure that the CPU is ready to overeenkomst with having to switch to a different task on a uur’s notice when needed.

CPU’s also have to overeenkomst with fairly a few other things which add complexity, including:

  • enforcing privilege levels and the boundaries inbetween user programs and the operating system
  • creating the illusion of “virtual memory” to programs
  • for the most popular processors, being rearwards compatible with legacy code

A GPU is a laborer

A GPU is very different. Yes, a GPU can do math, and can also do “this” and “that” based on specific conditions. However, GPU’s have bot designed so they are very good at doing movie processing, and less executive work.

Movie processing is a loterijlot of repetitive work, since it is permanently being told to do the same thing to large groups of pixels on the screen. Te order to make this run efficiency, movie processors are far stronger on the capability to do repetitive work, than the capability to rapidly switch tasks.

GPU’s have large numbers of ALU’s, more so than CPU’s. Spil a result, they can do large amounts of bulky mathematical labor ter a greater quantity than CPU’s.

Analogy

One way to visualize it is a CPU works like a puny group of very wise people who can quickly do any task given to them. A GPU is a large group of relatively dumb people who aren’t individually very swift or clever, but who can be trained to do repetitive tasks, and collectively can be more productive just due to the sheer number of people.

It’s not that a CPU is fat, spoiled, or lazy. Both CPUs and GPUs are creations made from billions of microscopic transistors crammed on a puny chunk of silicon. On silicon chips, size is expensive. The structures that make CPUs good at what they do take up lots of space. When those structures are omitted, that leaves slew of slagroom for many “dumb” ALU’s, which individually are very petite.

The ALUs of a GPU are partitioned into groups, and each group of ALUs shares management, so members of the group cannot be made to work on separate tasks. They can either all work on almost identical variations of one single task, te volmaakt sync with one another, or nothing at all. Attempting different hashes repeatedly – the process behind Bitcoin mining – is a very repetitive task suitable for a GPU, with each attempt varying only by the switching of one number (called a “nonce”) te the gegevens being hashed.

The ATI Radeon 5970 is a popular movie card for Bitcoin mining and, to date, offers the best known voorstelling of any movie card for this purpose.

This particular card has Trio,200 “Stream Processors”, which can be thought of spil Trio,200 very dumb execution units that can be trained to all do the same repetitive task, just so long spil they don’t have to make any decisions that interrupts their flow. Those execution units are contained te blocks. The 5970 uses a VLIW-5 architecture, which means the Three,200 Stream Processors are actually 640 “Cores,” Each able to process Five instruction vanaf clock cycle. Nvidia would call thesis cores “Cuda Cores”, but spil mentioned te this article, they are not VLIW, meaning they cannot do spil much work vanaf cycle. This is why comparing graphics cards by core count alone is not an accurate method of determining spectacle, and this is also why nVidia lags so far behind ATI te SHA-256 hashing.

Since ALU’s are what do all the work of Bitcoin mining, the number of available ALU’s has a onmiddellijk effect on the hash output. Compare that to a 4-core CPU that can switch tasks on a dime, but has ALU’s te some puny numerous of four, if not just four ALU’s alone. Attempting a single SHA256 hash te the setting of Bitcoin mining requires around 1,000 plain mathematical steps that vereiste be performed entirely by ALU’s.

That, ter a nutshell, is why GPU’s can mine Bitcoins so much quicker than CPU’s. Bitcoin mining requires no decision making – it is repetitive mathematical work for a pc. The only decision making that vereiste be made ter Bitcoin mining is, “do I have a valid block” or “do I not”. That’s an excellent workload to run on a GPU.

Why are AMD GPUs quicker than Nvidia GPUs?

Firstly, AMD designs GPUs with many elementary ALUs/shaders (VLIW vormgeving) that run at a relatively low frequency clock (typically 1120-3200 ALUs at 625-900 MHz), whereas Nvidia’s microarchitecture consists of fewer more complicated ALUs and attempts to compensate with a higher shader clock (typically 448-1024 ALUs at 1150-1544 MHz). Because of this VLIW vs. non-VLIW difference, Nvidia uses up more square millimeters of diegene space vanaf ALU, hence can pack fewer of them vanaf chip, and they succesnummer the frequency wall sooner than AMD which prevents them from enhancing the clock high enough to match or surpass AMD’s spectacle. This translates to a raw ALU show advantage for AMD:

  • AMD Radeon HD 6990: 3072 ALUs x 830 MHz = 2550 billion 32-bit instruction vanaf 2nd
  • Nvidia GTX 590: 1024 ALUs x 1214 MHz = 1243 billion 32-bit instruction vanaf 2nd

This approximate 2x-3x spectacle difference exists across the entire range of AMD and Nvidia GPUs. It is very visible ter all ALU-bound GPGPU workloads such spil Bitcoin, password bruteforcers, etc.

Secondly, another difference favoring Bitcoin mining on AMD GPUs instead of Nvidia’s is that the mining algorithm is based on SHA-256, which makes strenuous use of the 32-bit oprecht right rotate operation. This operation can be implemented spil a single hardware instruction on AMD GPUs (BIT_ALIGN_INT), but requires three separate hardware instructions to be emulated on Nvidia GPUs (Two shifts + 1 add). This alone gives AMD another 1.7x spectacle advantage (

1900 instructions instead of

3250 to execute the SHA-256 compression function).

Combined together, thesis Two factors make AMD GPUs overall 3x-5x quicker when mining Bitcoins.

NVIDIA Releases Fresh Generations of GPU Cards

NVIDIA’s fresh flagship card “GeForce GTX 690” is now beefier than it’s junior sibling – GTX 590. EVGA has also determined to use the same chipset on its flagship card “EVGA GeForce GTX 690 Signature”. But what are the comparitive figures for the AMD and fresh NVIDIA GPU’s ? See Some Vertoning Specs Below:

GeForce GTX 690 (Four,096MB):

GPU Clock (MHz) 915 (1,019), GFLOPS Five,621 Single Precision, Dual Precision Figures unavailable, ALU’s 3072 (manufacturer refer to this spil CUDA Cores)

AMD Radeon HD 6990:

830MHz Engine Clock, Five,100 GFLOPs Single Precision , 1,270 GFLOPs Dual Precision, ALU’s 3072

Related movie: The GTI – Everything You Need To Know | Up To Speed


Leave a Reply

Your email address will not be published. Required fields are marked *