Concept A brave fantasy: GPU + CPU + HBM on a single package

pavel

Caliper Novice
Original poster
Sep 1, 2018
33
16
Recently, I've been entertaining myself with an idea of what it takes to downsize the current generation "top tier desktop" to the lowest practical limit with current technology.

My line of thinking was following, for hardware internals, there are two limiting factors - planar size limits (boards), and volume limits (cooling systems, power supply parts)

ICs take fairly low volume, and all ICs in your PC will still take close to nothing in volume if stacked, but that is not usually practical for reasons of thermal, signal, power limits. The same is true for planar sizes, most of board surface is taken for signal routing, and ICs are hardly taking more than 10%. Putting everything on a single piece of silicon is only possible with a lot of compromises. You are not going far from a smartphone SoC level of performance even if you soup up its constituent parts — bottlenecks appear very fast. On thermals, we can't go further than 100W per cm2, that's the limitation of the silicon, on I/O it's around 300 contacts per cm/2. Silicon interposers or Intel EMIB substrate allow you to increase I/O density tenfold, improve thermals a bit, and reduce the chip size (less per-pad area needed,) but they are rather costly.

What if you put every major IC on an interposer along with HBM2? To provide enough of bandwidth for a GPU, 4096 memory will be a requirement, that's 4 HBM2 stacks. A CPU memory bandwidth can be easily saturated even by a single stack, but you will really need 2 of them just for memory volume. There is a huge incentive to somehow make the CPU and GPU to share a single memory controller. But to keep things simple, lets just assume 6 stacks.

With all hot, high frequency parts on a single package, the rest of the motherboard should be an easy thing to do. No need for 32 layer pcbs to route memory, nor micrometre level precision. The whole package should not take much more than the area of AMD Threadripper, though you will have to put decoupling caps very tightly or bury them into pcb.



Having routing area for memory be reduced dramatically and doing away with separate PCB for graphics, and SDRAM DIMMs, should help a bit with cost, but more importantly lets us say bye to out of plane PCBs and height. The few discreet parts remaining on the motherboard are ICs for I/O, ports and power.

The power will still be a pita, all "fast and hot" silicon will be running at around 1V, and can easily consume half a kilowatt, and will need current with very very low ripple. I'll follow on how to deal this in a later post.
 
Last edited by a moderator:

pavel

Caliper Novice
Original poster
Sep 1, 2018
33
16
Once it is settled that all "fast and hot" ICs are placed on a single big package, just like in a picture above, the remaining parts on the board are the IO and power.

The power will be an issue, feeding a package consuming up to half a kilowatt will not be easy. As voltages will be around 1V, issues of massive ohmic losses, and welding machine like current of 500 amps.

One good thing will still be that you eliminate lots of extraneous voltage rails from the motherboard. With most power hungry things on a single module, and "power domain," what remains on PCB is ~1V Vcore, 5V for USB, and 3.3V for M.2 extension boards. The 5V rail might have to be moderately beefy, but still nothing in comparison to Vcore.

When we use buck converters, we wouldn't be able to down-convert the 5V or 3.3V to Vcore because such low current bucks are not efficient and will vaporise at such current, and in addition any low voltage rail from which the current will be drawn will still have to be almost as beefy as 1V Vcore. We will have to use three different power rails as 12V is no longer on board.

Now we have following arrangement:
  • ?V to 1V
  • ?V to 3,3V
  • ?V to 5V
It will be very very desirable to avoid intermediary 12V conversion just because 12V to Vcore will have to be almost as beefy as the main DC-DC, and be lossy on "both sides."

What if we can do 48V to 1V, and use low power POLs for 3.3V and 5V? Doing efficient and compact 48 to 1 will be a monumental task, but it is still doable with current technology. Very recently, high amperage ganfets became available on open market https://eu.mouser.com/ProductDetail...20-1-P-E01-MR?qs=%2bEew9%2b0nqrB2GFgqM79EvQ== , the cost is $10 per pop. With 4 such switches, you should be able to pump close to 480 amps at 1V, and high switching speeds of gan parts should allow for smaller passives. At switching speeds of gan based parts, it should be possible to replace electrolytic caps with polymer or high capacity MLCC like recently announced 1000uF MLCC from Taiyo Yuden.

 

drunker

Trash Compacter
Apr 25, 2017
49
36
12v rails are usually available at the power supply. If psu spit out 48v and turn into 1v using mosfets on the pcb, enormous amount of heat would be created, thus inefficient. whereas using high frequency transformers in psu to convert 300+v to 12v would be much more efficient and compatible with a lot of components on the system.
 
  • Like
Reactions: Supercluster

pavel

Caliper Novice
Original poster
Sep 1, 2018
33
16
12v rails are usually available at the power supply. If psu spit out 48v and turn into 1v using mosfets on the pcb, enormous amount of heat would be created, thus inefficient. whereas using high frequency transformers in psu to convert 300+v to 12v would be much more efficient and compatible with a lot of components on the system.
The idea is that we are throwing out a dedicated ATX psu and use DC-DC at main PCB using very high-end parts to minimise switching losses.

GaNFETs can remain 92-93% efficient even in 1-2 megahertz region. Having size of passives minimised with this, losses in passives will go down as well.
 

drunker

Trash Compacter
Apr 25, 2017
49
36
The idea is that we are throwing out a dedicated ATX psu and use DC-DC at main PCB using very high-end parts to minimise switching losses.

GaNFETs can remain 92-93% efficient even in 1-2 megahertz region. Having size of passives minimised with this, losses in passives will go down as well.
this would theoretically work but the amount of heat from such a fets would be tremendous . I don't see the datasheet with the part you posted, so i can't work out the efficiency curve. this is a nightmare to cool. potentially producing more heat than core itself.
 

pavel

Caliper Novice
Original poster
Sep 1, 2018
33
16
this would theoretically work but the amount of heat from such a fets would be tremendous . I don't see the datasheet with the part you posted, so i can't work out the efficiency curve. this is a nightmare to cool. potentially producing more heat than core itself.
Certainly not, at 90% worst case scenario, you will only be dissipating 50w of heat from 500w load. That's close to efficiency of existing Vcore VRMs, and the net efficiency goes up.

Here are some studies from a vendor:

https://epc-co.com/epc/GaNTalk/Post/14229/48V-to-1V-Conversion-the-Rebirth-of-Direct-to-Chip-Power
http://epc-co.com/epc/Portals/0/epc/documents/presentations/PwrSOC10032016.pdf
 

drunker

Trash Compacter
Apr 25, 2017
49
36
Certainly not, at 90% worst case scenario, you will only be dissipating 50w of heat from 500w load. That's close to efficiency of existing Vcore VRMs, and the net efficiency goes up.

Here are some studies from a vendor:

https://epc-co.com/epc/GaNTalk/Post/14229/48V-to-1V-Conversion-the-Rebirth-of-Direct-to-Chip-Power
http://epc-co.com/epc/Portals/0/epc/documents/presentations/PwrSOC10032016.pdf
interesting. This theoretically will work, we just need to see what direction does the big boy chip makers takes. One thing I would like to point out is the studies never shown how clean the output is.
 

pavel

Caliper Novice
Original poster
Sep 1, 2018
33
16
If we want to go even more adventurous, there is a "power-on-package" approach, where POL is placed right on the package. http://www.vicorpower.com/industries-computing/power-on-package-technology.

That minimises "last centimetre" losses even further, but I suppose that the switching element itself will be more lossy given it is silicon based. Whether any of big boy chipmakers will go this route will almost wholly depend on whether they will be OK with being held hostage by the sole solution supplier.

Another great brief on 48-to-1: https://www.st.com/content/ccc/reso...ations/en.APEC_2017_48V_Direct_Conversion.pdf
 

pavel

Caliper Novice
Original poster
Sep 1, 2018
33
16
interesting. This theoretically will work, we just need to see what direction does the big boy chip makers takes. One thing I would like to point out is the studies never shown how clean the output is.
If we are determined to go further into the rabbit hole, we can think about other soft switching topologies. The main advantage of multiphase bucks is that they are easy to control, and that they deal with current spikes more or less gracefully, while being very simple.

All kinds of resonant converters that switch at zero current, are more efficient by default, but the real world BOM for them shoots through the roof.
 
Last edited:

drunker

Trash Compacter
Apr 25, 2017
49
36
You know, I have been stumble upon this topic just days after I reply to the thread and I started digging into it(I'm not an electrical engineer btw, just some hobbyist). From the information I have gathered in couple of days. This is really revolutionary stuff, and the world is ready to adopt this in a short period of time. from data centers, electronics appliance and your mobile devices. I do hope we can have further discussion about other topics too. But we don't have keep on replying to the thread. This will bother other people. Just pm me whenever you like.
 

SashaLag

SFF Lingo Aficionado
Jun 10, 2018
127
111
But we don't have keep on replying to the thread. This will bother other people. Just pm me whenever you like.
i have been lurking on this thread since it started... please don't! :)
I'm really interested on this topic even if (or maybe because) I don't know much about ...
If you are willing to share studies about your findings would be awesome too!

The only recent document found so far would be this one, which I find really interesting: Electrical, Electromagnetic, and Thermal Measurements of 2-D and 3-D Integrated DC/DC Converters
 
  • Like
Reactions: el01

pavel

Caliper Novice
Original poster
Sep 1, 2018
33
16
i have been lurking on this thread since it started... please don't! :)
I'm really interested on this topic even if (or maybe because) I don't know much about ...
If you are willing to share studies about your findings would be awesome too!

The only recent document found so far would be this one, which I find really interesting: Electrical, Electromagnetic, and Thermal Measurements of 2-D and 3-D Integrated DC/DC Converters
I'll introduce you to theory.

So, the main thing in every DC-DC converter is a switch which can be any type of transistor, but usually a MOSFET. Secondary to it come capacitors and inductors that store the electricity while the switch is off.

Theory of operations: the switch switches on until the voltage on the output reached the needed level, then it closes, until the electricity stored in capacitors and inductors runs out, and output get below the lower end of needed level. The distance in between the upper and lower limit of output voltage is called ripple. Normally you want to make it as low as possible.

There are many schemes for DC-DC converters, called topologies, but their fundamental principle of operation is the same. There are two main classes: hard switching - ones where the switching element is turning on and off with current going through it, soft switching - ones where the switching element switches on zero, or non-peak current.

In almost all cases, the main lossy element in a DC-DC is that very switching element, because it takes energy to switch the transistor on and off. Secondary to that are losses of energy escaping inductors, and capacitors. Lastly, there are Ohmic losses - heating from electric current in all parts of the circuit..

Two main routes are: reducing the loss in the switching element by using a complicated topology to allow it to switch softly, not under load, and second is to simply use a better semiconductor material for the switch, and run it at higher frequency, which also results in you having to use smaller capacitors and inductors because you need to store less electricity in each cycle.

The industry standard DC-DC topology is a buck converter, a hard switching topology - the simplest device possible, and the "lowest common denominator:" moderate current spike tolerance, bad power density, excellent BOM for low power designs, quite noisy, ripple have to be suppressed by passives, efficiency around 85%.

What motherboard and GPU VRMs are a lot of simple buck converters working in parallel, and the number of phases is growing because switching elements have hit the frequency and power ceiling. Here, the low power density of buck converters shows its head. This is why there is so much interest in switching to something better: better switches or new topology altogether.

With a better switching element like GaNFET, manufacturers can keep their old designs by simply replacing the most lossy part in them. This being a drop-in replacement is great, but it doesn't address anything about buck converter's inherent deficiencies.

Proponents of better topologies think that this is great moment to commit the rest of the industry to "fix the issue properly" this time. Besides better power density, soft switching has benefit of lower noise, lower inherent ripple in some topologies, and up to 98% efficiency in some extremely overengineered examples.

But going the route of complex topologies requires huge effort, that I guarantee an average motherboard maker will not commit to. Some of issues with soft switching topologies:
  1. Hard to make them efficient under variable load
  2. Unlike bucks, where you control lower and upper voltage through switching frequency and capacity of capacitor, some soft-switching topologies are non-linear in that regard.
  3. Most designs requires complex active control mechanism to regulate the power output, with bucks you simply have to raise switching frequency.
  4. Some soft switching topologies are great at handling power spikes, but some can't do that as such. It is a matter of complex trade-offs.
  5. They by definition require more parts, and board space, but they do have advantage over bucks when bucks get bottlenecked by power density.
 
Last edited:

SashaLag

SFF Lingo Aficionado
Jun 10, 2018
127
111
Thank you :)

I already have a small background in DCDC standard regulators as I am using some (like TI TPS54334) in the two boards I am designing for my thesis... but they are very conventional, non GaN DCDC regulator, with smaller gap between input and output range... so what are you talking about is newer to me as I am much more found in digital design than in analogue/power ones!

Anyway this is an area where I always wanted to investigate more so your conversation provides me a lot of keywords to start with... It's hard I will be able to give big contributions here in the near future... but everybody as to start somewhere and your contributions here are highly appreciated!
 
  • Like
Reactions: pavel

drunker

Trash Compacter
Apr 25, 2017
49
36
one of the things that are limiting gan fets from trickle down to the consumer products is the lack of infrastructure of current power stages. There are only a handful of integrated gan fet power stages with integrated driver ic and necessary features build into one chip. There is always a route of going with external components, but this is not as convenient.

Even though there are not many parts available, lots of cool products can still be made with couple of added components. If someone is willing to design a power supply. It would be 1/3 or smaller of the size what is currently on the market with the same ratings. I am certain people on this forum would really like to see that happening.

This is no longer fantasy!!!
 
  • Like
Reactions: SashaLag and nick_w