For some workloads, yes. I don’t think that the personal computer is going to go away.
But it also makes a lot of economic and technical sense for some of those workloads.
Historically — like, think up to about the late 1970s — useful computing hardware was very expensive. And most people didn’t have a requirement to keep computing hardware constantly loaded. In that kind of environment, we built datacenters and it was typical to time-share them. You’d use something like a teletype or some other kind of thin client to access a “real” computer to do your work.
What happened at the end of the 1970s was that prices came down enough and there was enough capability to do useful work to start putting personal computers in front of everyone. You had enough useful capability to do real computing work locally. They were still quite expensive compared to the great majority of today’s personal computers:
The original retail price of the computer was US$1,298 (equivalent to $6,700 in 2024)[18][19] with 4 KB of RAM and US$2,638 (equivalent to $13,700 in 2024) with the maximum 48 KB of RAM.
But they were getting down to the point where they weren’t an unreasonable expense for people who had a use for them.
At the time, telecommunications infrastructure was much more limited than it was today, so using a “real” computer remotely from many locations was a pain, which also made the PC make sense.
From about the late 1970s to today, the workloads that have dominated most software packages have been more-or-less serial computation. While “big iron” computers could do faster serial compute than personal computers, it wasn’t radically faster. Video games with dedicated 3D hardware were a notable exception, but those were latency sensitive and bandwidth intensive, especially relative to the available telecommunication infrastructure, so time-sharing remote “big iron” hardware just didn’t make a lot of sense.
And while we could — and to some extent, did — ramp up serial computational capacity by using more power, there were limits on the returns we could get.
However, what AI stuff represents has notable differences in workload characteristics. AI requires parallel processing. AI uses expensive hardware. We can throw a lot of power at things to get meaningful, useful increases in compute capability.
Just like in the 1970s, the hardware to do competitive AI stuff for many things that we want to do is expensive. Some of that is just short term, like the fact that we don’t have the memory manufacturing capacity in 2026 to meet need, so prices will rise to price out sufficient people that the available chips go to whoever the highest bidders are. That’ll resolve itself one way or another, like via buildout in memory capacity. But some of it is also that the quantities of memory are still pretty expensive. Even at pre-AI-boom prices, if you want the kind of memory that it’s useful to have available — hundreds of gigabytes — you’re going to be significantly increasing the price of a PC, and that’s before whatever the cost of the computation hardware is.
Power. Currently, we can usefully scale out parallel compute by using a lot more power. Under current regulations, a laptop that can go on an airline in the US can have an 100 Wh battery and a 100 Wh spare, separate battery. If you pull 100W on a sustained basis, you blow through a battery like that in an hour. A desktop can go further, but is limited by heat and cooling and is going to start running into a limit for US household circuits at something like 1800 W, and is going to be emitting a very considerable amount of heat dumped into a house at that point. Current NVidia hardware pulls over 1kW. A phone can’t do anything like any of the above. The power and cooling demands range from totally unreasonable to at least somewhat problematic. So even if we work out the cost issues, I think that it’s very likely that the power and cooling issues will be a fundamental bound.
In those conditions, it makes sense for many users to stick the hardware in a datacenter with strong cooling capability and time-share it.
Now, I personally really favor having local compute capability. I have a dedicated computer, a Framework Desktop, to do AI compute, and also have a 24GB GPU that I bought in significant part to do that. I’m not at all opposed to doing local compute. But at current prices, unless that kind of hardware can provide a lot more benefit than it currently does to most, most people are probably not going to buy local hardware.
If your workload keeps hardware active 1% of the time — and maybe use as a chatbot might do that — then it is something like a hundred times cheaper in terms of the hardware cost to have the hardware timeshared. If the hardware is expensive — and current Nvidia hardware runs tens of thousands of dollars, too rich for most people’s taste unless they’re getting Real Work done with the stuff — it looks a lot more appealing to time-share it.
There are some workloads for which there might be constant load, like maybe constantly analyzing speech, doing speech recognition. For those, then yeah, local hardware might make sense. But…if weaker hardware can sufficiently solve that problem, then we’re still back to the “expensive hardware in the datacenter” thing.
Now, a lot of Nvidia’s costs are going to be fixed, not variable. And assuming that AMD and so forth catch up, in a competitive market, will come down — with scale, one can spread fixed costs out, and only the variable costs will place a floor on hardware costs. So I can maybe buy that, if we hit limits that mean that buying a ton of memory isn’t very interesting, price will come down. But I am not at all sure that the “more electrical power provides more capability” aspect will change. And as long as that holds, it’s likely going to make a lot of sense to use “big iron” hardware remotely.
What you might see is a computer on the order of, say, a 2022 computer on everyone’s desk…but that a lot of parallel compute workloads are farmed out to datacenters, which have computers more-capable of doing parallel compute there.
Cloud gaming is a thing. I’m not at all sure that there the cloud will dominate, even though it can leverage parallel compute. There, latency and bandwidth are real issues. You’d have to put enough datacenters close enough to people to make that viable and run enough fiber. And I’m not sure that we’ll ever reach the point where it makes sense to do remote compute for cloud gaming for everyone. Maybe.
But for AI-type parallel compute workloads, where the bandwidth and latency requirements are a lot less severe, and the useful returns from throwing a lot of electricity at the thing significant…then it might make a lot more sense.
I’d also point out that my guess is that AI probably will not be the only major parallel-compute application moving forward. Unless we can find some new properties in physics or something like that, we just aren’t advancing serial compute very rapidly any more; things have slowed down for over 20 years now. If you want more performance, as a software developer, there will be ever-greater relative returns from parallelizing problems and running them on parallel hardware.
I don’t think that, a few years down the road, building a computer comparable to the one you might in 2024 is going to cost more than it did in 2024. I think that people will have PCs.
But those PCs might running software that will be doing an increasing amount of parallel compute in the cloud, as the years go by.
For some workloads, yes. I don’t think that the personal computer is going to go away.
But it also makes a lot of economic and technical sense for some of those workloads.
Historically — like, think up to about the late 1970s — useful computing hardware was very expensive. And most people didn’t have a requirement to keep computing hardware constantly loaded. In that kind of environment, we built datacenters and it was typical to time-share them. You’d use something like a teletype or some other kind of thin client to access a “real” computer to do your work.
What happened at the end of the 1970s was that prices came down enough and there was enough capability to do useful work to start putting personal computers in front of everyone. You had enough useful capability to do real computing work locally. They were still quite expensive compared to the great majority of today’s personal computers:
https://en.wikipedia.org/wiki/Apple_II
But they were getting down to the point where they weren’t an unreasonable expense for people who had a use for them.
At the time, telecommunications infrastructure was much more limited than it was today, so using a “real” computer remotely from many locations was a pain, which also made the PC make sense.
From about the late 1970s to today, the workloads that have dominated most software packages have been more-or-less serial computation. While “big iron” computers could do faster serial compute than personal computers, it wasn’t radically faster. Video games with dedicated 3D hardware were a notable exception, but those were latency sensitive and bandwidth intensive, especially relative to the available telecommunication infrastructure, so time-sharing remote “big iron” hardware just didn’t make a lot of sense.
And while we could — and to some extent, did — ramp up serial computational capacity by using more power, there were limits on the returns we could get.
However, what AI stuff represents has notable differences in workload characteristics. AI requires parallel processing. AI uses expensive hardware. We can throw a lot of power at things to get meaningful, useful increases in compute capability.
Just like in the 1970s, the hardware to do competitive AI stuff for many things that we want to do is expensive. Some of that is just short term, like the fact that we don’t have the memory manufacturing capacity in 2026 to meet need, so prices will rise to price out sufficient people that the available chips go to whoever the highest bidders are. That’ll resolve itself one way or another, like via buildout in memory capacity. But some of it is also that the quantities of memory are still pretty expensive. Even at pre-AI-boom prices, if you want the kind of memory that it’s useful to have available — hundreds of gigabytes — you’re going to be significantly increasing the price of a PC, and that’s before whatever the cost of the computation hardware is.
Power. Currently, we can usefully scale out parallel compute by using a lot more power. Under current regulations, a laptop that can go on an airline in the US can have an 100 Wh battery and a 100 Wh spare, separate battery. If you pull 100W on a sustained basis, you blow through a battery like that in an hour. A desktop can go further, but is limited by heat and cooling and is going to start running into a limit for US household circuits at something like 1800 W, and is going to be emitting a very considerable amount of heat dumped into a house at that point. Current NVidia hardware pulls over 1kW. A phone can’t do anything like any of the above. The power and cooling demands range from totally unreasonable to at least somewhat problematic. So even if we work out the cost issues, I think that it’s very likely that the power and cooling issues will be a fundamental bound.
In those conditions, it makes sense for many users to stick the hardware in a datacenter with strong cooling capability and time-share it.
Now, I personally really favor having local compute capability. I have a dedicated computer, a Framework Desktop, to do AI compute, and also have a 24GB GPU that I bought in significant part to do that. I’m not at all opposed to doing local compute. But at current prices, unless that kind of hardware can provide a lot more benefit than it currently does to most, most people are probably not going to buy local hardware.
If your workload keeps hardware active 1% of the time — and maybe use as a chatbot might do that — then it is something like a hundred times cheaper in terms of the hardware cost to have the hardware timeshared. If the hardware is expensive — and current Nvidia hardware runs tens of thousands of dollars, too rich for most people’s taste unless they’re getting Real Work done with the stuff — it looks a lot more appealing to time-share it.
There are some workloads for which there might be constant load, like maybe constantly analyzing speech, doing speech recognition. For those, then yeah, local hardware might make sense. But…if weaker hardware can sufficiently solve that problem, then we’re still back to the “expensive hardware in the datacenter” thing.
Now, a lot of Nvidia’s costs are going to be fixed, not variable. And assuming that AMD and so forth catch up, in a competitive market, will come down — with scale, one can spread fixed costs out, and only the variable costs will place a floor on hardware costs. So I can maybe buy that, if we hit limits that mean that buying a ton of memory isn’t very interesting, price will come down. But I am not at all sure that the “more electrical power provides more capability” aspect will change. And as long as that holds, it’s likely going to make a lot of sense to use “big iron” hardware remotely.
What you might see is a computer on the order of, say, a 2022 computer on everyone’s desk…but that a lot of parallel compute workloads are farmed out to datacenters, which have computers more-capable of doing parallel compute there.
Cloud gaming is a thing. I’m not at all sure that there the cloud will dominate, even though it can leverage parallel compute. There, latency and bandwidth are real issues. You’d have to put enough datacenters close enough to people to make that viable and run enough fiber. And I’m not sure that we’ll ever reach the point where it makes sense to do remote compute for cloud gaming for everyone. Maybe.
But for AI-type parallel compute workloads, where the bandwidth and latency requirements are a lot less severe, and the useful returns from throwing a lot of electricity at the thing significant…then it might make a lot more sense.
I’d also point out that my guess is that AI probably will not be the only major parallel-compute application moving forward. Unless we can find some new properties in physics or something like that, we just aren’t advancing serial compute very rapidly any more; things have slowed down for over 20 years now. If you want more performance, as a software developer, there will be ever-greater relative returns from parallelizing problems and running them on parallel hardware.
I don’t think that, a few years down the road, building a computer comparable to the one you might in 2024 is going to cost more than it did in 2024. I think that people will have PCs.
But those PCs might running software that will be doing an increasing amount of parallel compute in the cloud, as the years go by.