Just write your code, so that you (almost) don’t need comments (i.e. simple to read). The problems with (most) comments won’t be solved by AI.
(in-code) comments only make sense for me in roughly these scenarios:
complex piece of math possibly from a paper (e.g. a ref to the paper where it is explained)
function doc, here AI may really help at some time (right now, I’ll rather write it myself, AI represents this post very well in an even more verbose and literate way than the typical junior dev…)
This couples intentions to the code which in my example would be dynamic.
That’s going to be a bad time.
My point is that the conventions that used to be good for the past 50 years of development are likely going to change as tooling does.
Programming is effectively about managing complexity.
Yes, the abstraction of a development language being the layer at which you encode intention rather than in comments is better when humans are reading and writing the code itself.
But how many projects have historically run into problems when a decade earlier they chose a language that years later is stagnating in tooling or integrations versus another pick?
Imagine if the development work had been done exclusively in pseudocode and comments guiding generative AI writing in language A. How much easier might porting everything to language B end up being?
Language agnostic development may be quite viable within a year or so.
And just as you could write software in binary, letting a compiler do that and working with an abstracted layer is more valuable in time and cost.
I’m saying that the language is becoming something which software can effectively abstract, so moving the focus yet another layer up will likely be more valuable than clinging to increasingly obsolete paradigms.
Language agnostic development may be quite viable within a year or so.
I doubt that very much, GPT4 (to my knowledge still the best LLM) is far from being there. As (my) initial hype is overcome, I have basically stopped using it because I have to “help” it too much (and it got really worse over time…) so that I spent more time to get any usable results from it, instead of just writing the goddamn code myself.
There has to be a very large step in progress, that this is anywhere feasible (maybe that’s true for some “boilerplate” react UI code though). You have to have in mind, that you should still review all the code which takes a good chunk of the time (especially if it’s full with issues as it is with LLMs). Often I go over it and think yes, this is ok, and then I check it out in more detail and find a lot of issues that cost me more time compared to writing the code myself in the first place.
I have actually fed GPT4 a lot of natural language instructions to write code, and it was kind of a disaster, I have to try that again with more code instructions, as I think it’s better to just provide an LLM the code directly, if it will really get smart enough it will understand the intentions of the code without comments (as it has seen a lot of code).
Context size is also a bigger issue, the LLM just doesn’t have as much overview over the code and the relevant details (I need to try out the 32k GPT4 model though and feed it more code of the architecture, this may help, but is obviously a lot of work…)
Same for humans, if your code is really too complex, you can likely simplify it, such that humans can read it without comments.
If not, it falls for me in the first category I’ve listed (complex math or similar). And then of course comments make sense for a complex piece of code that may need more context.
I would only add comments otherwise for edgecases and ideas (e.g. TODO).
For the rest a good API doc (javadoc, rustdoc etc.) is more than enough (if it’s clear what a function should do and the function is written in a modular way, it should be easy to read the code IMHO.
Really if you need comments, think about the code first, is it the simplest approach? Can I make it more readable? I feel like I have written a lot of “unreadable” (or too complex) code in my junior years…
What otherwise makes sense for me is a high level description of the architecture.
There’s a world of difference between using ChatGPT and something like Copilot within a mature codebase.
Once a few of the Copilot roadmap features are added, I suspect you’ll be seeing yet another leap forward.
Too many commenting on this subject focus in on where the tech is at today without appropriately considering the jump from where it was at a year ago versus today and what that means for next year or the year after.
I’m mostly using ChatGPT4, because I don’t use vscode (helix), and as far as I could see it from colleagues, the current Copilot(X) is not helpful at all…
I’m describing the problem (context etc.), maybe paste some code there, and hope that it gets what I mean, when it doesn’t (which seems to be rather often), I’ll try to help it with the context it hasn’t gotten, but it very often fails, unless the code stuff is rather simple (i.e. boilerplaty).
But even if I want the GPT4 to generate a bunch of boilerplate, it introduces something like // repeat this 20 times in between the code that it should actually generate, and even if I tell it multiple times that it should generate the exact code, it fails pretty much all the time, also with increased context size via the API, so that it should actually be able to do it in one go, the gpt4-0314 model (via the API) seems to be a bit better here.
I’m absolutely interested where this leads, and I’m the first that monitors all the changes, but right now it slows me down, rather than really helping me. Copilot may be interesting in the future, but right now it’s dumb as fu… I’m not writing boilerplaty code, it’s rather complex stuff, and it fails catastrophically there, I don’t see that this will change in the near future. GPT4 got dumber over the course of the last half year, it was certainly better at the beginning. I can remember being rather impressed by it, but now meh…
It’s good for natural language stuff though, but not really for novel creative stuff in code (I’m doing most stuff in Rust btw.).
But GPT5 will be interesting. I doubt, that I’ll really profit from it for code related stuff (maybe GPT6 then or so), but we’ll see… All the other developments in that space are also quite interesting. So when it’s actually viable to train or constrain your own LLM on your own bigger codebase, such that it really gets the details, and gives actual helpful suggestions, (e.g. something like the recent CodeLlama release) this stuff may be more interesting for actual coding.
I’m not even letting it generate comments (e.g. above functions) because it’s kinda like this currently (figurative, more fancy but wordy, and not really helpful)
I can’t disagree with your colleagues more, and suppose that perhaps they are reporting experiences in a fresh codebase or early on in its release.
With a mature codebase, it feeds a lot of that in as context, and so suggestions match your naming conventions, style, etc.
It could definitely use integration with a linter so it doesn’t generate subtle bugs around generative naming mismatching actual methods/variables, but it’s become remarkably good, particularly in the past few weeks.
BTW, if you want more milage out of ChatGPT, I would actually encourage it to be extremely verbose with comments. You can always strip them out later, but the way generative models work, the things it generates along the way impact where it ends up. There’s a whole technique around having it work through problems in detailed thoughts called “chain of thought prompting” and you’ll probably have much better results instructing it to work through what needs to be done in a comment preceding its activity writing the code than just having it write the code.
And yes, I’m particularly excited to see where the Llama models go, especially as edge hardware is increasingly tailored for AI workloads over the next few years.
It could definitely use integration with a linter so it doesn’t generate subtle bugs around generative naming mismatching actual methods/variables, but it’s become remarkably good, particularly in the past few weeks.
Maybe I should try it again, I doubt thought that it really helps me, I’m a fast typer, and I don’t like to be interrupted by something wrong all the time (or not really useful) when I have a creative phase (a good LSP like rust-analyzer seems to be a sweet spot I think). And something like copilot seems to just confuse me all the time, either by showing plain wrong stuff, or something like: what does it want? ahh makes sense -> why this way, that way is better (then writing instead how I would’ve done it), so I’ll just skip that part for more complex stuff at least.
But it would be interesting how it may look like with code that’s a little bit less exotic/living on the edge of the language. Like typical frontend or backend stuff.
In what context are you using it, that it provides good results?
I would actually encourage it to be extremely verbose with comments
Yeah I don’t know, I’m not writing the code to feed it to an LLM, I like to write it for humans, with good function doc (for humans), I hope that an LLM is smart enough at some day to get the context. And that may be soon enough, but til then, I don’t see a real benefit of LLMs for code (other than (imprecise) boilerplate generators).
Just write your code, so that you (almost) don’t need comments (i.e. simple to read). The problems with (most) comments won’t be solved by AI.
(in-code) comments only make sense for me in roughly these scenarios:
The problems with comments are explained well IMHO here: https://www.youtube.com/watch?v=Bf7vDBBOBUA
Here is an alternative Piped link(s): https://piped.video/watch?v=Bf7vDBBOBUA
Piped is a privacy-respecting open-source alternative frontend to YouTube.
I’m open-source, check me out at GitHub.
This couples intentions to the code which in my example would be dynamic.
That’s going to be a bad time.
My point is that the conventions that used to be good for the past 50 years of development are likely going to change as tooling does.
Programming is effectively about managing complexity.
Yes, the abstraction of a development language being the layer at which you encode intention rather than in comments is better when humans are reading and writing the code itself.
But how many projects have historically run into problems when a decade earlier they chose a language that years later is stagnating in tooling or integrations versus another pick?
Imagine if the development work had been done exclusively in pseudocode and comments guiding generative AI writing in language A. How much easier might porting everything to language B end up being?
Language agnostic development may be quite viable within a year or so.
And just as you could write software in binary, letting a compiler do that and working with an abstracted layer is more valuable in time and cost.
I’m saying that the language is becoming something which software can effectively abstract, so moving the focus yet another layer up will likely be more valuable than clinging to increasingly obsolete paradigms.
I doubt that very much, GPT4 (to my knowledge still the best LLM) is far from being there. As (my) initial hype is overcome, I have basically stopped using it because I have to “help” it too much (and it got really worse over time…) so that I spent more time to get any usable results from it, instead of just writing the goddamn code myself. There has to be a very large step in progress, that this is anywhere feasible (maybe that’s true for some “boilerplate” react UI code though). You have to have in mind, that you should still review all the code which takes a good chunk of the time (especially if it’s full with issues as it is with LLMs). Often I go over it and think yes, this is ok, and then I check it out in more detail and find a lot of issues that cost me more time compared to writing the code myself in the first place.
I have actually fed GPT4 a lot of natural language instructions to write code, and it was kind of a disaster, I have to try that again with more code instructions, as I think it’s better to just provide an LLM the code directly, if it will really get smart enough it will understand the intentions of the code without comments (as it has seen a lot of code).
Context size is also a bigger issue, the LLM just doesn’t have as much overview over the code and the relevant details (I need to try out the 32k GPT4 model though and feed it more code of the architecture, this may help, but is obviously a lot of work…)
Same for humans, if your code is really too complex, you can likely simplify it, such that humans can read it without comments. If not, it falls for me in the first category I’ve listed (complex math or similar). And then of course comments make sense for a complex piece of code that may need more context. I would only add comments otherwise for edgecases and ideas (e.g.
TODO).For the rest a good API doc (javadoc, rustdoc etc.) is more than enough (if it’s clear what a function should do and the function is written in a modular way, it should be easy to read the code IMHO.
Really if you need comments, think about the code first, is it the simplest approach? Can I make it more readable? I feel like I have written a lot of “unreadable” (or too complex) code in my junior years…
What otherwise makes sense for me is a high level description of the architecture.
How were you feeding it?
There’s a world of difference between using ChatGPT and something like Copilot within a mature codebase.
Once a few of the Copilot roadmap features are added, I suspect you’ll be seeing yet another leap forward.
Too many commenting on this subject focus in on where the tech is at today without appropriately considering the jump from where it was at a year ago versus today and what that means for next year or the year after.
I’m mostly using ChatGPT4, because I don’t use vscode (helix), and as far as I could see it from colleagues, the current Copilot(X) is not helpful at all…
I’m describing the problem (context etc.), maybe paste some code there, and hope that it gets what I mean, when it doesn’t (which seems to be rather often), I’ll try to help it with the context it hasn’t gotten, but it very often fails, unless the code stuff is rather simple (i.e. boilerplaty). But even if I want the GPT4 to generate a bunch of boilerplate, it introduces something like
// repeat this 20 timesin between the code that it should actually generate, and even if I tell it multiple times that it should generate the exact code, it fails pretty much all the time, also with increased context size via the API, so that it should actually be able to do it in one go, thegpt4-0314model (via the API) seems to be a bit better here.I’m absolutely interested where this leads, and I’m the first that monitors all the changes, but right now it slows me down, rather than really helping me. Copilot may be interesting in the future, but right now it’s dumb as fu… I’m not writing boilerplaty code, it’s rather complex stuff, and it fails catastrophically there, I don’t see that this will change in the near future. GPT4 got dumber over the course of the last half year, it was certainly better at the beginning. I can remember being rather impressed by it, but now meh…
It’s good for natural language stuff though, but not really for novel creative stuff in code (I’m doing most stuff in Rust btw.).
But GPT5 will be interesting. I doubt, that I’ll really profit from it for code related stuff (maybe GPT6 then or so), but we’ll see… All the other developments in that space are also quite interesting. So when it’s actually viable to train or constrain your own LLM on your own bigger codebase, such that it really gets the details, and gives actual helpful suggestions, (e.g. something like the recent CodeLlama release) this stuff may be more interesting for actual coding.
I’m not even letting it generate comments (e.g. above functions) because it’s kinda like this currently (figurative, more fancy but wordy, and not really helpful)
// this variable is of type int let a = 8;I can’t disagree with your colleagues more, and suppose that perhaps they are reporting experiences in a fresh codebase or early on in its release.
With a mature codebase, it feeds a lot of that in as context, and so suggestions match your naming conventions, style, etc.
It could definitely use integration with a linter so it doesn’t generate subtle bugs around generative naming mismatching actual methods/variables, but it’s become remarkably good, particularly in the past few weeks.
BTW, if you want more milage out of ChatGPT, I would actually encourage it to be extremely verbose with comments. You can always strip them out later, but the way generative models work, the things it generates along the way impact where it ends up. There’s a whole technique around having it work through problems in detailed thoughts called “chain of thought prompting” and you’ll probably have much better results instructing it to work through what needs to be done in a comment preceding its activity writing the code than just having it write the code.
And yes, I’m particularly excited to see where the Llama models go, especially as edge hardware is increasingly tailored for AI workloads over the next few years.
Maybe I should try it again, I doubt thought that it really helps me, I’m a fast typer, and I don’t like to be interrupted by something wrong all the time (or not really useful) when I have a creative phase (a good LSP like rust-analyzer seems to be a sweet spot I think). And something like copilot seems to just confuse me all the time, either by showing plain wrong stuff, or something like: what does it want? ahh makes sense -> why this way, that way is better (then writing instead how I would’ve done it), so I’ll just skip that part for more complex stuff at least.
But it would be interesting how it may look like with code that’s a little bit less exotic/living on the edge of the language. Like typical frontend or backend stuff.
In what context are you using it, that it provides good results?
Yeah I don’t know, I’m not writing the code to feed it to an LLM, I like to write it for humans, with good function doc (for humans), I hope that an LLM is smart enough at some day to get the context. And that may be soon enough, but til then, I don’t see a real benefit of LLMs for code (other than (imprecise) boilerplate generators).