Codebases in the 100k+ to 1m+ sloc can be very difficult for a LLM (or human) to answer detailed questions about. A LLM might be able to point you in the right direction, but they don’t have enough context size to fit the code, let enough the capability to actually analyze it. Summarize? Sure, but it can only summarize what it has in context.
No, I mean summarizing things. It doesn’t need 100k-1m lines in context to give you an accurate picture of what’s going on. It certainly doesn’t need to invent things to tell you. It could totally give you incorrect assessments based on limited information, but it’s bound to be way better than what you could do in a short period off time.
My experience, having actually tried this on a huge codebase: my time was better spent looking at file names and reading source code myself to answer specific questions about the code.
Using it to read a single file or a few of them might go better. If you can find the right files first, you might get decent output.
Just ask ai to read the code base and tell you.
Oh it will definitely tell you something. It will be complete bullshit, but it’s something.
Of all the things AI does well, this is one of them
Spouting bullshit? If so, I agree.
Codebases in the 100k+ to 1m+ sloc can be very difficult for a LLM (or human) to answer detailed questions about. A LLM might be able to point you in the right direction, but they don’t have enough context size to fit the code, let enough the capability to actually analyze it. Summarize? Sure, but it can only summarize what it has in context.
No, I mean summarizing things. It doesn’t need 100k-1m lines in context to give you an accurate picture of what’s going on. It certainly doesn’t need to invent things to tell you. It could totally give you incorrect assessments based on limited information, but it’s bound to be way better than what you could do in a short period off time.
My experience, having actually tried this on a huge codebase: my time was better spent looking at file names and reading source code myself to answer specific questions about the code.
Using it to read a single file or a few of them might go better. If you can find the right files first, you might get decent output.