If you’re using an LLM, you should limit the output via a grammar to something like json, jsonl, or csv so you can load it into scripts and validate that the generated data matches the source data. Though at that point you might as well just parse the raw data and do it yourself. If I were you, I’d honestly use something like pandas/polars or even excel to get it done reliably without people bashing you for using the forbidden technology even if you can 100% confirm that the data is real and not hallucinated.
I also wouldn’t use any cloud LLM solution like OpenAI, Gemini, Grok, etc. Since those can change and are really hard to validate and give you little to no control of the model. I’d recommend using a local solution like running an open weight model like Mistral Nemo 2407 Instruct locally using llama.cpp or vLLM since the entire setup will not change unless you manually go in and change something. We use a custom finetuned version of Mixtral 8x7B Instruct at work in a research setting and it works very well for our purposes (translation and summarization) despite what critics think.
Tl;dr Use pandas/polars if you want 100% reliable (Human error not accounted). LLMs require lots of work to get reliable output from
Edit: There’s lots of misunderstanding for LLMs. You’re not supposed to use the bare LLM for any tasks except extremely basic ones that could be done by hand better. You need to build a system around them for your specific purpose. Using a raw LLM without a Retrieval Augmented Generation (RAG) system and complaining about hallucinations is like using the bare ass Linux kernel and complaining that you can’t use it as a desktop OS. Of course an LLM will hallucinate like crazy if you give it no data. If someone told you that you have to write a 3 page paper on the eating habits of 14th century monarchs in Europe and locked you in a room with absolutely nothing except 3 pages of paper and a pencil, you’d probably write something not completely accurate. However, if you got access to the internet and a few databases, you could write something really good and accurate. LLMs are exceptionally good at summarization and translation. You have to give them data to work with first.
Gonna defend gen z a bit here. Unlike older generations, gen z was raised in a large part only on locked down, touch screen interface devices like smartphones and tablets. These devices are designed to not be tampered with, designed and streamlined to “just work” for certain tasks without any hassle.
If you only have a smartphone or tablet, how are you supposed to learn how to use a desktop os? How are you supposed to learn how to use a file system? How are you supposed to learn how to install programs outside of a central app store? How are you supposed to learn to type on a physical keyboard if you do not own one?
I worked as a public school technician for a while and we used Chromebooks at my school system. Chromebooks are just as locked down if not more locked down than a smartphone due to school restrictions imposed via Google’s management interface. Sure they have a physical keyboard and “files” but many interfaces nowadays are point and click rather than typing. The filesystem (at least on the ones I worked with) were locked down to just the Downloads, Documents, Pictures, etc. directories with everything else locked down and inaccessible.
Schools (at least the ones I went to and worked at) don’t teach typing classes anymore. They don’t teach cursive classes. They don’t teach any classes on how to use technology outside of a few Microsoft certification programs that students have to chose to be in (and are awfully dull and will put you to sleep).
Gen Z does not have these technology skills because they largely do not have access to anything that they can use to learn these skills and they aren’t taught them by anyone. Gen Z is just expected to know these skills from being exposed to technology but that’s not how it works in the real world.
These people aren’t dumb as rocks either like so many older people say they are. It’s a bell curve, you’ll have the people dumb as rocks, the average person, and the Albert Einsteins. Most people here on lemmy fall closer to the “Albert Einstein” end of the tech savvy curve so there’s a lot of bias here. But I’ve had so many cases where I’ve met Boomers, Gen X, and Millennial who just can’t grasp technology at all.
Also, before someone says “they can just look it up on the internet”, they have no reason to. What’s the point of looking up these skills if they cannot practice them anywhere? Sure, you’ll have a few that are curious and interested in it but a vast majority of people have interests that lie outside of tech skills.
Tl;dr Gen Z is just expected to know technology and thus aren’t taught how to use it or even have access to non-locked down devices.