• pixxelkick@lemmy.world
    link
    fedilink
    English
    arrow-up
    1
    ·
    5 days ago

    If I were to design this, and I do indeed do stuff like this for a living, I would have the AI only able to compose just the query, but not handle the results, my API itself would actually perform the query and return the results.

    This would ensure the AI cannot “muck up” the results with fake data. Its only job is just to compose the query and confirm it works.

    So I would construct a set of MCP tools it can use to:

    1. Get the schema of the DB so it can compose a query
    2. Test run the same query against the DB
    3. Review the results and confirm its good, and get feedback if there are errors
    4. Once happy, the LLM would invoke a final MCP with the SQL query which the backend would then actually run said query and return those results to the user. If it errors out that same MCP would fire back the error to the LLM, in case it invoked the tool wrong. The user would only get their query returned to them when its valid and works

    Which actually would not be terribly hard to implement, maybe 1 week of work if Im just making an internal “to be used by our own people” type of tool that doesnt have to be super pretty, just a simple dashboard where they punch in their prompt, which then gets put in a queue, and then the get notified when the LLM has finished and returned the results to them, which they can then download as a CSV or some shit.

    Easy peasy and an example of actually using these tools in a sane way.

    I would never have something like this be outward “client” facing public though, this stuff would be reserved for internal use.

    • vrek@programming.dev
      link
      fedilink
      English
      arrow-up
      1
      ·
      5 days ago

      Yup that’s similar what I do sometimes. My general idea was always write a simplified example, prove it works, ask Ai to add in whatever complexity was needed based on my example, prove that works, release for internal use.