• fluxx@mander.xyz
    link
    fedilink
    English
    arrow-up
    2
    ·
    17 hours ago

    Are you running an mlx model? If not, try that. My m4 macbook runs qwen3.6-35b-a3b lightning fast. Has its issues, but fast nonetheless.

      • fluxx@mander.xyz
        link
        fedilink
        English
        arrow-up
        1
        ·
        4 hours ago

        I have a model with 64GB of ram. I’ve limited context to 16k, in an effort to make it more stable, but tbh - it is rather unreliable no matter what I do. With my setup - mlx_lm and webui, it frequently collapses or loops, no matter the settings. I have done a lot of debugging and have concluded it is probably inherent model behavior.