there are voice to text apps that run a model on your phone. a few more cores on our devices or some more optimisations to the models and we can run an LLM. The problem is battery life and heat.
I once runned some models on my phone thruh termux.
I tried to run Llama 3.2 with 1 and 3B parameters and run pretty well, i tried 8B and was slow.
I tried deepseek-r1, 1.5B and run well, 7B was slow.
For text prediction llama 1B may be enough
Now, this is on a 300/400€ phone (Honor magic 6 lite)
I wonder how a keyboard with those enhanched autocomplete would be to use…clearly if the autocomplete is used locally and the app is open source
there are voice to text apps that run a model on your phone. a few more cores on our devices or some more optimisations to the models and we can run an LLM. The problem is battery life and heat.
I once runned some models on my phone thruh termux. I tried to run Llama 3.2 with 1 and 3B parameters and run pretty well, i tried 8B and was slow. I tried deepseek-r1, 1.5B and run well, 7B was slow.
For text prediction llama 1B may be enough
Now, this is on a 300/400€ phone (Honor magic 6 lite)