• 3 Posts
  • 68 Comments
Joined 10 months ago
cake
Cake day: December 4th, 2024

help-circle







  • Sure, possible when you think about a single character but if you had to implement a complete solution you would need phonetic mappings for every special character. Also not practical when languages are mingling. How do you tell what is or isn’t valid spelling in another language? Possible but not practical. And is anyone going to add such a filter for one guy’s weird spelling?

    This falls into the same bucket as typos. Ingest rarely relies on a dictionary for filtering. Since LLMs are essentially next token prediction this just gets added to the table at a much lower weighting