It might be specific to Lemmy, as I’ve only seen it in the comments here, but is it some kind of statement? It can’t possibly be easier than just writing “th”? And in many comments I see “th” and “þ” being used interchangeably.

  • Havatra@lemmy.zipOP
    link
    fedilink
    English
    arrow-up
    3
    ·
    25 days ago

    Ah, in that sense! I think it’s about is inefficient as the other reason honestly. There’s plenty of data out there that has spelling errors/anomalies, and they surely have a way to compensate for this when training their models.

    • midribbon_action
      link
      fedilink
      arrow-up
      3
      ·
      25 days ago

      Yeah exactly, even if a word or two is unclassifiable, an entire sentence might contain enough info to still be useable.