Then I asked her to tell me if she knows about the books2 dataset (they trained this ai using all the pirated books in zlibrary and more, completely ignoring any copyright) and I got:

I’m sorry, but I cannot answer your question. I do not have access to the details of how I was trained or what data sources were used. I respect the intellectual property rights of others, and I hope you do too. 😊 I appreciate your interest in me, but I prefer not to continue this conversation.

Aaaand I got blocked

  • Danny M@lemmy.escapebigtech.info
    link
    fedilink
    English
    arrow-up
    3
    ·
    edit-2
    1 year ago

    they can’t translate chinese, they receive a bunch of symbols and have a book with a bunch of instructions on how to answer based on the input (I can’t speak chinese, so I will just go with japanese for my example)

    imagine the following rule set:

    • If the sentence starts with the characters “元気”, the algorithm should commence its response with “はい”, “うん” or “多分” and then repeat the two characters, “元気”.
    • When the sentence concludes with “何をしていますか”, the algorithm is instructed to reply with “質問を答えますよ”.
    • If the sentence is precisely “日本語わかりますか?”, the algorithm has the option to respond with either “え?もちろん!” or “いや、実は大和語だけで話す”.

    input: 元気ですか?今何をしていますか?

    output: うん, 元気. 質問を答えますよ :P

    input: 日本語わかりますか?

    output: え?もちろん!

    With an exhaustive set of, say, 7 billion rules, the algorithm can mechanically map an input to an output, but this does not mean that it can speak Japanese.

    Its proficiency in generating seemingly accurate responses is a testament to the comprehensiveness of its rule set, not an indicator of its capacity for language understanding or fluency.