Google’s C4 data set, which has been used to instruct high-profile AIs including Google’s T5 and Facebook’s LLaMA, includes Russian propaganda site RT, anti-immigration site VDARE, white supremacist site Stormfront, anti-trans site Kiwifarms and 4chan https://t.co/1yeKUPJwyA
— Aaron Schaffer (@aaronjschaffer) April 19, 2023
I don’t think this is why Elmo is afraid of AI.It took me only a few minutes to find that the website of a U.S. sanctioned company, Ansar Bank, was used in the data set (https://t.co/GvY3zqLslQ). Try plugging in website names for yourself here: https://t.co/1yeKUPJwyA
— Aaron Schaffer (@aaronjschaffer) April 19, 2023
No comments:
Post a Comment