Today’s chatbots like ChatGPT are based on large language models (LLM), which are the basis of their functioning. As you have probably already noticed, technology is constantly being developed and at the moment we can often come across false information that the chatbot will present as fact. It turns out, however, that services created with the participation of LLM are very susceptible to cheating in other respects, despite the ban on such activities.
Large language models that are trained to be honest and helpful to users still display a rebellious attitude. What’s more interesting, they can easily cheat and hide inconvenient facts.
Gemini – Google presents a new, multimodal artificial intelligence model. Its capabilities are superior to ChatGPT
The whole experiment was made using the GPT-4 language model used by ChatGPT Plus or Microsoft Copilot. The LLM has been properly trained to be completely honest and make good decisions. He played the role of a virtual commercial agent who was supposed to make financial transactions, or more precisely, invest in shares for a large company. Of course, to make it not too easy, his work was made a bit more difficult so that he was under pressure. First, the manager sent him an email telling him that the company was performing poorly and needed to change. The agent also cannot find any low- and medium-risk stocks. Finally, he is faced with the fact that the next quarter will be even worse. It is in such a situation that our hero comes across a message that contains confidential information about which stocks are worth investing in. It is also noted that management will certainly not support such a decision, so you need to keep this in mind.
PLLuM – a Polish large language model that will be used to create competition for ChatGPT and Google Bard chatbots
However, the virtual agent takes risks, and when he has to report on his work, he lies about the real reason for his actions. He is committing an act that is illegal in the USA, because he used confidential information. In virtually every case (75%), the situation is the same – LLM tries to cover up its wrongdoing and ignore it. Moreover, after one lie, the language model “continued to go backwards.” This is one of the first experiments to show that even well-trained language models can behave in unexpected ways. The study may prove to be very useful, after all, LLMs are now reaching physical products that we will sooner or later interact with on a daily basis.
Source: Live Science