AI models require additional standards and tests, the researchers say

As the use of artificial intelligence – benign and competition – increases at the rate of rupture, more cases of potentially harmful answers are revealed.

PixDeluxe | E+ | Gets the image

As the use of artificial intelligence – benign and competition – increases at the rate of rupture, more cases of potentially harmful answers are revealed. Here are included Right hatred. copyright violations or Sexual content.

The appearance of these undesirable behavior is deteriorating the lack of rules and insufficient testing of AI models, CNBC researchers said.

Getting machine learning models to behave the way it was intended for this is also a high order, said Javier Rand, AI researcher.

“The answer, after almost 15 years of research, is that we do not know how to do it, and it doesn’t look like it gets better,” said CNBC Rand, which focuses on competitive machine training.

However, there are some ways to evaluate the risks in II, eg Red union. Practice includes people testing and checking artificial intelligence systems to reveal and detect any potential damage – a mode of operation common in cybersecurity circles.

Shane Longpr, II Researcher and Policy and Lead Initiative by OriginHe noted that there are not enough red teams in red teams.

While AI startups are now using first -person appraisers or contracting on the other hand to check their models, opening up to third parties such as ordinary users, journalists, researchers and ethical hackers, according to data, according to data A document published by LongPre and researchers.

“Some disadvantages in the systems that people found the necessary lawyers, doctors actually a veterinarian, actual scientists who are specialized specialists on issues to find out whether it was a shortcoming or not, because the ordinary person probably could not or did not have sufficient knowledge,” Longpr said.

Adoption of standardized reports on AI deficiency, incentives and ways of disseminating these “deficiencies” in AI systems are some recommendations outlined in the document.

With this practice, it was successfully accepted in other sectors, such as software safety, “we need it in the II now,” LongPre added.

To marry this practice focused on users with management, politics and other tools, provide a better understanding of the risks that cause AI tools and users, Rand said.

We are running the II development path that is very harmful to many people, says Karen Hao

Already not moonsh

The Moonshot project is one of such approaches that combines technical solutions with politics mechanisms. Project Moonshot, launched Singapore’s Infocomm Media Development Computer robot.

A set of tools brings together comparison, red association and test lines. There is also an assessment mechanism that allows the AI ​​startups to ensure that their models can be trusted and harmful to users, Anup Kumar, Customer Technology Manager and AI in IBM Asia Pacific, CNBC reported.

Assessment – this a continuous process This should be done as before and after the models deploy, said Kumar, who noted that the response to the instrument was ambiguous.

“A lot of startups perceived it as a platform because it was with open source, And they started using it. But I think you know we can do much more. “

By moving forward, Project Moonshot seeks to enable setup for specific industries and enable multilingual and multicultural red association.

Higher standards

Pierre Alker, Professor Statistics in Essec Business School, Asia-Pacific Region, said technology companies are currently Hurrying to release Their latest AI models without proper assessment.

“When a pharmaceutical company develops a new drug, they need months of tests and very serious proof that it is useful and not harmful before they are approved by the government,” he said, adding that a similar process is in the aviation sector.

AI models must meet a strict set of conditions before they are approved, added alker. The transition from the wide tools of AI to the development that is designed for more specific tasks, to ease the anticipation and control of their misuse, said Alker.

“LLM can do too many things, but they are not targeted at tasks that are quite specific,” he said. As a result, “the number of possible abuses is too high to the developers foresee them all.”

Such broad models make the definition of what is considered safe and safe, according to A study that is rend was involved in.

Thus, technology companies should avoid overcoming that “their protection is better than they are,” Rand said.

Source link