Google Adds Guardrails to Keep AI in Check

May 24, 2023 TH Author

GOOGLE I/O 2023, MOUNTAIN VIEW, CALIF. — Sandwiched between major announcements at Google I/O, company executives discussed guardrails to its new AI products to ensure they are used responsibly and not misused.

Many of the executives, including Google CEO Sundar Pichai, noted some of the security concerns associated with advanced AI technologies coming out of the labs. The spread of misinformation, deepfakes, and abusive text or imagery generated by AI would be hugely detrimental if Google were responsible for the model that created this content, says James Sanders, principal analyst at CCS Insight.

“Safety, in the context of AI, concerns the impact of artificial intelligence on society. Google’s interests in responsible AI are motivated, at least in part, by reputation protection and discouraging intervention by regulators,” says Sanders.

For example, Universal Translator is a video AI offshoot of Google Translate that can take footage of a person speaking and translate the speech into another language. The app could potentially expand the video’s audience to include those who don’t speak the original language.

But the technology could also erode trust in the source material, since the AI modifies the lip movement to make it seem as if the person was speaking in the translated language, said James Manyika, Google’s senior vice president charged with responsible development of AI, who demonstrated the application on stage.

“There’s an inherent tension here. You can see how this can be incredibly beneficial, but some of the same underlying technology can be misused by bad actors to create deepfakes. We built the service around guardrails to help prevent misuse, and to make it accessible only to authorized partners,” Manyika said.

Setting up Custom Guardrails

Different companies are approaching AI guardrails differently. Google is focused on controlling the output generated by artificial intelligence tools and limiting who can actually use the technologies. Universal Translators are available to fewer than 10 partners, for example. ChatGPT has been programmed to say it couldn’t answer certain types of questions if the question or answer could cause harm.

Nvidia has NeMo Guardrails, an open source tool to ensure responses fit within specific parameters. The technology also prevents the AI from hallucinating, the term for giving a confident response that is not justified by its training data. If the Nvidia program detects that the answer isn’t relevant within specific parameters, it can decline to answer the question, or send the information to another system to find more relevant answers.

Google shared its research on safeguards in its new PaLM-2 large-language model, which was also announced at Google I/O. That Palm-2 technical paper explains that there are some questions in certain categories the AI engine will not touch.

“Google relies on automated adversarial testing to identify and reduce these outputs. Google’s Perspective API, created for this purpose, is used by academic researchers to test models from OpenAI and Anthropic, among others,” CCS Insight’s Sanders said.

Kicking the Tires at DEF CON

Manyika’s comments fit into the narrative of responsible use of AI, which took on more urgency after concerns about bad actors misusing technologies like ChatGPT to craft phishing approaches or generate malicious code to break into systems.

AI was already being used for deepfake videos and voices. AI company Graphika, which counts the Department of Defense as a client, recently identified instances of AI-generated footage being used to try to influence public opinion. “We believe the use of commercially available AI products will allow IO actors to create increasingly high-quality deceptive content at greater scale and speed,” the Graphika team wrote in its deepfakes report.

The White House has chimed in with a call for guardrails to mitigate misuse of AI technology. Earlier this month, the Biden administration secured the commitment of companies like Google, Microsoft, Nvidia, OpenAI, and Stability AI to allow participants to publicly evaluate their AI systems during DEF CON 31, which will be held in August in Las Vegas. The models will be red-teamed using an evaluation platform developed by Scale AI.

“This independent exercise will provide critical information to researchers and the public about the impacts of these models, and will enable AI companies and developers to take steps to fix issues found in those models,” the White House statement said.

Setting up Custom Guardrails

Kicking the Tires at DEF CON

Leave a Reply Cancel reply