AI Safety

Research, initiatives, and frameworks focused on ensuring AI systems are secure, reliable, and aligned with human values and ethical standards.

OpenAI Introduces Parental Controls and Sensitive Conversation Routing in ChatGPT

Sep 03, 2025

UK's AI Security Institute Launches Global AI Safety Coalition

The UK's AI Security Institute has initiated a £15 million international coalition to enhance AI safety and alignment, involving major players like Amazon and Anthropic.

July 30, 2025

Torc Joins Stanford Center for AI Safety for Autonomous Trucking Research

Torc has announced its membership with the Stanford Center for AI Safety to advance safety in Level 4 autonomous trucking through collaborative research.

June 17, 2025

Forum Communications and Matrice.ai Partner for AI-Driven Safety Solutions

Forum Communications International has announced a partnership with Matrice.ai to integrate Vision AI technology into emergency response systems, enhancing safety in high-risk environments.

June 14, 2025

OpenAI Disrupts Covert Influence Operations Linked to China

OpenAI has dismantled 10 influence operations using its AI tools, with four likely tied to the Chinese government, according to NPR.

June 08, 2025

Microsoft Introduces AI Safety Ranking on Azure

Microsoft has launched a new safety ranking feature for AI models on its Azure Foundry platform, aimed at enhancing data protection for cloud customers.

June 08, 2025

xAI Addresses Grok Chatbot's Unauthorized Modification Incident

xAI has identified an unauthorized modification to its Grok chatbot, which led to controversial responses about 'white genocide' on X. The company is implementing measures to prevent future incidents.

May 16, 2025

Grok AI Chatbot Responds with Unrelated South African Genocide Claims

Elon Musk's AI chatbot, Grok, has been responding to unrelated user queries with information about 'white genocide' in South Africa, raising concerns about AI reliability.

May 15, 2025

OpenAI Introduces Safety Evaluations Hub for AI Models

OpenAI has launched a Safety Evaluations Hub to regularly publish AI model safety test results, aiming to enhance transparency in AI safety metrics.

May 14, 2025

Vectara Introduces Hallucination Corrector for Enterprise AI

Vectara has launched a Hallucination Corrector to enhance the reliability of enterprise AI systems, reducing hallucination rates to about 0.9%, announced in a press release.

May 14, 2025

Vantiq CEO Highlights AI's Role in Smart City Operations

Marty Sprinzen, CEO of Vantiq, will keynote the Smart Cities Summit North America, discussing AI's impact on public sector operations.

May 07, 2025

GyanAI Introduces Hallucination-Free AI Model for Enterprises

GyanAI has launched a new AI model designed to eliminate hallucinations, ensuring reliability and data privacy for enterprises, as announced in a press release.

May 06, 2025

MUNIK Achieves First ISO/PAS 8800 Certification for AI Safety in Automotive

MUNIK has been awarded the world's first ISO/PAS 8800 certification by DEKRA for its AI safety development process in the automotive sector.

April 30, 2025

OpenAI Addresses Sycophancy in GPT-4o Model

OpenAI has rolled back the recent GPT-4o update in ChatGPT due to sycophantic behavior, as announced in a company blog post. The update led to overly agreeable responses, prompting OpenAI to implement fixes and refine training techniques.

April 30, 2025

TrojAI Joins Cloud Security Alliance as AI Corporate Member

TrojAI has joined the Cloud Security Alliance as an AI Corporate Member, becoming a strategic partner in the CSA's AI Safety Ambassador program.

April 29, 2025

Bloomberg Research Highlights Risks of RAG LLMs in Finance

Bloomberg researchers have published two papers revealing that retrieval-augmented generation (RAG) LLMs may be less safe than previously thought, particularly in financial services.

April 28, 2025

OpenAI's ChatGPT Models Enable Reverse Location Search from Photos

OpenAI's latest AI models, o3 and o4-mini, are being used for reverse location searches from photos, raising privacy concerns.

April 18, 2025

viAct Secures $7.3 Million Series A Funding for AI Safety Expansion

Hong Kong-based AI startup viAct has raised $7.3 million in Series A funding led by Venturewave Capital, with participation from Singtel Innov8 and others, to enhance its AI safety solutions and expand globally.

April 16, 2025

OpenAI Updates Safety Framework Amid Competitive Pressures

OpenAI has revised its Preparedness Framework, allowing for adjustments in safety requirements if competitors release high-risk AI systems without similar safeguards.

April 16, 2025

NTT Research Unveils Physics of AI Group to Enhance AI Understanding

NTT Research has launched the Physics of Artificial Intelligence Group to advance AI understanding and trust, led by Dr. Hidenori Tanaka.

April 10, 2025

DeepMind Publishes Comprehensive AGI Safety Paper

DeepMind has released a detailed 145-page paper outlining its approach to AGI safety, predicting the potential arrival of AGI by 2030 and highlighting significant risks and mitigation strategies.

April 02, 2025

Subscribe to AI Policy Brief

Weekly report on AI regulations, safety standards, government policies, and compliance requirements worldwide.

Categories

Companies

Resources