AI & SecurityMEDIUM

GitHub Copilot - New Rubber Duck AI Review Feature Launched

Featured image for GitHub Copilot - New Rubber Duck AI Review Feature Launched
#GitHub Copilot#Rubber Duck#AI Review#Claude Model#GPT-5.4

Original Reporting

HNHelp Net Security·Anamarija Pogorelec

AI Intelligence Briefing

CyberPings AI·Reviewed by Rohit Rana
Severity LevelMEDIUM

Moderate risk — monitor and plan remediation

🤖
🤖 AI RISK ASSESSMENT
AI Model/System
Vendor/Developer
Risk Type
Attack Surface
Affected Use Case
Exploit Complexity
Mitigation Available
Regulatory Relevance
🎯

Basically, GitHub Copilot now has a feature that checks its own work using a different AI model.

Quick Summary

GitHub Copilot has launched Rubber Duck, a new AI review feature. This tool helps developers catch overlooked coding errors. By using cross-model evaluations, it enhances code reliability and efficiency.

What Happened

GitHub has unveiled a new feature called Rubber Duck for its Copilot CLI. This feature is designed to enhance the coding process by allowing a secondary AI model to review the output of the primary model. The aim is to catch errors that the primary model might miss due to inherent biases in its training data.

How Rubber Duck Works

Rubber Duck operates by utilizing a different AI model than the one that generated the initial code. For instance, if the primary model is from the Claude family, Rubber Duck will run on GPT-5.4. This cross-model review helps surface potential errors such as:

  • Unfounded assumptions made by the primary model
  • Overlooked edge cases
  • Conflicts with existing code requirements

Benchmark Results

In tests conducted using the SWE-BENCH PRO, the combination of Claude Sonnet and Rubber Duck closed 74.7% of the performance gap when compared to the Opus model alone. This improvement is particularly notable for complex tasks that span multiple files, where the accuracy of the coding output is critical.

Error Detection Examples

During testing, Rubber Duck successfully identified several critical errors:

  • In one instance, it flagged a proposed async scheduler that would exit immediately, failing to execute any jobs.
  • Another case involved a loop that incorrectly overwrote dictionary keys, leading to dropped search query categories without any error notification.
  • Rubber Duck also caught issues in an email confirmation flow where the new code stopped writing to a Redis key, which could have broken the confirmation UI.

Activation of Rubber Duck

Rubber Duck can be activated in two ways: automatically or on demand. It automatically triggers at three key checkpoints:

  1. After drafting a plan
  2. After complex implementations
  3. After writing tests, but before execution

Developers can also manually request a review at any point during a coding session. This ensures that feedback is integrated effectively without overwhelming the developer with constant interruptions.

Availability and Future Plans

Currently, Rubber Duck is available in experimental mode within GitHub Copilot CLI. Developers can access it by using the /experimental command. GitHub plans to explore additional model pairings to enhance the feature further.

This innovative approach underscores GitHub's commitment to improving coding accuracy and reliability through advanced AI techniques.

Pro Insight

🔒 Pro insight: Rubber Duck's cross-model evaluation could significantly reduce coding errors, but its effectiveness will depend on model compatibility and integration.

Sources

Original Report

HNHelp Net Security· Anamarija Pogorelec
Read Original

Related Pings

MEDIUMAI & Security

Top Enterprise AI Gateways Ranked for Security and Integration

A recent survey shows 90% of organizations are adopting AI gateways for security and governance. This article ranks the top 12 gateways based on security depth and ease of integration, highlighting their unique strengths. Choosing the right gateway is crucial for effective AI deployment.

Cyber Security News·
MEDIUMAI & Security

OpenAI - Applications Open for AI Safety Research Fellowship

OpenAI is accepting applications for its AI Safety Fellowship, aimed at funding research on AI safety and alignment. This initiative is crucial for ethical AI development. Researchers from various fields are encouraged to apply and contribute to this important work.

Help Net Security·
MEDIUMAI & Security

Google Study - LLMs Enhance Abuse Detection Framework

A new Google study shows how large language models are enhancing content moderation across all stages of abuse detection. While they improve safety, they also introduce new governance challenges. The findings highlight the need for careful oversight as AI becomes more integrated into moderation processes.

Help Net Security·
HIGHAI & Security

AI Security - Google DeepMind Maps Web Attacks Against AI Agents

Google DeepMind researchers have identified six web attack types that can exploit AI agents. These attacks manipulate AI behavior, posing significant security risks. Awareness and proactive measures are essential to safeguard against these threats.

SecurityWeek·
MEDIUMAI & Security

OWASP GenAI Security Project - New Tools Matrix Released

The OWASP GenAI Security Project has updated its tools matrix, addressing 21 generative AI risks. Companies are urged to adopt linked defense strategies for GenAI systems to enhance security.

Dark Reading·
HIGHAI & Security

FortiOS 8.0 - Redefining Security for AI and Quantum Threats

FortiOS 8.0 has been launched, introducing AI-driven and quantum-ready security features. This update is essential for organizations facing modern threats. It enhances visibility and simplifies operations, ensuring robust protection against evolving risks.

Fortinet Threat Research·