A Google Project Naptime projektje LLM-eket használ az automatizált szoftveres sebezhetőség-kutatáshoz
Google has published details on Project Naptime, a software framework built to help use LLMs for vulnerability discovery in code. The system uses a specialized architecture to enhance an LLM's ability to perform vulnerability research through tool use, equipping the model with task-specific tools to improve capabilities and ensure verifiable results. Using Naptime with models like GPT-4 or Gemini Pro, Google was able to convincingly beat tests in CyberSecEval 2, a hard coding benchmark. The approach achieved new top scores of 1.00 on 'Buffer Overflow' tests and 0.76 on 'Advanced Memory Corruption' tests. The key takeaway is that when provided with the right tools, current LLMs can perform basic vulnerability research effectively.
- Space for Reasoning: Crucial for LLMs to engage in extensive reasoning processes.
- Interactive Environment: Allows models to adjust and correct near misses within the program environment.
- Specialised Tools: Equipping LLMs with debuggers and scripting environments to mirror human security researchers.
- Perfect Verification: Structuring tasks so solutions can be verified automatically with absolute certainty.
- Sampling Strategy: Integrating verification within an end-to-end system to allow models to explore multiple hypotheses through independent trajectories.
Miért fontos?
If we stopped all AI progress today, there's a huge capability overhang. Systems like Naptime show how powerful today's LLMs are if we go to the effort of building them some scaffolding to help them explore and experiment. This suggests that today's AI systems are a lot more powerful than they appear and can elicit surprising results when placed in the right systems.