Adversarial Robustness and Attack Vectors
Research on improving LLM resilience against various attack vectors and understanding vulnerabilities

Adversarial Robustness and Attack Vectors
Research on Large Language Models in Adversarial Robustness and Attack Vectors

Bypassing AI Defenses with No Prior Knowledge
Using CLIP as a surrogate model for no-box adversarial attacks

GazeCLIP: The Future of Gaze Tracking
Enhancing accuracy through text-guided multimodal learning

LLM-Powered Phishing: A New Threat Landscape
Comparing AI-generated vs. human-crafted lateral phishing attacks

Secure Enterprise LLM Platform
Making customized language models accessible while maintaining security

Safer Robot Decision-Making
Using LLM Uncertainty to Enhance Robot Safety and Reliability

WildfireGPT: Intelligent Multi-Agent System for Natural Hazards
Enhancing disaster response with specialized RAG-based LLM systems

Enhanced Security Through Smarter Models
Leveraging Finetuned LLMs as Powerful OOD Detectors

Defending LLMs Against Feedback Manipulation
Robust algorithms for protecting AI systems from adversarial feedback

Cross-Lingual Backdoor Attacks in LLMs
Revealing Critical Security Vulnerabilities Across Languages

Flatter Models, Stronger Defense
Linking Loss Surface Geometry to Adversarial Robustness

Securing LLMs Against Adversarial Attacks
Novel defense strategy using residual stream activation analysis

Discovering Hidden LLM Vulnerabilities
A new approach to identifying realistic toxic prompts that bypass AI safety systems

Fingerprinting LLMs: A New Security Challenge
Identifying specific LLMs with just 8 carefully crafted queries

Defeating Adversarial Phishing Attacks
Evaluating and improving ML-based detection systems against sophisticated threats

Exposing LLM Vulnerabilities
A Novel Approach to Red-teaming for Toxic Content Generation

Defending AI Against Harmful Fine-tuning
Introducing Booster: A Novel Defense for LLM Safety

Advancing Model Extraction Attacks on LLMs
Locality Reinforced Distillation improves attack effectiveness by 11-25%

Securing DNA Language Models Against Attacks
First Comprehensive Assessment of Adversarial Robustness in DNA Classification

Backdoor Threats to Vision-Language Models
Identifying security risks with out-of-distribution data

Bypassing AI Defenses: Smarter Adversarial Attacks
New semantically-consistent approach achieves 96.5% attack success rate

Securing LLM-based Agents
A new benchmark for agent security vulnerabilities and defenses

Exposing VLM Vulnerabilities
Self-supervised adversarial attacks on vision-language models

Security Vulnerabilities in SSM Models
Clean-Label Poisoning Can Undermine Generalization

Hidden Dangers in LLM Alignment
Advanced Backdoor Attacks That Evade Detection

The Dark Side of Web-Connected AI
Emerging security threats from LLMs with internet access

Enhancing YOLO with Contextual Intelligence
How Retriever-Dictionary modules expand object detection beyond single images

The Multilingual Vulnerability Gap
How fine-tuning attacks exploit language diversity in LLMs

Hidden Costs of Faster AI
How acceleration techniques affect bias in LLMs

Unmasking Backdoor Attacks in LLMs
Using AI-generated explanations to detect and understand security vulnerabilities

Vulnerabilities in AI-Powered Robots
Critical security risks in Vision-Language-Action robotics systems

Targeted Bit-Flip Attacks on LLMs
How evolutionary optimization can compromise model security with minimal effort

Fortifying Visual AI Against Attacks
Novel Adversarial Prompt Distillation for Stronger Vision-Language Models

Secure AI Collaboration at the Edge
Building Resilient Multi-Task Language Models Against Adversarial Threats

Hidden Threats in Code Comprehension
How imperceptible code manipulations can deceive AI while fooling humans

Exposing Weaknesses in Time Series LLMs
Uncovering critical security vulnerabilities in forecasting models

Exploiting the Reasoning Vulnerability of LLMs
How the SEED attack compromises LLM safety through subtle error injection

Exposing LLM Vulnerabilities: The AutoDoS Attack
A new black-box approach to force resource exhaustion in language models

Defending LLMs Against Input Attacks
Making Prompt Engineering Robust to Real-World Text Imperfections

Hidden Threats in Language Models
Cross-lingual backdoor attacks that evade detection

The Engorgio Attack: A New LLM Security Threat
How malicious prompts can overwhelm language models

Adaptive Security for LLMs
A New Framework That Balances Security and Usability

Fortifying Vision-Language Models Against Attacks
A Two-Stage Defense Strategy for Visual AI Security

Defending Against LLM Jailbreaking
A Novel Defense Mechanism for Safer AI Systems

The Deception Risk in AI Search Systems
How content injection attacks manipulate search results and AI judges

Boosting LLM Defense Without Retraining
How more compute time creates stronger shields against adversarial attacks

Exposing LLM Vulnerabilities
Why current defenses fail under worst-case attacks

Strategic Information Handling in LLMs
How LLMs reveal, conceal and infer information in competitive scenarios

Rethinking LLM Security Evaluations
Current assessments fail to capture real-world cybersecurity risks

Securing Federated Learning Against Attacks
A Communication-Efficient Approach for Byzantine-Resilient Optimization

Measuring Safety Depth in LLMs
A mathematical framework for robust AI safety guardrails

Exploiting Safety Vulnerabilities in DeepSeek LLM
How fine-tuning attacks can bypass safety mechanisms in Chain-of-Thought models

Exploiting Human Biases in AI Recommendations
How cognitive biases create security vulnerabilities in LLM recommenders

Backdoor Vulnerabilities in AI Vision Systems
Detecting poisoned samples in CLIP models with 98% accuracy

The Distraction Problem in AI
How irrelevant context compromises LLM security

Protecting Medical AI from Theft
Novel attacks expose vulnerabilities in medical imaging models

Navigating the LLM Security Battlefield
Comprehensive Analysis of Adversarial Attacks on Large Language Models

Synthetic Data for Better AI Security
Using LLMs to Generate OOD Data for Robust Classification

Confidence Elicitation: A New LLM Vulnerability
How attackers can extract sensitive information without model access

Hidden Dangers in LLMs
Mapping the Growing Backdoor Threat Landscape

Visual Illusion: The New Frontier in CAPTCHA Security
Combating LLM-powered attacks with human visual perception advantages

Bypassing AI Safety Guardrails
How simple activation shifting can compromise LLM alignment

Rethinking Adversarial Alignment for LLMs
Why current approaches to LLM security fall short

Breaking the Fortress of Language Models
A novel backdoor attack targeting o1-like LLMs' reasoning capabilities

UniGuardian: Unified Defense for LLM Security
A novel approach to detect and prevent multiple types of prompt-based attacks

Defending Against LLM Permutation Attacks
How reordering demonstrations can compromise model security

EigenShield: Fortifying Vision-Language Models
A Novel Defense Against Adversarial Attacks Using Random Matrix Theory

Defending AI Models from Poisoned Training Data
A novel adversarial training approach to counter label poisoning attacks

Breaking LLM Guardrails: Advanced Adversarial Attacks
New semantic objective approach improves jailbreak success by 16%

Emoji-Based Attacks on Language Models
Invisible Vulnerabilities in Modern NLP Systems

Defending AI Against Adversarial Attacks
A robust zero-shot classification approach using CLIP purification

Combating DoS Attacks in LLMs
Detecting and preventing harmful recursive loops in language models

Strengthening LLM Security Through Robustness Testing
New framework detects vulnerabilities in LLM-based NLP applications

Strengthening LLM Robustness Against Prompt Variations
A latent adversarial framework that improves resilience to paraphrased prompts

Testing LLMs Against Adversarial Defenses
Evaluating AI's ability to autonomously exploit security measures

Hijacking LLM Agent Reasoning
A Novel Framework for Comprehensive Security Testing of AI Agents

LLM Safety and Output Length
How longer responses affect model security under adversarial attacks

Security Vulnerabilities in RLHF Platforms
How adversaries can misalign language models through manipulation of reinforcement learning systems

Securing the Gatekeepers: LLM Router Vulnerabilities
First comprehensive security analysis of LLM routing systems across their entire lifecycle

The Repeated Token Vulnerability in LLMs
Understanding and resolving a critical security flaw in language models

Breaking Black-Box AI Models
A simple attack approach achieving over 90% success rate against GPT-4.5/4o/o1

Defeating Face-Morphing Attacks with AI
Zero-Shot Detection Using Multi-Modal LLMs and Vision Models

Boosting Vision-Language Model Security
Evolution-based Adversarial Prompts for Robust AI Systems

Defending LLMs Against Manipulative Attacks
A Temporal Context Awareness Framework for Multi-Turn Security

Autonomous Defense: Next-Gen LLM Security Testing
AI that continuously evolves to find LLM vulnerabilities

The Achilles' Heel of AI Reasoning
How Manipulated Endings Can Override Correct Reasoning in LLMs

Teleporting Security Across Language Models
Zero-shot mitigation of Trojans in LLMs without model-specific alignment data

The Hidden Fragility of LLMs
Understanding and mitigating performance collapse during deployment

Guiding AI Reasoning Through Intervention
A novel approach for controlling LLM behavior during the reasoning process

Defending AI Systems Against Adversarial Attacks
A Universal Detection Framework Using Pre-trained Encoders

Advanced Multi-Turn Red Teaming for LLM Security
Emulating sophisticated adversarial attacks through dual-level learning

Defending Recommender Systems from Attacks
A robust retrieval-augmented framework to combat LLM vulnerabilities

Uncovering LLM Vulnerabilities
New methods to identify and address stability issues in language models

Advancing Face Anti-Spoofing Security
Novel Content-Aware Composite Prompt Engineering for Cross-Domain Protection

VRAG: Smart Defense Against Visual Attacks
Training-Free Detection of Visual Adversarial Patches

Hidden Threats in LLM Recommendation Systems
How adversaries can manipulate rankings while evading detection

Fortifying AI Reward Systems Against Attacks
Adversarial training for more robust AI alignment

Defending Against LLM-Powered Attacks on Rumor Detection
A novel approach to secure social media analysis from AI-generated manipulation

Securing LLMs Against Hidden Threats
Using Influence Functions to Detect Poisoned Fine-tuning Data

LLM Vulnerabilities in Spam Detection
Security weaknesses in AI-powered spam filters

Evaluating LLM-Powered Security Attacks
A critical assessment of benchmarking practices in offensive security

Backdoor Vulnerabilities in LLM Recommendations
Exposing & defending against security threats in LLM-powered recommendation systems
