Security of Synthetic Data Generation
Research on security implications, auditing, and tracing of synthetic data generated by LLMs for downstream applications

Security of Synthetic Data Generation
Research on Large Language Models in Security of Synthetic Data Generation

Detecting AI-Generated Data in the Wild
New techniques to identify and audit synthetic data in downstream applications

Generating Private Synthetic Data via LLM APIs
Leveraging third-party LLMs without access to model weights

Unlocking Private Medical Text Generation
Using LLM Prompts to Create Synthetic Data While Preserving Patient Privacy

Stealing AI Art Prompts: A New Security Threat
How attackers can reverse-engineer valuable text-to-image prompts with minimal samples

Synthetic Clinical Data Generation for Privacy-Sensitive Applications
Using LLMs to create annotated training data for de-identification systems

Synthetic Data Privacy Risks
Detecting Information Leakage in LLM-Generated Text

Revolutionizing Cyberbullying Detection
Using LLM-Generated Data to Address Dataset Scarcity

Privacy-Preserving Synthetic Text
Training Better LLMs with Less Real Data

AI's Secret Languages
Preventing machines from developing communications beyond human understanding

Synthetic Tabular Data with Real-World Logic
Preserving Inter-column Relationships in Generated Data

Privacy-Preserving Synthetic Text Data
Generating high-quality text data without compromising privacy

Domain-Driven AI: The ELTEX Framework
Enhancing LLM specialization through explicit domain knowledge injection

Generating Paired Facial Images with LLMs
Leveraging AI to create thermal-visible facial image pairs for security applications
