Security of Synthetic Data Generation

Research on security implications, auditing, and tracing of synthetic data generated by LLMs for downstream applications

Hero image

Security of Synthetic Data Generation

Research on Large Language Models in Security of Synthetic Data Generation

Detecting AI-Generated Data in the Wild

Detecting AI-Generated Data in the Wild

New techniques to identify and audit synthetic data in downstream applications

Generating Private Synthetic Data via LLM APIs

Generating Private Synthetic Data via LLM APIs

Leveraging third-party LLMs without access to model weights

Unlocking Private Medical Text Generation

Unlocking Private Medical Text Generation

Using LLM Prompts to Create Synthetic Data While Preserving Patient Privacy

Stealing AI Art Prompts: A New Security Threat

Stealing AI Art Prompts: A New Security Threat

How attackers can reverse-engineer valuable text-to-image prompts with minimal samples

Synthetic Clinical Data Generation for Privacy-Sensitive Applications

Synthetic Clinical Data Generation for Privacy-Sensitive Applications

Using LLMs to create annotated training data for de-identification systems

Synthetic Data Privacy Risks

Synthetic Data Privacy Risks

Detecting Information Leakage in LLM-Generated Text

Revolutionizing Cyberbullying Detection

Revolutionizing Cyberbullying Detection

Using LLM-Generated Data to Address Dataset Scarcity

Privacy-Preserving Synthetic Text

Privacy-Preserving Synthetic Text

Training Better LLMs with Less Real Data

AI's Secret Languages

AI's Secret Languages

Preventing machines from developing communications beyond human understanding

Synthetic Tabular Data with Real-World Logic

Synthetic Tabular Data with Real-World Logic

Preserving Inter-column Relationships in Generated Data

Privacy-Preserving Synthetic Text Data

Privacy-Preserving Synthetic Text Data

Generating high-quality text data without compromising privacy

Domain-Driven AI: The ELTEX Framework

Domain-Driven AI: The ELTEX Framework

Enhancing LLM specialization through explicit domain knowledge injection

Generating Paired Facial Images with LLMs

Generating Paired Facial Images with LLMs

Leveraging AI to create thermal-visible facial image pairs for security applications

Fighting Fake News with Synthetic Data

Fighting Fake News with Synthetic Data

Using LLMs to Generate and Detect Manipulated Facts

Key Takeaways

Summary of Research on Security of Synthetic Data Generation