Skip to content
Cloudflare Docs

Unsafe and custom topic detection

AI Security for Apps (formerly Firewall for AI) can detect when an LLM prompt touches on unsafe or unwanted subjects. There are two layers of topic detection:

  • Default unsafe topics — A built-in set of safety categories that detect harmful content such as violent crimes, hate speech, and sexual content.
  • Custom topics — Topics you define to match your organization's specific policies, such as "competitors" or "financial advice".

Default unsafe topics

When AI Security for Apps is enabled, it automatically evaluates prompts against a set of default unsafe topic categories and populates two fields:

Default unsafe topic categories

CategoryDescription
S1Violent crimes
S2Non-violent crimes
S3Sex-related crimes
S4Child sexual exploitation
S5Defamation
S6Specialized advice
S7Privacy
S8Intellectual property
S9Indiscriminate weapons
S10Hate
S11Suicide and self-harm
S12Sexual content
S13Elections
S14Code interpreter abuse

Example rules — default unsafe topics

Block any prompt with unsafe content

  • When incoming requests match:

    FieldOperatorValue
    LLM Unsafe topic detectedequalsTrue

    Expression when using the editor:
    (cf.llm.prompt.unsafe_topic_detected)

  • Action: Block

Block only specific unsafe categories

  • When incoming requests match:

    FieldOperatorValue
    LLM Unsafe topic categoriesis inS1: Violent Crimes S10: Hate

    Expression when using the editor:
    (any(cf.llm.prompt.unsafe_topic_categories[*] in {"S1" "S10"}))

  • Action: Block


Custom topics

Custom topic detection lets you define your own topics and AI Security for Apps will score each prompt against them. You can then use these scores in custom rules or rate limiting rules to block, challenge, or log matching requests.

This capability uses a zero-shot classification model that evaluates prompts at runtime. No model training is required.

How custom topics work

  1. You define a list of up to 20 custom topics via the dashboard or API. Each topic consists of:
    • A label — Used in rule expressions and analytics
    • A topic string — The descriptive text the model uses to classify prompts
  2. When a request arrives at a cf-llm labeled endpoint, the model evaluates the prompt against all defined topic strings and returns a relevance score for each.
  3. Scores are written to the cf.llm.prompt.custom_topic_categories map field, keyed by label. You use labels — not topic strings — in rule expressions and analytics.

Scores follow the same convention as other AI Security for Apps scores, where lower values indicate higher relevance (1 = highly relevant, 99 = not relevant).

Define custom topics

  1. In the Cloudflare dashboard, go to the Security Settings page.

    Go to Settings
  2. Under AI Security for Apps, find the Custom Topics section and select Manage topics.

  3. Add a topic by providing:

    • Label: A short identifier used in rule expressions (for example, competitors).
    • Topic: A descriptive English text string the model uses for classification (for example, asking about Acme Corp products and pricing).
  4. Select Save.

Constraints

ParameterLimit
Maximum number of topics20
Topic string length2–50 printable ASCII characters
Label length2–20 characters
Label formatLowercase letters, numbers, and hyphens (-) only

Example rules — custom topics

Block prompts highly relevant to competitors

  • When incoming requests match:

    Enter the following expression in the editor:
    (cf.llm.prompt.custom_topic_categories["competitors"] lt 30)

  • Action: Block

  • When incoming requests match:

    Enter the following expression in the editor:
    (cf.llm.prompt.custom_topic_categories["finance"] lt 40)

  • Action: Log

Combine custom topics with other detections

Example expression:
(cf.llm.prompt.custom_topic_categories["competitors"] lt 30 or cf.llm.prompt.pii_detected)


Best practices for defining custom topics

The quality of custom topic detection depends on how you write your topic strings. The underlying model is a zero-shot classifier — it compares the semantic meaning of the prompt against your topic string.

Be specific and avoid vague topics

Overly broad topics match too many prompts (high false positives). Overly narrow topics miss relevant prompts (high false negatives).

QualityTopic stringWhy
GoodAcme Corp products and pricingNames a specific competitor — catches prompts discussing that company's offerings.
Goodsecurities trading and investment recommendationsTargets a well-defined intersection of two concepts.
Too narrowAcme Corp pricing page URLSo specific that only near-exact mentions will score highly.
Too broadtechnologyWill match almost any technical prompt.
Too broadbad thingsSemantically vague — the model cannot determine what you consider bad.

Use descriptive phrases instead of single keywords

A topic string like finance is less effective than securities trading and investment recommendations. More descriptive phrases give the model better signal and help prevent false positives.

Avoid semantically overlapping topics

If you define topics that mean nearly the same thing — for example, financial advice and investment guidance — both will score similarly on the same prompts, consuming two of your 20-topic budget without adding detection value. Consolidate overlapping concepts into a single topic.

Think about intent and not just keywords

The model performs semantic classification, not keyword matching. A topic string of Acme Corp products and pricing will detect requests that discuss that competitor's offerings even if they do not mention the company by name — for example, a prompt like "How does your pricing compare to the leading alternative?" can still score highly.

This also means you should phrase topics as action-oriented verb phrases that capture what the user is doing, not just the subject they mention. Descriptions that capture intent are significantly more discriminating — especially on borderline or ambiguous text.

For example, compare these two topic strings against two very different prompts:

Topic string"I read an article about tax deductions""What stocks should I buy to retire in 10 years?"
financial adviceMedium relevance (false positive)High relevance
asking for financial adviceNo relevance (correct)High relevance

The noun-phrase version (financial advice) returns a false positive on the passive text because the prompt merely mentions the subject. The verb-phrase version (asking for financial advice) correctly ignores passive mentions and only matches when the user is actively seeking advice.

Recommended phrasing styles:

StyleExample
Noun phraseinvestment advice
Verb phrase (recommended)asking for investment advice
Sentence-likea user seeking financial guidance

For most use cases, a 3–6 word verb phrase is the best trade-off between precision and coverage.

Test and iterate

After defining your topics, send test prompts and review the scores in Security Analytics. There are two ways to tune detection behavior:

  • Adjust the topic string. If a topic is matching too broadly, make the topic string more specific. If it is not matching requests you expect it to catch, broaden or rephrase the topic string.
  • Adjust the score threshold in your rule. A lower threshold (for example, lt 20) is stricter and only matches highly relevant requests. A higher threshold (for example, lt 50) is more permissive and catches a wider range of related requests. Start with a moderate threshold and refine based on what you observe in logs.

Example custom topics

LabelTopic stringUse case
competitorsasking about Acme Corp products and pricingPrevent your chatbot from discussing a specific rival's offerings
legal-adviceasking for legal counsel or regulatory compliance guidanceBlock prompts that solicit legal advice from your AI
student-datarequesting student personal information or academic recordsEdTech — prevent discussion of individual student data
exec-internaldiscussing internal executive decisions or leadership changesPrevent discussion of sensitive internal matters
crypto-adviceasking for cryptocurrency trading or investment recommendationsFinTech — block prompts seeking crypto investment tips