PolicyAI API Integration Guide
Overview
The PolicyAI API is simple to integrate into your app wherever you need to check for policy violations. PolicyAI will examine the provided content using our moderation LLM, identify if there was a policy violation, and give you back details about the violation so you can take action on the content. Common actions include taking the content down or flagging the associated user account for review by AiMod or a moderator.
With the API you can:
- Create and edit versioned policies
- Evaluate content against a policy version
We also provide a UI where you can:
- Sign up for an account
- Create, edit, manage, and test your policies
- Access your account details
PolicyAI URLs
We provide both a production and development PolicyAI.
Production
UI: https://policyai.musubilabs.ai
API: https://api.musubilabs.ai/policyai/docs
Development
UI: https://policyai.dev.musubilabs.ai
API: https://api.dev.musubilabs.ai/policyai/docs
Warning
We do not have guarantees on the duration for which data will be retained in the development PolicyAI.
Info
For the rest of this guide, any links taking you to the API or UI will navigate to the development PolicyAI.
Sign up for an account
To create an account for development, navigate to the dev UI where you’ll be prompted to sign up!
After you've created an account, you can view your account details on the Account page of the UI, including your organization memberships.
Organizations and Organization Settings
When you create an account, your own personal organization is created for you in our system. Organizations are used by PolicyAI to determine who has access to what. Any policies or evaluation results that you create using your personal organization ID are accessible only to you!
PolicyAI maintains settings for organizations to configure things like the default model to use for that organization. When any operation is run on the API, organization settings are checked and applied as makes sense. You can check your organization settings using the manage settings endpoints of the API.
API Authentication
When you use the API directly, you can authenticate using your API key like this:
GET https://api.musubilabs.ai/policyai/api/version
accept: application/json
Musubi-Api-Key: [your API key here]
You can grab your API key from the API Settings page of the UI.
Create your Policy
You can create and edit policies and policy versions on the Manage Policies page of the UI, or by using the manage policies endpoints of the API. Policy data is passed to the API as JSON, which in the UI is presented as structured markdown for a better editing experience.
Each policy has one or more versions, and for each version can configure the policy text, policy examples, and policy settings.
Tip
Any edits that you make on a policy in the UI are auto-saved!
Policy text
When you construct your policy text, organize it into categories, and include a list under each category that outlines the specific characteristics which would constitute a violation of that category in the policy.
Example policy text defining three categories:
# Selling
- actively selling any service or product
- mentioning services available
- stating a price for a service or product with intent to sell it
# Scamming
- tricking someone into sending money or crypto or signing up for a scam service
- inviting someone to chat off the app is not a violation unless the contact info provided is suspicious
# Drugs
- selling or seeking regulated or controlled substances
Policy examples
You can provide reference examples for each category in your policy to give the AI a solid idea of what should and should not be marked as violations. For example:
# Scamming
Unsafe: "I need you to urgently send me money. It's an emergency."
Unsafe: "I can only talk on telegram. It's more secure than this app."
Safe: "I don't feel comfortable sharing personal info here. Here's my number."
# Drugs
Unsafe: "I've got that good stuff."
Unsafe: "💊❄️🧊🍁🍚"
Safe: "420 friendly."
# Selling
Unsafe: "My rates are [price] for [time duration]. Interested?"
Safe: "I sell handmade jewelry."
Safe: "I'm shopping for a $20 dress right now"
Policy settings
PolicyAI supports customizing each policy in several ways.
Model
Choose the model that the policy will use to check for violations. We'll provide guidance on which model we expect to work best for you for a given application, and you can experiment on your datasets as well.
A policy can be used on any type of content that the selected model supports. Available models can be viewed in the UI under policy settings, or by using the models endpoints of the API.
Content types
PolicyAI currently supports these kinds of content:
- Text
- Image
- Text + Image combined
With video and audio support coming shortly as well. You can configure a policy to apply to only certain content types, which is useful when a policy is constructed and tested specifically for a specific content type. For example you may define one policy for user profile images and another for post text.
Testing your policy
Once you've put together a policy, test individual text or image content against it to make sure that it surfaces violations as you would expect.
To test your policy against a curated dataset, you can upload a test dataset on the Manage Datasets page of the UI, and run the policy against that dataset on the Test Text Policies page of the UI.
Apply your policy
Once you have your policy ready, you can use it to evaluate content using the API. The response for each evaluation contains the following:
- The assessment, either
SAFEorUNSAFE.UNSAFEmeans that a violation was detected. - The severity level. The levels are:
- 0: Safe (no violation)
- 1: Low
- 2: Medium
- 3: High
- The category that was violated, if the assessment of is
UNSAFE. - The reason for the evaluation result in plain text.
Note
Evaluated content is saved for 30 days by default.
At this point, you can incorporate these results into your system as needed :partying_face: