1. Introduction
Prophecy provides AI-powered tools to help you build data pipelines using natural language prompts. Whether you’re starting from scratch, migrating existing SQL or SAS code, or iterating on an existing pipeline, effective prompting is the key to getting the best results from Prophecy’s AI capabilities.
This guide covers best practices for working with two primary AI tools in Prophecy:
- AI Agent (Chat) — Available in SQL projects for conversational pipeline development
- Data Copilot — Available in Spark/PySpark projects for suggestions and code generation
2. General Prompting Principles
2.1 Be Specific and Explicit
The more precise your prompt, the better the result. Avoid vague instructions and instead describe exactly what you need.
Instead of:
Clean the data
Try:
Remove duplicate customer records based on customer_id, and fill null values in the email column with 'unknown@example.com'
Instead of:
Join the tables
Try:
Inner join the orders table with the customers table on customer_id, keeping all columns from orders and only customer_name and email from customers
2.2 Use Step-by-Step Instructions
Break complex transformations into smaller, sequential requests rather than trying to do everything in one prompt. This approach:
- Improves the reliability of the generated pipeline
- Makes it easier to debug or adjust individual steps
- Let’s you inspect intermediate results at each stage
Example workflow:
Read the notes table and the note_templates tableJoin notes with note_templates on NoteTemplateIdFilter to keep only rows where NoteTemplateName equals 'Request for Information'Group by ServiceAuthId and count the number of notes per authorization
2.3 Reference Tables and Gems with @ Mentions
Use the @ symbol to reference specific tables in your warehouse or gems already on your pipeline canvas. This removes ambiguity and helps the AI locate the exact objects you mean.
Examples:
How many records are in the @transactions table?Aggregate the @orders_cleaned gem to show sum of order quantity per monthAdd a filter after @last_aggregation_gem to show only records from 2024
As you type @, Prophecy will suggest matching table names and gem labels.
2.4 Use Descriptive Column Names
Prophecy’s AI suggestions work best with meaningful column names. Descriptive names give the AI semantic cues to generate accurate expressions.
Not recommended: col1, field_a, x
Recommended: customer_email, order_date, total_amount
If your source data has ambiguous column names, consider renaming them early in your pipeline to improve AI suggestions downstream.
3. Building Pipelines with the AI Agent (SQL Projects)
3.1 Start with Data Exploration
Before building transformations, explore your available data:
Find datasets with information about service authorizationsShow me sample data from @notesnotes where NoteTypeKey = ‘DECR’What columns are available in the @serviceauth table?
This helps you understand your data before writing transformation logic.
3.2 Build Incrementally
Rather than prompting for an entire complex pipeline at once, build it step by step:
Step 1 — Add sources: Add the notes and note_templates tables to the pipeline
Step 2 — Join: Join @notes with @note_templates on NoteTemplateId
Step 3 — Filter: Filter to keep only rows where NoteTemplateName is ‘Request for Information’ and NoteTypeKey is ‘DECR’
Step 4 — Transform: Add a column called NoteReviewer_Name using COALESCE(DisplayName, CONCAT_WS(’ ', FirstName, LastName))
Step 5 — Aggregate: Group by ServiceAuthId and count the number of notes
Step 6 — Save: Save the final output as a table
After each step, use Inspect to review the generated gems and verify the output before continuing.
3.3 Inspect and Verify
After each transformation:
1. Click Inspect to review the configuration of modified gems (highlighted in yellow)
2. Use Previous/Next to navigate through all modified gems
3. Examine both input and output data to confirm the transformation produces expected results
4. If something looks wrong, click Restore or Rollback to revert to a previous state and try a different prompt
3.4 Use Specific Phrases to Generate Visuals & Dashboards
If you’d like to see charts of your data, you can prompt AI to create visuals with specific phrases. If you have a specific chart in mind, include that in your prompt
Examples:
-
Data exploration:
Create an analysis dashboard of electronic good sales from 2023 from the @ecommerce data- This will create a new analysis tab with charts that the AI interprets, which would be helpful, using the full dataset
-
Specific visuals:
Create a pie chart of total revenue by gender- This will create a new pie chart and add it to the existing analysis dashboard
-
Filtering:
Add a dynamic filter to the electronics_sales_2023_dashboard for sale_month- Adds a filter to the dashboard that applies to all visuals; show results by going into preview mode, changing the filter, and running the analysis again.
Once the analysis dashboards are created, click on the edit button to inspect the columns and aggregations used. You can also change chart types, colors, and manually add more charts.
4. Prompting for Code Migration Use Cases
When migrating existing SAS, SQL, or other code to Prophecy pipelines, follow these strategies:
4.1 Provide Business Requirements as Context
The most effective prompts for migration include a business requirements document that describes the pipeline logic in plain English. This helps the AI understand the intent behind the code, not just the syntax.
A good business requirements prompt document should include:
- Source Tables: List all tables with their full paths and friendly names
- Step-by-Step Logic: Describe each transformation step in plain English, including join conditions, filter criteria, and column derivations
- Column Definitions: Specify how derived columns should be calculated, including any SQL expressions or business rules
- Expected Output: Describe the final output table structure and what each row represents
4.2 Include Reference SQL
Alongside business requirements, providing the original SQL or SAS code as a reference helps the AI generate accurate transformations. You can include:
- The original source code (SAS, SQL, etc.) for the AI to translate
- A simplified Databricks SQL version that serves as a more direct mapping
- Both a high-level summary and a detailed version of the requirements
4.3 Use a Structured Prompt Format
For complex migration prompts, organize your prompt document into clear sections:
Section 1 — Source Tables: List each table with its path, alias, and purpose
Section 2 — Transformation Steps: Number each step and describe the logic
Section 3 — Join Conditions: Specify join types, keys, and any special conditions
Section 4 — Filter Criteria: List all WHERE clause conditions in plain English
Section 5 — Column Derivations: Define calculated columns with formulas
Section 6 — Output Specification: Describe the target table schema and write destination
4.4 Iterate and Refine
Migration is rarely a one-shot process. Plan to:
1. Generate an initial pipeline from the prompt
2. Run the pipeline against sample data
3. Compare output with the expected results from the legacy system
4. Refine the prompt or manually adjust gems as needed
5. Repeat until output matches expectations
5. Sample Prompts for Common Operations
Data Exploration:
Find datasets containing customer informationShow me 5 random records from @sales_dataVisualize number of patients per city in @patients_raw_data
Filtering:
Filter to only include orders from 2024Remove rows where email is nullKeep only rows where SATypeKey is 'IP', 'AS', or 'MO'
Joins:
Join the orders and customers tables on customer_idLeft join @notes with @note_templates on NoteTemplateId
Aggregations:
Group by region and calculate average salesCount the number of records per ServiceAuthId
Transformations:
Calculate total revenue as quantity * priceRename cust_id to customer_id and amt to amountConvert the date from UTC to Eastern Time and format as MM/dd/yyyy
Output:
Save the final output as a tableShow me and save the final output of the pipeline
6. Troubleshooting Tips
Agent can’t find tables: Reindex your Databricks connection so the Knowledge Graph includes up-to-date table metadata.
Conversation feels stuck: Click the “Reset” option in the chat interface to start a fresh session with a clean context.
Unexpected results: Use Inspect to review each gem’s configuration. Check input and output data at each step to find where the logic diverges from expectations.
Complex expressions not generated correctly: Break the expression into simpler parts. Generate each part separately, then combine them manually in the gem configuration.
7. Summary of Key Tips
1. Be specific — Describe exactly what columns, tables, filters, and transformations you need
2. Build incrementally — One step at a time, inspecting results along the way
3. Use @ mentions — Reference tables and gems by name to reduce ambiguity
4. Use descriptive names — Meaningful column and gem names improve AI suggestions
5. Provide context — For migrations, include business requirements and reference SQL
6. Inspect everything — Review generated gems and data before proceeding
7. Iterate — Refine your prompts based on results; migration is a multi-pass process
8. Combine AI with manual editing — Use the AI for scaffolding, then fine-tune in the visual editor
For more information, visit the Prophecy documentation at docs.prophecy.ai

