Heap Blog

Building Target Personas with SQL, Salesforce, and Clearbit

Last week Todd, Heap’s Head of New Business, wrote a blog post describing how our new Salesforce → Heap integration is a powerful tool for defining strategy around the entire customer life cycle. I wanted to dive deeper into the difficulty of defining a target persona, and share a bit of our process before and after we built the Salesforce → Heap integration.

The Problem: Defining a Target Persona

As a company, our goal is to know exactly who to sell to, who to design the product for, and who to have distinct messaging for to maximize our value add, growth, and revenue. To define our target persona, our first step was trying to answer some seemingly simple questions:

  1. Who signs up for Heap?
  2. What’s their job title?
  3. What’s their industry?
  4. How much are they paying?

Once we know who we can answer questions like:

  1. Which iteration of our landing page leads to the most conversions of the best buyers?
  2. What channels bring in these buyers?
  3. What campaigns are most effective?

Defining a target persona is an iterative process that takes a lot of fine-tuning, and changes as the company evolves. Whoever is responsible for doing the analysis on the target persona should have moderate SQL expertise or dedicated engineering resources to handle the ETL process and analysis.

Defining Our Target Persona

We started defining our target persona using three data sets – Heap for our user and attribution data, Salesforce for our business logic (such as ARR), and Clearbit for our user enrichment.

Heap does a great job ensuring that all user interaction data is collected – we have a complete, retroactive data set to see who is signing up, who is visiting our blog, our marketing page,  who is running queries, what queries our users are running, etc. Heap also captures every referrer, UTM parameter, and every landing page for every user across the entirety of their usage.

Salesforce has all of the contract information, revenue, key contacts on each account, etc. We enrich each Contact, Lead, and Account using Clearbit’s Salesforce Integration, so Salesforce also holds the user/account information like industry and role.

Using Mode, a SQL based BI tool, we have to join our signup event with our users with our Contacts (or leads )  with our Accounts (or Opportunities depending on where revenue is stored within your Salesforce instance)  with our Clearbit Objects. The query looks something like this:

SELECT signup.TIME :: DATE, 
       contact.email, 
       cbit.cbit___employment_role___c, 
       cbit.cbit___employment_title___c, 
       cbit.cbit___employment_seniority___c, 
       cbit.cbit___company_category_industry_group___c, 
       cbit.cbit___company_category_industry___c, 
       cbit.cbit___company_category_sub_industry___c, 
       cbit.cbit___company_category_sector___c, 
       cbit.cbit___company_tags___c, 
       account.mrr___c * 12 AS ARR 
FROM   main_production.users u 
       join main_production.sign_up signup 
         ON u.user_id = signup.user_id 
       join salesforce._contact contact 
         ON u."identity" = contact.email 
       join salesforce._cbit___clearbit___c cbit 
         ON contact.cbit___clearbit___c = cbit.id 
       join salesforce._account account 
         ON contact.account_id = account.id 
WHERE  account.TYPE = 'Paying Customer' 
ORDER  BY 1 

This gives us a table with all the information I need. The next step is to aggregate the data.

Before we can start to group our data meaningfully, we have to understand what the possible result values are, and how to categorize them with a query such as:

SELECT cbit.cbit___employment_title___c, 
       Count(DISTINCT contact.email) 
FROM   main_production.users u 
       JOIN main_production.sign_up signup 
         ON u.user_id = signup.user_id 
       JOIN salesforce._contact contact 
         ON u."identity" = contact.email 
       JOIN salesforce._cbit___clearbit___c cbit 
         ON contact.cbit___clearbit___c = cbit.id 
       JOIN salesforce._account account 
         ON contact.account_id = account.id 
WHERE  account.type = 'Paying Customer' 
GROUP  BY 1 
ORDER  BY 2 DESC 

If title is too specific for your situation, Clearbit includes broader categories like Industry, and Role so it’s easier to group without having to use case when statements to categorize specific titles or group similar industries. Using Clearbit’s categories, we ended up with the query below to calculate the most common roles, industries, company size and the ARR associated with each bucket of users.

SELECT ROLE, 
       industry, 
       company_size, 
       SUM(arr) AS total_rev, 
       Avg(arr) AS average_rev, 
       Count(*) AS total_contacts 
FROM   (SELECT signup.TIME :: DATE, 
               contact.email, 
               cbit.cbit___employment_role___c           AS ROLE, 
               cbit.cbit___company_category_industry___c AS industry, 
               cbit.cbit___company_metrics_employees___c AS company_size, 
               Round(account.mrr___c * 12, 2)            AS arr 
        FROM   main_production.users u 
               join main_production.sign_up signup 
                 ON u.user_id = signup.user_id 
               join salesforce._contact contact 
                 ON u."identity" = contact.email 
               join salesforce._cbit___clearbit___c cbit 
                 ON contact.cbit___clearbit___c = cbit.id 
               join salesforce._account account 
                 ON contact.account_id = account.id 
        WHERE  account.TYPE = 'Paying Customer' 
        ORDER  BY 1) 
GROUP  BY 1, 
          2 

Analyzing Attribution

At this point we are ready to figure out what marketing campaigns are associated with each user, and how much revenue they are associated with overall. We look at two different touches, the first touch – what first brought a user to Heap, and the last touch – what ultimately was responsible for the conversion. (Although there are many ways to model attribution data, we’re just going to focus on these two in this post).

In order to find the UTM campaign associated with the conversion, we use the following SQL:

SELECT utm_campaign, 
       Sum(account.mrr___c * 12)     AS total_arr, 
       Count(DISTINCT users.user_id) AS total_users 
FROM   main_production.sign_up_submit_signup_form last_touch 
       JOIN main_production.users users 
         ON users.user_id = last_touch.user_id 
       JOIN salesforce._contact contact 
         ON users."identity" = contact.email 
       JOIN salesforce._cbit___clearbit___c cbit 
         ON contact.cbit___clearbit___c = cbit.id 
       JOIN salesforce._account account 
         ON contact.account_id = account.id 
WHERE  account.type = 'Paying Customer' 
AND (cbit.cbit___employment_role___c = 'product' OR cbit.cbit___employment_role___c = 'marketing')
AND (cbit.cbit___company_category_industry___c = 'Software' OR cbit.cbit___company_category_industry___c = 'E-commerce' )
AND cbit.cbit___company_metrics_employees___c > 20
GROUP BY utm_campaign

Essentially, we found all the attribution information associated with our target buyers, and then looked at the total revenue associated with each of those campaigns, and users converting from those campaigns to each of those campaigns. Then we update the query to analyze the number of sessions and sign ups associated with each of those campaigns. This data allows us to allocate our spend, and refine our campaigns to align with those driving the most revenue.

Heap and Salesforce Data should Co-Exist:

Cleaning, organizing, and analyzing this data requires a lot of resources. In order to enable marketing, sales, and customer success team to answer their own questions, we decided to build a Heap + SFDC integration. This integration automatically captures touch points, account updates, contacts created, etc. so our business logic exists in Heap as well.

We no longer have to worry about holes in the data, because our integration updates Heap each time a Salesforce contact, lead, task, opportunity, or account is updated. Downstream, our integration also transforms Salesforce’s convoluted data model into a straightforward user – event model. The current state of the user’s salesforce contact (or lead), account, any opportunity data is stored as properties on the user, and every time that information changes, Heap captures an event.

This simplifies our queries immensely making it easier to iterate on them. We can perform the same analysis to define our target persona with a simplified query:

SELECT users.sfdc_account_cbit_company_industry, 
       users.sfdc_contact_cbit_role, 
       Sum(users.sfdc_account_arr) AS Total_Rev, 
       Avg(users.sfdc_account_arr) AS AVG_Rev 
FROM   main_production.users users 
       JOIN main_production.sign_up signup 
         ON users.user_id = signup.user_id 
WHERE  users.sfdc_account_account_type = 'Paying Customer' 
GROUP  BY 1, 
          2 

Similarly, the query to look at attribution for these users looks more like this:

SELECT utm_campaign, 
       Sum(sfdc_account_arr), 
       Count(DISTINCT users.user_id) 
FROM   main_production.sign_up_submit_signup_form signup 
JOIN   main_production.users users 
ON     signup.user_id = users.user_id 
WHERE  sfdc_account_account_type = 'Paying Customer' 
AND    ( 
              sfdc_contact_cbit_role = 'Product' 
       OR     sfdc_contact_cbit_role = 'Data') 
AND    ( 
              sfdc_account_cbit_company_industry = 'E-commerce' 
       OR     sfdc_account_cbit_company_industry = 'SaaS' 
       OR     sfdc_account_cbit_company_industry = 'Fintech') 
AND    sfdc_account_cbit_company_size >= 20 GROUP BY utm_campaign

Using this Data in Heap

Not only is the query simplified, but having both data sets within Heap’s UI empowers our end users (in this case marketing and sales) to do the vast majority of this analysis without a data team or a BI tool. Within Heap’s UI you can perform very similar analysis. This graph shows us the title and industry of each user who signed up for Heap and became a paying customer, as well as the ARR associated with each grouping of categories.

target persona

Now that we have data dictating where we should allocate spend and which campaigns were most effective, our marketing team can perform ad-hoc analysis to rapidly iterate on campaigns and monitor their effectiveness.

What does this mean for us?

The Salesforce → Heap Integration solved a huge pain point and a blocker in our outbound and marketing efforts. Our go-to-market team now has the data they need, in an interface everyone can use, making us smarter as a company. Heap’s flexibility allows us to constantly analyze, explore, and iterate easily. As our product develops and our customer base grows, the new data is automatically included in our reports, allowing us to adapt accordingly.

Taylor Udell

Your Header Sidebar area is currently empty. Hurry up and add some widgets.