Setting up a custom enrichment to extract Convertro parameters from the URL [tutorial]

The Snowplow JavaScript script enrichment lets you run a custom JavaScript function that returns a derived context, which is attached to the final enriched event. In this tutorial, we will show you how to use it to extract Convertro tracking parameters from the event page_url field.

What are Convertro tracking parameters?

Convertro is an attribution platform that offers dashboards based on data collected through URL parameters (similar to GA’s utm_ parameters).

The three main Convertro parameters are the Source tracking parameters:

cvosrc=source1.source2.source3

Source 1 and Source 2 are used for the most general tracking. In most cases they are static and remain consistent across a given campaign or batch of adverts.

Source 1 is the channel of media (eg, display, affiliate, ppc, etc). This is similar to utm_medium.

Source 2 is the source of the media (eg, Bing, Pepperjam, Commission Junction, etc). This is similar to utm_source.

Source 3 is the most granular source in the general views of the Convertro dashboard. This is similar to utm_campaign, utm_content or utm_term, all bundled into one.

As well as these three sources, Convertro also provides a set of cvo_ parameters for various levels of granularity:

  • Creative (cvo_crid= and cvo_creative= for the creative ID and name);
  • Campaign (cvo_cid= and cvo_campaign= for the campaign ID and name);
  • Placement (cvo_pid= and cvo_placement= for the placement ID and name).

Additionally, users can add further cvo_ parameters, such as cvo_country, though they won’t show up in the general dashboard.

Why is a custom enrichment necessary?

With utm_-style parameters, it is often possible to configure the Snowplow Campaign attribution enrichment so that those can be captured in the mkt_ fields of the atomic.events table. Indeed, the campaign attribution enrichment can be used to extract the value of cvosrc and populate one of the mkt_ fields in atomic.events – whichever we choose to map it to. However, the whole value (eg affiliate.bing.tshirts) will be inserted in that mkt_ field. There is no way to map source1 to mkt_medium, source2 to mkt_source and source3 to mkt_term, say.

This is where the JS script enrichment comes in. It allows us to parse the page URL, extract the value of cvosrc, split it into the three component sources and have each source populate its own column.

Context schema

The JS script enrichment will produce a new context, so we are going to need a schema for it, just as with any other custom context. The Convertro tracking parameters schema has been added to Iglu Central with the latest Snowplow release:

It supports all the ‘default’ tracking parameters (so no cvo_country etc).

JavaScript function

There are a few conditions for the JS function to be used in the enrichment.

Your JavaScript must include a function, process(event), which:

  • Takes a Snowplow enriched event POJO(Plain Old Java Object) as its sole argument. (Ie, it takes the raw event line.)
  • Returns a JavaScript array of valid self-describing JSONs, which will be added to the derived_contexts field in the enriched event
  • Returns [] or null if there are no contexts to add to this event
  • Can throw exceptions but note that throwing an exception will cause the entire enriched event to end up in the Bad Bucket or Bad Stream.

You can also include other top-level functions and variables in your JavaScript script - but you must include a process(event) function somewhere in your script.

Here is the process(event) function for the Convertro enrichment:

function process(event) {
  try {

    // Get the url from the page_url field of the event
    var url = event.getPage_url();

    if (url != null) {
      // Function to extract parameter values from the url
      function getURLParameter(url, param) {
        return decodeURIComponent((new RegExp('[?|&]' + param + '=' + '([^&;]+?)(&|#|;|$)').exec(url) || [null, ''])[1].replace(/\+/g, '%20')) || null;
      }

      // Extract cvosrc
      var cvosrc = getURLParameter(url, 'cvosrc');

        // Extract source_1
        var source_1 = cvosrc.split(".")[0];

        // Extract source_2
        var source_2 = cvosrc.split(".")[1];

        // Extract source_3
        var source_3 = cvosrc.split(".")[2];

      // Extract creative_id
      var cvo_crid = getURLParameter(url, 'cvo_crid');

      // Extract creative_name
      var cvo_creative = getURLParameter(url, 'cvo_creative');

      // Extract campaign_id
      var cvo_cid = getURLParameter(url, 'cvo_cid');

      // Extract campaign_name
      var cvo_campaign = getURLParameter(url, 'cvo_campaign');

      // Extract placement_id
      var cvo_pid = getURLParameter(url, 'cvo_pid');

      // Extract placement_name
      var cvo_placement = getURLParameter(url, 'cvo_placement');
    }

    return [{
      schema: "iglu:com.convertro/tracking_parameters/jsonschema/1-0-0",
      data: {
        source1: source_1,
        source2: source_2,
        source3: source_3,
        creativeId: cvo_crid,
        creativeName: cvo_creative,
        campaignId: cvo_cid,
        campaignName: cvo_campaign,
        placementId: cvo_pid,
        placementName: cvo_placement
      }
    }];

  } catch(err) {};

  return [];
}

JSON config file

The enrichment needs to be enabled with a JSON config file. That files switches the enrichment on and off, and provides the JS script to be run (as a base64-encoded string).

This is what the JSON config file for the Convertro enrichment should look like:

{
	"schema": "iglu:com.snowplowanalytics.snowplow/javascript_script_config/jsonschema/1-0-0",
	"data": {
		"vendor": "com.snowplowanalytics.snowplow",
		"name": "javascript_script_config",
		"enabled": true,
		"parameters": {
			"script": "ZnVuY3Rpb24gcHJvY2VzcyhldmVudCkgewogIHRyeSB7CgogICAgLy8gR2V0IHRoZSB1cmwgZnJvbSB0aGUgcGFnZV91cmwgZmllbGQgb2YgdGhlIGV2ZW50CiAgICB2YXIgdXJsID0gZXZlbnQuZ2V0UGFnZV91cmwoKTsKCiAgICBpZiAodXJsICE9IG51bGwpIHsKICAgICAgLy8gRnVuY3Rpb24gdG8gZXh0cmFjdCBwYXJhbWV0ZXIgdmFsdWVzIGZyb20gdGhlIHVybAogICAgICBmdW5jdGlvbiBnZXRVUkxQYXJhbWV0ZXIodXJsLCBwYXJhbSkgewogICAgICAgIHJldHVybiBkZWNvZGVVUklDb21wb25lbnQoKG5ldyBSZWdFeHAoJ1s/fCZdJyArIHBhcmFtICsgJz0nICsgJyhbXiY7XSs/KSgmfCN8O3wkKScpLmV4ZWModXJsKSB8fCBbbnVsbCwgJyddKVsxXS5yZXBsYWNlKC9cKy9nLCAnJTIwJykpIHx8IG51bGw7CiAgICAgIH0KCiAgICAgIC8vIEV4dHJhY3QgY3Zvc3JjCiAgICAgIHZhciBjdm9zcmMgPSBnZXRVUkxQYXJhbWV0ZXIodXJsLCAnY3Zvc3JjJyk7CgogICAgICAgIC8vIEV4dHJhY3Qgc291cmNlXzEKICAgICAgICB2YXIgc291cmNlXzEgPSBjdm9zcmMuc3BsaXQoIi4iKVswXTsKCiAgICAgICAgLy8gRXh0cmFjdCBzb3VyY2VfMgogICAgICAgIHZhciBzb3VyY2VfMiA9IGN2b3NyYy5zcGxpdCgiLiIpWzFdOwoKICAgICAgICAvLyBFeHRyYWN0IHNvdXJjZV8zCiAgICAgICAgdmFyIHNvdXJjZV8zID0gY3Zvc3JjLnNwbGl0KCIuIilbMl07CgogICAgICAvLyBFeHRyYWN0IGNyZWF0aXZlX2lkCiAgICAgIHZhciBjdm9fY3JpZCA9IGdldFVSTFBhcmFtZXRlcih1cmwsICdjdm9fY3JpZCcpOwoKICAgICAgLy8gRXh0cmFjdCBjcmVhdGl2ZV9uYW1lCiAgICAgIHZhciBjdm9fY3JlYXRpdmUgPSBnZXRVUkxQYXJhbWV0ZXIodXJsLCAnY3ZvX2NyZWF0aXZlJyk7CgogICAgICAvLyBFeHRyYWN0IGNhbXBhaWduX2lkCiAgICAgIHZhciBjdm9fY2lkID0gZ2V0VVJMUGFyYW1ldGVyKHVybCwgJ2N2b19jaWQnKTsKCiAgICAgIC8vIEV4dHJhY3QgY2FtcGFpZ25fbmFtZQogICAgICB2YXIgY3ZvX2NhbXBhaWduID0gZ2V0VVJMUGFyYW1ldGVyKHVybCwgJ2N2b19jYW1wYWlnbicpOwoKICAgICAgLy8gRXh0cmFjdCBwbGFjZW1lbnRfaWQKICAgICAgdmFyIGN2b19waWQgPSBnZXRVUkxQYXJhbWV0ZXIodXJsLCAnY3ZvX3BpZCcpOwoKICAgICAgLy8gRXh0cmFjdCBwbGFjZW1lbnRfbmFtZQogICAgICB2YXIgY3ZvX3BsYWNlbWVudCA9IGdldFVSTFBhcmFtZXRlcih1cmwsICdjdm9fcGxhY2VtZW50Jyk7CiAgICB9CgogICAgcmV0dXJuIFt7CiAgICAgIHNjaGVtYTogImlnbHU6Y29tLmNvbnZlcnRyby90cmFja2luZ19wYXJhbWV0ZXJzL2pzb25zY2hlbWEvMS0wLTAiLAogICAgICBkYXRhOiB7CiAgICAgICAgc291cmNlMTogc291cmNlXzEsCiAgICAgICAgc291cmNlMjogc291cmNlXzIsCiAgICAgICAgc291cmNlMzogc291cmNlXzMsCiAgICAgICAgY3JlYXRpdmVJZDogY3ZvX2NyaWQsCiAgICAgICAgY3JlYXRpdmVOYW1lOiBjdm9fY3JlYXRpdmUsCiAgICAgICAgY2FtcGFpZ25JZDogY3ZvX2NpZCwKICAgICAgICBjYW1wYWlnbk5hbWU6IGN2b19jYW1wYWlnbiwKICAgICAgICBwbGFjZW1lbnRJZDogY3ZvX3BpZCwKICAgICAgICBwbGFjZW1lbnROYW1lOiBjdm9fcGxhY2VtZW50CiAgICAgIH0KICAgIH1dOwoKICB9IGNhdGNoKGVycikge307CgogIHJldHVybiBbXTsKfQ=="
		}
	}
}

Redshift

Once the enrichment is live, it will populate the atomic.com_convertro_tracking_parameters_1 table in Redshift. Make sure to deploy the DDL for that table in Redshift before switching on the enrichment. Otherwise, the pipeline will try to write to that table and fail when it can’t find it.

7 Likes

Hi @dilyan. Thanks for your introduction. I met a similar problem recently. I want to extract parameters from URL and do some complicated processing, finally overwrite the existing fields of mkt_medium, mkt_source, mkt_term…
But I am not sure how to return it since I found you created a new schema (iglu:com.convertro/tracking_parameters/jsonschema/1-0-0) in your return, but I cannot find the schema for these mkt fields. Do you have any method to implement it? Or must I create a new schema for the return data? Thanks!

1 Like

Hi @phxtorise, yes, you’ll need to create a new schema that describes the data that will be added to the event. This will be added to the derived_contexts field of the enriched event. If you’re loading into Redshift, it will be split into its own table; and if you’re loading into Bigquery, Snowflake or Databricks, it will be in its own column.

1 Like