Thanks, Ihor. it was helpful, using --target is working for me to populate data to Redshift. The only thing, I spent a couple of hours rerunning emr and keep on failing on missing table exception. I had to create a bunch of tables.
The list is here:
https://github.com/snowplow/iglu-central/tree/a65bd9574c3bb34f1699afda5a22c3e717df3f78/sql/com.snowplowanalytics.monitoring.batch
also, a problem with log running emr is solved. I choose a stronger machine for the cluster and not it takes instead of ~30 min around one minute.
Questions
how do I know what tables should I create? I understand it depends on the tracking purposes but still - is there documentation which tables should be used for which activity?
another question - to how to track properly enableLinkClickTracking (should I create a new post for this question?)
;(function(p,l,o,w,i,n,g){if(!p[i]){p.GlobalSnowplowNamespace=p.GlobalSnowplowNamespace||;
p.GlobalSnowplowNamespace.push(i);p[i]=function(){(p[i].q=p[i].q||).push(arguments)
};p[i].q=p[i].q||;n=l.createElement(o);g=l.getElementsByTagName(o)[0];n.async=1;
n.src=w;g.parentNode.insertBefore(n,g)}}(window,document,“script”,"//**********.cloudfront.net/2.9.2/sp.js",“snowplow”));
window.snowplow(‘newTracker’, ‘cf’, ‘*********.cloudfront.net’, { // Initialise a tracker - point to cloudfront that serves S3 bucket w/ pixel
appId: ‘web’,
cookieDomain: null,
gaCookies: true
});
window.snowplow(‘enableActivityTracking’, 30, 10);
window.snowplow(‘enableLinkClickTracking’, null, true, true);
window.snowplow(‘trackPageView’);
Redshift, all entries are page_ping. What is the way to configure the above tracker to track page view and link clicks?
Thanks
Oleg.