Detecting abandoned shopping carts with Snowplow

magaton · April 10, 2017, 10:08am

Hello,
what would be the recommended approach to track shopping cart abandonment in javascript tracker?

Thanks,
Milan

zivbaram · April 12, 2017, 8:30am

Hi Magaton, can you provide a bit more information? is it for you own shop? what platform are you using for the shop?
Thanks

Ziv

magaton · April 12, 2017, 8:34am

Hello, it is for client’s e-shops. We are now evaluating options for the clickstream part of our reco engine product.
e-shops are very different: magento, hybris, woo commerce, ofbiz, so we are looking for a very general and simple way to do that.

magaton · April 18, 2017, 2:16pm

I’ve come across this series of blog posts:
http://snowplowanalytics.com/blog/2014/07/31/using-graph-databases-to-perform-pathing-analysis-initial-experimentation-with-neo4j/ which opened me a completely new perspective.

Having page pings inserted as event nodes for checkout page seems like a straightforward way to answer the question from the subject.

Before I do a POC with snowplow, kafka and neo4j can somebody tell me why this approach is not a mainstream and if this is something that makes sense to snowplow dev community?

alex · April 18, 2017, 7:55pm

Hey @magaton - using a graph database for abandoned shopping cart detection is a really interesting idea.

However, given the fairly simple rules around defining and detecting abandoned carts, you can probably get away with something simpler. My Event Streams in Action book has an example abandoned shopping cart detector written for Kafka using Samza. It doesn’t process Snowplow events but you could certainly adapt it to work with Snowplow. The code is here:

https://github.com/alexanderdean/Unified-Log-Processing/tree/master/ch04/4.4/nile

magaton · April 19, 2017, 10:43am

Thanks @alex, I think I understand. Hope you by “interesting” don’t mean crazy idea
I usually think in cypher query terms and once data is in graph, everything is easy since you can ask anything, but to get there and keep it to a reasonable size is a different topic.
Do you see any particular reason to use Samza instead of Kafka Streams?
We have kafka, neo4j and ES in our infrastructure, which is already hard to manage, so not really keen to add a new beast

alex · April 19, 2017, 11:08am

Hi @magaton - no don’t worry, by “interesting” I just meant interesting , not crazy

This is the classic analytics-on-write versus analytics-on-read debate:

Analytics-on-read

Do all the pathing in a relatively general way in Neo4j
Ask any question you like
Experiment with different abandoned cart definitions
Some challenges around latency and scalability

Analytics-on-write

Decide on an abandoned cart definition
Write a Samza or Kafka Streams job to run the algorithm in-stream (an AWS user would use Lambda + DynamoDB)
Much less flexible than analytics-on-read
But low latency and super-scalable

alexmc6 · July 26, 2017, 4:51pm

@magaton - would you be willing to discuss this by email? I think we have similar needs.
alexmc6 on github or alex.mclintock at gmail dot com

Thanks

farhah · January 19, 2018, 10:02am

Have you got it working?Do you mind sharing?

Topic		Replies	Views
ClickStream or Click Path analysis For data modelers & consumers	1	2115	April 27, 2016
Sending Google Analytics events into Snowplow RFCs	12	7965	July 19, 2017
Snowplow not tracking "add to basket" events Tracking SDKs	2	2477	August 5, 2016
Building a model for event data as a graph – Snowplow From the blog	14	2319	March 11, 2021
Self-describing events versus the mega JSON-object property for Snowflake? For data modelers & consumers	8	1259	February 10, 2022

Detecting abandoned shopping carts with Snowplow

Analytics-on-read

Analytics-on-write

Related topics