Data modeling in real time

Thank you for your quick response!

I can start by giving you a bit of detail on my use case. I’m looking to collect web analytics and display them within my product. Specifically, I have content creators who will create web pages and publish them. Visitors will connect to these web pages, and we’ll collect the time they spend on each page and each section of the page (thanks to pings inspired by activity tracking, cf. this thread). Then, I want to provide content creators with graphs on the time spent by each visitor on each page/section.

Indeed, I think my use case doesn’t really require session data. I admit that I used Snowplow’s dbt package because I thought I understood that it was the recommended approach. I imagined, among other things, that the package allowed for managing edge cases such as event de-duplication.

So, would you recommend that I skip the dbt package and use the event table directly? If so, what are the edge cases handled by the package that I should be careful to manage manually?