Customise column names for Event and Entities

I think there are definitely some advantages to customisation but in having this standard (which is applied globally across all events + entities) it enables us to build robust data models like the recently released unified model to take advantage of knowing that these conventions are in place. If we allowed for more flexible renaming in the destination target we would then need to add additional complexity to each data model (and in turn each data warehouse) to account for variability in these column names.

Other changes - like storing the version as part of the column name rather than within the column allow us to often more efficiently query warehouses - e.g., in BigQuery this means the user can scan fewer bytes (by selecting only the values they need) and in Snowflake you can do so similarly where you can reduce the amount of computation required by not needing as many columns, as well as not needing to reference a field inside the VARIANT. Many query engines either do not calculate or provide only coarse / block level statistics on properties within these columns and as a result the query planner isn’t able to optimise the scans as much as is otherwise possible.

1 Like