Snowflake loader error

Hi,

We’re trying to get the snowplow / snowflake integration working. We’re at the point were where able to ingest events and they seem to be transformed / enriched correctly, I see the files in the s3 bucket. The snowflake loader terraform ran successfully, but we’re not seeing the events loaded into the snowflake events table.

In Snowflake I do see the files in the SP_TRANSFORMED_STAGE

The only error I’m seeing is in the sp-snowflake-loader-server logs:


2023-07-21T10:04:14.043+02:00	INFO Loader: Total 85 messages received, 0 loaded; Loader is in Idle state; Last state update at 2023-07-21 08:00:30.592

2023-07-21T10:09:14.044+02:00	INFO Loader: Total 85 messages received, 0 loaded; Loader is in Idle state; Last state update at 2023-07-21 08:00:30.592

2023-07-21T10:10:00.767+02:00	INFO DataDiscovery: Received a new message

2023-07-21T10:10:00.767+02:00	INFO DataDiscovery: Total 86 messages received, 0 loaded

2023-07-21T10:10:00.768+02:00	INFO DataDiscovery: New data discovery at run=2023-07-21-08-00-00 with following shredded types:

2023-07-21T10:10:00.768+02:00	* iglu:com.snowplowanalytics.snowplow/web_page/jsonschema/1-*-* WIDEROW

2023-07-21T10:10:00.768+02:00	* iglu:com.snowplowanalytics.snowplow/ua_parser_context/jsonschema/1-*-* WIDEROW

2023-07-21T10:10:00.768+02:00	* iglu:nl.basjes/yauaa_context/jsonschema/1-*-* WIDEROW

2023-07-21T10:10:30.784+02:00	ERROR Loader: Loading of s3://xxxxxxx/website/transformed/good/run=2023-07-21-08-00-00/ has failed. Not adding into retry queue. HikariPool-1 - Connection is not available, request timed out after 30000ms.

I’m not sure what this error means or how to debug the issue further. Any help would be appreciated.

Maybe some additional info: Snowplow is in AWS. Our Snowflake runs in Azure.

Further progress. Had to explicitly add the IP of the NAT Gateway to Snowflake’s security policies. Now getting the following error:

2023-07-21T14:00:00.786+02:00	INFO DataDiscovery: Total 109 messages received, 0 loaded

2023-07-21T14:00:00.789+02:00	INFO DataDiscovery: New data discovery at run=2023-07-21-11-50-00 with following shredded types:

2023-07-21T14:00:00.789+02:00	* iglu:com.snowplowanalytics.snowplow/web_page/jsonschema/1-*-* WIDEROW

2023-07-21T14:00:00.789+02:00	* iglu:com.snowplowanalytics.snowplow/ua_parser_context/jsonschema/1-*-* WIDEROW

2023-07-21T14:00:00.789+02:00	* iglu:nl.basjes/yauaa_context/jsonschema/1-*-* WIDEROW

2023-07-21T14:00:06.248+02:00

Copy
List(IP_NETSPEED, NAME_TRACKER, EVENT_ID, PAGE_TITLE, REFR_TERM, MKT_CLICKID, TI_PRICE_BASE, DVCE_SCREENHEIGHT, TR_SHIPPING, IP_ORGANIZATION, BR_COLORDEPTH, OS_MANUFACTURER, EVENT_VENDOR, SE_CATEGORY, EVENT_VERSION, PAGE_URLFRAGMENT, TI_NAME, PP_YOFFSET_MAX, SE_LABEL, PP_XOFFSET_MAX, TXN_ID, EVENT, EVENT_FINGERPRINT, DOMAIN_SESSIONID, DERIVED_TSTAMP, BR_FEATURES_WINDOWSMEDIA, BR_FEATURES_PDF, TR_SHIPPING_BASE, PP_XOFFSET_MIN, TRUE_TSTAMP, BR_FEATURES_GEARS, MKT_MEDIUM, DOC_CHARSET, REFR_URLPATH, BR_FEATURES_DIRECTOR, USER_ID, REFR_SOURCE, DOMAIN_SESSIONIDX, BR_VIEWHEIGHT, TI_SKU, TR_COUNTRY, TI_PRICE, BR_NAME, TR_STATE, DVCE_TYPE, DVCE_SCREENWIDTH, DVCE_CREATED_TSTAMP, PAGE_URLPATH, MKT_TERM, GEO_REGION_NAME, MKT_CONTENT, REFR_URLFRAGMENT, USER_IPADDRESS, DVCE_ISMOBILE, ETL_TSTAMP, TR_ORDERID, BR_LANG, PLATFORM, TI_ORDERID, MKT_CAMPAIGN, TR_TAX, REFR_DVCE_TSTAMP, BASE_CURRENCY, BR_FEATURES_SILVERLIGHT, TR_TOTAL, USER_FINGERPRINT, REFR_MEDIUM, IP_DOMAIN, PAGE_URLSCHEME, REFR_URLPORT, EVENT_NAME, BR_FAMILY, MKT_SOURCE, BR_COOKIES, V_ETL, BR_VERSION, IP_ISP, NETWORK_USERID, BR_TYPE, GEO_ZIPCODE, DOC_HEIGHT, SE_VALUE, BR_FEATURES_REALPLAYER, TR_AFFILIATION, BR_RENDERENGINE, TR_TAX_BASE, OS_NAME, TI_CURRENCY, TR_CURRENCY, BR_VIEWWIDTH, V_TRACKER, GEO_LONGITUDE, MKT_NETWORK, BR_FEATURES_JAVA, GEO_CITY, APP_ID, REFR_DOMAIN_USERID, SE_ACTION, REFR_URLHOST, PAGE_URLHOST, GEO_LATITUDE, GEO_COUNTRY, ETL_TAGS, BR_FEATURES_FLASH, PAGE_URLQUERY, REFR_URLQUERY, PAGE_URL, DOC_WIDTH, OS_TIMEZONE, COLLECTOR_TSTAMP, DOMAIN_USERID, PP_YOFFSET_MIN, USERAGENT, SE_PROPERTY, LOAD_TSTAMP, TI_QUANTITY, OS_FAMILY, GEO_REGION, V_COLLECTOR, PAGE_URLPORT, DVCE_SENT_TSTAMP, TI_CATEGORY, BR_FEATURES_QUICKTIME, TR_TOTAL_BASE, GEO_TIMEZONE, EVENT_FORMAT, TR_CITY, PAGE_REFERRER, REFR_URLSCHEME)
List(IP_NETSPEED, NAME_TRACKER, EVENT_ID, PAGE_TITLE, REFR_TERM, MKT_CLICKID, TI_PRICE_BASE, DVCE_SCREENHEIGHT, TR_SHIPPING, IP_ORGANIZATION, BR_COLORDEPTH, OS_MANUFACTURER, EVENT_VENDOR, SE_CATEGORY, EVENT_VERSION, PAGE_URLFRAGMENT, TI_NAME, PP_YOFFSET_MAX, SE_LABEL, PP_XOFFSET_MAX, TXN_ID, EVENT, EVENT_FINGERPRINT, DOMAIN_SESSIONID, DERIVED_TSTAMP, BR_FEATURES_WINDOWSMEDIA, BR_FEATURES_PDF, TR_SHIPPING_BASE, PP_XOFFSET_MIN, TRUE_TSTAMP, BR_FEATURES_GEARS, MKT_MEDIUM, DOC_CHARSET, REFR_URLPATH, BR_FEATURES_DIRECTOR, USER_ID, REFR_SOURCE, DOMAIN_SESSIONIDX, BR_VIEWHEIGHT, TI_SKU, TR_COUNTRY, TI_PRICE, BR_NAME, TR_STATE, DVCE_TYPE, DVCE_SCREENWIDTH, DVCE_CREATED_TSTAMP, PAGE_URLPATH, MKT_TERM, GEO_REGION_NAME, MKT_CONTENT, REFR_URLFRAGMENT, USER_IPADDRESS, DVCE_ISMOBILE, ETL_TSTAMP, TR_ORDERID, BR_LANG, PLATFORM, TI_ORDERID, MKT_CAMPAIGN, TR_TAX, REFR_DVCE_TSTAMP, BASE_CURRENCY, BR_FEATURES_SILVERLIGHT, TR_TOTAL, USER_FINGERPRINT, REFR_MEDIUM, IP_DOMAIN, PAGE_URLSCHEME, REFR_URLPORT, EVENT_NAME, BR_FAMILY, MKT_SOURCE, BR_COOKIES, V_ETL, BR_VERSION, IP_ISP, NETWORK_USERID, BR_TYPE, GEO_ZIPCODE, DOC_HEIGHT, SE_VALUE, BR_FEATURES_REALPLAYER, TR_AFFILIATION, BR_RENDERENGINE, TR_TAX_BASE, OS_NAME, TI_CURRENCY, TR_CURRENCY, BR_VIEWWIDTH, V_TRACKER, GEO_LONGITUDE, MKT_NETWORK, BR_FEATURES_JAVA, GEO_CITY, APP_ID, REFR_DOMAIN_USERID, SE_ACTION, REFR_URLHOST, PAGE_URLHOST, GEO_LATITUDE, GEO_COUNTRY, ETL_TAGS, BR_FEATURES_FLASH, PAGE_URLQUERY, REFR_URLQUERY, PAGE_URL, DOC_WIDTH, OS_TIMEZONE, COLLECTOR_TSTAMP, DOMAIN_USERID, PP_YOFFSET_MIN, USERAGENT, SE_PROPERTY, LOAD_TSTAMP, TI_QUANTITY, OS_FAMILY, GEO_REGION, V_COLLECTOR, PAGE_URLPORT, DVCE_SENT_TSTAMP, TI_CATEGORY, BR_FEATURES_QUICKTIME, TR_TOTAL_BASE, GEO_TIMEZONE, EVENT_FORMAT, TR_CITY, PAGE_REFERRER, REFR_URLSCHEME)

2023-07-21T14:00:06.248+02:00

Copy
contexts_com_snowplowanalytics_snowplow_web_page_1
contexts_com_snowplowanalytics_snowplow_web_page_1

2023-07-21T14:00:06.249+02:00

Copy
True for Info(s3://xxxxx/website/transformed/good/run=2023-07-21-11-50-00/,com.snowplowanalytics.snowplow,web_page,1,Context)
True for Info(s3://xxxxx/website/transformed/good/run=2023-07-21-11-50-00/,com.snowplowanalytics.snowplow,web_page,1,Context)

2023-07-21T14:00:06.253+02:00

Copy
List(IP_NETSPEED, NAME_TRACKER, EVENT_ID, PAGE_TITLE, REFR_TERM, MKT_CLICKID, TI_PRICE_BASE, DVCE_SCREENHEIGHT, TR_SHIPPING, IP_ORGANIZATION, BR_COLORDEPTH, OS_MANUFACTURER, EVENT_VENDOR, SE_CATEGORY, EVENT_VERSION, PAGE_URLFRAGMENT, TI_NAME, PP_YOFFSET_MAX, SE_LABEL, PP_XOFFSET_MAX, TXN_ID, EVENT, EVENT_FINGERPRINT, DOMAIN_SESSIONID, DERIVED_TSTAMP, BR_FEATURES_WINDOWSMEDIA, BR_FEATURES_PDF, TR_SHIPPING_BASE, PP_XOFFSET_MIN, TRUE_TSTAMP, BR_FEATURES_GEARS, MKT_MEDIUM, DOC_CHARSET, REFR_URLPATH, BR_FEATURES_DIRECTOR, USER_ID, REFR_SOURCE, DOMAIN_SESSIONIDX, BR_VIEWHEIGHT, TI_SKU, TR_COUNTRY, TI_PRICE, BR_NAME, TR_STATE, DVCE_TYPE, DVCE_SCREENWIDTH, DVCE_CREATED_TSTAMP, PAGE_URLPATH, MKT_TERM, GEO_REGION_NAME, MKT_CONTENT, REFR_URLFRAGMENT, USER_IPADDRESS, DVCE_ISMOBILE, ETL_TSTAMP, TR_ORDERID, BR_LANG, PLATFORM, TI_ORDERID, MKT_CAMPAIGN, TR_TAX, REFR_DVCE_TSTAMP, BASE_CURRENCY, BR_FEATURES_SILVERLIGHT, TR_TOTAL, USER_FINGERPRINT, REFR_MEDIUM, IP_DOMAIN, PAGE_URLSCHEME, REFR_URLPORT, EVENT_NAME, BR_FAMILY, MKT_SOURCE, BR_COOKIES, V_ETL, BR_VERSION, IP_ISP, NETWORK_USERID, BR_TYPE, GEO_ZIPCODE, DOC_HEIGHT, SE_VALUE, BR_FEATURES_REALPLAYER, TR_AFFILIATION, BR_RENDERENGINE, TR_TAX_BASE, OS_NAME, TI_CURRENCY, TR_CURRENCY, BR_VIEWWIDTH, V_TRACKER, GEO_LONGITUDE, MKT_NETWORK, BR_FEATURES_JAVA, GEO_CITY, APP_ID, REFR_DOMAIN_USERID, SE_ACTION, REFR_URLHOST, PAGE_URLHOST, GEO_LATITUDE, GEO_COUNTRY, ETL_TAGS, BR_FEATURES_FLASH, PAGE_URLQUERY, REFR_URLQUERY, PAGE_URL, DOC_WIDTH, OS_TIMEZONE, COLLECTOR_TSTAMP, DOMAIN_USERID, PP_YOFFSET_MIN, USERAGENT, SE_PROPERTY, LOAD_TSTAMP, TI_QUANTITY, OS_FAMILY, GEO_REGION, V_COLLECTOR, PAGE_URLPORT, DVCE_SENT_TSTAMP, TI_CATEGORY, BR_FEATURES_QUICKTIME, TR_TOTAL_BASE, GEO_TIMEZONE, EVENT_FORMAT, TR_CITY, PAGE_REFERRER, REFR_URLSCHEME)
List(IP_NETSPEED, NAME_TRACKER, EVENT_ID, PAGE_TITLE, REFR_TERM, MKT_CLICKID, TI_PRICE_BASE, DVCE_SCREENHEIGHT, TR_SHIPPING, IP_ORGANIZATION, BR_COLORDEPTH, OS_MANUFACTURER, EVENT_VENDOR, SE_CATEGORY, EVENT_VERSION, PAGE_URLFRAGMENT, TI_NAME, PP_YOFFSET_MAX, SE_LABEL, PP_XOFFSET_MAX, TXN_ID, EVENT, EVENT_FINGERPRINT, DOMAIN_SESSIONID, DERIVED_TSTAMP, BR_FEATURES_WINDOWSMEDIA, BR_FEATURES_PDF, TR_SHIPPING_BASE, PP_XOFFSET_MIN, TRUE_TSTAMP, BR_FEATURES_GEARS, MKT_MEDIUM, DOC_CHARSET, REFR_URLPATH, BR_FEATURES_DIRECTOR, USER_ID, REFR_SOURCE, DOMAIN_SESSIONIDX, BR_VIEWHEIGHT, TI_SKU, TR_COUNTRY, TI_PRICE, BR_NAME, TR_STATE, DVCE_TYPE, DVCE_SCREENWIDTH, DVCE_CREATED_TSTAMP, PAGE_URLPATH, MKT_TERM, GEO_REGION_NAME, MKT_CONTENT, REFR_URLFRAGMENT, USER_IPADDRESS, DVCE_ISMOBILE, ETL_TSTAMP, TR_ORDERID, BR_LANG, PLATFORM, TI_ORDERID, MKT_CAMPAIGN, TR_TAX, REFR_DVCE_TSTAMP, BASE_CURRENCY, BR_FEATURES_SILVERLIGHT, TR_TOTAL, USER_FINGERPRINT, REFR_MEDIUM, IP_DOMAIN, PAGE_URLSCHEME, REFR_URLPORT, EVENT_NAME, BR_FAMILY, MKT_SOURCE, BR_COOKIES, V_ETL, BR_VERSION, IP_ISP, NETWORK_USERID, BR_TYPE, GEO_ZIPCODE, DOC_HEIGHT, SE_VALUE, BR_FEATURES_REALPLAYER, TR_AFFILIATION, BR_RENDERENGINE, TR_TAX_BASE, OS_NAME, TI_CURRENCY, TR_CURRENCY, BR_VIEWWIDTH, V_TRACKER, GEO_LONGITUDE, MKT_NETWORK, BR_FEATURES_JAVA, GEO_CITY, APP_ID, REFR_DOMAIN_USERID, SE_ACTION, REFR_URLHOST, PAGE_URLHOST, GEO_LATITUDE, GEO_COUNTRY, ETL_TAGS, BR_FEATURES_FLASH, PAGE_URLQUERY, REFR_URLQUERY, PAGE_URL, DOC_WIDTH, OS_TIMEZONE, COLLECTOR_TSTAMP, DOMAIN_USERID, PP_YOFFSET_MIN, USERAGENT, SE_PROPERTY, LOAD_TSTAMP, TI_QUANTITY, OS_FAMILY, GEO_REGION, V_COLLECTOR, PAGE_URLPORT, DVCE_SENT_TSTAMP, TI_CATEGORY, BR_FEATURES_QUICKTIME, TR_TOTAL_BASE, GEO_TIMEZONE, EVENT_FORMAT, TR_CITY, PAGE_REFERRER, REFR_URLSCHEME)

2023-07-21T14:00:06.253+02:00	contexts_com_snowplowanalytics_snowplow_ua_parser_context_1

2023-07-21T14:00:06.253+02:00	True for Info(s3://xxxxx/website/transformed/good/run=2023-07-21-11-50-00/,com.snowplowanalytics.snowplow,ua_parser_context,1,Context)

2023-07-21T14:00:06.254+02:00	List(IP_NETSPEED, NAME_TRACKER, EVENT_ID, PAGE_TITLE, REFR_TERM, MKT_CLICKID, TI_PRICE_BASE, DVCE_SCREENHEIGHT, TR_SHIPPING, IP_ORGANIZATION, BR_COLORDEPTH, OS_MANUFACTURER, EVENT_VENDOR, SE_CATEGORY, EVENT_VERSION, PAGE_URLFRAGMENT, TI_NAME, PP_YOFFSET_MAX, SE_LABEL, PP_XOFFSET_MAX, TXN_ID, EVENT, EVENT_FINGERPRINT, DOMAIN_SESSIONID, DERIVED_TSTAMP, BR_FEATURES_WINDOWSMEDIA, BR_FEATURES_PDF, TR_SHIPPING_BASE, PP_XOFFSET_MIN, TRUE_TSTAMP, BR_FEATURES_GEARS, MKT_MEDIUM, DOC_CHARSET, REFR_URLPATH, BR_FEATURES_DIRECTOR, USER_ID, REFR_SOURCE, DOMAIN_SESSIONIDX, BR_VIEWHEIGHT, TI_SKU, TR_COUNTRY, TI_PRICE, BR_NAME, TR_STATE, DVCE_TYPE, DVCE_SCREENWIDTH, DVCE_CREATED_TSTAMP, PAGE_URLPATH, MKT_TERM, GEO_REGION_NAME, MKT_CONTENT, REFR_URLFRAGMENT, USER_IPADDRESS, DVCE_ISMOBILE, ETL_TSTAMP, TR_ORDERID, BR_LANG, PLATFORM, TI_ORDERID, MKT_CAMPAIGN, TR_TAX, REFR_DVCE_TSTAMP, BASE_CURRENCY, BR_FEATURES_SILVERLIGHT, TR_TOTAL, USER_FINGERPRINT, REFR_MEDIUM, IP_DOMAIN, PAGE_URLSCHEME, REFR_URLPORT, EVENT_NAME, BR_FAMILY, MKT_SOURCE, BR_COOKIES, V_ETL, BR_VERSION, IP_ISP, NETWORK_USERID, BR_TYPE, GEO_ZIPCODE, DOC_HEIGHT, SE_VALUE, BR_FEATURES_REALPLAYER, TR_AFFILIATION, BR_RENDERENGINE, TR_TAX_BASE, OS_NAME, TI_CURRENCY, TR_CURRENCY, BR_VIEWWIDTH, V_TRACKER, GEO_LONGITUDE, MKT_NETWORK, BR_FEATURES_JAVA, GEO_CITY, APP_ID, REFR_DOMAIN_USERID, SE_ACTION, REFR_URLHOST, PAGE_URLHOST, GEO_LATITUDE, GEO_COUNTRY, ETL_TAGS, BR_FEATURES_FLASH, PAGE_URLQUERY, REFR_URLQUERY, PAGE_URL, DOC_WIDTH, OS_TIMEZONE, COLLECTOR_TSTAMP, DOMAIN_USERID, PP_YOFFSET_MIN, USERAGENT, SE_PROPERTY, LOAD_TSTAMP, TI_QUANTITY, OS_FAMILY, GEO_REGION, V_COLLECTOR, PAGE_URLPORT, DVCE_SENT_TSTAMP, TI_CATEGORY, BR_FEATURES_QUICKTIME, TR_TOTAL_BASE, GEO_TIMEZONE, EVENT_FORMAT, TR_CITY, PAGE_REFERRER, REFR_URLSCHEME)

2023-07-21T14:00:06.254+02:00	contexts_nl_basjes_yauaa_context_1

2023-07-21T14:00:06.254+02:00	True for Info(s3://xxxxx/website/transformed/good/run=2023-07-21-11-50-00/,nl.basjes,yauaa_context,1,Context)

2023-07-21T14:00:06.744+02:00	INFO Migration: Migrating contexts_com_snowplowanalytics_snowplow_web_page_1 (pre-transaction)

2023-07-21T14:00:07.065+02:00	INFO Migration: Migrating contexts_com_snowplowanalytics_snowplow_ua_parser_context_1 (pre-transaction)

2023-07-21T14:00:07.356+02:00	INFO Migration: Migrating contexts_nl_basjes_yauaa_context_1 (pre-transaction)

2023-07-21T14:00:08.177+02:00	INFO Load: Loading transaction for s3://xxxxx/website/transformed/good/run=2023-07-21-11-50-00/ has started

2023-07-21T14:00:08.177+02:00	INFO Load: Loading s3://xxxxxwebsite/transformed/good/run=2023-07-21-11-50-00/

2023-07-21T14:00:08.182+02:00	INFO Load: COPY events FROM s3://xxxxx/website/transformed/good/run=2023-07-21-11-50-00/

2023-07-21T14:00:10.716+02:00

Copy
ERROR com.snowplowanalytics.snowplow.rdbloader: Transaction aborted. Sleeping for 30 seconds for the first time. Caught exception: net.snowflake.client.jdbc.SnowflakeSQLException: Failure using stage area. Cause: [Access Denied (Status Code: 403; Error Code: AccessDenied)]
ERROR com.snowplowanalytics.snowplow.rdbloader: Transaction aborted. Sleeping for 30 seconds for the first time. Caught exception: net.snowflake.client.jdbc.SnowflakeSQLException: Failure using stage area. Cause: [Access Denied (Status Code: 403; Error Code: AccessDenied)]

Hi @Laichzeit0 so the last error message you posted there does seem to be a permissions issue where the user issuing the COPY does not appear to have sufficient access to the stage?

In any case we are starting to move away from this model of using a stage at all and have recently pushed a much simpler Snowflake setup guide out which leverages temporary AWS IAM credentials for issuing copy statements.

In our testing this setup was much easier to deploy and work with over the existing setup with stages!

Would you be open to trying out that approach instead?

1 Like

Hi,

Sorry for the late reply, but yes the issue was permissions related. If I remember correctly it was since our Snowflake is in Azure and Snowplow in AWS so we had to add the NAT IP from AWS to be allowed in Snowflake.

1 Like