Verify opensource GCP install

Hi,

I’m attempting to verify that my opensource GCP install is all working…

I have:
Pushed the static schema
Created some custom schemas
Sent some messages using both the custom and static schemas

Initially I looked in the sp-bad-1-topic and saw that there was some issue, i think it was a 404. I managed to resolve that.

Now though there are no errors in the javascript i’m using to send events.
No errors in the above topic
No messages in the other topics

Whats the best way to verify this is alive and kicking?

I have found the database and now have access to it.

I assume the pipeline database is the correct one?
And again I assume I need to be looking in the postgres db

When running this I get relation "atomic.events" does not exist

SELECT * FROM atomic.events WHERE event_name = 'page_view';

Hi @SRosam the best way would be to move through the components step by step and check if there is data in PubSub and to verify the logs of each component.

It sounds like your Collector is working fine so you should see messages sent to the first PubSub topic (“raw”). If you are not seeing anything in the “good” or “bad-1” topics generally there is an issue in your Enrich application so you should be looking to the logs to find out whats going wrong here with the application.

But generally stepping through the pipeline and verifying that data is landing where it should and checking the application logs is how you debug it and check its working.

Hope this helps!

thanks @josh

I eventually got this in the bad stream.

I’m guessing the iglu server isn’t accessible?
OR
Is it that the schema isn’t available?

{
   "schema":"iglu:com.snowplowanalytics.snowplow.badrows/schema_violations/jsonschema/2-0-0",
   "data":{
      "processor":{
         "artifact":"snowplow-enrich-pubsub",
         "version":"2.0.5"
      },
      "failure":{
         "timestamp":"2022-09-07T09:09:29.924058Z",
         "messages":[
            {
               "schemaKey":"iglu:com.quix.steve.test/mouse-move-event/jsonschema/1-0-0",
               "error":{
                  "error":"ResolutionError",
                  "lookupHistory":[
                     {
                        "repository":"Iglu Central",
                        "errors":[
                           {
                              "error":"RepoFailure",
                              "message":"Unexpected exception fetching: org.http4s.client.UnexpectedStatus: unexpected HTTP status: 404 Not Found"
                           }
                        ],
                        "attempts":23,
                        "lastAttempt":"2022-09-07T09:06:56.747Z"
                     },
                     {
                        "repository":"Iglu Central - Mirror 01",
                        "errors":[
                           {
                              "error":"RepoFailure",
                              "message":"Unexpected exception fetching: org.http4s.client.UnexpectedStatus: unexpected HTTP status: 404 Not Found"
                           }
                        ],
                        "attempts":23,
                        "lastAttempt":"2022-09-07T09:06:56.789Z"
                     },
                     {
                        "repository":"Iglu Client Embedded",
                        "errors":[
                           {
                              "error":"NotFound"
                           }
                        ],
                        "attempts":1,
                        "lastAttempt":"2022-09-06T11:09:04.186Z"
                     },
                     {
                        "repository":"Iglu Server",
                        "errors":[
                           {
                              "error":"RepoFailure",
                              "message":"Unexpected exception fetching: org.http4s.client.UnexpectedStatus: unexpected HTTP status: 400 Bad Request"
                           }
                        ],
                        "attempts":23,
                        "lastAttempt":"2022-09-07T09:06:56.656Z"
                     }
                  ]
               }
            }
         ]
      },

Calling: http://[IGLUSERVER]/api/meta/server?api_key
Returns this which indicates 0 schemas

{
    "version": "0.7.0",
    "authInfo": {
        "vendor": "",
        "schema": null,
        "key": []
    },
    "database": "postgres",
    "schemaCount": 0,
    "debug": false,
    "patchesAllowed": true
}

But calling http://iglu_server/api/schemas
returns a load of schemas

@SRosam can you share your tfvars file please? Making sure to redact anything sensitive in there. My hunch is that there is a bad input somewhere that is causing lookups against your Iglu Server to fail.

# Will be prefixed to all resource names
# Use this to easily identify the resources created and provide entropy for subsequent environments
prefix = "sp"

# The project to deploy the infrastructure into
project_id = "abc123"

# Where to deploy the infrastructure
region = "europe-west2"

# --- Default Network
# Update to the network you would like to deploy into
#
# Note: If you opt to use your own network then you will need to define a subnetwork to deploy into as well
network    = "default"
subnetwork = ""

# --- SSH
# Update this to your IP Address
ssh_ip_allowlist = ["x.x.x.x/32", "x.x.x.x/32","x.x.x.x/32"]
# Generate a new SSH key locally with `ssh-keygen`
# ssh-keygen -t rsa -b 4096 
ssh_key_pairs = [
  {
    user_name  = "snowplow"
    public_key = "ssh-rsa xxx"
  }
]

# --- Snowplow Iglu Server
iglu_db_name     = "iglu"
iglu_db_username = "iglu"
# Change and keep this secret!
iglu_db_password = "!"

# Used for API actions on the Iglu Server
# Change this to a new UUID and keep it secret!
iglu_super_api_key = "my_guid"

# NOTE: To push schemas to your Iglu Server, you can use igluctl
# igluctl: https://docs.snowplowanalytics.com/docs/pipeline-components-and-applications/iglu/igluctl
# igluctl static push --public schemas/ http://CHANGE-TO-MY-IGLU-IP 00000000-0000-0000-0000-000000000000

# See for more information: https://github.com/snowplow-devops/terraform-google-iglu-server-ce#telemetry
# Telemetry principles: https://docs.snowplowanalytics.com/docs/open-source-quick-start/what-is-the-quick-start-for-open-source/telemetry-principles/
user_provided_id  = ""
telemetry_enabled = true

# --- SSL Configuration (optional)
ssl_information = {
  certificate_id = ""
  enabled        = false
}

# --- Extra Labels to append to created resources (optional)
labels = {}

Hi @SRosam this is just for the Iglu deployment? Can you share the tfvars for your pipeline as well?

# Will be prefixed to all resource names
# Use this to easily identify the resources created and provide entropy for subsequent environments
prefix = "sp"

# The project to deploy the infrastructure into
project_id = "valid-xxxx"

# Where to deploy the infrastructure
region = "europe-west2"

# --- Default Network
# Update to the network you would like to deploy into
#
# Note: If you opt to use your own network then you will need to define a subnetwork to deploy into as well
network    = "default"
subnetwork = ""

# --- SSH
# Update this to your IP Address
ssh_ip_allowlist = ["x.x.x.x/32", "x.x.x.x/32","x.x.x.x/32"]
# Generate a new SSH key locally with `ssh-keygen`
# ssh-keygen -t rsa -b 4096 
ssh_key_pairs = [
  {
    user_name  = "snowplow"
    public_key = "abc123"
  }
]

# --- Iglu Server Configuration
# Iglu Server DNS output from the Iglu Server stack
iglu_server_dns_name = "http://x.x.x.x/"
# Used for API actions on the Iglu Server
# Change this to the same UUID from when you created the Iglu Server
iglu_super_api_key = "my-guid"

# --- Snowplow Postgres Loader
pipeline_db_name     = "snowplow"
pipeline_db_username = "snowplow"
# Change and keep this secret!
pipeline_db_password = "abc123"
# IP ranges that you want to query the Pipeline Postgres Cloud SQL instance from directly over the internet.  An alternative access method is to leverage
# the Cloud SQL Proxy service which creates an IAM authenticated tunnel to the instance
#
# Details: https://cloud.google.com/sql/docs/postgres/sql-proxy
#
# Note: this exposes your data to the internet - take care to ensure your allowlist is strict enough
pipeline_db_authorized_networks = [
  {
    name  = "foo"
    value = "x.x.x.x/32"
  },
  {
    name  = "bar"
    value = "x.x.x.x/32"
  },
  {
    name  = "baz"
    value = "x.x.x.x/32"
  }
]
# Note: the size of the database instance determines the number of concurrent connections - each Postgres Loader instance creates 10 open connections so having
# a sufficiently powerful database tier is important to not running out of connection slots
pipeline_db_tier = "db-g1-small"

# See for more information: https://registry.terraform.io/modules/snowplow-devops/collector-pubsub-ce/google/latest#telemetry
# Telemetry principles: https://docs.snowplowanalytics.com/docs/open-source-quick-start/what-is-the-quick-start-for-open-source/telemetry-principles/
user_provided_id  = ""
telemetry_enabled = true

# --- SSL Configuration (optional)
ssl_information = {
  certificate_id = ""
  enabled        = false
}

# --- Extra Labels to append to created resources (optional)
labels = {}

Can you see anything wrong here @josh ?

This is the only potential one I could spot quickly - looking here we add a trailing slash ourselves so this might cause an issue with schema resolution.

Can you share anything from the application logs that could point to this?