Terraform quick start on GCP is not working , Trying to find manual set up page but not able to locate it in documentation

I have tried to set up snowplow on gcp using the terraform module provided in the snowplow documentation. In the very first step of setting up iglu server the terraform job is failing due to health check failure , health check is continuely failing however the VM is health and running. I have enabled all the required API as mentioned in some of the solution on snowplow website, I have also tried to setting it up using linux machine but both ways i am getting into same error.

Could you please provide me the link having the steps to manually set up snowplow on gcp .

1 Like

What do the logs say for the Iglu Server for that VM? Generally if it’s failing the health check there should be something in the logs to indicate where the failure is occurring and as a result the rest of the stack is not provisioned.

1 Like

Hi Mike, I have reproduced the same error msg and captured the error in the attached image and error log from console as well as some log from cloud logging are copied below:
Snowplow-Error
running vm

Error msg on terminal

Error: timeout while waiting for state to become 'created' (last state: 'creating', timeout: 20m0s)
│
│   with module.iglu_server.google_compute_region_instance_group_manager.grp,
│   on .terraform\modules\iglu_server\main.tf line 230, in resource "google_compute_region_instance_group_manager" "grp":│  230: resource "google_compute_region_instance_group_manager" "grp" {
│
Cloud logging – warning msgs
Time 7:33 IST last log 
{
  "textPayload": "2023-08-23 14:03:30.213 UTC [617]: [2-1] db=,user= LOG:  automatic analyze of table \"cloudsqladmin.public.heartbeat\" system usage: CPU 0.00s/0.00u sec elapsed 0.00 sec",
  "insertId": "s=72ed0b03e32b4b62afddaf86f0cdf8f8;i=4184;b=b0a40380d639435d8dc7d5221fe3429a;m=5e347988;t=6039793859418;x=a212a1a48609141-0@a1",
  "resource": {
    "type": "cloudsql_database",
    "labels": {
      "region": "us-central",
      "project_id": "snowplow-poc-123",
      "database_id": "snowplow-poc-123:sp-iglu-db-558d4bfb"
    }
  },
  "timestamp": "2023-08-23T14:03:30.213912Z",
  "severity": "INFO",
  "labels": {
    "INSTANCE_UID": "1-852779d9-c055-4474-b749-6af25de04d34",
    "LOG_BUCKET_NUM": "68"
  },
  "logName": "projects/snowplow-poc-123/logs/cloudsql.googleapis.com%2Fpostgres.log",
  "receiveTimestamp": "2023-08-23T14:04:12.100784364Z"
}

Time - 7:23 pm IST
{
  "insertId": "3655777108714915225622482866447037347",
  "jsonPayload": {
    "@type": "type.googleapis.com/compute.InstanceGroupManagerEvent",
    "instanceHealthStateChange": {
      "instanceWithId": "projects/881847669974/zones/us-central1-c/instances/5622482866447037347",
      "previousDetailedHealthState": "UNKNOWN",
      "ipAddress": "10.128.0.2",
      "networkWithId": "projects/881847669974/global/networks/3725966222115212588",
      "notificationTime": "2023-08-23T13:53:50.624Z",
      "network": "projects/snowplow-poc-123/global/networks/default",
      "instance": "projects/snowplow-poc-123/zones/us-central1-c/instances/sp-iglu-server-8b6c",
      "healthCheck": "projects/snowplow-poc-123/global/healthChecks/sp-iglu-server",
      "detailedHealthState": "TIMEOUT"
    }
  },
  "resource": {
    "type": "gce_instance_group_manager",
    "labels": {
      "instance_group_manager_id": "365577710871491522",
      "location": "us-central1",
      "project_id": "snowplow-poc-123",
      "instance_group_manager_name": "sp-iglu-server-grp"
    }
  },
  "timestamp": "2023-08-23T13:53:50.624Z",
  "severity": "WARNING",
  "labels": {
    "compute.googleapis.com/instance_location": "us-central1-c",
    "compute.googleapis.com/instance_name": "sp-iglu-server-8b6c",
    "compute.googleapis.com/instance_id": "5622482866447037347"
  },
  "logName": "projects/snowplow-poc-123/logs/compute.googleapis.com%2Finstance_group_manager_events",
  "receiveTimestamp": "2023-08-23T13:53:51.806943432Z"
}



Time - 7:15 pm IST

{
  "insertId": "3655777108714915225622482866447037347",
  "jsonPayload": {
    "@type": "type.googleapis.com/compute.InstanceGroupManagerEvent",
    "instanceHealthStateChange": {
      "instance": "projects/snowplow-poc-123/zones/us-central1-c/instances/sp-iglu-server-8b6c",
      "healthCheck": "projects/snowplow-poc-123/global/healthChecks/sp-iglu-server",
      "instanceWithId": "projects/881847669974/zones/us-central1-c/instances/5622482866447037347",
      "ipAddress": "10.128.0.2",
      "networkWithId": "projects/881847669974/global/networks/3725966222115212588",
      "detailedHealthState": "TIMEOUT",
      "previousDetailedHealthState": "UNKNOWN",
      "notificationTime": "2023-08-23T13:45:13.588Z",
      "network": "projects/snowplow-poc-123/global/networks/default"
    }
  },
  "resource": {
    "type": "gce_instance_group_manager",
    "labels": {
      "instance_group_manager_id": "365577710871491522",
      "project_id": "snowplow-poc-123",
      "instance_group_manager_name": "sp-iglu-server-grp",
      "location": "us-central1"
    }
  },
  "timestamp": "2023-08-23T13:45:13.588Z",
  "severity": "WARNING",
  "labels": {
    "compute.googleapis.com/instance_location": "us-central1-c",
    "compute.googleapis.com/instance_name": "sp-iglu-server-8b6c",
    "compute.googleapis.com/instance_id": "5622482866447037347"
  },
  "logName": "projects/snowplow-poc-123/logs/compute.googleapis.com%2Finstance_group_manager_events",
  "receiveTimestamp": "2023-08-23T13:45:14.429114442Z"
}



{
  "insertId": "3655777108714915225622482866447037347",
  "jsonPayload": {
    "instanceHealthStateChange": {
      "instanceWithId": "projects/881847669974/zones/us-central1-c/instances/5622482866447037347",
      "detailedHealthState": "TIMEOUT",
      "healthCheck": "projects/snowplow-poc-123/global/healthChecks/sp-iglu-server",
      "instance": "projects/snowplow-poc-123/zones/us-central1-c/instances/sp-iglu-server-8b6c",
      "networkWithId": "projects/881847669974/global/networks/3725966222115212588",
      "ipAddress": "10.128.0.2",
      "network": "projects/snowplow-poc-123/global/networks/default",
      "notificationTime": "2023-08-23T13:45:13.588Z",
      "previousDetailedHealthState": "UNKNOWN"
    },
    "@type": "type.googleapis.com/compute.InstanceGroupManagerEvent"
  },
  "resource": {
    "type": "gce_instance_group_manager",
    "labels": {
      "instance_group_manager_name": "sp-iglu-server-grp",
      "project_id": "snowplow-poc-123",
      "location": "us-central1",
      "instance_group_manager_id": "365577710871491522"
    }
  },
  "timestamp": "2023-08-23T13:45:13.588Z",
  "severity": "WARNING",
  "labels": {
    "compute.googleapis.com/instance_name": "sp-iglu-server-8b6c",
    "compute.googleapis.com/instance_location": "us-central1-c",
    "compute.googleapis.com/instance_id": "5622482866447037347"
  },
  "logName": "projects/snowplow-poc-123/logs/compute.googleapis.com%2Finstance_group_manager_events",
  "receiveTimestamp": "2023-08-23T13:45:14.429114442Z"
}


{
  "textPayload": "2023-08-23 13:43:28.193 UTC [30]: [1-1] db=cloudsqladmin,user=cloudsqladmin ERROR:  relation \"public.heartbeat\" does not exist at character 13",
  "insertId": "s=72ed0b03e32b4b62afddaf86f0cdf8f8;i=32c5;b=b0a40380d639435d8dc7d5221fe3429a;m=16784f23;t=603974bc969b3;x=9dd17553cb4db030-0@a2",
  "resource": {
    "type": "cloudsql_database",
    "labels": {
      "project_id": "snowplow-poc-123",
      "region": "us-central",
      "database_id": "snowplow-poc-123:sp-iglu-db-558d4bfb"
    }
  },
  "timestamp": "2023-08-23T13:43:28.199493Z",
  "severity": "ERROR",
  "labels": {
    "LOG_BUCKET_NUM": "68",
    "INSTANCE_UID": "1-852779d9-c055-4474-b749-6af25de04d34"
  },
  "logName": "projects/snowplow-poc-123/logs/cloudsql.googleapis.com%2Fpostgres.log",
  "receiveTimestamp": "2023-08-23T13:43:29.873172918Z"
}


Time - 7:13 pm IST

{
  "textPayload": "2023-08-23 13:43:28.193 UTC [30]: [1-1] db=cloudsqladmin,user=cloudsqladmin ERROR:  relation \"public.heartbeat\" does not exist at character 13",
  "insertId": "s=72ed0b03e32b4b62afddaf86f0cdf8f8;i=32c5;b=b0a40380d639435d8dc7d5221fe3429a;m=16784f23;t=603974bc969b3;x=9dd17553cb4db030-0@a2",
  "resource": {
    "type": "cloudsql_database",
    "labels": {
      "region": "us-central",
      "project_id": "snowplow-poc-123",
      "database_id": "snowplow-poc-123:sp-iglu-db-558d4bfb"
    }
  },
  "timestamp": "2023-08-23T13:43:28.199493Z",
  "severity": "ERROR",
  "labels": {
    "INSTANCE_UID": "1-852779d9-c055-4474-b749-6af25de04d34",
    "LOG_BUCKET_NUM": "68"
  },
  "logName": "projects/snowplow-poc-123/logs/cloudsql.googleapis.com%2Fpostgres.log",
  "receiveTimestamp": "2023-08-23T13:43:29.873172918Z"
}

Snowplow-Error

@mike , would you be able to suggest some solution for the error I mentioned above along .

Hi @Amit_Shrivastava1 the logs we would need to help debug this are the actual logs from the VM instance that was being deployed. This will tell us why it did not launch correctly.

Would you mind sharing those with us?

Hi @josh , There is no error log for VM so i have export full log as well as the VM log for the terraform quickstart execution and here is the git link from where you can access the json log files .

please have a look and suggest .

Hi @Amit_Shrivastava1 the logs we need can be found by navigating to the VM instance in the “Compute Engine” UI. From there:

  1. Select the VM;
  2. Select the “Observability” panel;
  3. Select “Logs”

This should give you the log stream from the VM instance which includes output from the launch script that is being executed on the VM and hopefully give us some insight into what is going wrong.

Hi @josh, I’m facing the same issues as Amit:

  • “terraform apply” times out because of VM instance health checks

Upon inspecting the logs, I can see that the Instagram Group (re)creates the instance 3 times during a run, but it can’t get it to post a positive health check.

I couldn’t spot any error in the logs of the VM. I tried to run “terraform apply” 6 times and it always ends up here.

Any idea why the VM fails the health checks?

Hi @conq7 can you please share your tfvars file with any secrets redacted and the git hash of the quick-start examples that you have checked out?

@josh

git hash:
20109423f3a052974d8f0f43466b8853ec52eaa6

tfvars

# Will be prefixed to all resource names
# Use this to easily identify the resources created and provide entropy for subsequent environments
prefix = "testv2"

# The project to deploy the infrastructure into
project_id = "XXXXX-snowplow-opensource"

# Where to deploy the infrastructure
region = "northamerica-northeast2"

# --- Default Network
# Update to the network you would like to deploy into
#
# Note: If you opt to use your own network then you will need to define a subnetwork to deploy into as well
network    = "default"
subnetwork = ""

# --- SSH
# Update this to your IP Address
ssh_ip_allowlist = ["82.XXX.XXX.XX/32"]
# Generate a new SSH key locally with `ssh-keygen`
# ssh-keygen -t rsa -b 4096 
ssh_key_pairs = [
  {
    user_name  = "raul"
    public_key = "ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDSU2SNYMEhUbZbxrQRtg/3tfp6e8+EouQCzI5bIU5Dox8VjqgSOF7rVCXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX rsa-key-20XXXXX"
  }
]

# --- Snowplow Iglu Server
iglu_db_name     = "iglu"
iglu_db_username = "iglu"
# Change and keep this secret!
iglu_db_password = "XXXXXXXXX"

# Used for API actions on the Iglu Server
# Change this to a new UUID and keep it secret!
iglu_super_api_key = "259cc8a1-816a-XXXXXXXXXXXXXXX"

# NOTE: To push schemas to your Iglu Server, you can use igluctl
# igluctl: https://docs.snowplowanalytics.com/docs/pipeline-components-and-applications/iglu/igluctl
# igluctl static push --public schemas/ http://CHANGE-TO-MY-IGLU-IP 00000000-0000-0000-0000-000000000000

# See for more information: https://github.com/snowplow-devops/terraform-google-iglu-server-ce#telemetry
# Telemetry principles: https://docs.snowplowanalytics.com/docs/open-source-quick-start/what-is-the-quick-start-for-open-source/telemetry-principles/
user_provided_id  = ""
telemetry_enabled = false

# --- SSL Configuration (optional)
ssl_information = {
  certificate_id = ""
  enabled        = false
}

# --- Extra Labels to append to created resources (optional)
labels = {}

Thanks @conq7 - and would you happen to have any org controls in place for this project that might be interfering with the default network / connectivity?

I have just done a clean deploy of the “default” version of Iglu Server and it deploys successfully without any errors - so my thinking is either this is a failure somewhere in the launch-script that is happening sporadically or there is a control in place for your org preventing something from deploying cleanly.

Can you try SSHing into the node and executing the script manually to see if you can find where it is failing?

@amit_shrivastava , i had the same problem and realised I forgot to enabled the APIs.
Enabled the apis and it worked properly for me.

  • [Compute Engine API]
  • [Cloud Resource Manager API]
  • [Identity and Access Management (IAM) API]
  • [Cloud Pub/Sub API]
  • [Cloud SQL Admin API]