hi there!
I have been testing the platform for the past few weeks on a test system and everything had been working great!
Yesterday I connected it to a production server (average traffic of about 30,000 pageviews/day.
The collector, kinesis streams etc. worked well but I have reached a block on the EmrEtlRunner process. The process has been going on for about 10 hrs already and its still on step 2: Enrich Raw Events! There’s only one entry in stderr:
log4j:ERROR Failed to rename [/mnt/var/log/hadoop/steps/s-2O6T4H2PNZWLB/syslog] to [/mnt/var/log/hadoop/steps/s-2O6T4H2PNZWLB/syslog.2017-03-25-02].
The EMR has been setup with 2 cores (m1.medium boxes) and during testing, it ran for about an hr to process the thrift data and send to redshift. The data volume is significantly higher on production of course, so I am not clear where the issue lies. I am not sure if the process and crashed completely (the EMR monitor indicates both cores as running).
What would be the best approach at this time?
- Would increasing the cores numbers from 2 to 4 help?
- Shall I stop this process now and restart? If so, how can I move the events data back from raw-processing bucket into the raw-in bucket? Will that affect that events data already being collected there for the past 10 hrs?
- So far, I am not clear how many events has been tracked, but so far about 6000 raw thrift records were deposited into the ETL raw processing bucket folder.
- In my javascript tracker, I am also recording page ping events after every 30 seconds (repeats every 10). Would turning this off help? I understand I do need this to capture bounce data etc.
Here are the other logs from the EMR process:
-
Controller log (end):
INFO ProcessRunner started child process 7618 :
hadoop 7618 2200 0 02:20 ? 00:00:00 bash /usr/lib/hadoop/bin/hadoop jar /mnt/var/lib/hadoop/steps/s-2O6T4H2PNZWLB/snowplow-hadoop-enrich-1.8.0.jar com.snowplowanalytics.snowplow.enrich.hadoop.EtlJob --hdfs --input_format thrift --etl_tstamp 1490407375035 --iglu_config ewogICJzY2hlbWEiOiAiaWdsdTpjb20uc25vd3Bsb3dhbmFseXRpY3MuaWdsdS9yZXNvbHZlci1jb25maWcvanNvbnNjaGVtYS8xLTAtMSIsCiAgImRhdGEiOiB7CiAgICAiY2FjaGVTaXplIjogNTAwLAogICAgInJlcG9zaXRvcmllcyI6IFsKICAgICAgewogICAgICAgICJuYW1lIjogIklnbHUgQ2VudHJhbCIsCiAgICAgICAgInByaW9yaXR5IjogMCwKICAgICAgICAidmVuZG9yUHJlZml4ZXMiOiBbICJjb20uc25vd3Bsb3dhbmFseXRpY3MiIF0sCiAgICAgICAgImNvbm5lY3Rpb24iOiB7CiAgICAgICAgICAiaHR0cCI6IHsKICAgICAgICAgICAgInVyaSI6ICJodHRwOi8vaWdsdS5pbnNpZGVob29rLmNvbSIKICAgICAgICAgIH0KICAgICAgICB9CiAgICAgIH0sCiAgICAgIHsKICAgICAgICAibmFtZSI6ICJJbnNpZGVIb29rIElnbHUgUmVwb3NpdG9yeSIsCiAgICAgICAgInByaW9yaXR5IjogNSwKICAgICAgICAidmVuZG9yUHJlZml4ZXMiOiBbICJjb20uaW5zaWRlaG9vay5pZ2x1IiBdLAogICAgICAgICJjb25uZWN0aW9uIjogewogICAgICAgICAgImh0dHAiOiB7CiAgICAgICAgICAgICJ1cmkiOiAiaHR0cDovL2lnbHUuaW5zaWRlaG9vay5jb20iCiAgICAgICAgICB9CiAgICAgICAgfQogICAgICB9CgogICAgXQogIH0KfQ== --enrichments eyJzY2hlbWEiOiJpZ2x1OmNvbS5zbm93cGxvd2FuYWx5dGljcy5zbm93cGxvdy9lbnJpY2htZW50cy9qc29uc2NoZW1hLzEtMC0wIiwiZGF0YSI6W3sic2NoZW1hIjoiaWdsdTpjb20uc25vd3Bsb3dhbmFseXRpY3Muc25vd3Bsb3cvdWFfcGFyc2VyX2NvbmZpZy9qc29uc2NoZW1hLzEtMC0wIiwiZGF0YSI6eyJ2ZW5kb3IiOiJjb20uc25vd3Bsb3dhbmFseXRpY3Muc25vd3Bsb3ciLCJuYW1lIjoidWFfcGFyc2VyX2NvbmZpZyIsImVuYWJsZWQiOnRydWUsInBhcmFtZXRlcnMiOnt9fX0seyJzY2hlbWEiOiJpZ2x1OmNvbS5zbm93cGxvd2FuYWx5dGljcy5zbm93cGxvdy9yZWZlcmVyX3BhcnNlci9qc29uc2NoZW1hLzEtMC0wIiwiZGF0YSI6eyJuYW1lIjoicmVmZXJlcl9wYXJzZXIiLCJ2ZW5kb3IiOiJjb20uc25vd3Bsb3dhbmFseXRpY3Muc25vd3Bsb3ciLCJlbmFibGVkIjp0cnVlLCJwYXJhbWV0ZXJzIjp7ImludGVybmFsRG9tYWlucyI6WyJpbnNpZGVob29rLmNvbSIsInd3dy1zdGFnZS5pbnNpZGVob29rLmNvbSIsInd3dy1kZXYuaW5zaWRlaG9vay5jb20iLCJpaC53d3cubG9jYWwiXX19fV19 --input_folder hdfs:///local/snowplow/raw-events/ --output_folder hdfs:///local/snowplow/enriched-events/ --bad_rows_folder s3://ih-anlytics-data-dev1/enriched/bad/run=2017-03-25-02-02-55/
2017-03-25T02:20:27.633Z INFO HadoopJarStepRunner.Runner: startRun() called for s-2O6T4H2PNZWLB Child Pid: 7618
INFO Synchronously wait child process to complete : hadoop jar /mnt/var/lib/hadoop/steps/s-2O6T4H2P…
INFO Process still running
INFO Process still running
INFO Process still running
INFO Process still running
Syslog:
2017-03-25 02:20:55,507 INFO cascading.flow.Flow (flow com.snowplowanalytics.snowplow.enrich.hadoop.EtlJob): [com.snowplowanalytics....] sink: Hfs["TextDelimited[['app_id', 'platform', 'etl_tstamp', 'collector_tstamp', 'dvce_created_tstamp', 'event', 'event_id', 'txn_id', 'name_tracker', 'v_tracker', 'v_collector', 'v_etl', 'user_id', 'user_ipaddress', 'user_fingerprint', 'domain_userid', 'domain_sessionidx', 'network_userid', 'geo_country', 'geo_region', 'geo_city', 'geo_zipcode', 'geo_latitude', 'geo_longitude', 'geo_region_name', 'ip_isp', 'ip_organization', 'ip_domain', 'ip_netspeed', 'page_url', 'page_title', 'page_referrer', 'page_urlscheme', 'page_urlhost', 'page_urlport', 'page_urlpath', 'page_urlquery', 'page_urlfragment', 'refr_urlscheme', 'refr_urlhost', 'refr_urlport', 'refr_urlpath', 'refr_urlquery', 'refr_urlfragment', 'refr_medium', 'refr_source', 'refr_term', 'mkt_medium', 'mkt_source', 'mkt_term', 'mkt_content', 'mkt_campaign', 'contexts', 'se_category', 'se_action', 'se_label', 'se_property', 'se_value', 'unstruct_event', 'tr_orderid', 'tr_affiliation', 'tr_total', 'tr_tax', 'tr_shipping', 'tr_city', 'tr_state', 'tr_country', 'ti_orderid', 'ti_sku', 'ti_name', 'ti_category', 'ti_price', 'ti_quantity', 'pp_xoffset_min', 'pp_xoffset_max', 'pp_yoffset_min', 'pp_yoffset_max', 'useragent', 'br_name', 'br_family', 'br_version', 'br_type', 'br_renderengine', 'br_lang', 'br_features_pdf', 'br_features_flash', 'br_features_java', 'br_features_director', 'br_features_quicktime', 'br_features_realplayer', 'br_features_windowsmedia', 'br_features_gears', 'br_features_silverlight', 'br_cookies', 'br_colordepth', 'br_viewwidth', 'br_viewheight', 'os_name', 'os_family', 'os_manufacturer', 'os_timezone', 'dvce_type', 'dvce_ismobile', 'dvce_screenwidth', 'dvce_screenheight', 'doc_charset', 'doc_width', 'doc_height', 'tr_currency', 'tr_total_base', 'tr_tax_base', 'tr_shipping_base', 'ti_currency', 'ti_price_base', 'base_currency', 'geo_timezone', 'mkt_clickid', 'mkt_network', 'etl_tags', 'dvce_sent_tstamp', 'refr_domain_userid', 'refr_dvce_tstamp', 'derived_contexts', 'domain_sessionid', 'derived_tstamp', 'event_vendor', 'event_name', 'event_format', 'event_version', 'event_fingerprint', 'true_tstamp']]"]["hdfs:/local/snowplow/enriched-events"]
2017-03-25 02:20:55,510 INFO cascading.flow.Flow (flow com.snowplowanalytics.snowplow.enrich.hadoop.EtlJob): [com.snowplowanalytics....] parallel execution is enabled: true
2017-03-25 02:20:55,510 INFO cascading.flow.Flow (flow com.snowplowanalytics.snowplow.enrich.hadoop.EtlJob): [com.snowplowanalytics....] starting jobs: 3
2017-03-25 02:20:55,510 INFO cascading.flow.Flow (flow com.snowplowanalytics.snowplow.enrich.hadoop.EtlJob): [com.snowplowanalytics....] allocating threads: 3
2017-03-25 02:20:55,518 INFO cascading.flow.FlowStep (pool-5-thread-1): [com.snowplowanalytics....] starting step: (1/3)
2017-03-25 02:20:55,938 INFO org.apache.hadoop.yarn.client.RMProxy (pool-5-thread-1): Connecting to ResourceManager at ip-10-0-1-153.ec2.internal/10.0.1.153:8032
2017-03-25 02:20:56,629 INFO org.apache.hadoop.yarn.client.RMProxy (pool-5-thread-1): Connecting to ResourceManager at ip-10-0-1-153.ec2.internal/10.0.1.153:8032
2017-03-25 02:21:17,993 INFO org.apache.hadoop.mapreduce.lib.input.FileInputFormat (pool-5-thread-1): Total input paths to process : 6288
2017-03-25 02:21:17,994 INFO org.apache.hadoop.conf.Configuration.deprecation (pool-5-thread-1): mapred.input.dir.recursive is deprecated. Instead, use mapreduce.input.fileinputformat.input.dir.recursive
2017-03-25 02:21:25,563 INFO cascading.util.Update (UpdateRequestTimer): newer Cascading release available: 2.6.3
2017-03-25 02:21:38,354 INFO org.apache.hadoop.mapreduce.JobSubmitter (pool-5-thread-1): number of splits:3144
2017-03-25 02:21:38,879 INFO org.apache.hadoop.mapreduce.JobSubmitter (pool-5-thread-1): Submitting tokens for job: job_1490407757382_0002
2017-03-25 02:21:39,285 INFO org.apache.hadoop.yarn.client.api.impl.YarnClientImpl (pool-5-thread-1): Submitted application application_1490407757382_0002
2017-03-25 02:21:39,426 INFO org.apache.hadoop.mapreduce.Job (pool-5-thread-1): The url to track the job: http://ip- 10-0-1-153.ec2.internal:20888/proxy/application_1490407757382_0002/
2017-03-25 02:21:39,435 INFO cascading.flow.FlowStep (pool-5-thread-1): [com.snowplowanalytics....] submitted hadoop job: job_1490407757382_0002
2017-03-25 02:21:39,437 INFO cascading.flow.FlowStep (pool-5-thread-1): [com.snowplowanalytics....] tracking url: http://ip-10-0-1-153.ec2.internal:20888/proxy/application_1490407757382_0002/
Thanks very much for your help!