Hi all,
I’m playing around with a javascript enrichment (in a test environment) using beam-enrich
but am running into a bit of a terse error message when starting up the enricher.
If I turn off this enrichment in the configuration, everything runs ok, so I’m certain it is just this enrichment causing the error.
Initially I thought maybe there is something weird with my script, so I changed it to be the same as the example script here hoping for at least a different error, but no luck. Also tried the simplest possible script (with just a process function retunring a javascript object string).
The enrichment configuration I have looks like this:
{
“schema”: “iglu:com.snowplowanalytics.snowplow/javascript_script_config/jsonschema/1-0-0”,
“data”: {
“vendor”: “com.snowplowanalytics.snowplow”,
“name”: “javascript_script_config”,
“enabled”: true,
“parameters”: {
“script”: “base64 encoded script here”
}
}
}
The error I get when starting the beam enrichment job (via it’s docker container) looks like this:
[main] WARN com.networknt.schema.JsonMetaSchema - Unknown keyword exclusiveMinimum - you should define your own Meta Schema. If the keyword is irrelevant for validation, just use a NonValidationKeyword [main] INFO org.apache.beam.runners.dataflow.options.DataflowPipelineOptions$StagingLocationFactory - No stagingLocation provided, falling back to gcpTempLocation [main] INFO org.apache.beam.runners.dataflow.DataflowRunner - Executing pipeline on the Dataflow Service, which will have billing implications related to Google Compute Engine usage and other Google Cloud Services. [main] INFO org.apache.beam.runners.dataflow.util.PackageUtil - Uploading 250 files from PipelineOptions.filesToStage to staging location to prepare for execution. [main] INFO org.apache.beam.runners.dataflow.util.PackageUtil - Staging files complete: 250 files cached, 0 files newly uploaded in 3 seconds [main] INFO org.apache.beam.runners.dataflow.DataflowPipelineTranslator - Adding raw-from-pubsub/PubsubUnboundedSource as step s1 [main] INFO org.apache.beam.runners.dataflow.DataflowPipelineTranslator - Adding raw-from-pubsub/MapElements/Map as step s2 Exception in thread "main" java.lang.ClassCastException: cannot assign instance of java.lang.invoke.SerializedLambda to field com.spotify.scio.util.Functions$$anon$7.g of type scala.Function1 in instance of com.spotify.scio.util.Functions$$anon$7 at java.io.ObjectStreamClass$FieldReflector.setObjFieldValues(ObjectStreamClass.java:2233) at java.io.ObjectStreamClass.setObjFieldValues(ObjectStreamClass.java:1405) at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2291) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2209) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2067) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1571) at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2285) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2209) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2067) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1571) at java.io.ObjectInputStream.readObject(ObjectInputStream.java:431) at org.apache.beam.sdk.util.SerializableUtils.deserializeFromByteArray(SerializableUtils.java:71) at org.apache.beam.runners.core.construction.ParDoTranslation.doFnWithExecutionInformationFromProto(ParDoTranslation.java:610) at org.apache.beam.runners.core.construction.ParDoTranslation.getSchemaInformation(ParDoTranslation.java:314) at org.apache.beam.runners.core.construction.ParDoTranslation.getSchemaInformation(ParDoTranslation.java:299) at org.apache.beam.runners.dataflow.DataflowPipelineTranslator$8.translateSingleHelper(DataflowPipelineTranslator.java:1003) at org.apache.beam.runners.dataflow.DataflowPipelineTranslator$8.translate(DataflowPipelineTranslator.java:995) at org.apache.beam.runners.dataflow.DataflowPipelineTranslator$8.translate(DataflowPipelineTranslator.java:992) at org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator.visitPrimitiveTransform(DataflowPipelineTranslator.java:494) at org.apache.beam.sdk.runners.TransformHierarchy$Node.visit(TransformHierarchy.java:665) at org.apache.beam.sdk.runners.TransformHierarchy$Node.visit(TransformHierarchy.java:657) at org.apache.beam.sdk.runners.TransformHierarchy$Node.access$600(TransformHierarchy.java:317) at org.apache.beam.sdk.runners.TransformHierarchy.visit(TransformHierarchy.java:251) at org.apache.beam.sdk.Pipeline.traverseTopologically(Pipeline.java:460) at org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator.translate(DataflowPipelineTranslator.java:433) at org.apache.beam.runners.dataflow.DataflowPipelineTranslator.translate(DataflowPipelineTranslator.java:192) at org.apache.beam.runners.dataflow.DataflowRunner.run(DataflowRunner.java:797) at org.apache.beam.runners.dataflow.DataflowRunner.run(DataflowRunner.java:188) at org.apache.beam.sdk.Pipeline.run(Pipeline.java:315) at org.apache.beam.sdk.Pipeline.run(Pipeline.java:301) at com.spotify.scio.ScioContext.execute(ScioContext.scala:598) at com.spotify.scio.ScioContext$$anonfun$run$1.apply(ScioContext.scala:586) at com.spotify.scio.ScioContext$$anonfun$run$1.apply(ScioContext.scala:574) at com.spotify.scio.ScioContext.requireNotClosed(ScioContext.scala:694) at com.spotify.scio.ScioContext.run(ScioContext.scala:574) at com.snowplowanalytics.snowplow.enrich.beam.Enrich$.main(Enrich.scala:94) at com.snowplowanalytics.snowplow.enrich.beam.Enrich.main(Enrich.scala)
Any hints much appreciated.