Today I made some updates to our json schemas and versioned them. Since I only added fields that were optional (these new fields accept null as value as well) I figured that I could just increment the minor version of the schema rather than the major version. I did this because I figured since these new fields are optional (nullable) this shouldn’t require a brand new redshift table
I created the new versions of the json schema files, ran the igluctl command to generate DDL files and jsonschema, ran the ddl files in redshift and updated the new schema and jsonpaths files to our s3 repository. Tonight we were re-processing some events and got the error pasted below
Im assuming that this has to do with the updates I made today to our jsonpaths files in our repo. When I looked at the json paths files that were created I noticed that the files were only versioned at the major version level. i.e the file names that were output by igluctl are named <schema>_1.json
rather than <schema>_1-0-1.json
so when I uploaded these to my repository, I overwrote the old version of the jsonpaths.
My questions are
(1) Does this sound like the right diagnosis for why Im seeing these errors below? The error doesn’t tell me which jsonpaths it is failing on but the timing of these errors with my version update today makes me think this has to be the case
(2) Was I right to update the minor version of the schema rather than create a new major version? I’d like to be able to add fields to my contexts that are nullable without having to manage a bunch of different versions of tables in redshift.
(3) If both answers above are yes, how can I generate jsonpaths files that will allow custom contexts from the previous minor version schema to be loaded along with contexts with the more recent versions?
COMMIT;: ERROR: Number of jsonpaths and the number of columns should match. JSONPath size: 12, Number of columns in table or column list: 19
Detail:
-----------------------------------------------
error: Number of jsonpaths and the number of columns should match. JSONPath size: 12, Number of columns in table or column list: 19
code: 8001
context:
query: 85016
location: s3_utility.cpp:670
process: padbmaster [pid=32510]
-----------------------------------------------
uri:classloader:/storage-loader/lib/snowplow-storage-loader/redshift_loader.rb:89:in `load_events_and_shredded_types'
uri:classloader:/gems/contracts-0.11.0/lib/contracts/method_reference.rb:43:in `send_to'
uri:classloader:/gems/contracts-0.11.0/lib/contracts/call_with.rb:76:in `call_with'
uri:classloader:/gems/contracts-0.11.0/lib/contracts/method_handler.rb:138:in `block in redefine_method'
uri:classloader:/storage-loader/bin/snowplow-storage-loader:54:in `block in (root)'
uri:classloader:/storage-loader/bin/snowplow-storage-loader:51:in `<main>'
org/jruby/RubyKernel.java:973:in `load'
uri:classloader:/META-INF/main.rb:1:in `<main>'
org/jruby/RubyKernel.java:955:in `require'
uri:classloader:/META-INF/main.rb:1:in `(root)'
uri:classloader:/META-INF/jruby.home/lib/ruby/stdlib/rubygems/core_ext/kernel_require.rb:1:in `<main>'