One of my server for IP check went down because of which all my events were going into “enrich-bad-json” kinesis stream. Now when my service is up I want to restore all those missing events in my elasticsearch.
I have tried following:
I polled my kinesis stream in which collector puts events and tried to put events from there back in the stream so that enricher can pick it but while data being processed by the enricher I am getting the following error.
Error deserializing raw event: Cannot read. Remote side has closed. Tried to read 2 bytes, but only got 0 bytes. (This is often indicative of an internal error on the server side. Please check your server logs.)"
This is the sample how my data looks when I pulled it from the kinesis stream:
\x0b\x00d\x00\x00\x00\r42.106.193.13\n\x00\xc8\x00\x00\x01f\xceuq\xa1\x0b\x00\xd2\x00\x00\x00\x05UTF-8\x0b\x00\xdc\x00\x00\x00\x12ssc-0.13.0-kinesis\x0b\x01,\x00\x00\x00\xb8Mozilla/5.0 (Linux; Android 5.0.2; Mi 4i Build/LRX22G; wv) AppleWebKit/537.36 (KHTML, like Gecko) Version/4.0 Chrome/61.0.3163.98 Mobile Safari/537.36 [FB_IAB/FB4A;FBAV/176.0.0.42.87;]\x0b\x016\x00\x00\x00jhttps://www.popxo.com/trending/priyanka-wore-tiffany-jewellery-worth-95-crore-at-her-bridal-shower-768514/\x0b\x01@\x00\x00\x00\x02/i\x0b\x01J\x00\x00\x02nstm=1541062094348&e=pp&url=https%3A%2F%2Fwww.popxo.com%2Ftrending%2Fpriyanka-wore-tiffany-jewellery-worth-95-crore-at-her-bridal-shower-768514%2F&page=Priyanka%20Wore%20Jewellery%20Worth%209.5%20Crore%20At%20Her%20Bridal%20Shower%20%7C%20POPxo&refr=http%3A%2F%2Fm.facebook.com%2F&pp_mix=0&pp_max=0&pp_miy=2397&pp_may=2397&tv=js-2.8.0&tna=cf&aid=popxo-web&p=web&tz=Asia%2FKolkata&lang=en-GB&cs=UTF-8&res=360x640&cd=32&cookie=1&eid=f7161e3e-42cb-480d-bb65-8fa61e4afe0f&dtm=1541062094327&vp=360x572&ds=360x17246&vid=9&sid=a80f8ac5-787c-4000-adb3-605da38e2614&duid=5d022148-0118-46ed-b4ea-92b27943a152&fp=2481695805&uid=730786\x0f\x01^\x0b\x00\x00\x00\r\x00\x00\x00\x15Host: track.popxo.com\x00\x00\x002Accept: image/webp, image/apng, image/*, */*;q=0.8\x00\x00\x00\x1eAccept-Encoding: gzip, deflate\x00\x00\x00#Accept-Language: en-GB, en-US;q=0.8\x00\x00\x05{Cookie: __gads=ID=63640a719a2fbbdf:T=1530956020:S=ALNI_MYHYy7Jul6pjsMhgT1D5G1yBQCkNA; _privy_a=%7B%22referring_domain%22%3A%22m.facebook.com%22%2C%22referring_url%22%3A%22http%3A%2F%2Fm.facebook.com%2F%22%2C%22utm_medium%22%3A%22social%22%2C%22utm_source%22%3A%22Facebook%22%2C%22search_term%22%3Anull%2C%22initial_url%22%3A%22https%3A%2F%2Fwww.popxo.com%2F2016%2F03%2Feverything-you-need-to-know-about-having-sex-during-your-period%2F%3Frpxeng%3D1%22%2C%22sessions_count%22%3A2%2C%22pages_viewed%22%3A2%7D; _privy_D8866583716CDA595B39701E=%7B%22uuid%22%3A%224457ab75-f0de-46ac-9b88-9328ba6f45db%22%2C%22variations%22%3A%7B%7D%2C%22country_code%22%3A%22HK%22%7D; __unam=8606ef0-16474170b60-175ab92d-2; cto_lwid=4c8224e2-fabf-4b11-86d5-3d2e13cd8a2d; _ga=GA1.2.1967126997.1530956014; _gid=GA1.2.655348186.1541051268; WZRK_G=59d2a089137d46bf9b877d3b1cd7aada; _parsely_session={%22sid%22:8%2C%22surl%22:%22https://www.popxo.com/trending/priyanka-wore-tiffany-jewellery-worth-95-crore-at-her-bridal-shower-768514/%22%2C%22sref%22:%22http://m.facebook.com/%22%2C%22sts%22:1541061963927%2C%22slts%22:1541051294163}; _parsely_visitor={%22id%22:%22b84d26d5-112b-4667-bfd5-82d0cf7ad0fc%22%2C%22session_count%22:8%2C%22last_session_ts%22:1541061963927}; _fbp=fb.1.1541061913311.1941076511; WZRK_S_8R5-WK8-Z64Z=%7B%22p%22%3A1%2C%22s%22%3A1541061909%2C%22t%22%3A1541062029%7D; sp=dc5e97c4-ae3c-471e-8201-06f68fd0f2fd\x00\x00\x00sReferer: https://www.popxo.com/trending/priyanka-wore-tiffany-jewellery-worth-95-crore-at-her-bridal-shower-768514/\x00\x00\x00\xc4user-agent: Mozilla/5.0 (Linux; Android 5.0.2; Mi 4i Build/LRX22G; wv) AppleWebKit/537.36 (KHTML, like Gecko) Version/4.0 Chrome/61.0.3163.98 Mobile Safari/537.36 [FB_IAB/FB4A;FBAV/176.0.0.42.87;]\x00\x00\x00%X-Requested-With: com.facebook.katana\x00\x00\x00\x1eX-Forwarded-For: 42.106.193.13\x00\x00\x00\x15X-Forwarded-Port: 443\x00\x00\x00\x18X-Forwarded-Proto: https\x00\x00\x00\x16Connection: keep-alive\x00\x00\x00\x1bTimeout-Access: <function1>\x0b\x01\x90\x00\x00\x00\x0ftrack.popxo.com\x0b\x01\x9a\x00\x00\x00$dc5e97c4-ae3c-471e-8201-06f68fd0f2fd\x0bzi\x00\x00\x00Aiglu:com.snowplowanalytics.snowplow/CollectorPayload/thrift/1-0-0\x00
I understand this is not thrift format so probably I am getting that error.
I tried converting it to thrift format and no success. Now I am trying to read unenriched thrift files which s3-store stores and will see if that brings me some success.
Could anybody suggest any other way to resolve my issue?
Thanks