Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
Re: [geomesa-users] Error while parsing action - only when multithreaded

In general yes, it's safer to keep client and server in sync. I believe that things should still work if your clients are 1.3.2-SNAPSHOT and your server is 1.3.1, but I'm not 100% sure.

Thanks,

Emilio

On 04/26/2017 11:43 AM, David Boyd wrote:

Emilio:

   Thanks a bunch.   So will have to think about the versioning and updates.
I take it I need to update both what is built in my clients and what is also
on my accumulo cluster?



On 4/26/17 8:47 AM, Emilio Lahr-Vivaz wrote:
Hi David,

Glad to help! I believe the issue you're seeing has already been identified and fixed in current master:

https://github.com/locationtech/geomesa/commit/893549ea47cd07bfcd8257087e200c37609fa312

This will be included in the next release (1.3.2), which should be in the next week or so. Alternatively, you can use version 1.3.2-SNAPSHOT and include our snapshot repository in your pom:

  <repository>
    <id>geomesa-snapshots</id>
    <url>https://repo.locationtech.org/content/repositories/geomesa-snapshots</url>
    <releases>
      <enabled>false</enabled>
    </releases>
    <snapshots>
      <enabled>true</enabled>
    </snapshots>
  </repository>


For explanation, we track various stats on ingested data to aid with query planning and counts. The bug involves merging stats together, hence why you don't see it with a single threaded ingest, as there is no need to merge.

Thanks,

Emilio

On 04/25/2017 07:22 PM, David Boyd wrote:
All:

   First a big thank you to everyone on this list for responding to my issues.
Special shout out to Emilio who helped me through several issues.

I have another weird one.

My application reads lines of data from a kafka topic and creates objects and features
from those lines.

The code is set up to support multiple threads (each with it's own runnable) to get some
parallelism (eventually this will be a spark job but this was quickest).

In any case I run my data file through with only a single thread persisting and the
everything runs clean.

However, when I set up multiple threads I get this error:

2017-04-25 15:36:10,184 | ERROR | [Thread-12] | (GDELT_Consumer.java:122) - Error while parsing action 'stat/OneOrMore/ZeroOrMore/Sequence/org$locationtech$geomesa$utils$stats$StatParser$$singleStat/org$locationtech$geomesa$utils$stats$StatParser$$histogram/org$locationtech$geomesa$utils$stats$StatParser$$histogramAction1' at input position (line 1, pos 998):
Count();MinMax("AgentGeoLocation");MinMax("recordKey");MinMax("AgentCode");MinMax("AgentName");MinMax("NameMetaphone");MinMax("AgentCountryCode");MinMax("AgentKnownGroupCode");MinMax("AgentEthnicCode");MinMax("AgentReligion1Code");MinMax("AgentReligion2Code");MinMax("AgentType1Code");MinMax("AgentType2Code");MinMax("AgentType3Code");MinMax("AgentGeoType");MinMax("AgentGeoFullname");MinMax("AgentGeoCountryCode");MinMax("AgentGeoADM1Code");MinMax("AgentGeoADM2Code");MinMax("AgentGeoFeatureID");TopK("recordKey");TopK("AgentCode");TopK("AgentName");TopK("NameMetaphone");TopK("AgentCountryCode");TopK("AgentKnownGroupCode");TopK("AgentEthnicCode");TopK("AgentReligion1Code");TopK("AgentReligion2Code");TopK("AgentType1Code");TopK("AgentType2Code");TopK("AgentType3Code");TopK("AgentGeoType");TopK("AgentGeoFullname");TopK("AgentGeoCountryCode");TopK("AgentGeoADM1Code");TopK("AgentGeoADM2Code");TopK("AgentGeoFeatureID");Histogram("AgentGeoLocation",10000,"POINT (-190 -100)","POINT (-170 -80)");Histogram("recordKey",1000,"00251fe1-1754-46a9-bec0-efbbfc558579","ff7c316b-6245-4838-a9d2-bfb73eb2cd6a");Histogram("AgentCode",1000,"AFG","tam");Histogram("AgentName",1000,"ABBOT","ZIMBABWE");Histogram("NameMetaphone",1000,"0F","XNSP");Histogram("AgentCountryCode",1000,"AFG","ZWE");Histogram("AgentKnownGroupCode",1000,"UNO0","UNOz");Histogram("AgentEthnicCode",1000,"chm","tam");Histogram("AgentReligion1Code",1000,"BUD","MOS");Histogram("AgentReligion2Code",1000,"CPT","LDS");Histogram("AgentType1Code",1000,"AGR","UAF");Histogram("AgentType2Code",1000,"BUS","MIL");Histogram("AgentType3Code",1000,"EDU","LAB");Histogram("AgentGeoType",1000,"0","5");Histogram("AgentGeoFullname",1000,"Acton, Shropshire, United Kingdom","Zimbabwe");Histogram("AgentGeoCountryCode",1000,"AE","ZI");Histogram("AgentGeoADM1Code",1000,"AE03","ZI");Histogram("AgentGeoADM2Code",1000,"12608","WA061");Histogram("AgentGeoFeatureID",1000,"-1201139","ZI");Frequency("recordKey",20);Frequency("AgentCode",20);Frequency("AgentName",20);Frequency("NameMetaphone",20);Frequency("AgentCountryCode",20);Frequency("AgentKnownGroupCode",20);Frequency("AgentEthnicCode",20);Frequency("AgentReligion1Code",20);Frequency("AgentReligion2Code",20);Frequency("AgentType1Code",20);Frequency("AgentType2Code",20);Frequency("AgentType3Code",20);Frequency("AgentGeoType",1);Frequency("AgentGeoFullname",20);Frequency("AgentGeoCountryCode",20);Frequency("AgentGeoADM1Code",20);Frequency("AgentGeoADM2Code",20);Frequency("AgentGeoFeatureID",20)







^

java.lang.IllegalArgumentException: requirement failed: Value out of bounds ([-180.0 180.0]): -190.0
org.parboiled.errors.ParserRuntimeException: Error while parsing action 'stat/OneOrMore/ZeroOrMore/Sequence/org$locationtech$geomesa$utils$stats$StatParser$$singleStat/org$locationtech$geomesa$utils$stats$StatParser$$histogram/org$locationtech$geomesa$utils$stats$StatParser$$histogramAction1' at input position (line 1, pos 998):
Now, I put debugs in my program to dump out all the coordinates, there are no -190 values.
I even put fencing in my code with logging to prevent out of range values.
I examined the file and no -190 coordinate exists.

It looks to be some sort of timing issue related to the indexes getting updated but I have no clue.

It does not occur at the same place or the same number of times in the file.

Full stack trace starts at line 4850 of the attached file.

Any thoughts?

Email me and I will find some way to send source and/or a shaded jar for debugging.





_______________________________________________
geomesa-users mailing list
geomesa-users@xxxxxxxxxxxxxxxx
To change your delivery options, retrieve your password, or unsubscribe from this list, visit
https://dev.locationtech.org/mailman/listinfo/geomesa-users



_______________________________________________
geomesa-users mailing list
geomesa-users@xxxxxxxxxxxxxxxx
To change your delivery options, retrieve your password, or unsubscribe from this list, visit
https://dev.locationtech.org/mailman/listinfo/geomesa-users

-- 
========= mailto:dboyd@xxxxxxxxxxxxxxxxx ============
David W. Boyd                     
VP,  Data Solutions       
10432 Balls Ford, Suite 240  
Manassas, VA 20109         
office:   +1-703-552-2862        
cell:     +1-703-402-7908
============== http://www.incadencecorp.com/ ============
ISO/IEC JTC1 WG9, editor ISO/IEC 20547 Big Data Reference Architecture
Chair ANSI/INCITS TC Big Data
Co-chair NIST Big Data Public Working Group Reference Architecture
First Robotic Mentor - FRC, FTC - www.iliterobotics.org
Board Member- USSTEM Foundation - www.usstem.org

The information contained in this message may be privileged 
and/or confidential and protected from disclosure.  
If the reader of this message is not the intended recipient 
or an employee or agent responsible for delivering this message 
to the intended recipient, you are hereby notified that any 
dissemination, distribution or copying of this communication 
is strictly prohibited.  If you have received this communication 
in error, please notify the sender immediately by replying to 
this message and deleting the material from any computer.

 


_______________________________________________
geomesa-users mailing list
geomesa-users@xxxxxxxxxxxxxxxx
To change your delivery options, retrieve your password, or unsubscribe from this list, visit
https://dev.locationtech.org/mailman/listinfo/geomesa-users


Back to the top