Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
Re: [geomesa-users] odd ingest error, stuck on debugging it

Hi Diane,

GEOMESA-1375 was for escaping strings when we encode them in the stats, but I don't think that would fix your problem. I believe it was GEOMESA-1384 that fixed your issue, which was merged in 1.2.5: geomesa.atlassian.net/browse/GEOMESA-1384

The issue was caused by 'empty' geometries - calling .getCentroid would return an empty point, which would then be encoded as the string POINT(? ?), which would then fail to be parsable as WKT when we encoded the stat as a string. When we create a new MinMax or Histogram stat for tracking ingested features, we pass in the current bounds for the attribute like so: MinMax("name", "amy", "xavier"). For a geometry, it will contain the bounding box as two WKT points: MinMax("geom", "POINT(-180 -90)", "POINT(180 90)"). Once an invalid geometry gets set as the min or max, the string will be invalid, causing the stat updater to fail, which causes ingestion to fail. That is why deleting the existing stats is required to fix the issue.

You could probably find the error in an existing stat by using the geomesa tools to interrogate them However, we don't really offer any way to fix the stat other than deleting and re-generating them. You could do it, but it would require calling the stats API directly in java.

Thanks,

Emilio

On 12/08/2016 02:01 PM, Diane Griffith wrote:

Emilio,

 

That did seem to allow me to ingest a small sample again.  Larger ingest will happen shortly.  I tested this as the fix on the 1.2.6 instance.

 

Thing is I am pretty sure that the instance that is 1.2.6 had that catalog dropped and it was re- ingested when it was 1.2.6 before this stats problem showed up.  It showed up around the same time for both the 1.2.6 and 1.2.3 instances.  I have no way of knowing what value was a problem in the stats though.  So do you think 1.2.7 had any additional stats changes to consider when we may be able to upgrade?  I saw “GEOMESA-1375 – Stats string created for GeoMesa stats may be incorrect” in 1.2.7 release notes.  Could that lead to the type of stacktrace I was seeing?

 

Is there a way to find out what is causing the failure in the stats?  A way for us to find the error in the existing stat? 

 

Thanks,

Diane

From: geomesa-users-bounces@xxxxxxxxxxxxxxxx [mailto:geomesa-users-bounces@xxxxxxxxxxxxxxxx] On Behalf Of Emilio Lahr-Vivaz
Sent: Thursday, December 08, 2016 11:53 AM
To: Geomesa User discussions
Subject: Re: [geomesa-users] odd ingest error, stuck on debugging it

 

Hi Diane,

I would guess that the error is in an existing stat that has already been written, which is why it works on a fresh ingestion. We have had some bugs in handling of unexpected values in our stats implementation - I believe that these have all been fixed by 1.2.6. When you ingest new data, it reads the existing stats and attempts to update them. To fix the problem, you can delete the existing stats and then re-generate them from the command-line tools.

First, delete the exists stats using the accumulo shell:

$ table <catalog>_stats
$ deletemany -f
$ compact

Wait for the compaction to finish, then use the geomesa tools to run the 'stats-analyze' command:

http://www.geomesa.org/documentation/user/commandline_tools.html#stats-analyze

Hope that helps - let us know!

Thanks,

Emilio

On 12/08/2016 11:25 AM, Diane Griffith wrote:

We have been ingesting data fine to existing catalogs for a bit across different geomesa versions (so after upgrades).  Recently one of our catalogs has started failing ingests of new data.  There was no code change and no one believes a schema or input file change.  I have tried to compare the feature schema in the accumulo catalog table to the schema being used to create the simple feature for the ingest process and I see no difference (again no one thinks a change happened here).  I have got one record that will produce the failed response an inspected all columns to the types marked in the simple feature type/catalog table schema.  Again nothing looks wrong.

 

I then tried to ingest that same record that will fail against the existing catalog to a new catalog/table and it successfully ingests.

 

So we are using java to ingest and the only error message I get outputted on the failure (may need to see if I can tweak on additional geomesa logging) is:

 

com.vividsolutions.jts.io.ParseException:  Invalid number: ? (line 1)

  at com.vividsolutions.jts.io.WKTReader.parseErrorWithLine(WKTReader.java:427)

  at com.vividsolutions.jts.io.WKTReader.getNextNumber (WKTReader.java: 298)

  at com.vividsolutions.jts.io.WKTReader.getPreciseCoordinate(WKTReader.java:257)

  at com.vividsolutions.jts.io.WKTReader.readPointText(WKTReader.java:515)

  at com.vividsolutions.jts.io.WKTReader.readGeometryTaggedText(WKTReader.java:472)

  at com.vividsolutions.jts.io.WKTReader.read(WKTReader.java:205)

  at com.vividsolutions.jts.io.WKTReader.read(WKTReader.java:174)

  at org.locationtech.geomesa.utils.text.WKTUtils$class.read(WKTUtils.scala:48)

  at org.locationtech.geomesa.utils.text.WKTUtils$.read(WKUtils.scala:65)

  at org.locationtech.geomesa.utils.stats.Stat$StatParser$$anonfun$histogramParser$9.apply(Stat.scala:417)

  at org.locationtech.geomesa.utils.stats.Stat$StatParser$$anonfun$histogramParser$9.apply(Stat.scala:397)

  at scala.util.parsing.combinator.Parsers$Success.map(Parsers.scala:137)

….

 

Our schema has 2 geometries, a point geometry and a polygon geometry.  The point is the default geometry.  I believe the data is fine for creating both the Point geometry and the Polygon geometry.  Again I can ingest a failing record just fine to a new table name/catalog. 

 

This is happening on geomesa 1.2.6 as well as 1.2.3 instances.

 

The stacktrace was long but further down it seems to be when it is applying the Stats on ingest.  Then again I may be interpreting the stack trace wrong, I just know the geometry is fine if I ingest the record to a different table name.

 

Any idea of how to get more information on why it is failing for an existing table name/catalog versus not for a new table name/catalog (same record and schema).  Any logging I can turn on to get more info or something I can look for on the catalog through the command line tools interface?

 

Thanks,

Diane

 




_______________________________________________
geomesa-users mailing list
geomesa-users@xxxxxxxxxxxxxxxx
To change your delivery options, retrieve your password, or unsubscribe from this list, visit
https://www.locationtech.org/mailman/listinfo/geomesa-users

 



_______________________________________________
geomesa-users mailing list
geomesa-users@xxxxxxxxxxxxxxxx
To change your delivery options, retrieve your password, or unsubscribe from this list, visit
https://www.locationtech.org/mailman/listinfo/geomesa-users


Back to the top