Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
Re: [geomesa-users] Null geometry field values

Hi David,

For 2, data types without native geometries are already supported (or they should be... it's not our normal use case so it's not tested very extensively).

For 1, there are some technical issues that allowing null geometries creates. I'm including the dev list for a more technical discussion.

One option would be to just not include features with null geometries in the spatial indices, and not include features with null geometries and/or null dates in the spatio-temporal index. However, our attribute indexing also includes a secondary spatio-temporal index, so we wouldn't be able to index them there either... the only way to retrieve the features would be by feature ID, which would hit the record table. Any other query would not return them unless the record table was specifically requested via query hint.

Currently scans without any reasonable index are run against the Z3 (or Z2) table - we do that to help the accumulo block cache. In this scenario, we could instead scan the record table. That would help with some queries, but attribute queries would still not return the features correctly.

A second option is that we could index the features with some kind of marker row (this is what you're suggesting I believe). For our spatial index this could possibly work fairly seamlessly, as any query with a spatial predicate would exclude the marker rows, while an inclusive predicate would scan the whole table. Queries with just a temporal component, however, would need special handling. We would have to add additional scan ranges to check for the null geometries in the spatio-temporal index. This would potentially slow down queries for all users, even if their data doesn't have nulls.

Does anyone have any better suggestions? Comments?

Thanks,

Emilio

On 09/15/2017 10:08 AM, David Boyd wrote:
All:

I remember discussing this once before but want to follow up to see if there is a work around or
something I am missing.

Currently, geomesa requires every feature type/schema to have a geometry field with a valid value. This makes sense
since that is the primary element of the indexing approach.

   I have two issues that I need to address:

1.  Data with empty or null geometry values.
2. Data types without native geometry fields.

The first instance is the most common. Currently my hack for that stores either a point for the south pole
or a bounding box very small at the south pole.

However, this bothers some people and even me to some extent because it is fake data, and can get returned
in a geometry search.

Are there plans or what would it take to have geomesa accept a null geometry field. It would be a special case in the indexing that would require a specific bucket/partition for those records. But in the end it would be little
different in terms of index distribution than what I am doing now.

The advantage would be none of this fake data would get returned in queries and one could actually query for all records with empty geometries. This would all work for all support geometry types.

Thoughts.

FYI - in the second instance I end up adding a fake field with the same fake values.





Back to the top