[
Date Prev][
Date Next][
Thread Prev][
Thread Next][
Date Index][
Thread Index]
[
List Home]
Re: [geomesa-dev] [geomesa-users] Null geometry field values
|
Hi David,
For 2, data types without native geometries are already supported (or
they should be... it's not our normal use case so it's not tested very
extensively).
For 1, there are some technical issues that allowing null geometries
creates. I'm including the dev list for a more technical discussion.
One option would be to just not include features with null geometries in
the spatial indices, and not include features with null geometries
and/or null dates in the spatio-temporal index. However, our attribute
indexing also includes a secondary spatio-temporal index, so we wouldn't
be able to index them there either... the only way to retrieve the
features would be by feature ID, which would hit the record table. Any
other query would not return them unless the record table was
specifically requested via query hint.
Currently scans without any reasonable index are run against the Z3 (or
Z2) table - we do that to help the accumulo block cache. In this
scenario, we could instead scan the record table. That would help with
some queries, but attribute queries would still not return the features
correctly.
A second option is that we could index the features with some kind of
marker row (this is what you're suggesting I believe). For our spatial
index this could possibly work fairly seamlessly, as any query with a
spatial predicate would exclude the marker rows, while an inclusive
predicate would scan the whole table. Queries with just a temporal
component, however, would need special handling. We would have to add
additional scan ranges to check for the null geometries in the
spatio-temporal index. This would potentially slow down queries for all
users, even if their data doesn't have nulls.
Does anyone have any better suggestions? Comments?
Thanks,
Emilio
On 09/15/2017 10:08 AM, David Boyd wrote:
All:
I remember discussing this once before but want to follow up to see
if there is a work around or
something I am missing.
Currently, geomesa requires every feature type/schema to have a
geometry field with a valid value. This makes sense
since that is the primary element of the indexing approach.
I have two issues that I need to address:
1. Data with empty or null geometry values.
2. Data types without native geometry fields.
The first instance is the most common. Currently my hack for that
stores either a point for the south pole
or a bounding box very small at the south pole.
However, this bothers some people and even me to some extent because
it is fake data, and can get returned
in a geometry search.
Are there plans or what would it take to have geomesa accept a null
geometry field. It would be a special case
in the indexing that would require a specific bucket/partition for
those records. But in the end it would be little
different in terms of index distribution than what I am doing now.
The advantage would be none of this fake data would get returned in
queries and one could actually query for
all records with empty geometries. This would all work for all
support geometry types.
Thoughts.
FYI - in the second instance I end up adding a fake field with the
same fake values.