Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
Re: [geomesa-users] Adaption RowId, Spatial Bin Level

Marcel,

The current HEAD on GitHub will already use the "_z3" index table if
(and only if) your feature's geometry is a POINT.  You shouldn't need to
do any additional work.  Just build the current version, and ingest your
POINT data.

We expect to update the documentation on the web site shortly, but the
quick version is this:  The new table uses a key that is build from a
concatenation of t0 and Z(x, y, t1) where

  t0:  The most significant bits from the date-time

  x:   Longitude of the point

  y:   Latitude of the point

  t1:  The least significant bits from the date-time

  Z:   a 3-dimensional Z-curve

You are correct that the new version does not use Geohashes any more.

Best of luck.  Thanks for writing!

Sincerely,
  -- Chris


On Mon, 2015-09-14 at 20:26 +0200, Marcel wrote:
> Is there a better way you would recommend for some tests using the _z3 
> table instead of the _st_idx table?
> So when this function is deprecated, I assume the new index structure 
> dont use z-curve and geohashes anymore?
> Could you tell me some facts about the new index structure?
> 
> Thanks,
> Marcel Jacob.
> 
> 
> Am 14.09.2015 19:30, schrieb Chris Eichelberger:
> > Marcel,
> >
> > One other note that came up from internal discussions about your
> > question:  Customizing the index-schema format is something that we
> > recommend most people do *not* do, because:
> >
> > 1.  this function is deprecated;
> >
> > 2.  it is very easy to introduce errors;
> >
> > 3.  it requires significant background knowledge about your data and
> > your cluster in order to tune the value properly;
> >
> > 4.  unless you have very specialized data, it is probably unnecessary
> >
> > Sorry for any confusion!
> >
> > Sincerely,
> >    -- Chris
> >
> > On Mon, 2015-09-14 at 12:45 -0400, Chris Eichelberger wrote:
> >> Marcel,
> >>
> >> There is some in-source documentation on how to construct this string
> >> here:
> >>
> >>
> >> https://github.com/locationtech/geomesa/blob/master/geomesa-accumulo/geomesa-accumulo-datastore/src/main/scala/org/locationtech/geomesa/accumulo/index/IndexSchema.scala#L21-L55
> >>
> >> An easier (and better) way is to use the IndexSchemaBuilder class later
> >> in the same file:
> >>
> >>
> >> https://github.com/locationtech/geomesa/blob/master/geomesa-accumulo/geomesa-accumulo-datastore/src/main/scala/org/locationtech/geomesa/accumulo/index/IndexSchema.scala#L330
> >>
> >> There is an example of how to use this in the unit test:
> >>
> >>
> >> https://github.com/locationtech/geomesa/blob/master/geomesa-accumulo/geomesa-accumulo-datastore/src/test/scala/org/locationtech/geomesa/accumulo/index/IndexSchemaTest.scala#L174-L183
> >>
> >> Sincerely,
> >>    -- Chris
> >>
> >>
> >> On Mon, 2015-09-14 at 18:13 +0200, Marcel wrote:
> >>> I found the simplified version for the default index-schema format with
> >>> 10 bits:  YXTTYXTTYX
> >>> How does the default index-schema format exactly looks like?
> >>>
> >>> featureType.getUserData().put(Constants.SFT_INDEX_SCHEMA, "?");
> >>>
> >>> Thanks,
> >>> Marcel Jacob.
> >>>
> >>>
> >>> Am 14.09.2015 14:56, schrieb Chris Eichelberger:
> >>>> Marcel,
> >>>>
> >>>> Note:  The following discussion, while somewhat interesting, is also
> >>>> deprecated:  The index-schema format described here was specific to the
> >>>> "_st_idx" table that is being replaced by the "_z3" table and index
> >>>> structure that we have found often out-performs the older index.
> >>>>
> >>>> Deprecated "_st_idx" index-schema format...  Table III's "spatial bins"
> >>>> column refers to the number of (base-32-encoded) Geohash characters that
> >>>> appear in the RowID.  Our experience was, generally, that the best
> >>>> performance occurred when the specificity of the RowID was appropriate
> >>>> for the granularity of data being indexed.  That is to say, if you have
> >>>> global data that is nearly uniformly distributed, a single (5-bit,
> >>>> base-32) Geohash character in the RowID may be appropriate; but for data
> >>>> concerning a single metropolitan area, 3-4 Geohash characters in the
> >>>> RowID might provide for better differentiation among rows.  This setting
> >>>> was controlled by what we called the "index-schema format", and while
> >>>> there is a setting to do this, it is not widely publicized.  (You have
> >>>> to set a user-data value for the "geomesa.index.st.schema" key within
> >>>> the simple-feature type when the schema is first registered in GeoMesa.)
> >>>>
> >>>> I hope this helps to address your interest.  If you have additional
> >>>> questions, please continue to share them.
> >>>>
> >>>> Thanks!
> >>>>
> >>>> Sincerely,
> >>>>     -- Chris
> >>>>
> >>>>
> >>>> On Mon, 2015-09-14 at 14:22 +0200, Marcel wrote:
> >>>>> Hello all,
> >>>>> I have found in your paper on page 7 (Table III) that increasing the
> >>>>> number of spatial bin level (maybe from 1 to 2) can have a positive
> >>>>> effect on query response times. Can you tell me, where I can set this
> >>>>> adaption of the rowId for my mapreduce job? Would you recommend a
> >>>>> spatial bin level of 2 or even higher?
> >>>>>
> >>>>> Thanks,
> >>>>> Marcel Jacob.
> >>>>> _______________________________________________
> >>>>> geomesa-users mailing list
> >>>>> geomesa-users@xxxxxxxxxxxxxxxx
> >>>>> To change your delivery options, retrieve your password, or unsubscribe from this list, visit
> >>>>> http://www.locationtech.org/mailman/listinfo/geomesa-users
> >>>> _______________________________________________
> >>>> geomesa-users mailing list
> >>>> geomesa-users@xxxxxxxxxxxxxxxx
> >>>> To change your delivery options, retrieve your password, or unsubscribe from this list, visit
> >>>> http://www.locationtech.org/mailman/listinfo/geomesa-users
> >>> _______________________________________________
> >>> geomesa-users mailing list
> >>> geomesa-users@xxxxxxxxxxxxxxxx
> >>> To change your delivery options, retrieve your password, or unsubscribe from this list, visit
> >>> http://www.locationtech.org/mailman/listinfo/geomesa-users
> >>
> >> _______________________________________________
> >> geomesa-users mailing list
> >> geomesa-users@xxxxxxxxxxxxxxxx
> >> To change your delivery options, retrieve your password, or unsubscribe from this list, visit
> >> http://www.locationtech.org/mailman/listinfo/geomesa-users
> >
> > _______________________________________________
> > geomesa-users mailing list
> > geomesa-users@xxxxxxxxxxxxxxxx
> > To change your delivery options, retrieve your password, or unsubscribe from this list, visit
> > http://www.locationtech.org/mailman/listinfo/geomesa-users
> 
> _______________________________________________
> geomesa-users mailing list
> geomesa-users@xxxxxxxxxxxxxxxx
> To change your delivery options, retrieve your password, or unsubscribe from this list, visit
> http://www.locationtech.org/mailman/listinfo/geomesa-users




Back to the top