Hi Emilio,
A quick question about HBase Query. I am seeing a lot of
data getting scanned for small spatial queries. I fired the
geospatial query on the OSMNodes table. Below are query and
table details. I am seeing the total number of read requests
(5,553,421,708) on HBase and seeing requests on most of the
regions and region servers. Any reasoning why did we scan each
region for this query?
Query:
"DWITHIN(geometry, POINT(-122.332426 47.607282), 50, meters) AND ingestionTimestamp <= '2020-05-27 16:59:31' AND nextTimestamp > '2020-05-27 16:59:31'"
Table Schema:
geomesa-hbase describe-schema -c atlas -f OSMNodes
INFO Describing attributes of feature 'OSMNodes'
geometry | Point (Spatio-temporally indexed)
ingestionTimestamp | Timestamp (Spatio-temporally indexed)
nextTimestamp | Timestamp
serializerVersion | String
featurePayload | String
User data:
geomesa.index.dtg | ingestionTimestamp
geomesa.indices | z3:6:3:geometry:ingestionTimestamp,id:4:3:
geomesa.stats.enable | true
geomesa.z.splits | 60
Query Plan (through GeoMesa Cli):
geomesa-hbase explain -c atlas -f OSMNodes
-q "DWITHIN(geometry, POINT(-122.332426 47.607282), 50,
meters) AND ingestionTimestamp <= '2020-05-27 16:59:31'
AND nextTimestamp > '2020-05-27 16:59:31'"
Planning 'OSMNodes' (DWITHIN(geometry, POINT (-122.332426
47.607282), 50.0, meters) AND ingestionTimestamp <=
2020-05-27T16:59:31+00:00) AND nextTimestamp >
2020-05-27T16:59:31+00:00
Original filter: (DWITHIN(geometry, POINT (-122.332426
47.607282), 50.0, meters) AND ingestionTimestamp <=
'2020-05-27 16:59:31') AND nextTimestamp > '2020-05-27
16:59:31'
Hints: bin[false] arrow[false] density[false] stats[false]
sampling[none]
Sort: none
Transforms: none
Strategy selection:
Query processing took 17ms for 1 options
Filter plan:
FilterPlan[Z3Index(geometry,ingestionTimestamp)[DWITHIN(geometry,
POINT (-122.332426 47.607282), 50.0, meters) AND
ingestionTimestamp <=
2020-05-27T16:59:31+00:00][nextTimestamp >
2020-05-27T16:59:31+00:00]]
Strategy selection took 1ms for 1 options
Strategy 1 of 1: Z3Index(geometry,ingestionTimestamp)
Strategy filter:
Z3Index(geometry,ingestionTimestamp)[DWITHIN(geometry, POINT
(-122.332426 47.607282), 50.0, meters) AND
ingestionTimestamp <=
2020-05-27T16:59:31+00:00][nextTimestamp >
2020-05-27T16:59:31+00:00]
Geometries: FilterValues(List(POLYGON
((-122.3317610175119 47.607282, -122.33177379496394
47.60715226835226, -122.33181163628976 47.607027522218985,
-122.33187308726842 47.606912555524126, -122.33195578637329
47.606811786373285, -122.33205655552413 47.60672908726843,
-122.33217152221899 47.60666763628976, -122.33229626835225
47.606629794963936, -122.332426 47.606617017511894,
-122.33255573164774 47.606629794963936, -122.33268047778101
47.60666763628976, -122.33279544447586 47.60672908726843,
-122.33289621362671 47.606811786373285, -122.33297891273158
47.606912555524126, -122.33304036371024 47.607027522218985,
-122.33307820503606 47.60715226835226, -122.3330909824881
47.607282, -122.33307820503606 47.60741173164774,
-122.33304036371024 47.60753647778101, -122.33297891273158
47.60765144447587, -122.33289621362671 47.60775221362671,
-122.33279544447586 47.60783491273157, -122.33268047778101
47.60789636371024, -122.33255573164774 47.60793420503606,
-122.332426 47.6079469824881, -122.33229626835225
47.60793420503606, -122.33217152221899 47.60789636371024,
-122.33205655552413 47.60783491273157, -122.33195578637329
47.60775221362671, -122.33187308726842 47.60765144447587,
-122.33181163628976 47.60753647778101, -122.33177379496394
47.60741173164774, -122.3317610175119
47.607282))),true,false)
Intervals:
FilterValues(List((-∞,2020-05-27T16:59:31Z]),true,false)
Plan: ScanPlan
Tables:
atlas_OSMNodes_z3_geometry_ingestionTimestamp_v6
Ranges (7440):
[%00;%0a;E$A%08;%00;%00;%00;%00;%00;::%00;%0a;E$A%0c;],
[%01;%0a;E$A%08;%00;%00;%00;%00;%00;::%01;%0a;E$A%0c;],
[%02;%0a;E$A%08;%00;%00;%00;%00;%00;::%02;%0a;E$A%0c;],
[%03;%0a;E$A%08;%00;%00;%00;%00;%00;::%03;%0a;E$A%0c;],
[%04;%0a;E$A%08;%00;%00;%00;%00;%00;::%04;%0a;E$A%0c;]
Scans (120):
['%0a;ElA%98;%00;%00;%00;%00;%00;::'%0a;Ema%8c;],
[:%0a;ElA%98;%00;%00;%00;%00;%00;:::%0a;Ema%8c;],
[%14;::%14;%0a;ElA%8c;],
[(%0a;ElA%98;%00;%00;%00;%00;%00;::(%0a;Ema%8c;],
[%12;%0a;ElA%98;%00;%00;%00;%00;%00;::%12;%0a;Ema%8c;]
Column families: d
Remote filters: MultiRowRangeFilter,
Z3HBaseFilter[(epoch,2629:2629),(zt,0:2009670),(zxy,335934:1603233:335941:1603248)],
CqlFilter[(DWITHIN(geometry, POINT (-122.332426 47.607282),
50.0, meters) AND ingestionTimestamp <=
2020-05-27T16:59:31+00:00) AND nextTimestamp >
2020-05-27T16:59:31+00:00]
Plan creation took 135ms
Query planning took 433ms
--
Regards,
Amit Kumar Srivastava