I have a GeoMesa table hosted by HBase cluster. After
switching from GeoMesa 2.4.1 to 3.0.0, some queries started to
fail due to "ZooKeeper session timeout".
Stack trace from my app:
1604525339987,"java.util.NoSuchElementException: Could not
obtain the next
Failed after attempts=3, exceptions:"
1604525339987,"Wed Nov 04 20:55:03 UTC 2020,
RpcRetryingCaller{globalStartTime=1604523178583, pause=100,
retries=3}, java.io.IOException: Call to
ip-10-0-22-145.ec2.internal/ failed on local
exception: org.apache.hadoop.hbase.ipc.CallTimeoutException:
Call id=5864, waitTime=73235, rpcTimetout=60000"
1604525339987,"Wed Nov 04 20:56:19 UTC 2020,
RpcRetryingCaller{globalStartTime=1604523178583, pause=100,
Can't get the location for replica 0"
1604525339987,"Wed Nov 04 20:57:48 UTC 2020,
RpcRetryingCaller{globalStartTime=1604523178583, pause=100,
retries=3}, java.io.IOException: Call to
ip-10-0-22-145.ec2.internal/ failed on local
exception: org.apache.hadoop.hbase.ipc.CallTimeoutException:
Call id=5869, waitTime=66664, rpcTimetout=60000"
1604525339987, at
1604525339987, at
1604525339987, at
I also found warnings from the log:
1604523293568,"04 Nov 2020 20:54:11,817 [33m[WARN] [m
org.apache.zookeeper.ClientCnxn: Client session timed out,
have not heard from server in 62688ms for sessionid
1604523335819,"04 Nov 2020 20:55:35,795 [33m[WARN] [m
org.apache.zookeeper.ClientCnxn: Client session timed out,
have not heard from server in 42227ms for sessionid
1604523368963,"04 Nov 2020 20:56:08,963 [33m[WARN] [m
org.apache.zookeeper.ClientCnxn: Unable to reconnect to
ZooKeeper service, session 0x47045cd3d has expired"
1604523368963,"04 Nov 2020 20:56:08,963 [33m[WARN] [m
This client just lost it's session with ZooKeeper, closing it.
It will be recreated next time someone needs it"
1604523379744,"04 Nov 2020 20:56:19,742 [33m[WARN] [m
(pool-9-thread-18) org.apache.hadoop.hbase.zookeeper.ZKUtil:
quorum=ip-10-0-21-114.ec2.internal:2181, baseZNode=/hbase
Unable to get data of znode
1604523379748,"04 Nov 2020 20:56:19,742 [33m[WARN] [m
(pool-9-thread-13) org.apache.hadoop.hbase.zookeeper.ZKUtil:
quorum=ip-10-0-21-114.ec2.internal:2181, baseZNode=/hbase
Unable to get data of znode
1604523379749,"04 Nov 2020 20:56:19,742 [33m[WARN] [m
(pool-9-thread-19) org.apache.hadoop.hbase.zookeeper.ZKUtil:
quorum=ip-10-0-21-114.ec2.internal:2181, baseZNode=/hbase
Unable to get data of znode
The query which triggered this issue:
INTERSECTS(geom,POLYGON ((-100.78857422 28.58452172,
-100.78857422 31.273855990000005, -93.71337890999999
31.273855990000005, -93.71337890999999 28.58452172,
-100.78857422 28.58452172))) AND timestamp <= '2020-10-23
16:52:20' AND timestamp > '2019-08-01 00:00:00'
The size of the test_TestTable_xz3_geom_timestamp_v2 table
is around 272 GB (GZ compressed), and the output data size of
this query is around 1.7GB (uncompressed).
I am able to reproduce the issue with this query pretty
consistently. And it would succeed if I just replaced the
GeoMesa jar in the classpath from 3.0.0/3.1.0 to 2.4.1.
I will keep looking into what got changed between the
releases, but would like to see if others are also
experiencing similar issues or can provide some insights on
Jun Cai