Hello,
Have you looked at
https://www.geomesa.org/documentation/user/datastores/index_config.html#customizing-index-creation
?
By default, GeoMesa HBase will create several different indices to
support different query patterns. Without any configuration, you
will get 3 indices, which means your data will be stored 3 times
over. This supports efficient queries with spatial filters (z2),
spatio-temporal filters (z3), and feature ID lookups (id). You can
also specify attribute indices, for querying by attribute values.
In contrast, GeoMesa FileSystem will only store your data once,
which generally supports spatio-temporal queries (it depends on your
partition scheme). Additionally, there are space savings from the
file format (parquet or orc, which can support things like
dictionary encoding and other optimizations over a particular
column), and not having to store an index key for each feature (in
general an extra 10 or so bytes).
You may also want to look into table compression in HBase, I believe
by default we enable gzip compression but you can specify other
algorithms through the user-data keys
'geomesa.table.compression.enabled' and
'geomesa.table.compression.type'. (I just looked and it appears we
have not documented that config).
There may also be hdfs redundancy in HBase, which would take up even
more space (that may not be an issue if you are running HBase on
s3).
All of that said, I would still expect HBase to take more space, but
you may be able to narrow the gap.
Thanks,
Emilio
On 10/29/18 10:37 AM, Martin Kellner
wrote:
Hello,
I repeated the setup the next day without any problems.
Since then, I never experienced that error again.
Probably I made some mistake referencing my files on S3.
I have tried GeoMesa Filesystem and GeoMesa HBase now.
GeoMesa Filesystem stores the data quite efficently in S3
(it even takes less space than my input data, since it stores
everything in .parquet files).
However GeoMesa HBase requires lots of storage. For me, it
takes 8 times the size of my input data.
So far I have not found a good way to reduce that demand
for storage (I tried to make the id shorter and play around
with the ingest configuration).
So for my understanding, it is just normal, that Geomesa on
HBase demands lots of storage.
Is this correct or do I miss something?
Thank you,
Martin
Hello,
Where did you try to set those properties? Did you see the
section in the docs on configuring access to s3?
https://www.geomesa.org/documentation/user/cli/filesystems.html#enabling-s3-ingest
I believe that the URL prefix that you use makes a
difference as well - s3 vs s3a vs s3n. I think s3a is the
preferred prefix to use, but some commands tend to require
one or the other.
Thanks,
Emilio
On
10/25/18 8:06 AM, Martin Kellner wrote:
Hi,
I just tried to setup GeoMesa FileSystem.
I want to store the files on s3. But when I
try the ingest I get the following error
message:
ERROR AWS Access Key ID and Secret Access
Key must be specified as the username or
password (respectively) of a s3 URL, or by
setting the fs.s3.awsAccessKeyId or
fs.s3.awsSecretAccessKey properties
(respectively).
java.lang.IllegalArgumentException: AWS
Access Key ID and Secret Access Key must be
specified as the username or password
(respectively) of a s3 URL, or by setting
the fs.s3.awsAccessKeyId or
fs.s3.awsSecretAccessKey properties
(respectively).
Of course I tried to set
<property>
<name>fs.s3a.access.key</name>
<value>XXXXXXXXXXXXX</value>
</property>
<property>
<name>fs.s3a.secret.key</name>
<value>YYYYYYYYYYYYYYYYYYYYYYY</value
</property>
Unfortunately I still get the same error
mesage.
Do I have to re-initialize something to
apply those changes?
Thank you very much,
Martin
_______________________________________________
geomesa-users mailing list
geomesa-users@xxxxxxxxxxxxxxxx
To change your delivery options, retrieve your password, or unsubscribe from this list, visit
https://dev.locationtech.org/mailman/listinfo/geomesa-users
_______________________________________________
geomesa-users mailing list
geomesa-users@xxxxxxxxxxxxxxxx
To change your delivery options, retrieve your password, or
unsubscribe from this list, visit
https://dev.locationtech.org/mailman/listinfo/geomesa-users
_______________________________________________
geomesa-users mailing list
geomesa-users@xxxxxxxxxxxxxxxx
To change your delivery options, retrieve your password, or unsubscribe from this list, visit
https://dev.locationtech.org/mailman/listinfo/geomesa-users
|