Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
[geomesa-users] Geomesa map/reduce ingest speed query

Hello,

 

We’ve been transitioning from a version of Geomesa from before the ‘z3’ index was introduced, to  1.1.0_rc.2. We tried an in-place upgrade of our 1.0.x tables, but unfortunately it didn’t work (I think the problem relates to my Scala compiler topping out at Function22, and I have 30+ attributes in my table).

 

Anyway, I figured I could just re-ingest the data, since that was typically something I could do overnight, and I was going to be out for a few days anyway.

 

My ingestion code is done using Map/Reduce, and is based upon the old geomesa.org GDELT Map/Reduce ingestion example; with version 1.0.x it worked fine. Now, after just over 1 week of processing, I’m only 21% of the way through a dataset of only around 9 million features with point geometry (each feature has 30+ attributes, one timestamp, one POINT geometry, and 3 secondary indexes). Each Map task has a 1GB heap (which I have room to increase if necessary), and I have plentiful space on HDFS.

 

It seems that my map tasks are repeatedly failing with a number of different errors (I’ve listed them at the bottom of the email). I tried an ingestion of a larger number of points (~43 million) with fewer (7) non-geometry attributes, and came across similar issues.

 

Any suggestions?

 

Thanks!

 

Ben

 

--

Error: Java heap space

--

Java.lang.reflect.UndeclaredThrowableException: Unknown exception in doAs

                at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1451)

                at org.apache.hadoop.mapred.Child.main(Child.java:262)

Caused by: java.security.PrivilegedActionException: org.apache.accumulo.core.client.MutationsRejectedException: # constraint violations : 0  security codes: []  # server errors 0 # exceptions 1

                at java.security.AccessController.doPrivileged(Native Method)

                at javax.security.auth.Subject.doAs(Subject.java:415)

                at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1438)

                ... 1 more

Caused by: org.apache.accumulo.core.client.MutationsRejectedException: # constraint violations : 0  security codes: []  # server errors 0 # exceptions 1

                at org.apache.accumulo.core.client.impl.TabletServerBatchWriter.checkForFailures(TabletServerBatchWriter.java:536)

                at org.apache.accumulo.core.client.impl.TabletServerBatchWriter.close(TabletServerBatchWriter.java:353)

                at org.apache.acc

--

org.apache.hadoop.io.SecureIOUtils$AlreadyExistsException: EEXIST: File exists

        at org.apache.hadoop.io.SecureIOUtils.createForWrite(SecureIOUtils.java:178)

        at org.apache.hadoop.mapred.TaskLog.writeToIndexFile(TaskLog.java:310)

        at org.apache.hadoop.mapred.TaskLog.syncLogs(TaskLog.java:383)

        at org.apache.hadoop.mapred.Child$4.run(Child.java:270)

        at java.security.AccessController.doPrivileged(Native Method)

        at javax.security.auth.Subject.doAs(Subject.java:415)

        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1438)

        at org.apache.hadoop.mapred.Child.main(Child.java:262)

Caused by: EEXIST: File exists

        at org.apache.hadoop.io.nativeio.NativeIO.open(Native Method)

        at org.apache.hadoop.io.SecureIOUtils.createForWrite(SecureIOUtils.java:172)

        ... 7 more

--

 


Back to the top