Byron,
What AWS region are you using? We may need to replicate
the bootstrap
script to each region in S3. (Though I'd be surprised
you got as far as you
did if you didn't have access to the bucket?
Thanks,
Anthony
Byron Chigoy
<bchigoy@xxxxxxxxxxx> writes:
> Thanks Jason -
> That did not work. We checkeddocker exec -t -i
accumulo-master find /opt -name *.jar and the file names
there match the file names
ingeomesa_spark_scala/kernel.json. We are wondering
about the appended -SNAPSHOT (my .jar files vs yours).
>
> In order to get GeoMesa (at least ingestion) to
work as well as geoserver we had to adjust the
Bootstrap. Perhaps that is where we went wrong? This was
due to the following:
>
> 1. Access error
tos3://geomesa-docker/bootstrap-geodocker-accumulo.sh,
we can see file contents via boto and CLI ls
> my_bucket = s3.Bucket('geomesa-docker')
> for obj in my_bucket.objects.all():
> print(obj.key)
> But are unable to grab it via CLI cp
orurllib.request.urlretrieve.
>
> 2. We adapted a copy from
geowave-geomesa-comparative-analysis/analyze/bootstrap-geodocker-accumulo.shRelative
changes were:
>
> IMAGE=quay.io/geomesa/accumulo-geomesa:latest
> vs
> IMAGE=quay.io/geodocker/accumulo:${TAG:-"latest"}
>
> AND
>
> DOCKER_OPT="-d --net=host --restart=always"
> if is_master ; then
> docker pull $IMAGE
> docker pull quay.io/geomesa/geoserver:latest
> docker pull
quay.io/geomesa/geomesa-jupyter:latest
> docker run $DOCKER_OPT --name=accumulo-master
$DOCKER_ENV $IMAGE master --auto-init
> docker run $DOCKER_OPT --name=accumulo-monitor
$DOCKER_ENV $IMAGE monitor
> docker run $DOCKER_OPT --name=accumulo-tracer
$DOCKER_ENV $IMAGE tracer
> docker run $DOCKER_OPT --name=accumulo-gc
$DOCKER_ENV $IMAGE gc
> docker run $DOCKER_OPT --name=geoserver
quay.io/geomesa/geoserver:latest
> docker run $DOCKER_OPT --name=jupyter
quay.io/geomesa/geomesa-jupyter:latest
> else # is worker
> docker pull $IMAGE
> docker run -d --net=host --name=accumulo-tserver
$DOCKER_ENV $IMAGE tserver
>
> Versus
>
> DOCKER_OPT="-d --net=host --restart=always"
> if is_master ; then
> docker run $DOCKER_OPT --name=accumulo-master
$DOCKER_ENV $IMAGE master --auto-init
> docker run $DOCKER_OPT --name=accumulo-monitor
$DOCKER_ENV $IMAGE monitor
> docker run $DOCKER_OPT --name=accumulo-tracer
$DOCKER_ENV $IMAGE tracer
> docker run $DOCKER_OPT --name=accumulo-gc
$DOCKER_ENV $IMAGE gc
> docker run $DOCKER_OPT --name=geoserver
quay.io/geodocker/geoserver:latest
> else # is worker
> docker run -d --net=host --name=accumulo-tserver
$DOCKER_ENV $IMAGE tserver
>
> 3. Bootstrap config changes
were-i=quay.io/geomesa/accumulo-geomesa:latest, -n=gis,
-p=secret, -e=TSERVER_XMX=10G,
-e=TSERVER_CACHE_DATA_SIZE=6G,
-e=TSERVER_CACHE_INDEX_SIZE=2G
>
> 4. Error from Jupyter Startup read (replaced IPs
with <IP>)
>
> bash: warning: setlocale: LC_ALL: cannot change
locale (en_US.UTF-8)
> [I <IP> NotebookApp] Kernel started:
83a4cb2d-8004-4c69-ad21-ad46ac2b4a48
> Starting Spark Kernel with
SPARK_HOME=/usr/local/spark
> bash: warning: setlocale: LC_ALL: cannot change
locale (en_US.UTF-8)
> bash: warning: setlocale: LC_ALL: cannot change
locale (en_US.UTF-8)
>
(Scala,org.apache.toree.kernel.interpreter.scala.ScalaInterpreter@1bc715b8)
>
(PySpark,org.apache.toree.kernel.interpreter.pyspark.PySparkInterpreter@292d1c71)
>
(SparkR,org.apache.toree.kernel.interpreter.sparkr.SparkRInterpreter@2b491fee)
>
(SQL,org.apache.toree.kernel.interpreter.sql.SqlInterpreter@3f1c5af9)
> 17/02/15 15:44:00 WARN toree.Main$$anon$1: No
external magics provided to PluginManager!
> 17/02/15 15:44:04 WARN
layer.StandardComponentInitialization$$anon$1: Locked to
Scala interpreter with SparkIMain until decoupled!
> 17/02/15 15:44:04 WARN
layer.StandardComponentInitialization$$anon$1: Unable to
control initialization of REPL class server!
> [W 15:44:04.777 NotebookApp] Notebook
GDELT+Analysis.ipynb is not trusted
> [W 15:44:04.803 NotebookApp] 404 GET
/nbextensions/widgets/notebook/js/extension.js?v=20170215154320
(<IP>) 2.94ms
referer=
http://ec2<IP>.compute-1.amazonaws.com:8890/notebooks/GDELT%2BAnalysis.ipynb
> 17/02/15 15:44:04 WARN util.NativeCodeLoader:
Unable to load native-hadoop library for your
platform... using builtin-java classes where applicable
> [W 15:44:06.577 NotebookApp] Timeout waiting for
kernel_info reply from
83a4cb2d-8004-4c69-ad21-ad46ac2b4a48
> 17/02/15 15:44:06 WARN yarn.Client: Neither
spark.yarn.jars nor spark.yarn.archive is set, falling
back to uploading libraries under SPARK_HOME.
> 17/02/15 15:44:13 ERROR spark.SparkContext: Error
initializing SparkContext.
> org.apache.spark.SparkException: Yarn application
has already ended! It might have been killed or unable
to launch application master.
>
> Any feedback is greatly appreciated.
> Byron, Texas A&M Transportation Institute
>
>
>
>
>
> From:
geomesa-users-bounces@xxxxxxxxxxxxxxxx
<geomesa-users-bounces@xxxxxxxxxxxxxxxx> on behalf
of Jim Hughes
<jnh5y@xxxxxxxx>
> Sent: Tuesday, February 14, 2017 4:15 PM
> To:
geomesa-users@xxxxxxxxxxxxxxxx
> Subject: Re: [geomesa-users] GeoMesa Docker EMR -
Jupyter Notebook help
>
> Hi Byron,
>
> As happenstance, I'm setting up a GeoMesa demo, and
I have a quick fix.
>
> You'll want to connect to the Jupyter docker (say,
with 'docker exec -it jupyter /bin/sh), and edit this
file:
/var/lib/hadoop-hdfs/.local/share/jupyter/kernels/geomesa_spark_scala/kernel.json.
>
> The line with the with the Toree Spark opts should
read...
>
> "__TOREE_SPARK_OPTS__":
"--driver-java-options=-Xmx4096M
--driver-java-options=-Dlog4j.logLevel=info --master
yarn --jars
file:///opt/geomesa/dist/spark/geomesa-accumulo-spark-runtime_2.11-1.3.0.jar,file:///opt/geomesa/dist/spark/geomesa-spark-converter_2.11-1.3.0.jar,file:///opt/geomesa/dist/spark/geomesa-spark-geotools_2.11-1.3.0.jar",
>
> One of the jars changed names (from
geomesa-accumulo-spark_2.11-1.3.0-shaded.jar to
geomesa-accumulo-spark-runtime_2.11-1.3.0.jar). That
difference caused the issues; I need to sort out
re-building the Docker images.
>
> Let me know if that doesn't sort it out!
>
> Cheers,
>
> Jim
>
>
> On 02/14/2017 04:07 PM, Byron Chigoy wrote:
>
> Hi - probably pretty basic, but we are able to get
the Docker Bootstrap tutorial working on AWS. We are
pulling fromhttps://quay.io/organization/geomesa. Once
started we can ingest the GDELT example and get the
descriptive. We are also able to bring the GDELT example
into GeoServer.
>
> However while Jupyter gets docked - the Kernel
GeoMesa Spark - Scala fails (Just says kernel busy). We
started the notebook on another port to see the error
behavior and get a list of them. See below any help or
clues would be most appreciated.
>
>
>
>
>
(Scala,org.apache.toree.kernel.interpreter.scala.ScalaInterpreter@5ef0d29e)
>
(PySpark,org.apache.toree.kernel.interpreter.pyspark.PySparkInterpreter@38f57b3d)
>
(SparkR,org.apache.toree.kernel.interpreter.sparkr.SparkRInterpreter@51850751)
>
(SQL,org.apache.toree.kernel.interpreter.sql.SqlInterpreter@3ce3db41)
> 17/02/14 20:50:29 WARN toree.Main$$anon$1: No
external magics provided to PluginManager!
> 17/02/14 20:50:32 WARN
layer.StandardComponentInitialization$$anon$1: Locked to
Scala interpreter with SparkIMain until decoupled!
> 17/02/14 20:50:32 WARN
layer.StandardComponentInitialization$$anon$1: Unable to
control initialization of REPL class server!
> 17/02/14 20:50:33 WARN util.NativeCodeLoader:
Unable to load native-hadoop library for your
platform... using builtin-java classes where applicable
> [W 20:50:34.769 NotebookApp] Timeout waiting for
kernel_info reply from
fe9c2776-f5d7-47bc-b5dd-d2769f631f2f
> 17/02/14 20:50:35 WARN yarn.Client: Neither
spark.yarn.jars nor spark.yarn.archive is set, falling
back to uploading libraries under SPARK_HOME.
> 17/02/14 20:50:42 ERROR spark.SparkContext: Error
initializing SparkContext.
> org.apache.spark.SparkException: Yarn application
has already ended! It might have been killed or unable
to launch application master.
>
>
> Byron
>
>
>
> _______________________________________________
geomesa-users mailing list
geomesa-users@xxxxxxxxxxxxxxxx To change your delivery
options, retrieve your password, or unsubscribe from
this list, visit
https://dev.locationtech.org/mailman/listinfo/geomesa-users
>
> _______________________________________________
> geomesa-users mailing list
>
geomesa-users@xxxxxxxxxxxxxxxx
> To change your delivery options, retrieve your
password, or unsubscribe from this list, visit
>
https://dev.locationtech.org/mailman/listinfo/geomesa-users
_______________________________________________
geomesa-users mailing list
geomesa-users@xxxxxxxxxxxxxxxx
To change your delivery options, retrieve your password,
or unsubscribe from this list, visit
https://dev.locationtech.org/mailman/listinfo/geomesa-users