Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
Re: [geomesa-users] Geomesa Spark build : ClassNotFoundException: org.apache.commons.io.Charsets

Hi Tom,

Thank you very much for the sample, it's very useful.
However, I have some problems with it. I am unable to find any repository for the org.locationtech.geomesa.spark.api.java.* classes. In order to solve that, I have used the org.locationtech.geomesa.spark.* classes such as SparkContext or SpatialRDDProvider instead of the java ones. When executing the jar, I have an error in the line where we instantiate the JavaSpatialRDDProvider applying the parameters to the JavaGeoMesaSpark class (in my case SpatialRDDProvider and GeoMesaSpark):

Exception in thread "main" java.lang.RuntimeException: Could not find a SparkGISProvider
        at org.locationtech.geomesa.spark.GeoMesaSpark$$anonfun$apply$2.apply(GeoMesaSpark.scala:47)
        at org.locationtech.geomesa.spark.GeoMesaSpark$$anonfun$apply$2.apply(GeoMesaSpark.scala:47)
        at scala.Option.getOrElse(Option.scala:121)
        at org.locationtech.geomesa.spark.GeoMesaSpark$.apply(GeoMesaSpark.scala:47)
        at org.locationtech.geomesa.spark.GeoMesaSpark.apply(GeoMesaSpark.scala)
        at com.praxedo.geomesa.geomesa_spark.Test.main(Test.java:34)

I guess I am missing some Maven dependencies providing the org.locationtech.geomesa.spark.api.java.* classes. These are the dependencies I actually have defined in my pom:

<dependencies>
<dependency>
<groupId>junit</groupId>
<artifactId>junit</artifactId>
<version>3.8.1</version>
<scope>test</scope>
</dependency>

<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.11</artifactId>
<version>2.0.0</version>
</dependency>

<dependency>
<groupId>org.locationtech.geomesa</groupId>
<artifactId>geomesa-spark-converter_2.11</artifactId>
<version>1.3.0</version>
</dependency>

<dependency>
<groupId>org.locationtech.geomesa</groupId>
<artifactId>geomesa-spark-core_2.11</artifactId>
<version>1.3.0</version>
</dependency>

<dependency>
<groupId>org.locationtech.geomesa</groupId>
<artifactId>geomesa-spark-geotools_2.11</artifactId>
<version>1.3.0</version>
</dependency>

<dependency>
<groupId>org.locationtech.geomesa</groupId>
<artifactId>geomesa-spark-sql_2.11</artifactId>
<version>1.3.0</version>
</dependency>

<dependency>
<groupId>org.geotools</groupId>
<artifactId>gt-main</artifactId>
<version>16.1</version>
</dependency>

</dependencies>

Is it also probable that I am getting the Could not find a SparkGISProvider error because of any missing dependency ?
Can you please share with my your Maven dependencies ?

Thanks,
José

2017-03-06 16:28 GMT+01:00 Tom Kunicki <tkunicki@xxxxxxxx>:


Hi, Jose.

There is an explicit Java API binding on the master branch that will be released with GeoMesa 1.3.1.  Here is a sample I whipped up against my local test data set:


import org.apache.hadoop.conf.Configuration;
import org.apache.spark.SparkConf;
import org.apache.spark.api.java.JavaSparkContext;
import org.geotools.data.Query;
import org.geotools.filter.text.ecql.ECQL;
import org.locationtech.geomesa.spark.api.java.*;

import java.util.HashMap;
import java.util.Map;

public class GeoMesaJavaSpark {

    public static void main(String... ags) throws Exception {

        SparkConf conf = new SparkConf()
.setMaster("local[4]”)
.setAppName("Sample Application");
        conf.set("spark.serializer", "org.apache.spark.serializer.KryoSerializer");
        conf.set("spark.kryo.registrator", "org.locationtech.geomesa.spark.GeoMesaSparkKryoRegistrator");
        JavaSparkContext jsc = new JavaSparkContext(conf);

        Map<String, String> params = new HashMap<>();
        params.put("instanceId", "local");
        params.put("zookeepers","localhost");
        params.put("user", "root");
        params.put("password", "secret");
        params.put("tableName", "geomesa.gdelt");

        JavaSpatialRDDProvider provider = JavaGeoMesaSpark.apply((Map)params);

        String filter = "BBOX(geom, -125, -24, -66, 50) AND dtg >= 2015-12-31T00:00:00Z";
        Query query = new Query("gdelt", ECQL.toFilter(filter));
        JavaSpatialRDD rdd = provider.rdd(new Configuration(), jsc, params, query);

        System.out.println(rdd.count());
        System.out.println(rdd.first());
        System.out.println(rdd.asGeoJSONString().first());
        System.out.println(rdd.asKeyValueList().first());
        System.out.println(rdd.asKeyValueMap().first());

    }
}


Tom



On Mar 6, 2017, at 9:38 AM, Jose Bujalance <joseab56@xxxxxxxxx> wrote:

Hi,
I just modified some versions on the master pom to match my environnement. I may try your solution later. I finally decided to use maven on java to get the builded jars instead. However, now I'm not sure how to use the geomesa-spark integration in java, as I haven't found any example code for java in the documentation. I am trying to "translate" this simple scala code into java :

// Datastore params
val dsParams = Map("bigtable.table.name" -> "Geoloc_Praxedo_catalog")

// set SparkContext
val conf = new SparkConf().setMaster("local[*]").setAppName("testSpark")
conf.set("spark.serializer", "org.apache.spark.serializer.KryoSerializer")
conf.set("spark.kryo.registrator", classOf[GeoMesaSparkKryoRegistrator].getName)
val sc = SparkContext.getOrCreate(conf)

// create RDD with a geospatial query using Geomesa functions
val spatialRDDProvider = GeoMesaSpark(dsParams)
val filter = ECQL.toFilter("BBOX(coords, 48.815215, 2.249294, 48.904295, 2.419337)")
val query = new Query("history_feature_nodate", filter)
val resultRDD = spatialRDDProvider.rdd(new Configuration, sc, dsParams, query)

resultRDD.count

Is there any useful link or documentation to understand how the geomesa-spark java api works ?

Thanks a lot,
José

2017-03-06 15:21 GMT+01:00 Emilio Lahr-Vivaz <elahrvivaz@xxxxxxxx>:
Hi,

Did you modify the pom? The module does build fine in master. Commons io looks like it should be pulled in from geomesa-convert-common:

https://github.com/locationtech/geomesa/blob/master/geomesa-convert/geomesa-convert-common/pom.xml#L34-L37

At any rate, you can add that snippet to the pom.xml for the failing module to explicitly include it. That should fix the error.

Thanks,

Emilio


On 03/06/2017 05:54 AM, Jose Bujalance wrote:
Hi,

I am trying to build the geomesa-spark module from source but the build fails when building the GeoMesa Spark Converter RDD Provider module. Some of the tests fail, giving the next error:

Tests run: 3, Failures: 0, Errors: 3, Skipped: 0, Time elapsed: 4.779 sec <<< FAILURE! - in org.locationtech.geomesa.spark.converter.ConverterSpatialRDDProviderTest
The ConverterSpatialRDDProvider should::read from local files(org.locationtech.geomesa.spark.converter.ConverterSpatialRDDProviderTest)  Time elapsed: 1.686 sec  <<< ERROR!
java.lang.NoClassDefFoundError: org/apache/commons/io/Charsets
        at org.apache.hadoop.security.Credentials.<clinit>(Credentials.java:222)
        at org.apache.hadoop.mapred.JobConf.<init>(JobConf.java:334)
        at org.apache.hadoop.mapreduce.Job.getInstance(Job.java:184)
        at org.locationtech.geomesa.spark.converter.ConverterSpatialRDDProvider.rdd(ConverterSpatialRDDProvider.scala:64)
        at org.locationtech.geomesa.spark.converter.ConverterSpatialRDDProviderTest$$anonfun$2$$anonfun$apply$7.apply(ConverterSpatialRDDProviderTest.scala:50)
        at org.locationtech.geomesa.spark.converter.ConverterSpatialRDDProviderTest$$anonfun$2$$anonfun$apply$7.apply(ConverterSpatialRDDProviderTest.scala:49)
        at org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:283)
        at org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:173)
        at org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:153)
        at org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:128)
        at org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:203)
        at org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:155)
        at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:103)
Caused by: java.lang.ClassNotFoundException: org.apache.commons.io.Charsets
        at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
        at org.apache.hadoop.security.Credentials.<clinit>(Credentials.java:222)
        at org.apache.hadoop.mapred.JobConf.<init>(JobConf.java:334)
        at org.apache.hadoop.mapreduce.Job.getInstance(Job.java:184)
        at org.locationtech.geomesa.spark.converter.ConverterSpatialRDDProvider.rdd(ConverterSpatialRDDProvider.scala:64)
        at org.locationtech.geomesa.spark.converter.ConverterSpatialRDDProviderTest$$anonfun$2$$anonfun$apply$7.apply(ConverterSpatialRDDProviderTest.scala:50)
        at org.locationtech.geomesa.spark.converter.ConverterSpatialRDDProviderTest$$anonfun$2$$anonfun$apply$7.apply(ConverterSpatialRDDProviderTest.scala:49)
        at org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:283)
        at org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:173)
        at org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:153)
        at org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:128)
        at org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:203)
        at org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:155)
        at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:103)

The ConverterSpatialRDDProvider should::read from local files with filtering(org.locationtech.geomesa.spark.converter.ConverterSpatialRDDProviderTest)  Time elapsed: 0.319 sec  <<< ERROR!
java.lang.NoClassDefFoundError: Could not initialize class org.apache.hadoop.security.Credentials
        at org.apache.hadoop.mapred.JobConf.<init>(JobConf.java:334)
        at org.apache.hadoop.mapreduce.Job.getInstance(Job.java:184)
        at org.locationtech.geomesa.spark.converter.ConverterSpatialRDDProvider.rdd(ConverterSpatialRDDProvider.scala:64)
        at org.locationtech.geomesa.spark.converter.ConverterSpatialRDDProviderTest$$anonfun$2$$anonfun$apply$11.apply(ConverterSpatialRDDProviderTest.scala:58)
        at org.locationtech.geomesa.spark.converter.ConverterSpatialRDDProviderTest$$anonfun$2$$anonfun$apply$11.apply(ConverterSpatialRDDProviderTest.scala:56)
        at org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:283)
        at org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:173)
        at org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:153)
        at org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:128)
        at org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:203)
        at org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:155)
        at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:103)

The ConverterSpatialRDDProvider should::read from a local file using Converter Name lookup(org.locationtech.geomesa.spark.converter.ConverterSpatialRDDProviderTest)  Time elapsed: 0.083 sec  <<< ERROR!
java.lang.NoClassDefFoundError: Could not initialize class org.apache.hadoop.security.Credentials
        at org.apache.hadoop.mapred.JobConf.<init>(JobConf.java:334)
        at org.apache.hadoop.mapreduce.Job.getInstance(Job.java:184)
        at org.locationtech.geomesa.spark.converter.ConverterSpatialRDDProvider.rdd(ConverterSpatialRDDProvider.scala:64)
        at org.locationtech.geomesa.spark.converter.ConverterSpatialRDDProviderTest$$anonfun$2$$anonfun$apply$15.apply(ConverterSpatialRDDProviderTest.scala:70)
        at org.locationtech.geomesa.spark.converter.ConverterSpatialRDDProviderTest$$anonfun$2$$anonfun$apply$15.apply(ConverterSpatialRDDProviderTest.scala:64)
        at org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:283)
        at org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:173)
        at org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:153)
        at org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:128)
        at org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:203)
        at org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:155)
        at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:103)

I don't see how can I install the Commons IO library and let the test access it. Does anybody have succesfully build Geomesa Spark and have an idea on how to solve this error ?

Thanks, 
José


_______________________________________________
geomesa-users mailing list
geomesa-users@xxxxxxxxxxxxxxxx
To change your delivery options, retrieve your password, or unsubscribe from this list, visit
https://dev.locationtech.org/mailman/listinfo/geomesa-users


_______________________________________________
geomesa-users mailing list
geomesa-users@xxxxxxxxxxxxxxxx
To change your delivery options, retrieve your password, or unsubscribe from this list, visit
https://dev.locationtech.org/mailman/listinfo/geomesa-users

_______________________________________________
geomesa-users mailing list
geomesa-users@xxxxxxxxxxxxxxxx
To change your delivery options, retrieve your password, or unsubscribe from this list, visit
https://dev.locationtech.org/mailman/listinfo/geomesa-users


_______________________________________________
geomesa-users mailing list
geomesa-users@xxxxxxxxxxxxxxxx
To change your delivery options, retrieve your password, or unsubscribe from this list, visit
https://dev.locationtech.org/mailman/listinfo/geomesa-users


Back to the top