Hi, David.
I found this from ~1 year ago… If you're looking to use Spark SQL or DataFrames this sample could be made more concise.
Tom
From: Tom Kunicki <tkunicki@xxxxxxxx> Subject: Re: [geomesa-users] Geomesa Spark build : ClassNotFoundException: org.apache.commons.io.Charsets Date: March 6, 2017 at 10:28:51 AM EST To: Geomesa User discussions <geomesa-users@xxxxxxxxxxxxxxxx>
Hi, Jose.
There is an explicit Java API binding on the master branch that will be released with GeoMesa 1.3.1. Here is a sample I whipped up against my local test data set:
import org.apache.hadoop.conf.Configuration; import org.apache.spark.SparkConf; import org.apache.spark.api.java.JavaSparkContext; import org.geotools.data.Query; import org.geotools.filter.text.ecql.ECQL; import org.locationtech.geomesa.spark.api.java.*;
import java.util.HashMap; import java.util.Map;
public class GeoMesaJavaSpark {
public static void main(String... ags) throws Exception {
SparkConf conf = new SparkConf() .setMaster("local[4]”) .setAppName("Sample Application"); conf.set("spark.serializer", "org.apache.spark.serializer.KryoSerializer"); conf.set("spark.kryo.registrator", "org.locationtech.geomesa.spark.GeoMesaSparkKryoRegistrator"); JavaSparkContext jsc = new JavaSparkContext(conf);
Map<String, String> params = new HashMap<>(); params.put("instanceId", "local"); params.put("zookeepers","localhost"); params.put("user", "root"); params.put("password", "secret"); params.put("tableName", "geomesa.gdelt");
JavaSpatialRDDProvider provider = JavaGeoMesaSpark.apply((Map)params);
String filter = "BBOX(geom, -125, -24, -66, 50) AND dtg >= 2015-12-31T00:00:00Z"; Query query = new Query("gdelt", ECQL.toFilter(filter)); JavaSpatialRDD rdd = provider.rdd(new Configuration(), jsc, params, query);
System.out.println(rdd.count()); System.out.println(rdd.first()); System.out.println(rdd.asGeoJSONString().first()); System.out.println(rdd.asKeyValueList().first()); System.out.println(rdd.asKeyValueMap().first());
} }
|