[geomesa-users] GEOMESA Spark SQL argument type mismatch on string field

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]

[geomesa-users] GEOMESA Spark SQL argument type mismatch on string field compare

From: Dave Boyd <dboyd@xxxxxxxxxxxxxxxxx>
Date: Thu, 28 Mar 2019 14:54:36 +0000
Accept-language: en-US
Delivered-to: geomesa-users@xxxxxxxxxxxxxxxx
List-archive: <https://dev.locationtech.org/mhonarc/lists/geomesa-users>
List-help: <mailto:geomesa-users-request@locationtech.org?subject=help>
List-subscribe: <https://dev.locationtech.org/mailman/listinfo/geomesa-users>, <mailto:geomesa-users-request@locationtech.org?subject=subscribe>
List-unsubscribe: <https://dev.locationtech.org/mailman/options/geomesa-users>, <mailto:geomesa-users-request@locationtech.org?subject=unsubscribe>
Thread-index: AQHU5XYpcGL/xcqGCE6tFAKthA4rMA==
Thread-topic: GEOMESA Spark SQL argument type mismatch on string field compare

All:
I am running geomesa-accumulo-spark-runtime_2.11-2.2.2-SNAPSHOT within a zeppelin notebook.
I am having an interesting problem getting the SQL parsed whenever I try to do an equals compare on
a string field. Maybe I am just not getting the syntax correct.
I create the dataframe as follows:

val linkagesdataFrame = spark.read.format("geomesa").options(dsParams).option("geomesa.feature", "coalescelinkage").load()
linkagesdataFrame.createOrReplaceTempView("linkageview")

The schema for the data frame is:

root |-- __fid__: string (nullable = false) |-- entity2version: string (nullable = true) |-- linklabel: string (nullable = true) |-- linktype: integer (nullable = true) |-- source: string (nullable = true) |-- datecreated: timestamp (nullable = true) |-- title: string (nullable = true) |-- version: string (nullable = true) |-- entity2name: string (nullable = true) |-- entity2source: string (nullable = true) |-- objectkey: string (nullable = true) |-- lastmodified: timestamp (nullable = true) |-- name: string (nullable = true) |-- entity2key: string (nullable = true) |-- linkstatus: integer (nullable = true)

This SQL statement fails to parse with the below error:

%sql
select * from linkageview where name = "NGrams"

java.lang.IllegalArgumentException: argument type mismatch at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.locationtech.geomesa.spark.SparkVersions$$anonfun$1$$anonfun$apply$2.apply(SparkVersions.scala:34) at org.locationtech.geomesa.spark.SparkVersions$$anonfun$1$$anonfun$apply$2.apply(SparkVersions.scala:34) at org.locationtech.geomesa.spark.SparkVersions$.copy(SparkVersions.scala:48) at org.apache.spark.sql.SQLRules$SpatialOptimizationsRule$$anonfun$apply$1.applyOrElse(SQLRules.scala:261) at org.apache.spark.sql.SQLRules$SpatialOptimizationsRule$$anonfun$apply$1.applyOrElse(SQLRules.scala:221) at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$3.apply(TreeNode.scala:288) at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$3.apply(TreeNode.scala:288) at org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:70) at org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:287) at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$transformDown$1.apply(TreeNode.scala:293) at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$transformDown$1.apply(TreeNode.scala:293) at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$5.apply(TreeNode.scala:331) at org.apache.spark.sql.catalyst.trees.TreeNode.mapProductIterator(TreeNode.scala:188) at org.apache.spark.sql.catalyst.trees.TreeNode.transformChildren(TreeNode.scala:329) at org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:293) at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$transformDown$1.apply(TreeNode.scala:293) at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$transformDown$1.apply(TreeNode.scala:293) at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$5.apply(TreeNode.scala:331) at org.apache.spark.sql.catalyst.trees.TreeNode.mapProductIterator(TreeNode.scala:188) at org.apache.spark.sql.catalyst.trees.TreeNode.transformChildren(TreeNode.scala:329) at org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:293) at org.apache.spark.sql.catalyst.trees.TreeNode.transform(TreeNode.scala:277) at org.apache.spark.sql.SQLRules$SpatialOptimizationsRule$.apply(SQLRules.scala:221) at org.apache.spark.sql.SQLRules$SpatialOptimizationsRule$.apply(SQLRules.scala:143) at org.apache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$execute$1$$anonfun$apply$1.apply(RuleExecutor.scala:85) at org.apache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$execute$1$$anonfun$apply$1.apply(RuleExecutor.scala:82) at scala.collection.LinearSeqOptimized$class.foldLeft(LinearSeqOptimized.scala:124) at scala.collection.immutable.List.foldLeft(List.scala:84) at org.apache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$execute$1.apply(RuleExecutor.scala:82) at org.apache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$execute$1.apply(RuleExecutor.scala:74) at scala.collection.immutable.List.foreach(List.scala:381) at org.apache.spark.sql.catalyst.rules.RuleExecutor.execute(RuleExecutor.scala:74) at org.apache.spark.sql.execution.QueryExecution.optimizedPlan$lzycompute(QueryExecution.scala:73) at org.apache.spark.sql.execution.QueryExecution.optimizedPlan(QueryExecution.scala:73) at org.apache.spark.sql.execution.QueryExecution.sparkPlan$lzycompute(QueryExecution.scala:79) at org.apache.spark.sql.execution.QueryExecution.sparkPlan(QueryExecution.scala:75) at org.apache.spark.sql.execution.QueryExecution.executedPlan$lzycompute(QueryExecution.scala:84) at org.apache.spark.sql.execution.QueryExecution.executedPlan(QueryExecution.scala:84) at org.apache.spark.sql.Dataset.withTypedCallback(Dataset.scala:2791) at org.apache.spark.sql.Dataset.head(Dataset.scala:2112) at org.apache.spark.sql.Dataset.take(Dataset.scala:2327) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.zeppelin.spark.SparkZeppelinContext.showData(SparkZeppelinContext.java:108) at org.apache.zeppelin.spark.SparkSqlInterpreter.interpret(SparkSqlInterpreter.java:135) at org.apache.zeppelin.interpreter.LazyOpenInterpreter.interpret(LazyOpenInterpreter.java:103) at org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:632) at org.apache.zeppelin.scheduler.Job.run(Job.java:188) at org.apache.zeppelin.scheduler.ParallelScheduler$JobRunner.run(ParallelScheduler.java:162) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748)

Interestingly the query:

select * from linkageview where name like 'NGrams%'

Works just fine.

This only seems to affect direct equal comparisons of string fields.
I have run a number of other complex queries with aggregations and
even some gnarly joins that work just fine.

Any thoughts?

-- 
========= mailto:dboyd@xxxxxxxxxxxxxxxxx ============
David W. Boyd                     
VP,  Data Solutions       
10432 Balls Ford, Suite 240  
Manassas, VA 20109         
office:   +1-703-552-2862        
cell:     +1-703-402-7908
============== http://www.incadencecorp.com/ ============
ISO/IEC JTC1 WG9, editor ISO/IEC 20547 Big Data Reference Architecture
Chair ANSI/INCITS TC Big Data
Co-chair NIST Big Data Public Working Group Reference Architecture
First Robotic Mentor - FRC, FTC - www.iliterobotics.org
Board Member- USSTEM Foundation - www.usstem.org

The information contained in this message may be privileged 
and/or confidential and protected from disclosure.  
If the reader of this message is not the intended recipient 
or an employee or agent responsible for delivering this message 
to the intended recipient, you are hereby notified that any 
dissemination, distribution or copying of this communication 
is strictly prohibited.  If you have received this communication 
in error, please notify the sender immediately by replying to 
this message and deleting the material from any computer.

Follow-Ups:
- Re: [geomesa-users] GEOMESA Spark SQL argument type mismatch on string field compare
  - From: Emilio Lahr-Vivaz

Prev by Date: [geomesa-users] GeoMesa 2.2.2 and 2.1.3 released
Next by Date: [geomesa-users] GeoMesa 2.3.0 released
Previous by thread: [geomesa-users] GeoMesa 2.2.2 and 2.1.3 released
Next by thread: Re: [geomesa-users] GEOMESA Spark SQL argument type mismatch on string field compare
Index(es):
- Date
- Thread

Breadcrumbs