Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
Re: [geomesa-users] Spark lost task java.lang.IllegalStateException: unread block data

In our compute jar, we have to relocate some packages to avoid classpath conflicts. That might be tripping you up inside eclipse - I'm not sure of a way to relocate classes in the IDE though...

Here's the relevant pom snippet:

1.2.6 - https://github.com/locationtech/geomesa/blob/geomesa-1.2.6/geomesa-compute/pom.xml#L165
1.3.x - https://github.com/locationtech/geomesa/blob/master/geomesa-accumulo/geomesa-accumulo-compute/pom.xml#L172

On 12/01/2016 07:11 AM, Milan Muňko wrote:

Hello everybody,

I would like to ask. I am getting Lost task 1.0 in stage 0.0 (TID 1, 147.175.3.163): java.lang.IllegalStateException: unread block data error when trying to compile and run application from eclipse.

I have no problem running the script when compiled as geomesa-compute. I am clearly missing some jars in the project buildpath.

I am trying to run this code:

package com.aimaps.iot;

import java.text.SimpleDateFormat
import org.apache.hadoop.conf.Configuration
import org.apache.spark.{SparkConf, SparkContext}
import org.geotools.data.{DataStoreFinder, Query}
import org.geotools.factory.CommonFactoryFinder
import org.geotools.filter.text.ecql.ECQL
import org.locationtech.geomesa.accumulo.data.AccumuloDataStore
import org.locationtech.geomesa.compute.spark.GeoMesaSpark

import scala.collection.JavaConversions._

object GeomesaQuerry {
 
  val dbParams = Map(
    "instanceId" -> "accumulo",
    "zookeepers" -> "localhost:2181",
    "user"       -> "root",
    "password"   -> "pass1234",
    "tableName"  -> "gdelt_ukraine")
   
  val feature = "event"
  val geom = "geom"
  val date = "SQLDATE"
 
  val bbox   = "22.128811, 44.390411, 40.218079, 52.375359"
  val during = "2014-04-01T00:00:00.000Z/2014-04-31T23:59:59.999Z"
  val filter = s"bbox($geom, $bbox) AND $date during $during"
 
  def main (args: Array[String]) {
   
    val ds = DataStoreFinder.getDataStore(dbParams).asInstanceOf[AccumuloDataStore]
   
    val q = new Query(feature, ECQL.toFilter(filter))
   
    val sparkConf = new SparkConf(true)
    sparkConf.setMaster("spark://munko-acer:7077")
    sparkConf.setAppName("GeomesaQuerry")
    //sparkConf.setExecutorEnv("SPARK_EXECUTOR_MEMORY", "4G")
   
    val sc = new SparkContext(GeoMesaSpark.init(sparkConf, ds))
   
    val queryRDD = GeoMesaSpark.rdd(new Configuration, sc, dbParams, q, None)
   
    val dayAndFeature = queryRDD.mapPartitions { iter =>
      val df = new SimpleDateFormat("yyyyMMdd")
      val ff = CommonFactoryFinder.getFilterFactory2
      val exp = ff.property(date)
      iter.map { f => (df.format(exp.evaluate(f).asInstanceOf[java.util.Date]), f) }
    }

    val countByDay = dayAndFeature.map( x => (x._1, 1)).reduceByKey(_ + _)

    countByDay.collect().foreach(println)
    println("\n")

    ds.dispose()
  }
 
}


And I get this error:

Error while parsing JAI registry file "file:/home/milan/Tools/geomesa-1.2.6/dist/spark/geomesa-compute-1.2.6-shaded.jar!/META-INF/registryFile.jai" :
Error in registry file at line number #31
A descriptor is already registered against the name "org.geotools.ColorReduction" under registry mode "rendered"
Error in registry file at line number #32
A descriptor is already registered against the name "org.geotools.ColorInversion" under registry mode "rendered"
Running Spark version 2.0.2
Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Your hostname, munko-acer resolves to a loopback address: 127.0.1.1; using 147.175.3.163 instead (on interface enp2s0)
Set SPARK_LOCAL_IP if you need to bind to another address
Changing view acls to: milan
Changing modify acls to: milan
Changing view acls groups to:
Changing modify acls groups to:
SecurityManager: authentication disabled; ui acls disabled; users  with view permissions: Set(milan); groups with view permissions: Set(); users  with modify permissions: Set(milan); groups with modify permissions: Set()
Successfully started service 'sparkDriver' on port 40519.
Registering MapOutputTracker
Registering BlockManagerMaster
Created local directory at /tmp/blockmgr-4814c20a-c5c5-4c74-b535-c5c44c5dd6cf
MemoryStore started with capacity 1949.1 MB
Registering OutputCommitCoordinator
Logging initialized @2810ms
jetty-9.2.z-SNAPSHOT
Started o.s.j.s.ServletContextHandler@288ca5f0{/jobs,null,AVAILABLE}
Started o.s.j.s.ServletContextHandler@4068102e{/jobs/json,null,AVAILABLE}
Started o.s.j.s.ServletContextHandler@44bd4b0a{/jobs/job,null,AVAILABLE}
Started o.s.j.s.ServletContextHandler@6c008c24{/jobs/job/json,null,AVAILABLE}
Started o.s.j.s.ServletContextHandler@216e0771{/stages,null,AVAILABLE}
Started o.s.j.s.ServletContextHandler@21079a12{/stages/json,null,AVAILABLE}
Started o.s.j.s.ServletContextHandler@fcc6023{/stages/stage,null,AVAILABLE}
Started o.s.j.s.ServletContextHandler@67c5ac52{/stages/stage/json,null,AVAILABLE}
Started o.s.j.s.ServletContextHandler@36417a54{/stages/pool,null,AVAILABLE}
Started o.s.j.s.ServletContextHandler@2b8bb184{/stages/pool/json,null,AVAILABLE}
Started o.s.j.s.ServletContextHandler@472a11ae{/storage,null,AVAILABLE}
Started o.s.j.s.ServletContextHandler@dc79225{/storage/json,null,AVAILABLE}
Started o.s.j.s.ServletContextHandler@30e9ca13{/storage/rdd,null,AVAILABLE}
Started o.s.j.s.ServletContextHandler@46185a1b{/storage/rdd/json,null,AVAILABLE}
Started o.s.j.s.ServletContextHandler@51288417{/environment,null,AVAILABLE}
Started o.s.j.s.ServletContextHandler@60cf62ad{/environment/json,null,AVAILABLE}
Started o.s.j.s.ServletContextHandler@1e0895f5{/executors,null,AVAILABLE}
Started o.s.j.s.ServletContextHandler@1ac4ccad{/executors/json,null,AVAILABLE}
Started o.s.j.s.ServletContextHandler@fd9ebde{/executors/threadDump,null,AVAILABLE}
Started o.s.j.s.ServletContextHandler@14982a82{/executors/threadDump/json,null,AVAILABLE}
Started o.s.j.s.ServletContextHandler@4ee5b2d9{/static,null,AVAILABLE}
Started o.s.j.s.ServletContextHandler@72f8ae0c{/,null,AVAILABLE}
Started o.s.j.s.ServletContextHandler@323f3c96{/api,null,AVAILABLE}
Started o.s.j.s.ServletContextHandler@6726cc69{/stages/stage/kill,null,AVAILABLE}
Started ServerConnector@723d9a10{HTTP/1.1}{0.0.0.0:4040}
Started @3024ms
Successfully started service 'SparkUI' on port 4040.
Bound SparkUI to 0.0.0.0, and started at http://147.175.3.163:4040
Connecting to master spark://munko-acer:7077...
Successfully created connection to munko-acer/127.0.1.1:7077 after 26 ms (0 ms spent in bootstraps)
Connected to Spark cluster with app ID app-20161201125709-0005
Executor added: app-20161201125709-0005/0 on worker-20161201102135-147.175.3.163-44853 (147.175.3.163:44853) with 4 cores
Granted executor ID app-20161201125709-0005/0 on hostPort 147.175.3.163:44853 with 4 cores, 1024.0 MB RAM
Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 34069.
Server created on 147.175.3.163:34069
Registering BlockManager BlockManagerId(driver, 147.175.3.163, 34069)
Registering block manager 147.175.3.163:34069 with 1949.1 MB RAM, BlockManagerId(driver, 147.175.3.163, 34069)
Registered BlockManager BlockManagerId(driver, 147.175.3.163, 34069)
Executor updated: app-20161201125709-0005/0 is now RUNNING
Started o.s.j.s.ServletContextHandler@f1868c9{/metrics/json,null,AVAILABLE}
SchedulerBackend is ready for scheduling beginning after reached minRegisteredResourcesRatio: 0.0
Desired query plan requires multiple scans - falling back to full table scan
The query being executed requires multiple scans, which is not currently supported by geomesa. Your result set will be partially incomplete. Query: BBOX(geom, 22.128811,44.390411,40.218079,52.375359) AND SQLDATE DURING 2014-04-01T00:00:00+00:00/2014-05-01T23:59:59.999+00:00
Block broadcast_0 stored as values in memory (estimated size 2.9 MB, free 1946.2 MB)
Block broadcast_0_piece0 stored as bytes in memory (estimated size 237.7 KB, free 1946.0 MB)
Added broadcast_0_piece0 in memory on 147.175.3.163:34069 (size: 237.7 KB, free: 1948.9 MB)
Created broadcast 0 from newAPIHadoopRDD at GeoMesaSpark.scala:142
Registered executor NettyRpcEndpointRef(null) (147.175.3.163:46958) with ID 0
Registering block manager 147.175.3.163:37089 with 366.3 MB RAM, BlockManagerId(0, 147.175.3.163, 37089)
Starting job: collect at GeomesaQuerry.scala:55
Registering RDD 3 (map at GeomesaQuerry.scala:53)
Got job 0 (collect at GeomesaQuerry.scala:55) with 2 output partitions
Final stage: ResultStage 1 (collect at GeomesaQuerry.scala:55)
Parents of final stage: List(ShuffleMapStage 0)
Missing parents: List(ShuffleMapStage 0)
Submitting ShuffleMapStage 0 (MapPartitionsRDD[3] at map at GeomesaQuerry.scala:53), which has no missing parents
Block broadcast_1 stored as values in memory (estimated size 4.1 KB, free 1946.0 MB)
Block broadcast_1_piece0 stored as bytes in memory (estimated size 2.4 KB, free 1946.0 MB)
Added broadcast_1_piece0 in memory on 147.175.3.163:34069 (size: 2.4 KB, free: 1948.9 MB)
Created broadcast 1 from broadcast at DAGScheduler.scala:1012
Submitting 2 missing tasks from ShuffleMapStage 0 (MapPartitionsRDD[3] at map at GeomesaQuerry.scala:53)
Adding task set 0.0 with 2 tasks
Stage 0 contains a task of very large size (4581 KB). The maximum recommended task size is 100 KB.
Starting task 0.0 in stage 0.0 (TID 0, 147.175.3.163, partition 0, PROCESS_LOCAL, 4691322 bytes)
Starting task 1.0 in stage 0.0 (TID 1, 147.175.3.163, partition 1, PROCESS_LOCAL, 4691421 bytes)
Launching task 0 on executor id: 0 hostname: 147.175.3.163.
Launching task 1 on executor id: 0 hostname: 147.175.3.163.
Lost task 1.0 in stage 0.0 (TID 1, 147.175.3.163): java.lang.IllegalStateException: unread block data
    at java.io.ObjectInputStream$BlockDataInputStream.setBlockDataMode(ObjectInputStream.java:2449)
    at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1385)
    at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2018)
    at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1942)
    at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1808)
    at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1353)
    at java.io.ObjectInputStream.readObject(ObjectInputStream.java:373)
    at org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:75)
    at org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:114)
    at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:253)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)

Starting task 1.1 in stage 0.0 (TID 2, 147.175.3.163, partition 1, PROCESS_LOCAL, 4691421 bytes)
Lost task 0.0 in stage 0.0 (TID 0) on executor 147.175.3.163: java.lang.IllegalStateException (unread block data) [duplicate 1]
Launching task 2 on executor id: 0 hostname: 147.175.3.163.
Starting task 0.1 in stage 0.0 (TID 3, 147.175.3.163, partition 0, PROCESS_LOCAL, 4691322 bytes)
Launching task 3 on executor id: 0 hostname: 147.175.3.163.
Lost task 1.1 in stage 0.0 (TID 2) on executor 147.175.3.163: java.lang.IllegalStateException (unread block data) [duplicate 2]
Starting task 1.2 in stage 0.0 (TID 4, 147.175.3.163, partition 1, PROCESS_LOCAL, 4691421 bytes)
Launching task 4 on executor id: 0 hostname: 147.175.3.163.
Lost task 0.1 in stage 0.0 (TID 3) on executor 147.175.3.163: java.lang.IllegalStateException (unread block data) [duplicate 3]
Starting task 0.2 in stage 0.0 (TID 5, 147.175.3.163, partition 0, PROCESS_LOCAL, 4691322 bytes)
Launching task 5 on executor id: 0 hostname: 147.175.3.163.
Lost task 1.2 in stage 0.0 (TID 4) on executor 147.175.3.163: java.lang.IllegalStateException (unread block data) [duplicate 4]
Starting task 1.3 in stage 0.0 (TID 6, 147.175.3.163, partition 1, PROCESS_LOCAL, 4691421 bytes)
Launching task 6 on executor id: 0 hostname: 147.175.3.163.
Lost task 0.2 in stage 0.0 (TID 5) on executor 147.175.3.163: java.lang.IllegalStateException (unread block data) [duplicate 5]
Starting task 0.3 in stage 0.0 (TID 7, 147.175.3.163, partition 0, PROCESS_LOCAL, 4691322 bytes)
Launching task 7 on executor id: 0 hostname: 147.175.3.163.
Lost task 1.3 in stage 0.0 (TID 6) on executor 147.175.3.163: java.lang.IllegalStateException (unread block data) [duplicate 6]
Task 1 in stage 0.0 failed 4 times; aborting job
Cancelling stage 0
Stage 0 was cancelled
ShuffleMapStage 0 (map at GeomesaQuerry.scala:53) failed in 1.704 s
Job 0 failed: collect at GeomesaQuerry.scala:55, took 1.798401 s
Exception in thread "main" org.apache.spark.SparkException: Job aborted due to stage failure: Task 1 in stage 0.0 failed 4 times, most recent failure: Lost task 1.3 in stage 0.0 (TID 6, 147.175.3.163): java.lang.IllegalStateException: unread block data
    at java.io.ObjectInputStream$BlockDataInputStream.setBlockDataMode(ObjectInputStream.java:2449)
    at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1385)
    at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2018)
    at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1942)
    at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1808)
    at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1353)
    at java.io.ObjectInputStream.readObject(ObjectInputStream.java:373)
    at org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:75)
    at org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:114)
    at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:253)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)

Driver stacktrace:
    at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1454)
    at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1442)
    at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1441)
    at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
    at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48)
    at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1441)
    at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:811)
    at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:811)
    at scala.Option.foreach(Option.scala:257)
    at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:811)
    at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1667)
    at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1622)
    at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1611)
    at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)
    at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:632)
    at org.apache.spark.SparkContext.runJob(SparkContext.scala:1873)
    at org.apache.spark.SparkContext.runJob(SparkContext.scala:1886)
    at org.apache.spark.SparkContext.runJob(SparkContext.scala:1899)
    at org.apache.spark.SparkContext.runJob(SparkContext.scala:1913)
    at org.apache.spark.rdd.RDD$$anonfun$collect$1.apply(RDD.scala:912)
    at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
    at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)
    at org.apache.spark.rdd.RDD.withScope(RDD.scala:358)
    at org.apache.spark.rdd.RDD.collect(RDD.scala:911)
    at com.aimaps.iot.GeomesaQuerry$.main(GeomesaQuerry.scala:55)
    at com.aimaps.iot.GeomesaQuerry.main(GeomesaQuerry.scala)
Caused by: java.lang.IllegalStateException: unread block data
    at java.io.ObjectInputStream$BlockDataInputStream.setBlockDataMode(ObjectInputStream.java:2449)
    at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1385)
    at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2018)
    at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1942)
    at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1808)
    at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1353)
    at java.io.ObjectInputStream.readObject(ObjectInputStream.java:373)
    at org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:75)
    at org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:114)
    at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:253)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)
Invoking stop() from shutdown hook
Lost task 0.3 in stage 0.0 (TID 7) on executor 147.175.3.163: java.lang.IllegalStateException (unread block data) [duplicate 7]
Stopped ServerConnector@723d9a10{HTTP/1.1}{0.0.0.0:4040}
Removed TaskSet 0.0, whose tasks have all completed, from pool
Stopped o.s.j.s.ServletContextHandler@6726cc69{/stages/stage/kill,null,UNAVAILABLE}
Stopped o.s.j.s.ServletContextHandler@323f3c96{/api,null,UNAVAILABLE}
Stopped o.s.j.s.ServletContextHandler@72f8ae0c{/,null,UNAVAILABLE}
Stopped o.s.j.s.ServletContextHandler@4ee5b2d9{/static,null,UNAVAILABLE}
Stopped o.s.j.s.ServletContextHandler@14982a82{/executors/threadDump/json,null,UNAVAILABLE}
Stopped o.s.j.s.ServletContextHandler@fd9ebde{/executors/threadDump,null,UNAVAILABLE}
Stopped o.s.j.s.ServletContextHandler@1ac4ccad{/executors/json,null,UNAVAILABLE}
Stopped o.s.j.s.ServletContextHandler@1e0895f5{/executors,null,UNAVAILABLE}
Stopped o.s.j.s.ServletContextHandler@60cf62ad{/environment/json,null,UNAVAILABLE}
Stopped o.s.j.s.ServletContextHandler@51288417{/environment,null,UNAVAILABLE}
Stopped o.s.j.s.ServletContextHandler@46185a1b{/storage/rdd/json,null,UNAVAILABLE}
Stopped o.s.j.s.ServletContextHandler@30e9ca13{/storage/rdd,null,UNAVAILABLE}
Stopped o.s.j.s.ServletContextHandler@dc79225{/storage/json,null,UNAVAILABLE}
Stopped o.s.j.s.ServletContextHandler@472a11ae{/storage,null,UNAVAILABLE}
Stopped o.s.j.s.ServletContextHandler@2b8bb184{/stages/pool/json,null,UNAVAILABLE}
Stopped o.s.j.s.ServletContextHandler@36417a54{/stages/pool,null,UNAVAILABLE}
Stopped o.s.j.s.ServletContextHandler@67c5ac52{/stages/stage/json,null,UNAVAILABLE}
Stopped o.s.j.s.ServletContextHandler@fcc6023{/stages/stage,null,UNAVAILABLE}
Stopped o.s.j.s.ServletContextHandler@21079a12{/stages/json,null,UNAVAILABLE}
Stopped o.s.j.s.ServletContextHandler@216e0771{/stages,null,UNAVAILABLE}
Stopped o.s.j.s.ServletContextHandler@6c008c24{/jobs/job/json,null,UNAVAILABLE}
Stopped o.s.j.s.ServletContextHandler@44bd4b0a{/jobs/job,null,UNAVAILABLE}
Stopped o.s.j.s.ServletContextHandler@4068102e{/jobs/json,null,UNAVAILABLE}
Stopped o.s.j.s.ServletContextHandler@288ca5f0{/jobs,null,UNAVAILABLE}
Stopped Spark web UI at http://147.175.3.163:4040
Shutting down all executors
Asking each executor to shut down
MapOutputTrackerMasterEndpoint stopped!
MemoryStore cleared
BlockManager stopped
BlockManagerMaster stopped
OutputCommitCoordinator stopped!
Successfully stopped SparkContext
Shutdown hook called
Deleting directory /tmp/spark-bc960a3b-309c-4acb-bfe2-60969598c699



Thank you for your help.
Milan
--


Ing. Milan Muňko | Co-Founder

AI-MAPS s. r. o., Tallerova 4, 811 02 Bratislava Slovakia

Mobile: +421 944 612 592

Email: milan.munko@xxxxxxxxxxx

Web: www.ai-maps.com




_______________________________________________
geomesa-users mailing list
geomesa-users@xxxxxxxxxxxxxxxx
To change your delivery options, retrieve your password, or unsubscribe from this list, visit
https://www.locationtech.org/mailman/listinfo/geomesa-users


Back to the top