Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
Re: [geomesa-users] Error importing csv into geomesa-redis with nifi

What do you have `schema-user-data` set to? It seems like the processor is not finding your existing schema, and is trying to create a new one - but the user data is invalid, i.e. you have something like `geomesa.z.splits=0`. I think you must have configured it somewhere to get that error, as by default it would just be empty. But, I'm also not sure why it's not finding the existing schema that you created through the CLI, as it shouldn't even be trying to create a new schema.

Thanks,

Emilio

On 7/8/22 11:15 AM, Michele Andreoli wrote:
Hi,

I will try to update the geomesa processors at 3.4.1 as soon as possible.

the full stack trace:
java.lang.IllegalArgumentException: requirement failed: Z shards must be between 1 and 127
at scala.Predef$.require(Predef.scala:219)
at org.locationtech.geomesa.utils.index.IndexConfigurationCheck$.validateIndices(GeoMesaSchemaValidator.scala:120)
at org.locationtech.geomesa.utils.index.GeoMesaSchemaValidator$.validate(GeoMesaSchemaValidator.scala:29)
at org.locationtech.geomesa.index.geotools.MetadataBackedDataStore.createSchema(MetadataBackedDataStore.scala:133)
at org.locationtech.geomesa.index.geotools.GeoMesaDataStore$SchemaCompatibility$DoesNotExist.apply(GeoMesaDataStore.scala:694)
at org.geomesa.nifi.datastore.processor.mixins.DataStoreIngestProcessor$IngestProcessor.checkSchema(DataStoreIngestProcessor.scala:207)
at org.geomesa.nifi.datastore.processor.RecordIngestProcessor$RecordIngest$$anon$1$$anonfun$process$1.apply(RecordIngestProcessor.scala:83)
at org.geomesa.nifi.datastore.processor.RecordIngestProcessor$RecordIngest$$anon$1$$anonfun$process$1.apply(RecordIngestProcessor.scala:80)
at org.locationtech.geomesa.utils.io.package$WithClose$.apply(package.scala:64)
at org.geomesa.nifi.datastore.processor.RecordIngestProcessor$RecordIngest$$anon$1.process(RecordIngestProcessor.scala:80)
at org.apache.nifi.controller.repository.StandardProcessSession.read(StandardProcessSession.java:2675)
at org.apache.nifi.controller.repository.StandardProcessSession.read(StandardProcessSession.java:2643)
at org.geomesa.nifi.datastore.processor.RecordIngestProcessor$RecordIngest.ingest(RecordIngestProcessor.scala:78)
at org.geomesa.nifi.datastore.processor.mixins.DataStoreIngestProcessor$$anonfun$onTrigger$1.apply(DataStoreIngestProcessor.scala:118)
at org.geomesa.nifi.datastore.processor.mixins.DataStoreIngestProcessor$$anonfun$onTrigger$1.apply(DataStoreIngestProcessor.scala:114)
at scala.collection.Iterator$class.foreach(Iterator.scala:742)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1194)
at scala.collection.IterableLike$class.foreach(IterableLike.scala:72)
at scala.collection.AbstractIterable.foreach(Iterable.scala:54)
at org.geomesa.nifi.datastore.processor.mixins.DataStoreIngestProcessor$class.onTrigger(DataStoreIngestProcessor.scala:114)
at org.geomesa.nifi.processors.redis.PutGeoMesaRedisRecord.onTrigger(PutGeoMesaRedisRecord.scala:15)
at org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27)
at org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1283)
at org.apache.nifi.controller.tasks.ConnectableTask.invoke(ConnectableTask.java:214)
at org.apache.nifi.controller.scheduling.AbstractTimeBasedSchedulingAgent.lambda$doScheduleOnce$0(AbstractTimeBasedSchedulingAgent.java:63)
at org.apache.nifi.engine.FlowEngine$2.run(FlowEngine.java:110)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:750)


and these are the flow-file attributes:
RouteOnAttribute.Route = isGeomesa
catalog = simulator
executesql.query.duration = 7
executesql.query.executiontime = 7
executesql.query.fetchtime = 0
executesql.resultset.index = 0
executesql.row.count = 0
feature = sim
fileToImport = sim_00020_1656669600_1656676800.netstate.counter.results.csv
fileType = file
filename = sim_00020_1656669600_1656676800.netstate.counter.results.csv
fragment.count = 1
fragment.identifier = 33d3daa3-8539-4283-b5e3-762c634a05b5
fragment.index = 0
mime.type = application/json
path = /Data/co-mil/simulation/proc_results/sim/2022/07/01
pathToImport = /Data/co-mil/simulation/proc_results/sim/2022/07/01
record.count = 1
record.jobDetFilename = sim_00020_1656669600_1656676800.netstate.counter.csv
record.jobDetHdfsPath = /Data/co-mil/simulation/sim/2022/07/01
record.jobDetId = 13
record.jobDetProcFilename = sim_00020_1656669600_1656676800.netstate.counter.results.csv
record.jobDetProcHdfsPath = /Data/co-mil/simulation/proc_results/sim/2022/07/01
record.jobDetStatus = 5
record.jobDetSubType = netstate.counter
record.jobDetType = 1656676800
record.jobGeomesaResImport = simulator.sim
record.jobId = 11
record.jobPk = 20
record.jobProcScript = python:sim_geojson_merger.py
record.jobServiceEndpoint = /simulation
record.jobStartTime = 1656669600
record.jobTabResImport = Empty string set
record.jobType = sim
segment.original.filename = 792c5d86-d4b6-4e2d-9e8d-404296380fb6
uuid = 65c392df-6a77-4193-b603-dffa0690230f

Il giorno ven 8 lug 2022 alle ore 16:10 Emilio Lahr-Vivaz <elahrvivaz@xxxxxxxx> ha scritto:
Hello,

Could you provide the full stack trace of the error? The error seems odd... are there any flow-file attributes on your csv input?

You might try updating to the latest GeoMesa version (3.4.1) to see if that fixes the issue. GeoMesa 3.2.x is built against NiFi 1.12.1, so that could potentially cause issues when running in NiFi 1.16. The latest release is built against NiFi 1.15.3, but I'm not sure anyone has tried it with 1.16 yet. Generally NiFi is flexible with version differences, but I have seen it be an issue in the past. If you upgrade, you might try using the PutGeoMesaRecord processor with a RedisDataStoreService, as the PutGeoMesaRedisRecord processor has been deprecated.

Thanks,

Emilio

On 7/8/22 9:18 AM, Michele Andreoli wrote:
Hi,

I have troubles using the NiFi (1.16.1) processor PutGeoMesaRedisRecord 3.2.0 to ingest csv data to geomesa-redis.

The data are in a comma separated csv with a geometry field and a datetime field, for example (the first 5 lines, my csv contains hundreds rows):
dtg,domain,owner,poly_cid,mean_of_transport,counter,name,geom,length
2022-07-01 10:00:00,co-mil,nifi,436,bike,0,Via_Anselmo_Ronchetti,"LINESTRING (9.2001050000000000 45.4657210000000000, 9.2001460000000000 45.4658880000000000, 9.2001620000000000 45.4661670000000000, 9.2001620000000000 45.4663500000000000, 9.2001650000000000 45.4663540000000000, 9.2001790000000000 45.4666470000000000, 9.2001530000000000 45.4669260000000000)",134.27992849449834
2022-07-01 10:00:00,co-mil,nifi,436,bus,0,Via_Anselmo_Ronchetti,"LINESTRING (9.2001050000000000 45.4657210000000000, 9.2001460000000000 45.4658880000000000, 9.2001620000000000 45.4661670000000000, 9.2001620000000000 45.4663500000000000, 9.2001650000000000 45.4663540000000000, 9.2001790000000000 45.4666470000000000, 9.2001530000000000 45.4669260000000000)",134.27992849449834
2022-07-01 10:00:00,co-mil,nifi,436,car,0,Via_Anselmo_Ronchetti,"LINESTRING (9.2001050000000000 45.4657210000000000, 9.2001460000000000 45.4658880000000000, 9.2001620000000000 45.4661670000000000, 9.2001620000000000 45.4663500000000000, 9.2001650000000000 45.4663540000000000, 9.2001790000000000 45.4666470000000000, 9.2001530000000000 45.4669260000000000)",134.27992849449834
2022-07-01 10:00:00,co-mil,nifi,436,metro,0,Via_Anselmo_Ronchetti,"LINESTRING (9.2001050000000000 45.4657210000000000, 9.2001460000000000 45.4658880000000000, 9.2001620000000000 45.4661670000000000, 9.2001620000000000 45.4663500000000000, 9.2001650000000000 45.4663540000000000, 9.2001790000000000 45.4666470000000000, 9.2001530000000000 45.4669260000000000)",134.27992849449834
2022-07-01 10:00:00,co-mil,nifi,436,pawns,1,Via_Anselmo_Ronchetti,"LINESTRING (9.2001050000000000 45.4657210000000000, 9.2001460000000000 45.4658880000000000, 9.2001620000000000 45.4661670000000000, 9.2001620000000000 45.4663500000000000, 9.2001650000000000 45.4663540000000000, 9.2001790000000000 45.4666470000000000, 9.2001530000000000 45.4669260000000000)",134.27992849449834

Before all, I created a schema on Redis through geomesa-redis command line tool:
./geomesa-redis_2.11-3.2.0/bin/geomesa-redis create-schema -u redis://myuser:mypwd@redis-geomesa:11637 -c simulator -f sim -s dtg:Date,domain:String,owner:String,poly_cid:Integer,mean_of_transport:String,counter:Integer,name:String,geom:LineString:srid=4326,length:Double

After that I have configured the NiFi processor PutGeoMesaRedisRecord 3.2.0:
<properties>
<entry><key>record-reader</key><value>CSVReaderGeomesa</value></entry>
<entry><key>feature-type-name</key><value>sim</value></entry>
<entry><key>feature-id-col</key><value>${geomesa.id.col}</value></entry>
<entry><key>geometry-cols</key><value>geom:LineString:srid=4326</value></entry>
<entry><key>geometry-serialization</key><value>WKT</value></entry>
<entry><key>json-cols</key><value>${geomesa.json.cols}</value></entry>
<entry><key>default-date-col</key><value>${geomesa.default.dtg.col}</value></entry>
<entry><key>visibilities-col</key><value>${geomesa.visibilities.col}</value></entry>
<entry><key>schema-compatibility</key><value><name>existing</name></value></entry>
<entry><key>write-mode</key><value><name>append</name></value></entry>

<entry><key>redis.url</key><value>redis://myuser:mypwd@redis-geomesa:11637</value></entry>
<entry><key>redis.catalog</key><value>simulator</value></entry>
...
</properties>

And the CSVReaderGeomesa controller service:
<properties>
...
<entry><key>schema-access-strategy</key><value>infer-schema</value></entry>
<entry><key>csv-reader-csv-parser</key><value>commons-csv</value></entry>
<entry><key>Date Format</key><value>yyyy-MM-dd HH:mm:ss</value></entry>
<entry><key>Time Format</key><value>HH:mm:ss</value></entry>
<entry><key>Timestamp Format</key><value>yyyy-MM-dd HH:mm:ss</value></entry>
<entry><key>CSV Format</key><value>custom</value></entry>
<entry><key>Value Separator</key><value>,</value></entry>
<entry><key>Record Separator</key><value>\n</value></entry>
<entry><key>Skip Header Line</key><value>true</value></entry>
<entry><key>ignore-csv-header</key><value>true</value></entry>
<entry><key>Quote Character</key><value>"</value></entry>
<entry><key>Escape Character</key><value>\</value></entry>
...
</properties>

So, when NiFi loads and processes the csv the PutGeoMesaRedisRecord give me an error:
java.lang.IllegalArgumentException: requirement failed: Z shards must be between 1 and 127

If I modify the csv with only 1 line, it works.

I don't understand if the error was in the schema creation on Redis or in the NiFi config processor or in the csv data?


--
Michele Andreoli


_______________________________________________
geomesa-users mailing list
geomesa-users@xxxxxxxxxxx
To unsubscribe from this list, visit https://dev.eclipse.org/mailman/listinfo/geomesa-users

_______________________________________________
geomesa-users mailing list
geomesa-users@xxxxxxxxxxx
To unsubscribe from this list, visit https://dev.eclipse.org/mailman/listinfo/geomesa-users


--
Michele Andreoli


_______________________________________________
geomesa-users mailing list
geomesa-users@xxxxxxxxxxx
To unsubscribe from this list, visit https://dev.eclipse.org/mailman/listinfo/geomesa-users


Back to the top