Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
Re: [geomesa-users] Shapefile ingestion

Hi Diethard,

I had a little time to fix the issue. The PR should be merged into master soon; it is up here if you want to build it yourself:

https://github.com/locationtech/geomesa/pull/1512

I tested it out with your shapefile, and it ingested 244 and failed to ingest 4 of the features. Hopefully that will be good enough for now...



Thanks,

Emilio

On 05/17/2017 10:46 AM, Emilio Lahr-Vivaz wrote:
That would be awesome, thanks!

On 05/17/2017 10:41 AM, Diethard Steiner wrote:
Ok, thanks for clarifying. I wait for your commit next week then. Please let me know when ready ... I am also preparing a blog post/tutorial on this, so this can be beneficial for more users.

On 17 May 2017, at 15:12, Emilio Lahr-Vivaz <elahrvivaz@xxxxxxxx> wrote:

Probably the tutorial was originally written under an earlier indexing scheme. We used to be more lenient in how we indexed values - we would treat values greater than 180 as 180. However, this lead to some users having unexpected results, so we decided to fail early.

Thanks,

Emilio

On 05/17/2017 10:07 AM, Diethard Steiner wrote:
Hi Emilio,

Thanks a lot for your reply! It would be great if you could look into this next week.
Since this shapefile is reference in the GeoMesa Spark Jupyter tutorial, I am a bit surprised ingestion doesn't work. Maybe the reason for this is that I am using the simplified shapefile as opposed to the normal one. I will try the latter one as soon as I find time and report back.

I also tried to open the simplified shapefile in QGIS and it was displayed just fine.

Best regards,
Diethard

On 17 May 2017, at 14:34, Emilio Lahr-Vivaz <elahrvivaz@xxxxxxxx> wrote:

Hi Diethard,

From the exception, it looks like there are some polygons in the shapefile that are outside the lat/lon bounds of [-90, 90] and [-180, 180]. When we encounter coordinates like that we fail the indexing as invalid.

It's also possible that there are some subtle precision errors going on, since the coordinates are very close to 180. I'm not sure if that would be coming from the shapefile or something in our processing.

I think for a quick fix, in our shapefile ingestion code we should be ingesting the features one at a time, instead of all at once. This would allow us to skip any polygons that have invalid coordinates, but still ingest the rest of the shapefile:

https://github.com/locationtech/geomesa/blob/master/geomesa-utils/src/main/scala/org/locationtech/geomesa/utils/geotools/GeneralShapefileIngest.scala#L91

As an example, in our non-shapefile ingest code we ingest features one at a time:

https://github.com/locationtech/geomesa/blob/master/geomesa-tools/src/main/scala/org/locationtech/geomesa/tools/ingest/AbstractIngest.scala#L125-L138

I don't think I have time to look into it this week, but verifying the CRS and coordinates in the shapefile would also be a good idea. I should have time next week if it's not resolved by then.

Thanks,

Emilio

On 05/16/2017 06:23 PM, Diethard Steiner wrote:
Thanks a lot Jim and Emilio for your replies!

I quickly tried without specifying the feature/schema:

geomesa ingest -u root -p password \
  -c myNamespace.countries -f shapeFile TM_WORLD_BORDERS_SIMPL-0.3.shp

And I get following error:

ERROR requirement failed: Values out of bounds ([-180.0 180.0] [-90.0 90.0]): [-180.0 180.00000190734863] [-90.0 -60.54722595214844]
java.lang.IllegalArgumentException: requirement failed: Values out of bounds ([-180.0 180.0] [-90.0 90.0]): [-180.0 180.00000190734863] [-90.0 -60.54722595214844]
    at scala.Predef$.require(Predef.scala:219)
    at org.locationtech.geomesa.curve.XZ2SFC.org$locationtech$geomesa$curve$XZ2SFC$$normalize(XZ2SFC.scala:321)
    at org.locationtech.geomesa.curve.XZ2SFC.index(XZ2SFC.scala:55)
    at org.locationtech.geomesa.index.index.XZ2Index$$anonfun$toIndexKey$1.apply(XZ2Index.scala:124)

(which is the same as I got before - or at least fairly the same one)

You can download the shapefile from [Thematicmapping.org](http://thematicmapping.org/downloads/world_borders.php): From the **Download** section choose `TM_WORLD_BORDERS_SIMPL-0.3.zip`. I've also attached it.

It's quite late over here now ... so I'll only have a chance to check your reply tomorrow. Thanks in advance for your help!

Best regards,
Diethard


On Tue, May 16, 2017 at 10:31 PM, Jim Hughes <jnh5y@xxxxxxxx> wrote:
Hi Diethard,

Feel free to pass on the rest of the exception and/or link to the shapefile in question.  We have assumptions that data is in EPSG:4326 (longitude, latitude) and the points are inside that CRS's bounding box.  

If the shapefile violates either of those conditions, it might need a little pre-processing TLC. 

Generally, with those assumptions, as Emilio said, you shouldn't need to create the schema beforehand.

Cheers,

Jim


On 05/16/2017 05:15 PM, Emilio Lahr-Vivaz wrote:
Hi Diethard,

Likely the schema you're creating doesn't quite align with the shapefile. Try the ingest command without creating the schema first - the ingest command will create it for you. You can either delete the existing schema or change the catalog table.

We use the geotools shapefile data store to read a shapefile - I believe that you pass it the primary .shp file, but it expects the other files to be alongside.

Thanks,

Emilio

On 05/16/2017 03:00 PM, Diethard Steiner wrote:
Hi,

I am trying to follow the Spark Jupyter tutorial and I am struggling a bit with the Shapefile import. The docu shows a very basic example of how to ingest a shapefile, referencing directly the `shp` and not any other required files of the shapefile bundle. Is the `shp` file actually the only one that will be imported? No attributes from the dbf file etc?

I tried the following but I get an out of bound exception:

# create feature/schema
geomesa create-schema \
    -u root \
    -c myNamespace.countries \
    -f countriesFeature \
    -s shape:Polygon,fips:String,iso2:String,iso3:String,un:Integer,name:String,area:String,pop2005:Double,region:Integer,subregion:Integer,lon:Float,lat:Float

# ingest data
geomesa ingest -u root -p password \
  -c myNamespace.countries -f shapeFile countriesFeature TM_WORLD_BORDERS_SIMPL-0.3.shp

Can you please clarify how to correctly import this file?

Thanks,
Diethard


_______________________________________________
geomesa-users mailing list
geomesa-users@xxxxxxxxxxxxxxxx
To change your delivery options, retrieve your password, or unsubscribe from this list, visit
https://dev.locationtech.org/mailman/listinfo/geomesa-users
_______________________________________________
geomesa-users mailing list
geomesa-users@xxxxxxxxxxxxxxxx
To change your delivery options, retrieve your password, or unsubscribe from this list, visit
https://dev.locationtech.org/mailman/listinfo/geomesa-users

_______________________________________________ geomesa-users mailing list geomesa-users@xxxxxxxxxxxxxxxx To change your delivery options, retrieve your password, or unsubscribe from this list, visit https://dev.locationtech.org/mailman/listinfo/geomesa-users
_______________________________________________
geomesa-users mailing list
geomesa-users@xxxxxxxxxxxxxxxx
To change your delivery options, retrieve your password, or unsubscribe from this list, visit
https://dev.locationtech.org/mailman/listinfo/geomesa-users
_______________________________________________
geomesa-users mailing list
geomesa-users@xxxxxxxxxxxxxxxx
To change your delivery options, retrieve your password, or unsubscribe from this list, visit
https://dev.locationtech.org/mailman/listinfo/geomesa-users


_______________________________________________
geomesa-users mailing list
geomesa-users@xxxxxxxxxxxxxxxx
To change your delivery options, retrieve your password, or unsubscribe from this list, visit
https://dev.locationtech.org/mailman/listinfo/geomesa-users

_______________________________________________
geomesa-users mailing list
geomesa-users@xxxxxxxxxxxxxxxx
To change your delivery options, retrieve your password, or unsubscribe from this list, visit
https://dev.locationtech.org/mailman/listinfo/geomesa-users


_______________________________________________
geomesa-users mailing list
geomesa-users@xxxxxxxxxxxxxxxx
To change your delivery options, retrieve your password, or unsubscribe from this list, visit
https://dev.locationtech.org/mailman/listinfo/geomesa-users



_______________________________________________
geomesa-users mailing list
geomesa-users@xxxxxxxxxxxxxxxx
To change your delivery options, retrieve your password, or unsubscribe from this list, visit
https://dev.locationtech.org/mailman/listinfo/geomesa-users


Back to the top