- 讀取csv格式的數(shù)據(jù),將對(duì)對(duì)應(yīng)文本經(jīng)緯度轉(zhuǎn)成geometry類型的數(shù)據(jù)進(jìn)行處理
- 數(shù)據(jù)類型為CSV(帶有表頭信息)
- 2.測(cè)試代碼如下:
import findspark
findspark.init()
from geospark.utils import KryoSerializer, GeoSparkKryoRegistrator
from geospark.core.SpatialRDD import PointRDD
from geospark.core.enums import FileDataSplitter
from pyspark.sql import SparkSession
from geospark.register import GeoSparkRegistrator
from geospark.utils.adapter import Adapter
from pyspark import SparkConf,SparkContext
from pyspark.sql.functions import *
spark = SparkSession.builder\
.config("spark.serializer", KryoSerializer.getName)\
.config("spark.kryo.registrator", GeoSparkKryoRegistrator.getName).\
getOrCreate()
GeoSparkRegistrator.registerAll(spark)
inputLocation = r"D:\pycharm\pythonProject\GeoSpark\functions\taxi\taxi\taxi.csv"
df = spark.read.format("csv").option("header","true").load(inputLocation)
df.createOrReplaceTempView("view")
textSpatialDf = spark.sql("""
select *,'Point('||view.pickup_longitude||' '||view.pickup_latitude||')' as geom from view
""")
# df2.select("geom").show(truncate=False)
textSpatialDf.createOrReplaceTempView("textView")
spatialDf = spark.sql("""
select *,ST_PointFromText(textView.geom,'WKT') as geometry from textView
""")
spatialDf.show()
spatialDf.printSchema()
- 測(cè)試結(jié)果如下:
+---------+-------------------+-------------------+---------------+-------------------+-------------------+------------------+---------+------------------+-------------------+--------
----------+------------+-----------+---------+-------+------------------+------------------+------------------+--------------------+--------------------+
|vendor_id| pickup_datetime| dropoff_datetime|passenger_count| trip_distance| pickup_longitude| pickup_latitude|rate_code|store_and_fwd_flag| dropoff_longitude| dropof
f_latitude|payment_type|fare_amount|surcharge|mta_tax| tip_amount| tolls_amount| total_amount| geom| geometry|
+---------+-------------------+-------------------+---------------+-------------------+-------------------+------------------+---------+------------------+-------------------+--------
----------+------------+-----------+---------+-------+------------------+------------------+------------------+--------------------+--------------------+
| CMT|2014-01-09 20:45:25|2014-01-09 20:52:31| 1|0.69999999999999996|-73.994770000000003|40.736828000000003| 1| N|-73.982226999999995|40.73178
9999999997| CRD| 6.5| 0.5| 0.5|1.3999999999999999| 0|8.9000000000000004|Point(-73.9947700...|POINT (-73.99477 ...|
| CMT|2014-01-09 20:46:12|2014-01-09 20:55:12| 1| 1.3999999999999999|-73.982392000000004|40.773381999999998| 1| N|-73.960448999999997|40.76399
5000000001| CRD| 8.5| 0.5| 0.5|1.8999999999999999| 0| 11.4|Point(-73.9823920...|POINT (-73.982392...|
| CMT|2014-01-09 20:44:47|2014-01-09 20:59:46| 2| 2.2999999999999998|-73.988569999999996|40.739406000000002| 1| N|-73.986626000000001|
40.765217| CRD| 11.5| 0.5| 0.5| 1.5| 0| 14|Point(-73.9885699...|POINT (-73.98857 ...|
| CMT|2014-01-09 20:44:57|2014-01-09 20:51:40| 1| 1.7|-73.960212999999996|40.770463999999997| 1| N|-73.979862999999995|40.77705
0000000003| CRD| 7.5| 0.5| 0.5| 1.7| 0|10.199999999999999|Point(-73.9602129...|POINT (-73.960213...|
| CMT|2014-01-09 20:47:09|2014-01-09 20:53:32| 1|0.90000000000000002|-73.995371000000006|40.717247999999998| 1| N|-73.984367000000006|40.72052
3999999997| CRD| 6| 0.5| 0.5| 1.75| 0| 8.75|Point(-73.9953710...|POINT (-73.995371...|
| CMT|2014-01-09 20:45:07|2014-01-09 20:51:01| 1|0.90000000000000002|-73.983811000000003|40.749654999999997| 1| N|-73.989746999999994|40.75657
4999999998| CRD| 6| 0.5| 0.5|1.3999999999999999| 0|8.4000000000000004|Point(-73.9838110...|POINT (-73.983811...|
| CMT|2014-01-09 20:44:04|2014-01-09 21:05:45| 1| 3.6000000000000001|-73.984138000000002|40.726317000000002| 1| N|-73.962868999999998|
40.758443| CRD| 16.5| 0.5| 0.5| 5.25| 0| 22.75|Point(-73.9841380...|POINT (-73.984138...|
| CMT|2014-01-09 20:43:23|2014-01-09 20:52:07| 1| 2.1000000000000001| -73.979906|40.745849999999997| 1| N|-73.959090000000003|40.77363
9000000003| CRD| 9| 0.5| 0.5| 2| 0| 12|Point(-73.979906 ...|POINT (-73.979906...|
| CMT|2014-01-09 20:43:04|2014-01-09 20:54:29| 1| 3.3999999999999999|-73.981147000000007|40.758918000000001| 1| N|-73.942509999999999|40.78597
5000000001| CRD| 12| 0.5| 0.5|2.6000000000000001| 0| 15.6|Point(-73.9811470...|POINT (-73.981147...|
| CMT|2014-01-09 20:50:23|2014-01-09 20:58:10| 1| 2.2999999999999998|-73.955192999999994|40.765467999999998| 1| N|-73.979022999999998|40.74057
7999999999| CRD| 9| 0.5| 0.5| 1| 0| 11|Point(-73.9551929...|POINT (-73.955193...|
| CMT|2014-01-09 20:51:36|2014-01-09 21:15:07| 1| 9.5|-73.885274999999993|40.773048000000003| 1| N|-73.980879000000002|40.77738
3999999998| CRD| 28.5| 0.5| 0.5| 6.96|5.3300000000000001|41.789999999999999|Point(-73.8852749...|POINT (-73.885275...|
| CMT|2014-01-09 20:48:04|2014-01-09 21:01:37| 1| 3.2999999999999998|-73.991782000000001| 40.748911| 1| N|-73.988359000000003|
40.714205| CRD| 12.5| 0.5| 0.5|4.0499999999999998| 0|17.550000000000001|Point(-73.9917820...|POINT (-73.991782...|
| CMT|2014-01-09 20:47:49|2014-01-09 20:56:11| 2| 1.8|-73.965716999999998|40.758674999999997| 1| N|-73.984059000000002|40.73744
8000000001| CRD| 8.5| 0.5| 0.5|1.8999999999999999| 0| 11.4|Point(-73.9657169...|POINT (-73.965717...|
| CMT|2014-01-09 20:48:47|2014-01-09 20:56:52| 2| 1.3999999999999999|-73.977008999999995|40.751620000000003| 1| N|-73.982642999999996|40.76657
3999999999| CRD| 7.5| 0.5| 0.5| 1.7| 0|10.199999999999999|Point(-73.9770089...|POINT (-73.977009...|
| CMT|2014-01-09 20:47:51|2014-01-09 21:02:31| 3| 2.6000000000000001|-73.977655999999996|40.753680000000003| 1| N|-73.952248999999995|
40.777676| CRD| 12.5| 0.5| 0.5| 1| 0| 14.5|Point(-73.9776559...|POINT (-73.977656...|
| CMT|2014-01-09 20:49:49|2014-01-09 21:20:38| 1| 11.199999999999999|-73.788265999999993|40.647542000000001| 1| N|-73.949224999999998|40.65270
0000000003| CRD| 35.5| 0.5| 0.5| 0| 0| 36.5|Point(-73.7882659...|POINT (-73.788266...|
| CMT|2014-01-09 16:51:35|2014-01-09 17:00:17| 1| 1.7| -74.007503|40.725991999999998| 1| N|-73.988181999999995|40.73458
3000000001| CRD| 8.5| 1| 0.5| 2| 0| 12|Point(-74.007503 ...|POINT (-74.007503...|
| CMT|2014-01-09 16:43:29|2014-01-09 16:59:15| 1| 4.7000000000000002|-74.014865999999998| 40.709353| 1| N|-73.986084000000005|40.75908
1000000002| CRD| 16| 1| 0.5| 4| 0| 21.5|Point(-74.0148659...|POINT (-74.014866...|
| CMT|2014-01-09 16:46:50|2014-01-09 16:56:41| 1| 1.6000000000000001| -73.967675| 40.763109| 1| N|-73.952590999999998|40.77818
5999999998| CRD| 9| 1| 0.5|2.1000000000000001| 0| 12.6|Point(-73.967675 ...|POINT (-73.967675...|
| CMT|2014-01-09 16:47:00|2014-01-09 17:37:58| 1| 17.899999999999999|-73.781730999999994|40.644728999999998| 2| N|-73.978604000000004|40.76182
2000000002| CRD| 52| 0| 0.5| 11.56|5.3300000000000001|69.390000000000001|Point(-73.7817309...|POINT (-73.781731...|
+---------+-------------------+-------------------+---------------+-------------------+-------------------+------------------+---------+------------------+-------------------+--------
----------+------------+-----------+---------+-------+------------------+------------------+------------------+--------------------+--------------------+
only showing top 20 rows
root
|-- vendor_id: string (nullable = true)
|-- pickup_datetime: string (nullable = true)
|-- dropoff_datetime: string (nullable = true)
|-- passenger_count: string (nullable = true)
|-- trip_distance: string (nullable = true)
|-- pickup_longitude: string (nullable = true)
|-- pickup_latitude: string (nullable = true)
|-- rate_code: string (nullable = true)
|-- store_and_fwd_flag: string (nullable = true)
|-- dropoff_longitude: string (nullable = true)
|-- dropoff_latitude: string (nullable = true)
|-- payment_type: string (nullable = true)
|-- fare_amount: string (nullable = true)
|-- surcharge: string (nullable = true)
|-- mta_tax: string (nullable = true)
|-- tip_amount: string (nullable = true)
|-- tolls_amount: string (nullable = true)
|-- total_amount: string (nullable = true)
|-- geom: string (nullable = true)
|-- geometry: geometry (nullable = false)
當(dāng)然其他方法也是可以轉(zhuǎn)的;