比如原始表的schema如下:

image.png
現(xiàn)在想將該DataFrame 的schema轉(zhuǎn)換成:
id:String,
goods_name:String
price: Array<String>
sql 轉(zhuǎn)換
spark.sql("create table speedup_tmp_test_spark_schema_parquet12 using parquet as select cast(id as string),cast(goods_name as string),cast(price as array<string>) from tmp_test_spark_schema_parquet")case class 變換
case class newSchemaClass(id: String, goods_name: String, price: Array[String])
// 原dataframe
val df = spark.sql("select * from tmp_test_spark_schema_parquet")
// 新dataframe
val newDF = df .rdd.map { r =>
newSchemaClass(r(0).toString, r(1).toString, r.getSeqInt.map(_.toString).toArray)
}.toDF()
// 獲取具體數(shù)據(jù)
newDF.collect()(2).getListString