1. 引入

上一篇我們討論了Geotrellis如何設(shè)計(jì)底層的數(shù)據(jù)類(lèi)型模型,Geotrellis實(shí)際上如何從tiff文件中將數(shù)據(jù)讀取出來(lái)呢?

我們?cè)俅位仡櫹路降念?lèi)結(jié)構(gòu)圖:

綠色的為類(lèi)繼承
紅色的為特征實(shí)現(xiàn)

可以發(fā)現(xiàn),UInt32GeotiffTile類(lèi)中引入的特質(zhì)大部分與兩類(lèi)行為有關(guān):

與Segment相關(guān)的特質(zhì)
與宏相關(guān)的特質(zhì)

我們首先討論與Segment相關(guān)的特質(zhì).在引入對(duì)Segment模型的解析之前,需要補(bǔ)充Geotiff中數(shù)據(jù)排列布局的相關(guān)知識(shí).

2. 圖像數(shù)據(jù)在tiff文件中的排布方式

官方文檔對(duì)于Tiff文件的數(shù)據(jù)結(jié)構(gòu)有了一定的描述,
圖像數(shù)據(jù)在Tiff文件中有兩種排布方式:

條帶式排布(Striped)
瓦片式排布(Tiled)

這里也有更詳細(xì)的描述.

2.1 條帶式排布

顧名思義,在條帶式排布的tiff文件中,數(shù)據(jù)的存儲(chǔ)粒度為條:文件中的圖像數(shù)據(jù)被分割為若干數(shù)據(jù)條,一個(gè)條帶即定義為包含固定行數(shù),具有一定大小的數(shù)據(jù)條.若干數(shù)據(jù)條的組合即是全部的圖像數(shù)據(jù).

條帶式排布采用3個(gè)TiffTag參數(shù)描述:

RowsPerStrip:條帶中包含的行數(shù).圖像中的每個(gè)條帶在其中必須具有相同數(shù)量的行，但在某些情況下除外(如最后一行).
StripOffsets:偏移量表，顯示每個(gè)條帶在tiff文件中的起始位置.
- StripOffsets并不限制順序,這意味著條帶可以以任意順序出現(xiàn).
- 某些閱讀器讀取出來(lái)的Tiff文件是一條一條不連續(xù)的垃圾數(shù)據(jù),可能就是為了加快速度,假定條帶按照順序存儲(chǔ),而不是根據(jù)實(shí)際偏移量表來(lái)讀取的.
StripByteCounts:條帶大小數(shù)組,描述每個(gè)條帶以字節(jié)為單位的大小.

條帶式排布有如下優(yōu)點(diǎn):

只需將所需要的條帶讀入內(nèi)存,可以節(jié)省內(nèi)存使用.
由于存在偏移表,可以更方便的隨機(jī)訪(fǎng)問(wèn)數(shù)據(jù).

條帶式的缺點(diǎn):

當(dāng)讀取一小部分?jǐn)?shù)據(jù),但該數(shù)據(jù)跨越量多行時(shí),讀取冗余會(huì)比較大.

2.2 瓦片式排布

Tiff 6.0引入了瓦片式排布.可以理解為具有寬度和高度的2d條帶,是當(dāng)前更為常見(jiàn)的排布方式.

瓦片式排布需要4個(gè)TiffTag參數(shù)描述:

TileWidth/TileLength:類(lèi)似于RowsPerStrip.必須是16的倍數(shù),但兩者不必相等.
TileOffsets:類(lèi)似于StripOffsets.
TileByteCounts:類(lèi)似于StripByteCounts.

相比條帶式排布,瓦片式的布局粒度更低,對(duì)于局部數(shù)據(jù)的獲取成本更低,且有利于數(shù)據(jù)壓縮.

2.3 兩種模型的區(qū)別與聯(lián)系

對(duì)于Geotiff數(shù)據(jù),無(wú)論數(shù)據(jù)源是采用何種排布方式,其實(shí)對(duì)數(shù)據(jù)進(jìn)行訪(fǎng)問(wèn)都可以歸納為一種模式,即:

根據(jù)偏移量和字節(jié)數(shù),遍歷每一個(gè)條帶/瓦片.

所不同的是需要區(qū)分排布方式,制定計(jì)算具體坐標(biāo)的方式,畢竟我們實(shí)際的絕大多數(shù)操作,都是針對(duì)具體位置的,而非一個(gè)條帶/瓦片的全部字節(jié)流.

這就是為何Geotrellis要設(shè)計(jì)Segment數(shù)據(jù)模型.

3. Segment模型概況

在代碼中我們能找到若干包含Segment命名的對(duì)象結(jié)構(gòu),它們的主要功能如下(CellType指的是不同數(shù)據(jù)類(lèi)型都有對(duì)應(yīng)的對(duì)象):

SegmentByte:定義了從ByteReader中讀取Byte數(shù)據(jù)的功能
- LazySegmentBytes:實(shí)現(xiàn)惰性按需讀取數(shù)據(jù)的功能
- ArraySegmentBytes:實(shí)現(xiàn)直接讀取全部數(shù)據(jù)的功能
GeotiffSegment:定義了對(duì)Segment抽象的get/map邏輯
- CelltypeGeotiffSegment:實(shí)現(xiàn)了map方法
  - CellTypeWithNodataGeotiffSegment:實(shí)現(xiàn)了get方法
GeotiffSegmentCollection:定義了從Byte獲取/遍歷Segment的邏輯
- CellTypeGeotiffSegmentCollection:實(shí)現(xiàn)了從Byte數(shù)據(jù)解壓為對(duì)應(yīng)數(shù)據(jù)類(lèi)型的方法
GeoTiffSegmentLayout:定義從行列號(hào)定位Segment序號(hào)的抽象邏輯
SegmentTransform:定義通過(guò)行列號(hào)定位到Segment中對(duì)應(yīng)數(shù)值序號(hào)的邏輯
- StripedSegmentTransform:實(shí)現(xiàn)在條帶式排布下的定位方法
- TiledSegmentTransform:實(shí)現(xiàn)在瓦片式排布下的定位方法
GeoTiffSegmentLayoutTransform:定義訪(fǎng)問(wèn)Segment中具體值的一系列邏輯

它們的包含關(guān)系大致是:

GeotiffSegmentCollection包含:
- SegmentByte
- GeotiffSegment
GeoTiffSegmentLayoutTransform包含:
- SegmentTransform
- GeoTiffSegmentLayout

我們先從實(shí)現(xiàn)讀取Byte數(shù)據(jù)到具體類(lèi)型的功能說(shuō)起.

4. 實(shí)現(xiàn)讀取Byte數(shù)據(jù)到具體類(lèi)型的功能

4.1 GeotiffSegmentCollection特質(zhì)

從繼承結(jié)構(gòu)圖中,我們可以看到Uint32GeotiffTile實(shí)現(xiàn)了Uint32GeotiffSegmentCollection的特質(zhì),而該特質(zhì)又繼承自GeotiffSegmentCollection.

代碼如下:

trait GeoTiffSegmentCollection {
  
  type T >: Null <: GeoTiffSegment

  val segmentBytes: SegmentBytes
  val decompressor: Decompressor
  val bandType: BandType

  // 預(yù)定義解壓函數(shù),從(Int, Array[Byte])轉(zhuǎn)換為GeoTiffSegment對(duì)象
  val decompressGeoTiffSegment: (Int, Array[Byte]) => T

  // 緩存上一次的調(diào)用值
  private var _lastSegment: T = null
  private var _lastSegmentIndex: Int = -1

  // 根據(jù)SegmentIndex獲取對(duì)應(yīng)的Segment
  def getSegment(i: Int): T = {
    if(i != _lastSegmentIndex) {
      _lastSegment = decompressGeoTiffSegment(i, segmentBytes.getSegment(i))
      _lastSegmentIndex = i
    }
    _lastSegment
  }

  // 迭代獲取Segment
  def getSegments(ids: Traversable[Int]): Iterator[(Int, T)] = {
    for { (id, bytes) <- segmentBytes.getSegments(ids) }
      yield id -> decompressGeoTiffSegment(id, bytes)
  }
}

trait UInt32GeoTiffSegmentCollection extends GeoTiffSegmentCollection {
  type T = UInt32GeoTiffSegment

  val bandType = UInt32BandType
    
  // 定義具體的解壓函數(shù)
  lazy val decompressGeoTiffSegment =
    (i: Int, bytes: Array[Byte]) => new UInt32GeoTiffSegment(decompressor.decompress(bytes, i))
}

我們重點(diǎn)關(guān)注GeoTiffSegmentCollection中的核心方法:getSegment,它定義了一個(gè)重要的功能:通過(guò)Segment序列號(hào),取得一個(gè)Segment對(duì)象.

通過(guò)分析語(yǔ)法結(jié)構(gòu),其邏輯為:

從數(shù)據(jù)源中讀取原始的壓縮過(guò)的Byte數(shù)據(jù).
將其解壓為未壓縮的數(shù)據(jù).
將未壓縮的Byte數(shù)據(jù)裝入指定類(lèi)型的GeotiffSegment對(duì)象中,并返回.

每個(gè)步驟都對(duì)應(yīng)一個(gè)字段/方法:

segmentBytes字段:實(shí)現(xiàn)了讀取Byte的功能.[定義于segmentByte類(lèi)]
decompressor字段:將壓縮后的Byte數(shù)據(jù)解壓的功能.[定義于Decompressor類(lèi)]
decompressGeoTiffSegment方法:將解壓后的Byte數(shù)據(jù)轉(zhuǎn)換為Geotiff文件實(shí)際的類(lèi)型(在本例中為Uint32),最終得到一個(gè)UInt32GeoTiffSegment對(duì)象[定義于UInt32GeoTiffSegment類(lèi)]

我們先從SegmentBytes類(lèi)開(kāi)始,看看Geotrellis是如何實(shí)現(xiàn)其中的邏輯.

4.2 SegmentBytes特質(zhì)

回顧一下GeotiffTile的構(gòu)造函數(shù),可見(jiàn)segmentBytes來(lái)自于構(gòu)造函數(shù)傳入的GeotiffInfo對(duì)象:

// 調(diào)用構(gòu)造函數(shù)
GeoTiffTile(
    info.segmentBytes, //傳入
    info.decompressor,
    info.segmentLayout,
    info.compression,
    info.cellType,
    Some(info.bandType),
    info.overviews.map(geoTiffSinglebandTile)
)

object GeoTiffTile {
  def apply(
    segmentBytes: SegmentBytes, // 定義形參
    decompressor: Decompressor,
    segmentLayout: GeoTiffSegmentLayout,
    compression: Compression,
    cellType: CellType,
    bandType: Option[BandType] = None,
    overviews: List[GeoTiffTile] = Nil
  ): GeoTiffTile = {
    bandType match {
      case Some(UInt32BandType) =>
        cellType match {
          case ct: FloatCells =>
            new UInt32GeoTiffTile(
              segmentBytes, // 傳入
              decompressor,
              segmentLayout,
              compression,
              ct,
              overviews.map(applyOverview(_, compression, cellType, bandType)).collect { case gt: UInt32GeoTiffTile => gt }
            )
    // ... 省略

segmentBytes的實(shí)際賦值:

// 在GeoTiffInfo的定義中

val segmentBytes: SegmentBytes =
  if (streaming)
    LazySegmentBytes(byteReader, tiffTags)
  else
    // byteReader共用了讀取tiffTag的byteReader
    // tiffTags此時(shí)已經(jīng)讀取完畢
    ArraySegmentBytes(byteReader, tiffTags)

我們先來(lái)看SegmentBytes特質(zhì)的定義:

trait SegmentBytes extends Seq[Array[Byte]] with Serializable {
  def getSegment(i: Int): Array[Byte]
  def getSegments(indices: Traversable[Int]): Iterator[(Int, Array[Byte])]
  def getSegmentByteCount(i: Int): Int
  def apply(idx: Int): Array[Byte] = getSegment(idx)
  def iterator: Iterator[Array[Byte]] =
    getSegments(0 until length).map(_._2)
}

可以看出:

SegmentBytes可以看成字節(jié)數(shù)組的序列,每一個(gè)字節(jié)數(shù)組可以看做一個(gè)條帶/瓦片,若干字節(jié)數(shù)組組成的序列就形成全部圖像數(shù)據(jù).
SegmentBytes要實(shí)現(xiàn)的功能是單個(gè)或迭代獲取字節(jié)序列(條帶/瓦片)

因?yàn)槲覀兛偸且詁yte字節(jié)的形式從存儲(chǔ)介質(zhì)中讀取數(shù)據(jù),因此SegmentByte是整個(gè)Segment模型的數(shù)據(jù)源頭

我們?cè)賮?lái)看一下實(shí)現(xiàn)SegmentBytes特質(zhì)的ArraySegmentBytes類(lèi)和LazySegmentBytes類(lèi)的定義

4.2.1 LazySegmentBytes類(lèi)

LazySegmentBytes是從ByteReader中讀取數(shù)據(jù)到內(nèi)存中的類(lèi):

class LazySegmentBytes(
  byteReader: ByteReader,
  tiffTags: TiffTags,
  maxChunkSize: Int = 32 * 1024 * 1024,
  maxOffsetBetweenChunks: Int = 1024
) extends SegmentBytes {

  import LazySegmentBytes.Segment

  def length: Int = tiffTags.segmentCount
    
  // 通過(guò)區(qū)分兩種排布方式獲取對(duì)應(yīng)的偏移量表和字節(jié)數(shù)表
  val (segmentOffsets, segmentByteCounts) =
    if (tiffTags.hasStripStorage) {
      val stripOffsets = tiffTags &|->
        TiffTags._basicTags ^|->
        BasicTags._stripOffsets get
      val stripByteCounts = tiffTags &|->
        TiffTags._basicTags ^|->
        BasicTags._stripByteCounts get
      (stripOffsets.get, stripByteCounts.get)
    } else {
      val tileOffsets = tiffTags &|->
        TiffTags._tileTags ^|->
        TileTags._tileOffsets get
      val tileByteCounts = tiffTags &|->
        TiffTags._tileTags ^|->
        TileTags._tileByteCounts get
      (tileOffsets.get, tileByteCounts.get)
    }

  def getSegmentByteCount(i: Int): Int = segmentByteCounts(i).toInt

  // 將Segment打包為緩沖塊
  protected def chunkSegments(segmentIds: Traversable[Int]): List[List[Segment]]  = {
    {for { id <- segmentIds } yield {
      // 記錄每一個(gè)Segment的起始字節(jié)位置和長(zhǎng)度信息,但不讀取實(shí)際值
      val offset = segmentOffsets(id)
      val length = segmentByteCounts(id)
      Segment(id, offset, offset + length - 1)
    }}.toSeq
      .sortBy(_.startOffset) // 因?yàn)镚eotiff并沒(méi)有強(qiáng)制要求每個(gè)數(shù)據(jù)塊按順序存儲(chǔ),因此需要保證按從小到大的順序排序,以符合一般閱讀邏輯
      .foldLeft((0L, List(List.empty[Segment]))) { case ((chunkSize, headChunk :: commitedChunks), seg) =>
      // chunkSize: 當(dāng)前塊的大小
      // headChunk: 當(dāng)前塊集合的第一個(gè)塊,也是最新追加的塊,是一個(gè)List[Segment]
      // commitedChunks: 除第一個(gè)以外的元素
      // seg:每一個(gè)傳入的Segment對(duì)象

      // 是否應(yīng)該開(kāi)啟新塊的判斷
      val isSegmentNearChunk =
        // 當(dāng)為第一個(gè)塊時(shí),headChunk沒(méi)有數(shù)據(jù),為Nil,使用headOption比較安全
        headChunk.headOption.map { c =>
          // 檢測(cè)最新添加的元素是否過(guò)大
          seg.startOffset - c.endOffset <= maxOffsetBetweenChunks
        }.getOrElse(true) // 當(dāng)調(diào)用時(shí)沒(méi)有數(shù)據(jù),也認(rèn)為在緩沖塊內(nèi)

      // 大小和偏移量都沒(méi)有越界的話(huà)
      if (chunkSize + seg.size <= maxChunkSize && isSegmentNearChunk)
        // 繼續(xù)往當(dāng)前的最新塊內(nèi)追加Segment
        (chunkSize + seg.size) -> ((seg :: headChunk) :: commitedChunks)
      else
        // 開(kāi)一個(gè)新塊,該塊內(nèi)首元素就是最新的Segment
        seg.size -> ((seg :: Nil) :: headChunk :: commitedChunks)
    }
  }._2.reverse.map(_.reverse) // 這里有兩個(gè)逆序:塊的逆序和每個(gè)塊內(nèi)Segment逆序,因?yàn)槎际峭ㄟ^(guò)首追加的方式構(gòu)造的


  // 不采用塊的模式,直接讀取數(shù)據(jù)
  def getSegment(i: Int): Array[Byte] = {
    val startOffset = segmentOffsets(i)
    val endOffset = segmentOffsets(i) + segmentByteCounts(i) - 1
    getBytes(startOffset, segmentByteCounts(i))
  }
  
  // 讀取每一個(gè)塊中的每一個(gè)Segment中的Byte數(shù)據(jù)
  protected def readChunk(segments: List[Segment]): Map[Int, Array[Byte]] = {
    segments
      .map { segment =>
        segment.id -> getBytes(segment.startOffset, segment.endOffset - segment.startOffset + 1)
      }
      .toMap
  }

  // 返回一個(gè)可以遍歷全部塊中Byte數(shù)據(jù)的迭代器
  def getSegments(indices: Traversable[Int]): Iterator[(Int, Array[Byte])] = {
    val chunks = chunkSegments(indices)
    chunks
      .toIterator // 轉(zhuǎn)換成迭代器,實(shí)現(xiàn)lazy模式
      .flatMap(chunk => readChunk(chunk)) // 每一個(gè)迭代讀取一個(gè)chunk
  }

  // 實(shí)際讀取Byte數(shù)據(jù)的方法
  private[geotrellis] def getBytes(offset: Long, length: Long): Array[Byte] = {
    byteReader.position(offset)
    byteReader.getBytes(length.toInt)
  }

}

object LazySegmentBytes {
  def apply(byteReader: ByteReader, tiffTags: TiffTags): LazySegmentBytes =
    new LazySegmentBytes(byteReader, tiffTags)

  // Segment的邏輯結(jié)構(gòu)
  case class Segment(id: Int, startOffset: Long, endOffset: Long) {
    def size: Long = endOffset - startOffset + 1
  }
}

顧名思義,LazySegmentBytes類(lèi)實(shí)現(xiàn)了一種以數(shù)據(jù)塊為滑動(dòng)窗口的讀取形式,以懶加載的形式從文件中讀取二進(jìn)制流方法getSegments,其步驟可描述為:

將原始文件中的全部條帶/瓦片的大小和偏移量信息記錄于一個(gè)Segment對(duì)象中.
如果連續(xù)的多個(gè)Segment同時(shí)滿(mǎn)足兩個(gè)條件,即其中記錄的條帶/瓦片大小之和不超過(guò)32MB且首尾Segment間記錄的偏移量之差也不超過(guò)1000時(shí),就將這些Segment合并為一個(gè)塊(List[Segment]),即chunk.
最終形成一個(gè)包含若干塊的列表List(List[Segment]).
讀取時(shí)將塊列表轉(zhuǎn)換為迭代器.每個(gè)迭代器返回一個(gè)塊.
因?yàn)榈鞯?a target="_blank">Lazy特性,只有在獲取每個(gè)迭代元素的時(shí)候才真正的執(zhí)行與其相關(guān)的代碼.因此將讀取Byte的代碼與每個(gè)迭代相關(guān)聯(lián),則讀取數(shù)據(jù)也只發(fā)生在迭代到具體塊時(shí),這就能實(shí)現(xiàn)每次讀取到內(nèi)存的數(shù)據(jù)不超過(guò)塊限定的最大數(shù)據(jù)(默認(rèn)值為32MB).這樣就能在效率和資源占用中取得一個(gè)平衡.
- 如果沒(méi)有懶加載,自動(dòng)將全部數(shù)據(jù)讀取到內(nèi)存,就會(huì)造成雙倍的內(nèi)存占用.浪費(fèi)了資源.
- 如果不將Segment聚合為塊,雖然內(nèi)存節(jié)省的更多,但頻繁的IO上下文切換可能會(huì)影響效率.

4.2.2 ArraySegmentBytes類(lèi)

ArraySegmentBytes是直接從內(nèi)存中讀取數(shù)據(jù)的類(lèi),是對(duì)LazySegmentBytes類(lèi)的再封裝:

class ArraySegmentBytes(compressedBytes: Array[Array[Byte]]) extends SegmentBytes {

  def length = compressedBytes.length
  def getSegment(i: Int) = compressedBytes(i)
  def getSegmentByteCount(i: Int): Int = compressedBytes(i).length
  def getSegments(indices: Traversable[Int]): Iterator[(Int, Array[Byte])] =
    indices.toIterator
      .map { i => i -> compressedBytes(i) }
}

object ArraySegmentBytes {

  def apply(byteReader: ByteReader, tiffTags: TiffTags): ArraySegmentBytes = {
    // 通過(guò)LazySegmentBytes類(lèi)直接將指定文件的全部數(shù)據(jù)讀取到內(nèi)存中
    val streaming = LazySegmentBytes(byteReader, tiffTags)
    val compressedBytes = Array.ofDim[Array[Byte]](streaming.length)
    streaming.getSegments(compressedBytes.indices).foreach {
      case (i, bytes) => compressedBytes(i) = bytes
    }
    new ArraySegmentBytes(compressedBytes)
  }
}

4.3 Decompressor類(lèi)

Decompressor類(lèi)定義了解壓Byte數(shù)據(jù)的邏輯,也來(lái)自GeotiffTile的構(gòu)造函數(shù)傳入的GeotiffInfo對(duì)象.

數(shù)據(jù)一般是壓縮后存入Tiff文件的,因此在實(shí)際讀取時(shí),需要先解壓.在這里可以看見(jiàn)默認(rèn)支持的壓縮算法.當(dāng)然我們無(wú)需去關(guān)注壓縮/解壓方法的具體實(shí)現(xiàn),因?yàn)樗鼈兪菢?biāo)準(zhǔn)的通用算法.我們只需關(guān)注它們是如何與Geotrellis的邏輯交互的.

Decompressor的構(gòu)造函數(shù)如下:

object Decompressor {
  def apply(tiffTags: TiffTags, byteOrder: ByteOrder): Decompressor = {
    import geotrellis.raster.io.geotiff.tags.codes.CompressionType._

    // 檢測(cè)字節(jié)序
    def checkEndian(d: Decompressor): Decompressor = {
      // ByteBuffer默認(rèn)為大端序列,如果數(shù)據(jù)是小端序列,需要翻轉(zhuǎn)
      if(byteOrder != ByteOrder.BIG_ENDIAN && tiffTags.bitsPerPixel > 8) {
        d.flipEndian(tiffTags.bytesPerPixel / tiffTags.bandCount)
      } else { d }
    }

    // 檢測(cè)預(yù)測(cè)器
    def checkPredictor(d: Decompressor): Decompressor = {
      val predictor = Predictor(tiffTags)
      if(predictor.checkEndian)
        checkEndian(d).withPredictor(predictor)
      else { d.withPredictor(predictor) }

    val segmentCount = tiffTags.segmentCount
    val segmentSizes = Array.ofDim[Int](segmentCount)
    val bandCount = tiffTags.bandCount
    if(!tiffTags.hasPixelInterleave || bandCount == 1) {
      cfor(0)(_ < segmentCount, _ + 1) { i =>
        segmentSizes(i) = tiffTags.imageSegmentByteSize(i).toInt
      }
    } else {
      cfor(0)(_ < segmentCount, _ + 1) { i =>
        segmentSizes(i) = tiffTags.imageSegmentByteSize(i).toInt * tiffTags.bandCount
      }
    }

    // 根據(jù)元數(shù)據(jù)中定義的壓縮類(lèi)型選擇解壓器
    tiffTags.compression match {
      case Uncompressed =>
        checkEndian(NoCompression)
      case LZWCoded =>
        checkPredictor(LZWDecompressor(segmentSizes))
      case ZLibCoded | PkZipCoded =>
        checkPredictor(DeflateCompression.createDecompressor(segmentSizes))
      case PackBitsCoded => // PackBits壓縮方式不支持預(yù)測(cè)器
        checkEndian(PackBitsDecompressor(segmentSizes))
      case JpegCoded => // 有損壓縮,無(wú)預(yù)測(cè)器概念
        checkEndian(JpegDecompressor(tiffTags))

      // 
      case HuffmanCoded =>
        val msg = "compression type CCITTRLE is not supported by this reader."
        throw new GeoTiffReaderLimitationException(msg)
      // ... 省略若干不支持的壓縮方式
    }
  }
}

有關(guān)預(yù)測(cè)器(predictor),可以在這里了解詳細(xì)信息.
這里也有關(guān)于Tiff文件壓縮的討論.

4.4 GeotiffSegment類(lèi)及其繼承類(lèi)

壓縮后的數(shù)據(jù)從SegmentByte中被讀取,從Decompressor中被解壓為原始的Byte類(lèi)型值,最終在GeotiffSegment中被轉(zhuǎn)換為實(shí)際數(shù)據(jù)類(lèi)型值.

根據(jù)上一篇數(shù)據(jù)模型模型,Geotrellis因?yàn)樯婕暗絅odata值的定義,因此有7*3+1種實(shí)際的數(shù)據(jù)類(lèi)型.因此需要與Celltype對(duì)應(yīng)的CelltypeGeotiffSegment.

以Float32類(lèi)型為例,看一下GeotiffSegment如何實(shí)現(xiàn)其功能:

// 抽象的GeotiffSegment,只預(yù)定義方法,沒(méi)有實(shí)現(xiàn)
trait GeoTiffSegment {
  def size: Int
  def getInt(i: Int): Int // 獲取指定數(shù)據(jù)
  def getDouble(i: Int): Double

  def bytes: Array[Byte]

  def map(f: Int => Int): Array[Byte] 
  def mapDouble(f: Double => Double): Array[Byte]
  def mapWithIndex(f: (Int, Int) => Int): Array[Byte]
  def mapDoubleWithIndex(f: (Int, Double) => Double): Array[Byte]
}

// 針對(duì)float32類(lèi)型,實(shí)現(xiàn)部分預(yù)定義方法
abstract class Float32GeoTiffSegment(val bytes: Array[Byte]) extends GeoTiffSegment {
  protected val buffer = ByteBuffer.wrap(bytes).asFloatBuffer
  // float32占用4字節(jié)
  val size: Int = bytes.size / 4

  // 直接獲取數(shù)據(jù)
  def get(i: Int): Float = buffer.get(i)

  def getInt(i: Int): Int
  def getDouble(i: Int): Double
  protected def intToFloatOut(v: Int): Float
  protected def doubleToFloatOut(v: Double): Float

  // 實(shí)現(xiàn)了map操作的相關(guān)方法
  def map(f: Int => Int): Array[Byte] = {
    val arr = Array.ofDim[Float](size)
    // 以Int類(lèi)型獲取全部數(shù)據(jù)
    cfor(0)(_ < size, _ + 1) { i =>
      arr(i) = intToFloatOut(f(getInt(i)))
    }
    // 將結(jié)果值存回Byte數(shù)組
    val result = new Array[Byte](size * FloatConstantNoDataCellType.bytes)
    val bytebuff = ByteBuffer.wrap(result)
    bytebuff.asFloatBuffer.put(arr)
    result
  }

  def mapWithIndex(f: (Int, Int) => Int): Array[Byte] = {
    val arr = Array.ofDim[Float](size)
    cfor(0)(_ < size, _ + 1) { i =>
      arr(i) = intToFloatOut(f(i, getInt(i)))
    }
    val result = new Array[Byte](size * FloatConstantNoDataCellType.bytes)
    val bytebuff = ByteBuffer.wrap(result)
    bytebuff.asFloatBuffer.put(arr)
    result
  }
  
  // ...省略與double相關(guān)的函數(shù)定義,與int的類(lèi)似

}

// 無(wú)Nodata值模式
class Float32RawGeoTiffSegment(bytes: Array[Byte]) extends Float32GeoTiffSegment(bytes) {
  def getInt(i: Int): Int = get(i).toInt
  def getDouble(i: Int): Double = get(i).toDouble

  // 直接進(jìn)行數(shù)值轉(zhuǎn)換即可
  protected def intToFloatOut(v: Int): Float = v.toFloat
  protected def doubleToFloatOut(v: Double): Float = v.toFloat
}

// 使用固定Nodata值模式
class Float32ConstantNoDataGeoTiffSegment(bytes: Array[Byte]) extends Float32GeoTiffSegment(bytes) {
  // 使用定義的轉(zhuǎn)換方法
  // 這些方法都是宏方法,將放到后面介紹
  def getInt(i: Int): Int = f2i(get(i))
  def getDouble(i: Int): Double = f2d(get(i))

  protected def intToFloatOut(v: Int): Float = i2f(v)
  protected def doubleToFloatOut(v: Double): Float = d2f(v)
}

// 使用用戶(hù)自定義Nodata值的情況
class Float32UserDefinedNoDataGeoTiffSegment(bytes: Array[Byte], val userDefinedFloatNoDataValue: Float)
    extends Float32GeoTiffSegment(bytes)
       with UserDefinedFloatNoDataConversions {

  // 使用定義的轉(zhuǎn)換方法
  def getInt(i: Int): Int = udf2i(get(i))
  def getDouble(i: Int): Double = udf2d(get(i))

  protected def intToFloatOut(v: Int): Float = i2udf(v)
  protected def doubleToFloatOut(v: Double): Float = d2udf(v)
}

可以發(fā)現(xiàn):

即使對(duì)于Float32格式的數(shù)據(jù),get/map函數(shù)依舊收束為對(duì)int/double的操作.
因?yàn)樯婕暗絅odata值轉(zhuǎn)換,所以遇到Byte數(shù)據(jù)轉(zhuǎn)換與實(shí)際類(lèi)型數(shù)據(jù)相互轉(zhuǎn)換的操作就會(huì)按Celltype延展出分支.

至此,就能大概了解GeotiffSegmentCollection從Byte數(shù)組中讀取實(shí)際類(lèi)型的數(shù)據(jù)是如何實(shí)現(xiàn)的了.

對(duì)于GeotiffSegmentCollection來(lái)說(shuō),讀取的粒度是Segment,這是一個(gè)邏輯上的結(jié)構(gòu),沒(méi)有實(shí)際的物理意義,使用Segment的索引(SegmentIndex)可以遍歷全部數(shù)據(jù),但若想讀取指定區(qū)域的數(shù)據(jù),則需要一個(gè)Segment與實(shí)際行列號(hào)間的相互轉(zhuǎn)換機(jī)制.這就是GeoTiffSegmentLayoutTransform存在的意義了.

5. 實(shí)現(xiàn)從指定位置讀取數(shù)據(jù)的功能

5.1 GeoTiffSegmentLayout類(lèi)

與sgemetBytes和decopressor對(duì)象一樣,segmentLayout也來(lái)自于構(gòu)造函數(shù)傳入的GeotiffInfo對(duì)象:

// 以瓦片的形式描述Segment的布局結(jié)構(gòu)
// layoutCols/Rows:一列/行能放下多少個(gè)Segment片
// tileCols/Rows:一個(gè)Segment片的一列/行有多少個(gè)像素
case class TileLayout(layoutCols: Int, layoutRows: Int, tileCols: Int, tileRows: Int)

// 通過(guò)伴隨對(duì)象調(diào)用的方法
object GeoTiffSegmentLayout {
  def apply(
    totalCols: Int,
    totalRows: Int,
    storageMethod: StorageMethod,
    interleaveMethod: InterleaveMethod,
    bandType: BandType
  ): GeoTiffSegmentLayout = {
    
    val tileLayout =
      storageMethod match {
        // 瓦片式排布下改動(dòng)不大
        case Tiled(blockCols, blockRows) =>
          // 計(jì)算一列/行能放下多少個(gè)
          val layoutCols = math.ceil(totalCols.toDouble / blockCols).toInt
          val layoutRows = math.ceil(totalRows.toDouble / blockRows).toInt
          TileLayout(layoutCols, layoutRows, blockCols, blockRows)
        case s: Striped =>
          val rowsPerStrip = math.min(s.rowsPerStrip(totalRows, bandType), totalRows).toInt
          // 計(jì)算一列能放下多少行
          val layoutRows = math.ceil(totalRows.toDouble / rowsPerStrip).toInt
          // 條帶式排布每行只有1個(gè)Segment片
          // 條帶瓦片占滿(mǎn)整行,Segment片的寬度就是整行的寬度
          TileLayout(1, layoutRows, totalCols, rowsPerStrip)
      }
    GeoTiffSegmentLayout(totalCols, totalRows, tileLayout, storageMethod, interleaveMethod)
  }
}

// GeoTiffSegmentLayout的定義
case class GeoTiffSegmentLayout(
  totalCols: Int,
  totalRows: Int,
  tileLayout: TileLayout,
  storageMethod: StorageMethod,
  interleaveMethod: InterleaveMethod
) {
      def isTiled: Boolean =
        storageMethod match {
          case _: Tiled => true
          case _ => false
        }
      def isStriped: Boolean = !isTiled
      def hasPixelInterleave: Boolean = interleaveMethod == PixelInterleave

  // 根據(jù)給定的行列號(hào)計(jì)算所在Segmen片的序號(hào)
  private [geotiff] def getSegmentIndex(col: Int, row: Int): Int = {
    // 定位該位置在列中的位置
    val layoutCol = col / tileLayout.tileCols
    // 定位該位置在行中的位置
    val layoutRow = row / tileLayout.tileRows
    // 最終計(jì)算出具體是哪一個(gè)Segment片
    (layoutRow * tileLayout.layoutCols) + layoutCol
  }
  
  // ... 省略其他方法
}

Segment在這里與Tile是同一個(gè)東西,前者的語(yǔ)義更強(qiáng)調(diào)其在數(shù)據(jù)讀取中的作用,后者則是其在布局中的作用.為了方便理解,都使用Segment片來(lái)描述.

GeoTiffSegmentLayout實(shí)現(xiàn)了通過(guò)行列號(hào)定位Segment片的序號(hào),這只是知道了一個(gè)位置范圍.在該Segment片中精確定位指定行列號(hào)的位置,就交給了SegmentTransform特質(zhì)去實(shí)現(xiàn).

5.2 SegmentTransform特質(zhì)

private [geotiff] trait SegmentTransform {
  // 每一個(gè)Segment片對(duì)應(yīng)一個(gè)SegmentTransform
  def segmentIndex: Int
  def segmentLayoutTransform: GeoTiffSegmentLayoutTransform
  protected def segmentLayout = segmentLayoutTransform.segmentLayout

  protected def bandCount = segmentLayoutTransform.bandCount

  protected def layoutCols: Int = segmentLayout.tileLayout.layoutCols
  protected def layoutRows: Int = segmentLayout.tileLayout.layoutRows

  protected def tileCols: Int = segmentLayout.tileLayout.tileCols
  protected def tileRows: Int = segmentLayout.tileLayout.tileRows

  // 定位該Segment片整張影像的哪一列/行
  protected def layoutCol: Int = segmentIndex % layoutCols
  protected def layoutRow: Int = segmentIndex / layoutCols
    
  // ...省略

}

// 以瓦片式排布為例
private [geotiff] case class TiledSegmentTransform(segmentIndex: Int, segmentLayoutTransform: GeoTiffSegmentLayoutTransform) extends SegmentTransform {
  // 根據(jù)行列號(hào)計(jì)算在本Segment片中指定位置的序列號(hào)
  def gridToIndex(col: Int, row: Int): Int = {
    val tileCol = col - (layoutCol * tileCols)
    val tileRow = row - (layoutRow * tileRows)
    tileRow * tileCols + tileCol
  }

}

5.3 GeoTiffSegmentLayoutTransform

GeoTiffSegmentLayoutTransform類(lèi)將SegmentTransform特質(zhì)和GeoTiffSegmentLayout類(lèi)組合起來(lái)使用:

trait GeoTiffSegmentLayoutTransform {
  private [geotrellis] def segmentLayout: GeoTiffSegmentLayout
  // 這里使用了懶加載配合對(duì)象抽取,在segmentLayout被賦值后自動(dòng)獲取相關(guān)的一系列參數(shù)
  private lazy val GeoTiffSegmentLayout(totalCols, totalRows, tileLayout, isTiled, interleaveMethod) =
    segmentLayout
    
  // 獲取Segment片的序列號(hào)
  private [geotiff] def getSegmentIndex(col: Int, row: Int): Int =
    segmentLayout.getSegmentIndex(col, row)

  // 獲取指定序列的Segment片的轉(zhuǎn)換器
  private [geotiff] def getSegmentTransform(segmentIndex: Int): SegmentTransform = {
    val id = segmentIndex % bandSegmentCount
    if (segmentLayout.isStriped)
      StripedSegmentTransform(id, GeoTiffSegmentLayoutTransform(segmentLayout, bandCount))
    else
      TiledSegmentTransform(id, GeoTiffSegmentLayoutTransform(segmentLayout, bandCount))
}

object GeoTiffSegmentLayoutTransform {
  def apply(_segmentLayout: GeoTiffSegmentLayout, _bandCount: Int): GeoTiffSegmentLayoutTransform =
    new GeoTiffSegmentLayoutTransform {
      val segmentLayout = _segmentLayout
      val bandCount = _bandCount
    }
}

從類(lèi)繼承圖中可以看到,GeotiffTile類(lèi)實(shí)現(xiàn)了GeoTiffSegmentLayoutTransform的特質(zhì),即GeotiffTile類(lèi)擁有了從指定行列號(hào)讀取具體類(lèi)型數(shù)值的能力:

def get(col: Int, row: Int): Int = {
    // 獲取指定位置所在的瓦片序號(hào)(來(lái)自GeoTiffSegmentLayout的方法)
    val segmentIndex = getSegmentIndex(col, row)
    // 獲取指定位置在該瓦片中的位置(來(lái)自GeoTiffSegmentLayoutTransform和SegmentTransform的方法)
    val i = getSegmentTransform(segmentIndex).gridToIndex(col, row)
    // 精確定位位置,獲取數(shù)值(來(lái)自SegmentByte和GeotiffSegment的方法)
    getSegment(segmentIndex).getInt(i)
}

6. 總結(jié)

我們通過(guò)分析類(lèi)繼承圖我們將特質(zhì)分為兩大類(lèi):

Segment相關(guān)
宏相關(guān)

我們主要研究了Segment相關(guān)的特質(zhì),并引入了Segment模型.Segment模型主要實(shí)現(xiàn)了兩大功能:

定位數(shù)據(jù)具體位置
讀取原始Byte數(shù)據(jù)并轉(zhuǎn)換到實(shí)際的數(shù)據(jù)類(lèi)型

其中的核心概念就是Segment.什么是Segment?Segment是一個(gè)邏輯概念:

對(duì)于瓦片式排布:一個(gè)瓦片就是一個(gè)Segment
對(duì)于條帶式排布:一個(gè)條帶就是一個(gè)Segment
根據(jù)行列號(hào)計(jì)算具體位置時(shí),操作的Tile也是Segment

Segment模型打通了從讀取到訪(fǎng)問(wèn)的全套流程.

其實(shí),宏在數(shù)據(jù)的讀取與轉(zhuǎn)換中也發(fā)揮了巨大的作用.我們下一節(jié)就分析一下宏模型在Geotrellis中起的作用.

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九欧美,1769亚洲,黄色成人av

How it works(18) Geotrellis是如何讀取GeoTiff的(C) Segment模型

How it works(18) Geotrellis是如何讀取GeoTiff的(C) Segment模型

1. 引入

2. 圖像數(shù)據(jù)在tiff文件中的排布方式

2.1 條帶式排布

2.2 瓦片式排布

2.3 兩種模型的區(qū)別與聯(lián)系

3. Segment模型概況

4. 實(shí)現(xiàn)讀取Byte數(shù)據(jù)到具體類(lèi)型的功能

4.1 GeotiffSegmentCollection特質(zhì)

4.2 SegmentBytes特質(zhì)

4.2.1 LazySegmentBytes類(lèi)

4.2.2 ArraySegmentBytes類(lèi)

4.3 Decompressor類(lèi)

4.4 GeotiffSegment類(lèi)及其繼承類(lèi)

5. 實(shí)現(xiàn)從指定位置讀取數(shù)據(jù)的功能

5.1 GeoTiffSegmentLayout類(lèi)

5.2 SegmentTransform特質(zhì)

5.3 GeoTiffSegmentLayoutTransform

6. 總結(jié)

相關(guān)閱讀更多精彩內(nèi)容

友情鏈接更多精彩內(nèi)容

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九 欧美,1769亚洲,黄色成人av

How it works(18) Geotrellis是如何讀取GeoTiff的(C) Segment模型

1. 引入

2. 圖像數(shù)據(jù)在tiff文件中的排布方式

2.1 條帶式排布

2.2 瓦片式排布

2.3 兩種模型的區(qū)別與聯(lián)系

3. Segment模型概況

4. 實(shí)現(xiàn)讀取Byte數(shù)據(jù)到具體類(lèi)型的功能

4.1 GeotiffSegmentCollection特質(zhì)

4.2 SegmentBytes特質(zhì)

4.2.1 LazySegmentBytes類(lèi)

4.2.2 ArraySegmentBytes類(lèi)

4.3 Decompressor類(lèi)

4.4 GeotiffSegment類(lèi)及其繼承類(lèi)

5. 實(shí)現(xiàn)從指定位置讀取數(shù)據(jù)的功能

5.1 GeoTiffSegmentLayout類(lèi)

5.2 SegmentTransform特質(zhì)

5.3 GeoTiffSegmentLayoutTransform

6. 總結(jié)

相關(guān)閱讀更多精彩內(nèi)容

友情鏈接更多精彩內(nèi)容

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九欧美,1769亚洲,黄色成人av