免费不卡av,日韩精品www,国产免异久久

準(zhǔn)備

使用的庫(kù)：

superagent (需要安裝類型文件@types/superagent)
npm install superagent @types/superagent -D
cheerio 向jquery一樣來解析html
npm install cheerio @types/cheerio -D

開始

使用superagent爬取html

async getRawHtml(){
    const result = await superagent.get(this.url)
    console.log('result test',result.text)
  }

使用cheerio 對(duì)html內(nèi)容進(jìn)行解析

cheerio遵從jquery的語(yǔ)法

eq:表示第幾個(gè)元素

getCourseInfo(html:string){
    const courseInfos:Course[] = []
    const $ = cheerio.load(html)
    $('.content').find('.course-item').map((index,element) =>{
      const desc = $(element).find('.course-desc')
      const name = desc.eq(0).text()
      const count = parseInt(desc.eq(1).text().split('：')[1])
      courseInfos.push({
        name,
        count
      })
    })
    console.log('result',courseInfos)
  }

將爬取的內(nèi)容寫進(jìn)json文件

使用fs讀寫文件
對(duì)Object類型的key，value定義interface

interface Content{
  [propName:number]:Course[]
}

generateJsonFile(courseResult:courseResult){
    // write to json file
    let fileContent:Content = {}
    const filePath = path.resolve(__dirname,'../data/course.json')
    if(fs.existsSync(filePath)){
      fileContent = JSON.parse(fs.readFileSync(filePath,'utf-8'))
    }
    fileContent[courseResult.time] = courseResult.data
    fs.writeFileSync(filePath,JSON.stringify(fileContent,null,2),'utf-8')
  }

最后整理一下代碼，將度文件與希望文件的行為分開。

使用組合模式優(yōu)化代碼

將獨(dú)有的邏輯抽離

將分析html并生成文件的的代碼片段抽離出去，放在一個(gè)單獨(dú)的類里面，并且調(diào)用

crowller 只負(fù)責(zé)讀取/寫
analyzer 只負(fù)責(zé)分析

// class crowller
constructor(private analyzer:any){
    this.initSpiderProcess()
  }

//index
 const analyzer = new Analyzer()
const crowller = new Crowller(analyzer)

將具體的analyzer的邏輯都放在analyze這個(gè)方法之中，并對(duì)analyzer這個(gè)類以及獨(dú)有的analyze這個(gè)方法定義一個(gè)接口。

// crowller.ts
async initSpiderProcess(){
    const html = await this.getRawHtml()
    const fileContent = this.analyzer.analyze(html,this.filePath)
    this.writeFile(fileContent)
  }
  constructor(private url:string,private analyzer:Analyzer){
    this.initSpiderProcess()
  }
//analyzer.ts

public analyze(html:string,filePath:string){
    const courseInfo = this.getCourseInfo(html)
    const fileContent = this.generateJsonFile(courseInfo,filePath)
    return JSON.stringify(fileContent,null,2) 
  }
//不同的分析器只要重寫這一個(gè)方法就可以了
export default class Web1Analyzer implements Analyzer{
  public analyze(html:string,filePath:string){
    const courseInfo = html;
    return courseInfo
  }
}

最后附上所有相關(guān)代碼：

// analyzer.ts
import cheerio from 'cheerio'
import fs from 'fs'
import path from 'path'

import {Analyzer} from './crowller'

interface Course{
  name:string,
  count:number
}

interface courseResult{
  time:number,
  data:Course[]
}

interface Content{
  [propName:number]:Course[]
}

export default class Web1Analyzer implements Analyzer{
  private getCourseInfo(html:string){
    const courseInfos:Course[] = []
    const $ = cheerio.load(html)
    $('.content').find('.course-item').map((index,element) =>{
      const desc = $(element).find('.course-desc')
      const name = desc.eq(0).text()
      const count = parseInt(desc.eq(1).text().split('：')[1])
      courseInfos.push({
        name,
        count
      })
    })
    const result = {
      time:new Date().getTime(),
      data:courseInfos
    }
    return result
  }

  private generateJsonFile(courseResult:courseResult,filePath:string){
    // write to json file
    let fileContent:Content = {}
    if(fs.existsSync(filePath)){
      fileContent = JSON.parse(fs.readFileSync(filePath,'utf-8'))
    }
    fileContent[courseResult.time] = courseResult.data
    return fileContent
  }

  public analyze(html:string,filePath:string){
    const courseInfo = this.getCourseInfo(html)
    const fileContent = this.generateJsonFile(courseInfo,filePath)
    return JSON.stringify(fileContent,null,2) 
  }
}

// crowller
import superagent from 'superagent'
import cheerio from 'cheerio'
import fs from 'fs'
import path from 'path'

import Web1Analyzer from './analyzer'
import analyzerB from './analyzerB'

interface Course{
  name:string,
  count:number
}

interface courseResult{
  time:number,
  data:Course[]
}

interface Content{
  [propName:number]:Course[]
}

export interface Analyzer{
  analyze:(html:string,filePath:string) => string;
}

class Crowller{
  private filePath = path.resolve(__dirname,'../data/course.json')
  async getRawHtml(){
    const result = await superagent.get(this.url)
    return result.text
  }
  
  writeFile(fileContent:string){
    fs.writeFileSync(this.filePath,fileContent,'utf-8')
  }
  async initSpiderProcess(){
    const html = await this.getRawHtml()
    const fileContent = this.analyzer.analyze(html,this.filePath)
    this.writeFile(fileContent)
  }
  constructor(private url:string,private analyzer:Analyzer){
    this.initSpiderProcess()
  }
}

const secret = 'x3b174jsx'
const url = `http://www.dell-lee.com/typescript/demo.html?secret=${secret}`
const analyzer = new analyzerB()
new Crowller(url,analyzer)
// superagent js ts --> js

//ts -> .d.ts 翻譯文件 @types-> js

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九欧美,1769亚洲,黄色成人av

Typescript學(xué)習(xí)筆記(10) ----- ts爬蟲實(shí)戰(zhàn)

Typescript學(xué)習(xí)筆記(10) ----- ts爬蟲實(shí)戰(zhàn)

準(zhǔn)備

使用的庫(kù)：

開始

使用superagent爬取html

使用cheerio 對(duì)html內(nèi)容進(jìn)行解析

將爬取的內(nèi)容寫進(jìn)json文件

使用組合模式優(yōu)化代碼

將獨(dú)有的邏輯抽離

友情鏈接更多精彩內(nèi)容

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九 欧美,1769亚洲,黄色成人av

Typescript學(xué)習(xí)筆記(10) ----- ts爬蟲實(shí)戰(zhàn)

準(zhǔn)備

使用的庫(kù)：

開始

使用superagent爬取html

使用cheerio 對(duì)html內(nèi)容進(jìn)行解析

將爬取的內(nèi)容寫進(jìn)json文件

使用組合模式優(yōu)化代碼

將獨(dú)有的邏輯抽離

友情鏈接更多精彩內(nèi)容

色偷偷精品伊人,欧洲久久精品,欧美综合婷婷骚逼,国产AV主播,国产最新探花在线,九色在线视频一区,伊人大交九欧美,1769亚洲,黄色成人av