【mysql】將以固定分隔符分隔的字符串轉(zhuǎn)成多行的形式

背景

近期在做用戶標(biāo)簽項目,目前標(biāo)簽的存儲是用戶id,標(biāo)簽ids(每個標(biāo)簽以,形式存儲) 的形式,但是如果想統(tǒng)計前后兩天標(biāo)簽的變化,使用find_in_set函數(shù),一方面查詢速度慢(因為不能使用索引),另一方面目前標(biāo)簽已有將近300多個,以后還會更多,一個標(biāo)簽一個標(biāo)簽的寫,使得sql特別長。
目前采取的策略是:將用戶標(biāo)簽表拆分成 用戶id,標(biāo)簽id的形式。這樣將前后兩天的表關(guān)聯(lián),就能查出昨天到今天有那些標(biāo)簽離開,哪些標(biāo)簽進(jìn)來。
那么采用上述策略就需要研究怎么將用戶id,標(biāo)簽ids的形式轉(zhuǎn)換成用戶id,標(biāo)簽id的形式。

探索

我們知道像 1,2,3,4,5,212 這種字符串,如果要分別取到1 2 3 4 5 212,用編程的思想就是先將該字符串用","分隔成一個數(shù)據(jù),然后遍歷取到數(shù)組里的每一個值,但是在mysql里并沒有數(shù)組的概念,但是我們可以用各種方法求得字符串的長度,以及求得使用“,”分隔后有多少個值。也可以用mysql可以采取的字符串截取的形式去獲得相應(yīng)位置的數(shù)值。下面就讓我們看一下吧~

實現(xiàn)

  • 相關(guān)表結(jié)構(gòu)
CREATE TABLE `tagids_label` (
  `userid` int(11) NOT NULL COMMENT '用戶id',
  `label` int(11) NOT NULL COMMENT '標(biāo)記,暫時 保留三天的數(shù)據(jù),day%3 ',
  `day` int(11) NOT NULL COMMENT '對應(yīng)的統(tǒng)計日期的天',
  `tagids` text NOT NULL COMMENT '標(biāo)簽id,以,(英文)分隔',
  `createTime` datetime NOT NULL COMMENT '創(chuàng)建時間',
  `updateTime` datetime NOT NULL COMMENT '更新時間',
  PRIMARY KEY (`userid`,`label`),
  KEY `index_day` (`day`),
  KEY `index_label` (`label`),
  KEY `index_label_userid` (`userid`,`label`),
  KEY `index_createTime_userid` (`userid`,`createTime`),
  KEY `index_userid` (`userid`),
  KEY `index_createtime` (`createTime`) USING BTREE,
  FULLTEXT KEY `index_tagids` (`tagids`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COMMENT='用戶標(biāo)簽結(jié)果表'
CREATE TABLE `sequence` (
  `seq` int(3) DEFAULT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8

ps:sequence表表示標(biāo)簽的個數(shù),從1到最大個數(shù)

  • 獲得固定分隔符分隔后元素個數(shù)

    • 原數(shù)據(jù)
    SELECT * FROM `tagids_label` WHERE `userid` =2
    
    blockchain
    171,172,173,174,175,184,187,189,191,192,49,52,55,90,96,101,104,110,7,9,253,270,277,280,129,131,134,136,138,139,231,241,58,63,66,70,72,75,77,79,84,149,150,159,163,165,166,193,195,256,225,236,246,248,197,200,207,221,210,278,227
    
    • 分隔符分隔后元素個數(shù)
    SELECT length(`tagids`) ,length(REPLACE (`tagids`,',','')),length(`tagids`)-length(REPLACE (`tagids`,',',''))+1  FROM `tagids_label` WHERE `userid` =2
    
    image.png
    • 注:length(tagids)計算字符串長度,以字節(jié)為單位,每個數(shù)字、英文標(biāo)點符號是一個字節(jié),每個中文、中文標(biāo)點符號是3個字節(jié)。length(tagids)表示tagids含有多少個數(shù)字和標(biāo)點符號
      replace(tagids,',','')將字符串tagids里的","用空字符來代替,length(replace(tagids,',',''))就表示tagids含有多少個數(shù)字。
      length(tagids)-length(replace(tagids,',','')) 表示tagsid含有多少個標(biāo)點符號,那標(biāo)點符號+1就表示tagids用","分隔符分隔后含有多少個元素,即標(biāo)簽個數(shù)。

  • substring_index 截取字符串
    • 根據(jù)關(guān)鍵字","截取字符串
    SELECT substring_index('171,172,173,174,175,184,187,189,191,192,49,52,55,90,96,101,104,110,7,9,253,270,277,280,129,131,134,136,138,139,231,241,58,63,66,70,72,75,77,79,84,149,150,159,163,165,166,193,195,256,225,236,246,248,197,200,207,221,210,278,227', ',',1)
    UNION ALL 
    SELECT substring_index('171,172,173,174,175,184,187,189,191,192,49,52,55,90,96,101,104,110,7,9,253,270,277,280,129,131,134,136,138,139,231,241,58,63,66,70,72,75,77,79,84,149,150,159,163,165,166,193,195,256,225,236,246,248,197,200,207,221,210,278,227', ',',2)
    UNION ALL 
    SELECT substring_index('171,172,173,174,175,184,187,189,191,192,49,52,55,90,96,101,104,110,7,9,253,270,277,280,129,131,134,136,138,139,231,241,58,63,66,70,72,75,77,79,84,149,150,159,163,165,166,193,195,256,225,236,246,248,197,200,207,221,210,278,227', ',',3)
    UNION ALL 
    SELECT substring_index('171,172,173,174,175,184,187,189,191,192,49,52,55,90,96,101,104,110,7,9,253,270,277,280,129,131,134,136,138,139,231,241,58,63,66,70,72,75,77,79,84,149,150,159,163,165,166,193,195,256,225,236,246,248,197,200,207,221,210,278,227', ',',4)
    UNION ALL 
    ......
    UNION ALL
    SELECT substring_index('171,172,173,174,175,184,187,189,191,192,49,52,55,90,96,101,104,110,7,9,253,270,277,280,129,131,134,136,138,139,231,241,58,63,66,70,72,75,77,79,84,149,150,159,163,165,166,193,195,256,225,236,246,248,197,200,207,221,210,278,227',',',61)
    
    結(jié)果:
    171
    171,172
    171,172,173
    171,172,173,174
    ......
    171,172,173,174,175,184,187,189,191,192,49,52,55,90,96,101,104,110,7,9,253,270,277,280,129,131,134,136,138,139,231,241,58,63,66,70,72,75,77,79,84,149,150,159,163,165,166,193,195,256,225,236,246,248,197,200,207,221,210,278,227
    
    • 注:substring_index(str,delim,count) 說明:substring_index(被截取字段,關(guān)鍵字,關(guān)鍵字出現(xiàn)的次數(shù)),如果count=-1我們就可以截取到倒數(shù)第一個被關(guān)鍵字分隔的元素。只要在上面查詢結(jié)果中再使用一次substring_index即可獲得每個被關(guān)鍵字分隔的元素。

  • 最終實現(xiàn)
SELECT
    userid,
    SUBSTRING_INDEX(
        SUBSTRING_INDEX(tagids, ',', seq),
        ',' ,- 1
    ) sub_id,
    seq
FROM sequence
 JOIN (SELECT * FROM `tagids_label` WHERE  userid = 2)b
WHERE
    seq BETWEEN 1
AND (
    SELECT
        1 + LENGTH(tagids) - LENGTH(replace(tagids, ',', ''))
)
ORDER BY
    userid,
    tagids;

  • 結(jié)果:


    image.png

    image.png
?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請聯(lián)系作者
【社區(qū)內(nèi)容提示】社區(qū)部分內(nèi)容疑似由AI輔助生成,瀏覽時請結(jié)合常識與多方信息審慎甄別。
平臺聲明:文章內(nèi)容(如有圖片或視頻亦包括在內(nèi))由作者上傳并發(fā)布,文章內(nèi)容僅代表作者本人觀點,簡書系信息發(fā)布平臺,僅提供信息存儲服務(wù)。

相關(guān)閱讀更多精彩內(nèi)容

友情鏈接更多精彩內(nèi)容