SQL Server 大數(shù)據(jù)管理——數(shù)據(jù)歸檔
背景:
????????數(shù)據(jù)庫幾百GB,甚至TB級別數(shù)據(jù)庫,歷史數(shù)據(jù)偶爾會用到,不能直接刪除,就需要定期歸檔歷史數(shù)據(jù)。以往在歸檔歷史數(shù)據(jù)方案:
方案1:做全備保存歸檔數(shù)據(jù),刪除線上庫歸檔數(shù)據(jù)
方案2:創(chuàng)建一個新庫,將歸檔數(shù)據(jù)導(dǎo)入到新庫,備份保留新庫,刪除線上庫歸檔數(shù)據(jù)
????????對于方案1,這樣會有較多的不需要或者不能歸檔的數(shù)據(jù)重復(fù)備份,占用大量空間,并且一旦歸檔數(shù)據(jù)需要使用,拷貝、還原需要時間較長,效率低;同時對還原需要空間也較難以匹配;方案2解決了數(shù)據(jù)重復(fù)備份占用空間的問題,但數(shù)據(jù)轉(zhuǎn)移需要時間較長
因此,采用本文敘述的方案,表分區(qū)+數(shù)據(jù)庫主文件備份的備份策略,可以同時很好的解決方案1、2的問題。
一. ???主文件數(shù)據(jù)移動到輔文件
因為在進行文件/組還原時,還需要用到主文件,所以在每次數(shù)據(jù)歸檔時,同時要對主文件備份歸檔,所以主文件要盡可能的小。需要盡可能的將數(shù)據(jù)移動到輔助文件上
1.1??新表處理:
修改數(shù)據(jù)庫默認文件組,或者創(chuàng)建表時指定文件組,使得新創(chuàng)建的表均在輔助文件組上
--創(chuàng)建測試數(shù)據(jù)庫
CREATE DATABASE [test]
CONTAINMENT = NONE
ON? PRIMARY
( NAME = N'test', FILENAME = N'D:\DB\test.mdf' , SIZE = 8192KB , FILEGROWTH = 65536KB )
LOG ON
( NAME = N'test_log', FILENAME = N'D:\DB\test_log.ldf' , SIZE = 8192KB , FILEGROWTH = 65536KB )
--增加文件組和輔助文件
alter database test add filegroup data
alter database test add file
(name=data,filename='D:\DB\data.ndf', SIZE = 8192KB , FILEGROWTH = 65536KB ) TO FILEGROUP [data]
alter database test add file
(name=data1,filename='D:\DB\data1.ndf',size=8192KB,filegrowth=65536KB) to filegroup data
修改默認文件組腳本
USE [test]
GO
IF NOT EXISTS (SELECT name FROM sys.filegroups WHERE is_default=1 AND name = N'data')
ALTER DATABASE [test] MODIFY FILEGROUP [data] DEFAULT
GO
create table testdd(id int)
select OBJECT_NAME(object_id) tableName,i.name indexName,s.name spaceName
from sys.indexes i
left join sys.data_spaces s on i.data_space_id=s.data_space_id
where object_id =OBJECT_ID('dbo.testdd','U')
通過系統(tǒng)視圖發(fā)現(xiàn),創(chuàng)建的testdd表是在默認的data文件組下:
1.2??已經(jīng)存在的表
1.2.1 不需要歸檔的表
不需要歸檔的表,可以通過創(chuàng)建、或刪除聚集索引將數(shù)據(jù)移動到輔助文件上
alter database test add filegroup tableMetaData
alter database test add file
(name=tableMetaData,filename='D:\DB\tableMetaData.ndf', SIZE = 8192KB , FILEGROWTH = 65536KB ) TO FILEGROUP tableMetaData
alter database test add filegroup dataMD
alter database test add file(
? ? ? name= dataMD,filename='D:\DB\dataMD.ndf',size=8192KB,filegrowth=65536KB
) to filegroup dataMD
create table testp(id int) on [primary]
create table testd(id int) on data
create table testMD(id int) on dataMD
create table testm(id int) on tableMetaData
--testp中插入100萬行數(shù)據(jù)
DECLARE @max AS INT, @rc AS INT;
SET @max = 1000000;
SET @rc = 1;
INSERT INTO testp VALUES(1);
WHILE @rc * 2 <= @max
BEGIN
????INSERT INTO dbo.testp SELECT id + @rc FROM dbo.testp;
????SET @rc = @rc * 2;
END
INSERT INTO dbo.testp
SELECT id + @rc FROM dbo.testp WHERE id + @rc <= @max;
go
--testm中插入100萬行數(shù)據(jù)
DECLARE @max AS INT, @rc AS INT;
SET @max = 100000;
SET @rc = 1;
INSERT INTO testm VALUES(1);
WHILE @rc * 2 <= @max
BEGIN
????INSERT INTO dbo.testm SELECT id + @rc FROM dbo.testm;
????SET @rc = @rc * 2;
END
INSERT INTO dbo.testm
SELECT id + @rc FROM dbo.testm WHERE id + @rc <= @max;
go
為testp表在文件組data上創(chuàng)建聚集索引
創(chuàng)建索引之前文件數(shù)據(jù)量
主文件和輔文件tableMetaData已經(jīng)增長至9216,其他文件任然初始大小
1.2.1.1????????創(chuàng)建聚集索引移動數(shù)據(jù)
CREATE CLUSTERED INDEX [CIx_testp] ON [dbo].[testp]
(
? ? ? [id] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = ON, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON)
ON [data]
GO
從下圖可以看到此時testp的數(shù)據(jù)已經(jīng)移到data文件組下
通過下圖可以看到,data文件大小增加一次,但主文件大小并沒有減少
1.2.1.2????????刪除聚集索引移動數(shù)據(jù)
CREATE CLUSTERED INDEX [CIx_testm] ON [dbo].[testm]
(
? ? ? [id] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = ON', ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON)
ON [data]
GO
--? 刪除聚集索引,并指數(shù)據(jù)定存儲的文件組
DROP INDEX [CIx_testm] ON dbo.testm WITH (MOVE TO [dataMD], ONLINE=ON)
GO
從下圖可以看到,testm表數(shù)據(jù)已經(jīng)從tableMetaData文件組移動到dataMD文件組
同時從下圖可以看到,通過刪除聚集索引轉(zhuǎn)移數(shù)據(jù),原文件大小仍然沒有改變(tableMetaData仍然是9216KB)
1.3??需要定期歸檔表數(shù)據(jù)的轉(zhuǎn)移
根據(jù)歸檔規(guī)則,一般按數(shù)據(jù)產(chǎn)生的時間進行分區(qū),創(chuàng)建表分區(qū),一個分區(qū)對應(yīng)一個文件組的一個文件
create table testpp(id int,date datetime) on [primary]
DECLARE @max AS INT, @rc AS INT;
SET @max = 1000000;
SET @rc = 1;
INSERT INTO testpp VALUES(1,'2018-01-01');
WHILE @rc * 2 <= @max
BEGIN
INSERT INTO dbo.testpp SELECT id + @rc,DATEADD(MI,@rc,[date]) FROM dbo.testpp;
SET @rc = @rc * 2;
END
INSERT INTO dbo.testpp
SELECT id + @rc,DATEADD(MI,@rc,[date]) FROM dbo.testpp WHERE id + @rc <= @max;
go
此時主文件大小仍然為9216KB沒有增長
系統(tǒng)視圖更新遲緩,導(dǎo)致文件大小和真實數(shù)據(jù)大小不一致
下面對testpp表進行分區(qū)
--添加文件組、文件
use test
alter database test
add filegroup Before2019
alter database test
add filegroup After2019
alter database test
add file(name=N'Before2019',filename='D:\DB\testPartion\Before2019.ndf'
,size=1mb, filegrowth=1mb)
to filegroup Before2019
alter database test
add file(name=N'After2019',filename='D:\DB\testPartion\After2019.ndf'
,size=1mb,? filegrowth=1mb)
to filegroup After2019
--創(chuàng)建分區(qū)函數(shù)
create partition function RangeTime(datetime)
as range left for values('2019-01-01')
--創(chuàng)建分區(qū)方案
create partition scheme RangeSchema_CreateTime
as partition [RangeTime]
to(Before2019,After2019)
GO
--為testpp表添加分區(qū)
USE [test]
GO
BEGIN TRANSACTION
CREATE CLUSTERED INDEX [ClusteredIndex_on_RangeSchema_CreateTime_636570819394124711] ON [dbo].[testpp]
(
[date]
)WITH (SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF) ON [RangeSchema_CreateTime]([date])
DROP INDEX [ClusteredIndex_on_RangeSchema_CreateTime_636570819394124711] ON [dbo].[testpp]
COMMIT TRANSACTION
1.4??收縮主文件
不管是創(chuàng)建聚集索引轉(zhuǎn)移數(shù)據(jù),刪除聚集索引轉(zhuǎn)移數(shù)據(jù),或者創(chuàng)建表分區(qū)轉(zhuǎn)移數(shù)據(jù),源文件的空間都不會自動釋放,需要對源文件進行收縮
USE [test]
GO
DBCC SHRINKFILE(N'test', 0) withno_infomsgs
--參數(shù)說明
--emptyfile 轉(zhuǎn)移數(shù)據(jù)到同文件組的其他文件
--notruncate 移動數(shù)據(jù)頁到文件前段,但不釋放空間
--truncateonly 不移動數(shù)據(jù)頁,釋放尾部空間
--with no_infomsgs 取消顯示所有信息消息
GO
二. ???歸檔歷史數(shù)據(jù)
2.1??將歸檔分區(qū)數(shù)據(jù)轉(zhuǎn)為普通表數(shù)據(jù)
--在歸檔數(shù)據(jù)所在的分區(qū)文件組上創(chuàng)建和歸檔表結(jié)構(gòu)相同的表
create table testpp2018(id int,date datetime) on Before2019
--查詢partition_number 及其對應(yīng)數(shù)據(jù)所在的文件組
;with cte as
(select
object_id
,OBJECT_NAME(i.object_id) tableName
,i.index_id
,dds.partition_scheme_id
,dds.destination_id as partition_number
,fg.groupid
,fg.groupname
,f.fileid
,f.name
,f.filename
--,p.partition_id
--,p.rows
from sys.destination_data_spaces dds,sys.indexes i,sys.sysfilegroups fg,sys.sysfiles f
where dds.partition_scheme_id=i.data_space_id
and dds.data_space_id=fg.groupid
and fg.groupid=f.groupid
)
,cte1 as(
select
ps.data_space_id as partition_scheme_id
,ps.name partiton_schemes_name
,pf.name partition_function_name
,pf.function_id
--,prv.value AS BoundaryValue
from sys.partition_schemes ps ,sys.partition_functions pf--,sys.partition_range_values prv
where ps.function_id=pf.function_id
--and pf.function_id=prv.function_id
)
select cte.tableName,cte.groupname,cte.name,cte.filename
,cte.partition_number,cte1.partiton_schemes_name,cte1.partition_function_name,p.rows
,prv.boundary_id,prv.value BoundaryValue
from cte
inner join cte1 on cte.partition_scheme_id=cte1 .partition_scheme_id
left join sys.partition_range_values prv on cte1.function_id=prv.function_id and cte.partition_number=prv.boundary_id
left join sys.partitions? p on cte.object_id=p.object_id and cte.index_id=p.index_id and cte.partition_number=p.partition_number
where? cte.object_id=OBJECT_ID('dbo.testpp','U')
--要移出2019年以前的數(shù)據(jù),即partition_number=1 的分區(qū),從分區(qū)表中移除歸檔數(shù)據(jù)到普通表
alter table testpp switch partition 1 to testpp2018
從上圖可以看到分區(qū)1的記錄數(shù)已經(jīng)為0,下面將把分區(qū)1合并掉,并移除對應(yīng)的文件和文件組
use test
go
--合并分區(qū)1
alter partition function RangeTime()
merge range('2019-01-01 0:00:00')
--移除歸檔分區(qū)的文件和文件組
alter database test remove file [Before2019]
alter database test remove filegroup [Before2019]
通過創(chuàng)建聚集索引的方式將表數(shù)據(jù)從分區(qū)文件移動到主文件
CREATE CLUSTERED INDEX [ClusteredIndex-20180321-145814] ON [dbo].[testpp2018]
(
[id] ASC
) ON [PRIMARY]
GO
DROP INDEX [ClusteredIndex-20180321-145814] ON [dbo].[testpp2018] WITH ( ONLINE = OFF )
GO
2.2??備份主文件
BACKUPDATABASE [test] FILEGROUP=N'PRIMARY'
TO? DISK=N'E:\backup\test_PRIMARY_2018.bak'
WITHNOFORMAT,INIT,? NAME =N'test-完整 數(shù)據(jù)庫 備份',SKIP,NOREWIND,NOUNLOAD,? STATS= 10
GO
2.3??數(shù)據(jù)恢復(fù)測試
RESTOREDATABASE [test1] FILE=N'test'
FROM? DISK=N'E:\backup\test_PRIMARY.bak'
WITH? FILE= 1,
MOVEN'test'TON'D:\DB\test1\test1.mdf',?
MOVEN'test_log'TON'D:\DB\test1\test1_1.ldf',?
RECOVERY,? replace,? STATS= 10
GO
2.4??歸檔說明
備份的表格、時間區(qū)間列表
備份文件路徑名稱
數(shù)據(jù)恢復(fù)腳本,見2.3
2.5??刪除歸檔數(shù)據(jù)
truncatetable [dbo].[testpp2018]
————————————————
版權(quán)聲明:本文為CSDN博主「三空道人」的原創(chuàng)文章,遵循CC 4.0 BY-SA版權(quán)協(xié)議,轉(zhuǎn)載請附上原文出處鏈接及本聲明。
原文鏈接:https://blog.csdn.net/zhoujunah/article/details/79665279