TRUING DAI HOC SlT PHAM TP.CM
KHOA TOAN-TIN HOC
B0 MON TIN HOC
Tran VTnh Tien Dire-Nguyen Phu Thinh
NGHIEN CLTU C6NG NGHE OLAP
VA I NC DVNG
LUAN VAN CU' NHAN CONG NGHE THONG TIN
TP.HCM, Thang 4-2010
TRUING DAI HOC SlT PHAM TP.CM
KHOA TOAN-TIN HOC
BO MON TIN HOC
Tran Vinh Ti6n Due-Nguyln Phu Thinh
NGHIEN CtTU C6NG NGHE OLAP
VA U*NG DUNG
GIAO VIEN HUONG DAN: NGUYEN THI TRA LINH
TP.HCM, Thang 4-2010
1
Leri cam on.
Dau tien, chung em xin gui ldi cam cm sau sac den Co Nguyen Thi Tra
94 trang |
Chia sẻ: huong20 | Ngày: 07/01/2022 | Lượt xem: 300 | Lượt tải: 0
Tóm tắt tài liệu Luận văn Nghiên cứu công nghệ olap ứng dụng, để xem tài liệu hoàn chỉnh bạn click vào nút DOWNLOAD ở trên
Linh
va Thay Tran Ngoc Bao la nhung ngudi da true tiep hudng dan de ngay hom nay
chung em co the hoan thanh luan van nay. Chac chan rang neu khong co su diu dat
va tan tinh giup do cua Co va Thay thi chung em se gap rat nhidu kho khan trong
qua trinh nghien cuu thuc hien luan van.
Ke den, chung em muon tran trong cam cm cong lao to ldn cua Ba Me,
nhung dang sinh thanh da nuoi nang va day do chung em nen ngudi nhu ngay hom
nay. Chac chan rang sau nay du co lam gi, du co di dau, chung em cung se khong
lam phu long Ba Me.
Cuoi cung, xin gui ldi cam om chan thanh den trudng Dai hoc Su Pham
Thanh Pho Ho Chi Minh, noi da danh su quan tarn rat ldn doi vdi sinh vien cong
nghe thong tin chung em trong suot qua trinh hoc tap tai day. Dac biet, xin tran
trong cam on toan the cac Thay Co khoa Toan-Tin. Tren con dudng gop nhat nhung
kien thuc quy bau cua ngay hom nay, cac thay, cac co, va ban be la nhung ngudi da
day dd, cung chung em sat canh va trai nghiem trong suot 4 nam dai hQC.
2
Muc luc• •
Lai c&m an
Myc lyc
Bang cac ki hifu, chu viit tit
Bang danh muc cac bang bieu
Bang danh myc cac hinh ve
Tom tit n$i dung luan van
Md dau
Co so ly thuyet
Chuang 1 Cong nghy OLAP
1.1 Gidi thi?u ve Data Warehouse
1.1.1 Dinhnghia
1.1.2 Y nghia cua Data Warehouse
1.1.3 Bac diem cua Data Warehouse
1.1.4 Cau true cua Data Warehouse .
1.2 T6ng quan ve OLAP
1.2.1 Giaithiyu
1.2.2 Dac diem cua OLAP:
1.3 C&c mo hinh OLAP
1.3.1 MOLAP (Multidimensional OLAP)
1.3.2 ROLAP (Relational OLAP)
1.3.3 HOLAP (Hybrid OLAP)
1.3.4 So sanh cac mo hinh
Chuang 2 OLAP trong MS SQL Server Analysis Service (SSAS)
2.1. Kiin true cac thanh phan OLAP trong SSAS
2.1.1 Cube
2.1.2 Dimension .
2.1.3 Measure .
2.1.4 Fact table (Fact)
2.1.5 Slice
2.1.6 Partition
2
3
6
7
8
10
11
15
15
15
15
16
16
19
19
19
21
21
21
22
25
26
27
27
27
29
31
31
32
33
3
2.1.7 Role
2.1.8 Schedule (Lap lich)
2.2. Quy trinh thilt kl cac d6i tugng trong SSAS
2.2.1 So luge ve quy trinh xay dung Data Warehouse
2.2.1.1 Quy trinh chung
2.2.1.2 Cong cy thyc hifn:
2.2.2 Quy trinh tao Cube:
Chuong3 Component Pivot Table trong Excel
3.1. PivotTable
3.1.1 Gidithifu
3.1.2 Cachsudyng
3.2. PivotChart
3.3. Gioi thieu mQt so cong cy h6 try OLAP tuong ty nhu Pivot Table cua Excel....47
3.3.1 RadarCube OLAP
3.3.2 Data Dynamics Analysis
3.3.3 ComponentOne OLAP for WinForm ,
3.3.4 Devexpress ASPxPivotGrid .
3.4 So sanh Pivot Table vdi cac cong cy khac:
LTu diem: .
Nhuyc diem:
Ket luan: .
Chuomg 4 Cac thu viyn lap trinh lien quan din SSAS
4.1. AMO
4.2. ADOMD.NET
4.3. SMO
4.4. MDX
4.4.1 Ciu true chung
4.4.2 Vidymlu
4.4.3 Luuy .
Chuomg 5 Xay dung cong cy Web ho try phan tich du liyu cho cac ung dyng OLTP58
5.1. Phan tich, thilt kl .
5.1.1 Phan tich .
5.1.1.1 Quy trinh xir ly nghiyp vy:.,
5.1.1.2 Yeu cau chuc nang:
5.1.1.3 Mo hinh quan niem du lifu:
33
34
35
35
35
37
37
.43
43
43
43
46
47
48
49
49
50
50
50
51
52
52
53
53
55
55
55
56
58
58
58
60
66
4
5.1.2 Thiltkl
5.1.2.1 Dulifu:
5.1.2.2 Kiln true module: ..
5.1.2.3 He th6ng giao di?n:
5.2. Cai dat, lap trinh
Danh gia, ban luan vl kit qua
Ket luan - Huong phat triln
Cac kit qua da dat duoc
1. Vl ly thuyet
2. Vl th\rc nghi?m
Danh muc tai lieu tham khao
.72
72
79
80
88
89
91
91
91
91
93
5
Bang cac ki hieu, chfr viet tat
Tieng Anh:
AMO : Analysis Management Objects
Business Intelligence
Data Transformation Services
Extract-Transform - Load
BI:
DTS:
ETL:
HOLAP: Hybrid OLAP
Multi-dimensional Expressions
Multidimensional OLAP
Online Analytical Processing
Relational OLAP
SQL Server Management Objects
SQL Server Analysis Services
MDX:
MOLAP:
OLAP:
ROLAP :
SMO:
SSAS:
Tieng Viet:
• CSDL: Co so du lieu
6
Bang danh muc cac bang bieu
Bang 1: So sanh cac mo hinh OLAP. .26
7
Bang danh muc cac hinh ve
Hinh 1: Cac dang cor so du lieu..
Hinh 2: VI du v6 mot mlu bao cao co dinh
Hinh 3: Minh hoa vi tri cua ung dung
Hinh 4: Data Warehouse
Hinh 5: Bac diem cua Data Warehouse
Hinh 6: Cau true cua Data Warehouse
Hinh 7: Vi tri cua OLAP
Hinh 8: Mo hinh du lieu MOLAP
Hinh 9: Mo hinh du lieu ROLAP
Hinh 10: Luge do hinh sao
Hinh 11: Luge do hinh bong tuyet
Hinh 12: Mo hinh du lieu HOLAP
Hinh 13: VI dg v'e Kh6i (Cube)
Hinh 14: VI dg ve 6 rong trong khoi
Hinh 15: VI dg ve cac Dimension (ChiSu) khac nhau
Hinh 16: Fact Table
Hinh 17: Slice...
Hinh 18: Cac d6i tugng trong SSAS
Hinh 19: PivotTable
Hinh 20: PivotChart
Hinh 21: RadarCube OLAP
Hinh 22: Data Dynamics Analysis
Hinh 23: ComponentOne OLAP for WinForm
Hinh 24: Devexpress ASPxPivotGrid
Hinh 25: Mo hinh xu ly nghiep vg quan ly template
Hinh 26: So do Use-Case mo ta he thong chuc nang cua Administrator
Hinh 27: So do Use-Case mo ta he thong chuc nang cua Designer
Hinh 28: So do User-Case mo ta he thong chuc nang cua Custom User
Hinh 29: So do Use-Case mo ta chuc nang quan ly Template
Hinh 30: So do Use-Case mo ta chuc nang phan quyen Template
Hinh 31: So do Use-Case mo ta chuc nang quan ly Cube
Hinh 32: So do Use-Case mo ta chuc nang phan quyen Cube
Hinh 33: So do Use-Case mo ta chuc nang quan ly User
Hinh 34: So do Use-Case mo ta chuc nang quan ly Group
Hinh 35: So do Use-Case mo ta chuc nang quan ly Profile
Hinh 36: So d6 Use-Case mo ta chuc nang cau hinh he thong
Hinh 37: Mo hinh quan niem du lieu he th6ng OLAP
Hinh 38: Mo hinh vat ly
Hinh 39: Man hinh chinh phan he administrator
Hinh 40: Man hinh chuc nang Copy Rights
11
12
13
15
17
19
19
.22
.23
.24
.24
.25
.28
29
.30
.31
.32
.35
.43
.46
.47
.48
.49
.50
.59
.60
.61
.61
.62
.62
.63
.63
.64
,65
.65
,66
.67
.73
,80
,81
8
Hinh 41: Man hinh chuc nang Active cube
Hinh 42: Man hinh chuc nang Assign cube to group ....
Hinh 43: Man hinh chuc nang Assign user to cube
Hinh 44: Man hinh chuc nang List user .
Hinh 45: Man hinh chuc nang Assign Groups .
Hinh 46: Man hinh chuc nang Assign Users .
Hinh 47: Man hinh chuc nang Import cube
Hinh 48: Man hinh chuc nang Assign rights to profile..
Hinh 49: Man hinh chuc nang Schedule .
Hinh 50: Man hinh chuc nang Setting Connection .
Hinh 51: Man hinh chinh phan he designer .
Hinh 52: Man hinh chuc nang Design template .
Hinh 53: Man hinh chuc nang Export template .
Hinh 54: Man hinh chuc nang Assign template to user.
Hinh 55: Man hinh chinh phan he custom user .
.81
.81
.82
,82
,83
.83
,83
.84
.84
.85
.85
.86
.86
.87
.87
9
Tom tat noi dung luan van
> Tim hieu tong quan ve cong nghe OLAP
> Tim hieu OLAP trong MS SQL Server Analysis Service (SSAS)
> Tim hieu ve component Pivot Table trong Excel
• Pivot Table
• PivotChart
> Tim hi6u cac thu vien lap trinh lien quan dSn SSAS: MDX, AMO, SMO,
ADOMD.NET, ...
> Xay dvmg cong cu h6 trq phan tich du lieu cho cac ung dung OLTP
10
M ifdau
> Gioi thieu so1 lirtfc:
Ngay nay, each mang thong tin va su bung no thong tin tren toan cau da dan den
cac thuc trang nhu sau:
• Du lieu: Dugc lint tru d nhieu dang ccf s& die lieu khac nhau
Unlimited Data Sources
Hlnh 1: Cac dang co1sn dfr li$u
• Nhieu he thong thong tin dugc xay dung gay ra he qua:
Nhung giao dien khac nhau
Nhung dang bieu du lieu khac nhau
Thong tin triing lap va khong nhat quan
• Trong khi do, cac phan mem va he thong bao cao thuemg chi cung cap
mot so mau bao cao co dinh.
VD: Bao cao ve diem cuoi ki cua sinh vien
11
1KET QUA HOC TAP
Ma SV TOO Khoa Toan
Le Tuin Ngay sinh: 2/15/1991 12:00:00AIHo ten:
SO TT MON Ma. Ten mon Diem
Co ly thuy6tCO 2.00'
2 CS Co so do lieu 3.00
C4u true dCr li§u3 CT 6.00
RR ToEn rfiri rac 10.004
ToEn cao c4pTC5 1.00
6 TT Tri tu# nhSn tao 3.00
Hinh 2: Vi du ve mot mau bao cao co dinh
Vay tinh huong dat ra la ta se xu ly nhu the nao khi ngudi dung muon:
Xem them thong tin ve 1 thanh phin nao do trong CSDL ngoai nhung
thanh phan da co tren report
Xem report theo nhieu hinh thuc khac nhau, ngoai dang table da qua
quen thuoc
4 Khong con each nao khac hon la ta phai tao them 1 report moi !!!
> Y tirong, li do chon de tai:
Tir thuc trang tren, co th§ thay rang nhu cau phan tick so lieu de ho trq ra
cac quyet dinh chien luqc mot each kip thdi da tro nen ngay cang cap thiet.
12
Trong moi trudng kinh doanh hien dai, cac lanh dao doanh nghiep can co
nhung thong tin mang tinh quan tri chinh xac, nhanh chong va a nhieu goc do khac
nhau nham thuc hien viec ra quyet dinh kinh doanh hieu qua.
Vi thS, y tucmg cua de tai la sir dung he thdng OLAP k<k hop voi Data
warehouse de tao ra 1 cong cu giup ngudi dung - dac biet la nha quan ly, nhung
ngudi ra quyet dinh - de dang khai thac, truy xuat duqc cac mau du lieu ma ho quan
tam.
a
Operational System
Olap Analysisa ETL |M»tadalaj aERP Extraction.Transformation.Loadingaÿ Reporting
CRM
Data Warehouse s
Data Mining
Hinh 3: Minh hoa v| tri cua tmg dung
> Muc dich de tai
Xay dung 1 cong cu phan tich du lieu true tuyen tren nen web cho phep:
• Phan tich sd lieu dua tren nhilu chi tieu, yiu to.
13
• Co he thong phan quyen chat che ve chuc nang va du lieu tren ngudi
dung lan nhom ngudi dung
• Co he thong report, chart,... true quan, da dang, ho trq keo tha
• Co he thong template giup ngudi dung de dang horn trong qua trinh xem
va phan tich du lieu
• Co chuc nang Import, Export template, du lieu,... phuc vu cho nhu cau
chuyen doi giua cac he thong khac nhau
• Tuy theo nhu cau / lira chon cua ngudi su dung khi phan tich so lieu.
• Doc lap vdi he thong co sd du lieu cua ngudi dung
> Doi tirong va pham vi nghien ciru
• Cong nghe Data Warehouse va OLAP
• Cac mo hinh OLAP: MOLAP, ROLAP, HOLAP
• Tim hidu OLAP trong MS SQL Server Analysis Service (SSAS)
• Quy trinh thiet ke cac doi tuqng trong SSAS: Data Source, Data Source
View, Cube, Dimension, ...
• Component Pivot Table va PivotChart trong Excel
• Cac thu vien lap trinh lien quan d6n SSAS: MDX, AMO, SMO,
ADOMD.NET,...
• Xay dung cong cu ho trq phan tich du lieu cho cac ung dung OLTP
14
Ctf so" ly thuyet
Chmmg 1 Cong nghe OLAP
1.1 Gioi thieu ve Data Warehouse
1.1.1 Dinh nghTa
Pre-Data
Warehouse
Data
Cleansing
Data
Repositories
Front-End
Analytics
OLAP
r—iData Mart
Data
WarehouseETL Data MiningF=|
Data Mart
OLTP Server
L" Data
JTy Visualizationg
ReportingMeta-Data Repository
k
Data Flow
T
Hinh 4: Data Warehouse
> Kho du lieu (data warehouse), goi mot each chlnh xac hom la kho thong tin
(information warehouse), la mot ca sa du lieu huang doi tuang duac thiet ke vai
viec tiep can cac y kien trong moi linh vuc kinh doanh. No cung cap cac cong cu
de dap ting thong tin can thiet cho cac nha quan tri kinh doanh tai moi cap do to
chuc - khong nhung chi la nhung yeu cau du lieu phiic hop, ma con la dieu kien
thuan tien nhat de dat duqc viec lay thong tin nhanh, chinh xac.
15
> La mot su pha tron cua nhieu cong nghe, bao gom cac ca so du lieu da chieu va
moi quan he giua chung, kien true chu khach, giao dien ngudi dung do hoa,...
> Kho du lieu thudng rat Ion tdi hang tram GB hay tham chi hang Terabyte. Kho
du lieu duqc xay dung de tien loi cho viec truy cap theo nhieu nguon, nhieu kieu
du lieu khac nhau sao cho co the ket hop duqc ca nhung ung dung cua cac cong
nghe hien dai va ke thira duqc tir nhung he thong da co sin tir trudc.
1.1.2 Y nghTa cua Data Warehouse
> Truy cap du lieu mot each thong nhat :
• Thu thap va k6t hop thong tin.
• Cung cap mot khung nhin tich hop
• Giao dien ngudi su dung khong bien doi
• Ho trq kha nang chia se.
> Du lieu duqc tich hop va td chuc:
• Lam cho he thong trd nen de hieu
• Ro rang
• De phan tich
> Chuyen doi du lieu thanh thong tin: cung cap thong tin chinh xac dung thdi diem
va dung dinh dang
1.1.3 Dac dilm cua Data Warehouse
16
Subject
Oriented
integrated
Data
Warehouse
Non Volatile Time Variant
Hinh 5: Dae diem cua Data Warehouse
> Tich hap (Integrated):
Du lieu tap hop tur nhieu nguon khac nhau trong mot to chuc/cong ty va da duoc
dong nhat, xap xep, rut gon.
> La mot tap hop thong tin chinh xac, chat luong va nhat quan
> Chuan hoa :
• Cac qui uac ten
• Cac thuoc tinh
• Cac don vi do lucmg
> Huong chu the (Subject-oriented): Chuyen tu hudng ung dung sang hudng ho
trq quyet dinh
• Duoc to chuc xung quanh cac chu the chinh, nhu khach hang (customer),
san phim (product), ban hang (sales).
• Tap trung vao viec mo hinh hoa va phan tich du lieu cho cac nha dua ra
quyet dinh, ma khong tap trung vao cac hoat dong hay cac xu ly toan tac
17
hang ngay.
• Cung cap mot khung nhin dcm gian va sue tich xung quanh cac su kien
cua cac chu the
> Bien thoi gian (Time-variant): Cac du lieu duqc dinh nghTa vao mot thai diem
xac dinh va duqc ghi chep lai phuc vu cho viec so sanh du lieu theo chiSu thcri
gian
• Yeu cau quan trong d6i voi kho du lieu la pham vi ve thdi gian dai horn so
voi cac he thong tac nghiep :
Co so du lieu tac nghiep: du lieu chi co gia tri hien then
Du lieu cua kho du lieu: cung cap thong tin lich su (vi du nhu,
5-10 nam truoc)
• Y6u t6 then gian duqc luu tru trong CSDL
> Co dinh (Non-volatile ): du lieu co the duqc them vao nhung khong duqc sua
chua hoac xoa bo. Nham gium nha quan ly co the tong hop cong viec kinh
doanh cua toan to chuc/cong ty.
Load
Operational Databases Warehouse Database
INSERT Read
UPDATE
Read
DELETE
18
1.1.4 Cau true cua Data Warehouse
Data Warehouse
(Cube)
IData Mart
Hinh 6: Cau true cua Data Warehouse
> Mot Data warehouse duoc cau tao tir mot hay nhieu cube (khai niem nay se de
cap cr phan sau).
> Enterprise Data warehouse: tap hop toan b§ thong tin ve cac doi tuong cua toan
to chuc/cong ty. Vi d\i: san pham, khach hang, doanh thu ...
> Data mart (kho dir lieu chu de): la mot tap con cua Data warehouse, chi tap trung
vao nhung doi tuomg duorc chon.
1.2 Tong quan \e OLAP
1.2.1 Gioithieu
%Disparate Data SourcesOLTP Database OLAP CubeDatamart/
Data Warehouse
>
Access Database
iText Files
Hinh 7: Vj tri cua OLAP
19
> N4u he thbng xu ly chuyin giao du lieu OLTP tap trung vao viec thu thap, luu
tru va bien doi du lieu mot each chuan xac, thi OLAP tap trung vao viec su dung
cac du lieu da duqc bien doi vao viec ra quyet dinh san xuat kinh doanh. OLAP
la mot muc trong cac phan mem co so du lieu cung cap giao dien qua do ngubi
su dung co the bien doi hoac gibi han cac du lieu so khai tuy theo cac ham da
dinh nghTa hoac do chinh nguoi su dung dinh nghia, sau do nhanh chong kiem
tra cac ket qua trong cac chieu khac nhau cua du lieu
> He thong OLAP la mot he thong quan ly du lieu giau nang luc. No cho phep
nguoi su dung phan tich du lieu qua viec cat lat (slice) du lieu theo nhieu khia
canh khac nhau, khoan xuong (drill down) muc chi tiet hon hay cuon len (roll
up) muc tong hop hon cua du lieu. Ban chat cot loi cua OLAP la du lieu duqc
lay ra tir Data warehouse hoac tu Datamart sau do duqc chuyen thanh mo hinh
da chieu va duqc luu tru trong mot kho du lieu da chieu.
> Cu the, OLAP la mot ky thuat su dung cac the hien du lieu da chieu goi la cac
khoi (cube) nham cung cap kha nang truy xuat nhanh den du lieu cua kho du
lifu. Tao kh6i cho du lieu trong cac bang chieu (dimension table) va bang su
kien (fact table) trong kho du lieu de cung cap kha nang thuc hien cac truy van
tinh vi va phan tich cho cac ung dung client.
> OLAP cung cap nhieu lqi ich cho nguoi phan tich, cho vi du nhu:
• Cung cap mo hinh du lieu da chieu true quan cho phep de dang lua chon,
dinh hubng va kham pha du lieu.
• Cung cap mot ngon ngu truy van phan tich, cung cap sue manh de kham
pha cac moi quan he trong du lieu kinh doanh phuc tap.
• Du lieu duqc tinh toan truoc doi vbi cac truy van thubng xuyen nham
lam cho thbi gian tra loi rat nhanh doi vbi cac truy van dac biet.
20
• Cung cap cac cong cu manh giup ngucri dung tao cac khung nhin mcri cua
du lieu dua tren mot tap cac ham tinh toan dac biet.
• Ho trg tao mo hinh chuc nang de du bao, phan tich xu the phat trien va
phan tich thong ke.
• Tim ki4m va hi§n thi du lieu duoi dang bilu d6, khong gian 2D, 3D, ...
> OLAP dugc dat ra de xu ly cac tmy van lien quan den lugng du lieu rdt lorn ma
neu cho thuc thi cac truy van nay trong he thong OLTP se khong the cho ket qua
hoac se mat rat nhieu thdi gian.
1.2.2 Dac dilm cua OLAP:
> Dugc bi6t d4n nhu la mot phSn cua kho du lieu
> Cung cap cac bao cao, phan tich tien tinh toan, cac do thi, bieu d6
> Cho phep phan tich true tuyen du lieu
> Tham do su tuong tac du lieu
> Cung cap cac giao dien da dang cho ngucri dung
> Cung cap kha nang phan tich du lieu phuc tap bang phuemg thuc don gian
1.3 Cac mo hinh OLAP
1.3.1 MOLAP (Multidimensional OLAP)
> Mo hinh MOLAP luu tru du lieu co so (la du lieu tu cac bang cua kho du lieu)
va thong tin tong hgp (la cac do do dugc tinh toan tu cac bang) trong cac khoi
21
Data mart or
data warehouse
Data in OLAP
environment
A
SQL Server
MOLAR
| dataOracle MOLAP data
MOLAR aggregations
Other
Hinh 8: Mo hinh dfr li?u MOLAP
> Du lieu duac tim thay tai diem giao cua cac chieu
Customer Store
Store
Time Time
SALES FINANCE
Product GLLine
> Du lieu trong MOLAP can phai dugc process sau mot thai gian nhat dinh de cap
nhat lai nhung du lieu moi trong Data Warehouse.
> MOLAP thich hop cho cac truy van tong hop dfr lieu theo thfri gian dai.
Vi du: tong hop doanh thu cua tung chi nhanh trong vong 2 nam tro lai.
1.3.2 ROLAP (Relational OLAP)
> Mo hinh OLAP quan he (ROLAP) luu tru du lieu co so va thong tin t6ng hop
trong cac bang quan he
22
Data in OLAP
environment
Data mart or
data warehouse
SQL Server
Relational
database
Oracle
ROLAP data
ROLAP aggregations
Other
Hinh 9: Mo hinh dir li?u ROLAP
> Du lieu dugc tim thly tai di6m giao cua 1 cot va 1 mot hang
Table Columns
IRows
Key values to join
> Co 2 kieu mo hinh du lieu:
• Luge do hinh sao (star): m6i mot chieu chi dugc xay dung tir mot bang
chieu
23
Product Table Store Table
Productjd
Product_desc
StoreJd
Districtjd
\
Sales Fact Table
Productjd
Storejd
Itemjd
Dayjd
Sales_dollars
Sales_unlts
Time Table Item Table
Dayjd
Monthjd
Period id
Yearjd_
Itemjd
ltem_desc
Hlnh 10: Luge do hinh sao
• Luge do hinh bong tuyet (snowflake): moi chieu dugc xay dung tu nhieu
bang chieu
Product Table Store Table District TableProductjd
Product desc
Storejd
Store_de$c
Districtjd
Districtjd
District_desc
\ /7
Sales Fact Table
Itemjd
Storejd
Sales_dollars
Sales_units
\
Time Table Dept Table
Deptjd
Dept_desc
Mgrjd
Mar Table
Deptjd
Mgrjd
Mgr_name
Item Table
Weekjd
Periodjd
Yearjd
Itemjd
ltem_desc
Deptjd
Hinh 11: Luge do hinh bong tuyet
> ROLAP thich hop cho cac truy van dir lieu theo thoi gian thuc (xay ra
thucmg xuyen va doi hoi do chinh xac tuc thoi).
Vi du: xem thong tin chung khoan
24
1.3.3 HOLAP (Hybrid OLAP)
> La mo hinh OLAP lai (HOLAP) la su ket hop giua MOLAP va ROLAP.
Data mart or
data warehouse
Data In OLAP
environment
ry 7 ril ,gn-HtSQL Server
I#Other
MOLAP
FT
1
Hinh 12: Mo hinh dir li?u HOLAP
> HOLAP la mo hinh lai giua MOLAP va HOLAP, k4t hop nhung uu di6m va
loai bo nhung khuyet diem cua 2 mo hinh tren nham tao ra mot mo hinh toi uu
> Lai ich cua vi$c luu tru trong cau true HOLAP la:
• Luu du lieu t6ng hap trong Cube (theo MOLAP) d§ tang t6c do xu ly cac
truy van phuc tap
• Luu tru du lieu ca so trong cac bang quan he (theo ROLAP) nham tiet
kiem khong gian luu tru
Vi du: Trong mot cong ty chung khoan:
Du lieu ve thong tin nhan vien, doanh thu cua 2 nam tra ve trude
=> luu trong cube
Du lieu ve thong tin chung khoan, gia co phieu hang ngay
=> luu trong bang quan he
Toi uu hoa duqc van de luu tru va xu ly cua he thong.
25
1.3.4 So sanh cac mo hinh
MOLAP ROLAP HOLAP
So chieu >2 2 >2
Kh6iLuu tru du lieu ca so Bang quan he Bang quan he
Luu tru thong tin tong hop Kh6i Kh6iBang quan he
Hieu suat thuc hien truy van Cham nhatNhanh nhat Nhanh
Nhi6u ThipTieu thu khong gian luu tru Trung binh
ThipChi phi bao tri Cao Trung binh
Bang 1: So sanh cac mo hinh OLAP
KETLUAN:
Chon mo hinh nao can dua vao hieu suat, muc dich khai thac, va luang die
lieu,... (vi du MOLAP: thlch hop v&i data marts <50 GB, ROLAP: luong die lieu
co the lom horn)
26
Chirong 2 OLAP trong MS SQL Server Analysis
Service (SSAS)
Gioi thieu sc liro’c ve SSAS
> Tir truoc ddn nay, SQL Server dirge biSt dSn voi vai tro la mot he quan tri co so
du lieu co trach nhiem ho trg quan li, hru trir du lieu voi Database Engine. Tuy
nhien, tir phien ban SQL Server 2005 d6n SQL Server 2008 va moi nhat la SQL
Server 2008 R2, thi bo SQL Server da duoc tich hop nhilu goi dich vu h6 tro
viec tich hgp va khai thac kha nang tiem tang cua mot co so du lieu nhu SQL
Server Integrating Services, SQL Server Analysis Services va SQL Server
Reporting Services.
> Analysis Services la mot dich vu ho tro manh me viec phan tich, khai thac thong
tin tiem tang ben trong cua mot he co so du lieu.
> Analysis Services la cong cu de su dung, tich hop va linh dong giup dinh nghTa
cac chieu, xay dung cac khoi du lieu da chieu va cimg cap cac chuomg trinh ung
diing truy xuat tod cac khoi nay.
2.1. Kien true cac thanh phan OLAP trong SSAS
2.1.1 Cube
> Cube (Khoi) la phan tu chinh trong xu ly phan tich true tuyen, la tap con
(subset) du lieu tir kho du lieu, duoc to chuc va tong hop trong cac cau true da
chieu.
> Cube dung cac dimensions, fact table va cac measure de mo ta du lieu trong
cube.
Vi du : mot cube 3 chieu
27
lolt-iiptuiesK'ii
Dimensions:
NY Time.Product.StoreJ1 »*
LA A
. rol!ÿm
Store...w Juice |l£
a Milk S
« Coke p6
Cream gj
Soap -i2
Hierarchies:
Product =» Brand -ÿ ...
Day -»ÿ Week-> Quarter
Store —> Region -» Country
!
Bread g| loweek
/ M T W Til F S S
/ Time
*6 unitsof bread solrl in T A on K I
Hinh 13: Vi du ve Khoi (Cube)
> Mot cube co th6 co nhi6u chi6u
• Neu co nhieu hom 3 chieu se duoc g<?i la ‘sieu khoi’ (’’hypercube”)
• V6 mat ly thuyet thi so chieu la khong han ch4
• Thuong thi so chieu la tir 4 den 12
> Mot cube bao gom nhieu 6 du lieu
• La mot lien ket giua cac gia tri cua chieu
• Mot 6 co the la rong (khong co du lieu cho lien ket nay)
• Khoi thua (sparsecube): co nhieu 6 r6ng
• Khoi day dac (densecube): co it 6 rong
Vi du ve 6 rone:
28
measure value, for fact "Jill got 87 in CS5678”
Jack A1
87 73 "students" dimension
25 95EE2222
CS4002/ 6289
/
"courses"
dimension empty cell, no factabout Jill and CS4002
Hinh 14: Vi du vl 6 rong trong kh6i
> Cube bieu dien du lieu duai dang tong hop nhieu hon la chi tiet nhu khi dung
bang thong thuong. Do do, cube cho nguoi dung thay dugc cai nhin tom tit,
tong hop cua toan bo du lieu.
> Mot cube co the chua toan bo hoac mot phan cua du lieu nguon (Data source
view). Vi vay cac cube chi dugc luu tru va truy cap qua gioi han rieng cua no.
Du lieu dugc chuyen tCr nguon den cac cube qua cong cu ETL (Extract,
Transform, and Load).
2.1.2 Dimension
> Cac chieu la each mo ta chung loai ma theo do cac du lieu trong Cube dugc
phan chia de phan tich
> Moi chieu co the dugc xay dung tren mot hoac nhieu bang chieu (dimension
table) nham mo ta cho chilu do
> Cac chieu dugc anh xa ttr thong tin cua cac bang chieu trong kho du lieu vao
cac muc phan cap, vi du nhu chieu Dia ly thi gom cac muc nhu Luc dia, Quoc
gia, Tinh-Thanh pho. Trong vi du mo hinh kinh doanh a tren, chieu Thai gian co
the dugc mo ta bdi cac thuoc tinh nhu Nam, Quy, Thang va Ngay.
29
> Cac chieu dugc tao mot each doe lap va co the chia se giua cac OLAP Cube
nham xay dung cac Cube di dang va de chac chan ring thong tin tong hop cho
phan tich luon on dinh.
Time Product Location
[AM] All
"j Region| Manufacturer |
yX | Quarter | ;
Week | District || Brand |Ij Month|
r i | Store |Product I
Hinh IS: Vi du ve cac Dimension (Chieu) khac nhau
> Chieu c6 su phan c4p :
Phan cap la cot song cua viec gop du lieu hay noi mot each khac la dua vao cac
phan cap ma viec gop du lieu mod co the thuc hien dugc. Phan lorn cac chieu deu co
mot ciu true da muc hay phan cap.
Vi du: Neu chung ta lam nhung quyet dinh ve gia san pham de toi da doanh thu
thi chung ta can quan sat a nhung du lieu ve doanh thu san pham dugc gop theo gia
san pham, tuc la chung ta da thuc hien mot each gop. Khi can lam nhung quyet dinh
khac thi chung ta can thuc hien nhung phep gop tuong ung khac.
Nhu vay co the co qua nhieu tien trinh gop. The nen cac tien trinh gop nay can
phai dugc thuc hien mot each de dang, linh hoat de co the h6 trg nhimg phan tich
khong hoach dinh truoc. Dieu nay co the dugc giai quyet tren co sd co su trg giup
cua nhung phan cip rang va sau.
30
2.1.3 Measure
> Cac don vi do cua khoi la cac cot trong bang Fact. Cac dorn vi do lucmg xac dinh
nhung gia tri so tir bang Fact ma dugc tong hop phan tlch nhu dinh gia, tri gia,
hoac so luong ban.
> La don vi do de danh gia, phan tich du lieu.
2.1.4 Fact table (Fact)
> Fact la bang mo ta su kien ma he thong muon phan tich, ket hgrp voi cac
dimension da dugc xay dimg sin nham to chuc du lieu mot each co he th6ng.
> Mot fact gom cac truong chua khoa chinh cua cac dimension ma no ket hop,
dong thai chua cac measure cua cube. Moi fact co cac measure khac nhau.
> Fact dugc luu tai mot cap du lieu tho nao do
Cac fchra n'Ui fact table nr
khoacim cac <iiivi«u$icm table
Stoielufo Mesures
\
\ Prod Code TimeCode StoreCodt Sales Qty
Fact tableProsi'.ctlnfo I
[ TuneInfo
I
Hinh 16: Fact Table
31
2.1.5 Slice
> Slice la mot “lat cat”, tuong ting mot gia tri duy nhat cho chieu “duoc chieu”
vdi cac gia tri thanh vien cua cac chieu khac.
columnspage
Product: shoes Variables: all
Sales MarginOrect Costs Total CostsIndirect Costs
520 320 430 90January 110.shoes
February 400 250 130 380 20
March 430 300 120 420 10
. * April 490 320 150 470 20A
£ May 520 310 490 30180
June 390 230 150 380 10
470 290 160 450 2:July
August 500 360 150 510 -10P
September 450 290 430 22140
October 50480 290 140 430
rows November 510 310 150 460 50
Time: months December 550 330 160 490 60
© ©Basic OLAP Analysis Relational OLAP Analysis
/
[pUi Revenue for Laptop
Computers In 2003
At All Stores
I i
!,•!: r
- IZZIZ::!Z M i/8
3 JGeofyaphy Revenue for All ElectronicsIn 2003 and Q1 2004
At Stores In tfie NE Region
/Ia
F
Total Revenue and Costa
In Jan 2004 and Jan 2003
At Top 10 Revenue Stores
Data Warehou*ÿ
Hinh 17: Slice
32
2.1.6 Partition
> Tat ca cac kh6i deu co toi thiiu mot phan hoach (Partition) de chua du lieu cua
no. Mot partition don dupe tu dong tao ra khi khoi dupe dinh nghia.
> Khi ta tao mot partition moi cho mot khoi, partition mod nay duoc them vao
trong tap hop cac partition da ton tai doi voi khoi.
> Khoi phan anh du lieu da duoc ket noi co trong tit ca cac partition cua no. Mot
bang partition cua khoi la vo hinh doi voi ngudi dung.
> Cac partition tieu bieu cho mot cong cu manh, mem deo cho viec quan tri cac
khii OLAP, dac biet cac khii ldn.
Vi du: Mot khoi chua thong tin thuong mai co the chua trong mot hoac nhieu
partition cho du lieu cua nhung nam trudc va cac partition cho moi quy cua nam
hien tai. Cuoi nam cac bang partition cua bon quy co the duqc hop nhat trong
mpt partition don cho nam do. Cac bang partition co the duoc luu tru voi cac su
lua chon ket hop khac nhau theo phuong thuc luu tru, dinh vi du lieu nguon va
thiit ki kit hop. Tinh mim deo nay cho phep ta thiit ki cac chiin luoc lvm tru
khoi thich hop voi cac yeu cau cua ta.
2.1.7 Role
> Bao mat trong Analysis Services duoc tich hop san va dua tren bao mat cua
Microsoft Windows, ©i Analysis Services dupe bao mat, ta tao cac role trong
database OLAP. Moi role co the chua mot hoac nhieu user (group) dupe tao
trong he dieu hanh. M6i role dupe tao ra se lien ket voi cac cube trong database
OLAP. Nhd vay, ta co the quan ly su bao mat cua cube bang each han che truy
cap metadata (cac member trong dimemsion) cung nhu truy cap data (cac gia tri
luu tru trong cube).
> Analysis Services co mot role co dinh dupe tao san, bao gom cac thanh vien cua
group Administrators tren may chu server, dupe goi la server role.
> Be co the thuc thi moi tac vu trong he thong Analysis Services, user dang dung
33
phai la thanh vien cua group Administrators. Tat ca thanh vien trong group nay
dupe toan quyen tren cac cube va cac database cua server.
> Trong thuc te, ngudi ta tao ra cac role tren moi database, goi la database role,
cho phep mot user chi dupe quyen quan tri tren mot database.
> Ngoai ra, ta con co the tao cac role d cap do cube (truy cap toan bo mot so cube
quy dinh) hoac a cap do dimension (truy cap data a mot s6 dimension cua mot
so cube nhat dinh)
2.1.8 Schedule (Lap lich)
> Tai mot thdi diem nao do, he thong OLAP server can phai process lai de update
nhung thay doi tir phia Database Server, dam bao cho Client co the truy xuat
dupe nhung du lieu moi nhat va day du cua he thong.
> Co 2 each process: thu cong va tu dong
• Thu cong: co the tien hanh process thu cong khi can thiet ( co su thay doi
du lieu hay ciu true he thong).
• Tu dong: co the lap lich cho he thong tu dong process sau mot khoang
thdi gian quy dinh trade.
34
2.2. Quy trinh thiet ke cac doi tirong trong SSAS
Cube model
Cube *± Dimension
Cube dimension
Hierarchy
Cube hierarchy I
",Cube level Facts Join Level,TCube facts
Measure Measure Attribute Join Attribute
1mnDimension tables Dimension tablesIHl Fact table Relationaltables in DB2
Hinh 18: Cac ddi tirong trong SSAS
2.2.1 So1 lucre ve quy trinh xay dung Data Warehouse
2.2.1.1 Quy trinh chung
> Nhan dang du lieu nguon (Identify the data source):
• Mot trong nhung buac dau tien de xay dung Data Warehouse la phai
nhan biet cac nguon du lieu. Ta can phai tinh toan va xem xet nhung du
lieu n
Các file đính kèm theo tài liệu này:
- luan_van_nghien_cuu_cong_nghe_olap_ung_dung.pdf