DISTKEY 範例 DISTSTYLE EVEN 範例 DISTSTYLE ALL 範例

分佈範例

下列範例顯示資料如何根據您在 CREATETABLE陳述式中定義的選項分佈。

DISTKEY 範例

查看TICKIT資料庫中USERS資料表的結構描述。USERID 定義為資料SORTKEY欄和資料DISTKEY欄：


select "column", type, encoding, distkey, sortkey 
from pg_table_def where tablename = 'users';
    
    column     |          type          | encoding | distkey | sortkey
---------------+------------------------+----------+---------+---------
 userid        | integer                | none     | t       |       1
 username      | character(8)           | none     | f       |       0
 firstname     | character varying(30)  | text32k  | f       |       0

...

USERID 是此表格上分佈欄的好選擇。如果您查詢 SVV_DISKUSAGE 系統檢視，您可以看到資料表分佈非常均勻。資料欄編號為零，資料欄 0 USERID也是。


select slice, col, num_values as rows, minvalue, maxvalue
from svv_diskusage
where name='users' and col=0 and rows>0
order by slice, col;

slice| col | rows  | minvalue | maxvalue
-----+-----+-------+----------+----------
0    | 0   | 12496 | 4        | 49987
1    | 0   | 12498 | 1        | 49988
2    | 0   | 12497 | 2        | 49989
3    | 0   | 12499 | 3        | 49990
(4 rows)

資料表包含 49,990 個資料列。資料列 (num_values) 資料欄顯示每個配量包含大約相同數目的資料列。minvalue 和 maxvalue 資料欄顯示每一個配量上的值範圍。每個層幾乎都包含整個範圍的值，因此每個層都有很大機會參與執行查詢，該查詢會篩選使用者的範圍IDs。

此範例示範小型測試系統上的分佈。配量總數通常高得多。

如果您經常使用 STATE欄加入或分組，您可以選擇在 STATE欄上分發。下列範例顯示您建立具有與資料表相同資料但DISTKEY將設定為 STATE欄之新USERS資料表的案例。在此情況下，分佈不是平均的。配量 0 (13,587 個資料列) 比配量 3 (10,150 個資料列) 多保留大約 30% 的資料列。在規模很大的資料表中，此分佈扭曲數量可能對查詢處理具有負面影響。


create table userskey distkey(state) as select * from users;

select slice, col, num_values as rows, minvalue, maxvalue from svv_diskusage
where name = 'userskey' and col=0 and rows>0
order by slice, col;

slice | col | rows  | minvalue | maxvalue
------+-----+-------+----------+----------
    0 |   0 | 13587 |        5 |    49989
    1 |   0 | 11245 |        2 |    49990
    2 |   0 | 15008 |        1 |    49976
    3 |   0 | 10150 |        4 |    49986
(4 rows)

DISTSTYLE EVEN 範例

如果您使用與資料表相同的資料建立新USERS資料表，但將 DISTSTYLE設定為 EVEN，則資料列一律會平均分佈於切片。


create table userseven diststyle even as 
select * from users;

select slice, col, num_values as rows, minvalue, maxvalue from svv_diskusage
where name = 'userseven' and col=0 and rows>0
order by slice, col;

slice | col | rows  | minvalue | maxvalue
------+-----+-------+----------+----------
    0 |   0 | 12497 |        4 |    49990
    1 |   0 | 12498 |        8 |    49984
    2 |   0 | 12498 |        2 |    49988
    3 |   0 | 12497 |        1 |    49989  
(4 rows)

不過，因為分佈不是根據特定資料欄，所以查詢處理能力可能降低，尤其在資料表聯結至其他資料表時更是如此。聯結資料欄若少了分佈，通常會影響可以有效地執行聯結操作的類型。當這兩個資料表在其各自聯結資料欄上進行分佈和排序時，會最佳化聯結、彙總和分組操作。

DISTSTYLE ALL 範例

如果您使用與資料表相同的資料建立新USERS資料表，但將 DISTSTYLE設定為 ALL，則所有資料列都會分佈到每個節點的第一個層。


select slice, col, num_values as rows, minvalue, maxvalue from svv_diskusage
where name = 'usersall' and col=0 and rows > 0
order by slice, col;

slice | col | rows  | minvalue | maxvalue
------+-----+-------+----------+----------
    0 |   0 | 49990 |        4 |    49990
    2 |   0 | 49990 |        2 |    49990

(4 rows)

您的瀏覽器已停用或無法使用 Javascript。

您必須啟用 Javascript，才能使用 AWS 文件。請參閱您的瀏覽器說明頁以取得說明。

文件慣用形式

查詢計劃範例

排序金鑰