如何使用AnalyticDB for PostgreSQL 6.0 进行TPC-H 1TB数据的测试

TPC-H 简介

TPC-H是事务处理性能委员会( Transaction ProcessingPerformance Council )制定的基准程序之一,TPC-H 主要目的是评价特定查询的决策支持能力,该基准模拟了决策支持系统中的数据库操作,测试数据库系统复杂查询的响应时间。
TPC-H 里的表是模拟一个配件销售管理系统进行建模。详情参考 TPCH Specification

8张表逻辑关系

如何使用AnalyticDB for PostgreSQL 6.0 进行TPC-H 1TB数据的测试

测试数据量的说明

数据量的大小对查询速度有直接的影响,TPC-H 中使用SF描述数据量,1SF 对应1GB 单位。1000SF,即1TB。1SF对应的数据量只是8个表的总数据量不包括索引等空间占用,准备数据时需预留更多空间。

AnalyticDB for PostgreSQL 6.0 规格选择

选择性价比适中的两种规格:
• ADB PG 6.0 SSD存储+单节点1核+实例节点数64
• ADB PG 6.0 SSD存储+单节点4核+实例节点数32
• 后续的测试过程均使用默认配置,未针对TPC-H的查询SQL做特别的修改

测试步骤

开通一个ECS实例

准备一台ECS(建议规格:ecs.g6.4xlarge规格、CentOS系统、ESSD 2T数据盘,建议与AnalyticDB for PostgreSQL 6.0 实例用相同region和VPC网络),用于1T数据生成、数据上传/入库、客户端测试。

开通一个AnalyticDB for PostgreSQL 6.0 实例

参考配置如下图。建议与ECS实例用相同区域和VPC网络。如何使用AnalyticDB for PostgreSQL 6.0 进行TPC-H 1TB数据的测试

开通外网,修改白名单,创建数据库账号

进入阿里云分析型数据库PostgreSQL产品页,进入分析型数据库PostgreSQL版控制台,找到已开通的AnalyticDB for PostgreSQL 6.0 实例,点击“实例名链接”进入详情页,参考下图位置,修改配置项。如何使用AnalyticDB for PostgreSQL 6.0 进行TPC-H 1TB数据的测试

生成TPC-H 1T数据

  • SSH进入到ECS实例,下载TPC-H dbgen代码,编译后dbgen目录生成 dbgen/qgen 执行程序。
git clone https://github.com/gregrahn/tpch-kit.git
cd tpch-kit/dbgen
make
  • 生成1T数据,运行 ./dbgen --help 查看如何生成,命令参考:
    ./dbgen -vf -s 1000
  • 也可以并行生成分片文件,如下shell脚本参考(10个分片文件):
for((i=1;i<=10;i++));
do
./dbgen -s 1000 -S $i -C 10 -f &
done
  • 处理生成的 tbl 文件,tbl文件每行最后会多1个'|',可以用seed命令将每行后面的'|'去掉,shell脚本参考:
sed -i 's/.$//' ./region.tbl &
sed -i 's/.$//' ./nation.tbl &
for((i=1;i<=10;i++));
do
sed -i 's/.$//' ./lineitem.tbl.$i &
sed -i 's/.$//' ./orders.tbl.$i &
sed -i 's/.$//' ./customer.tbl.$i &
sed -i 's/.$//' ./partsupp.tbl.$i &
sed -i 's/.$//' ./part.tbl.$i &
sed -i 's/.$//' ./supplier.tbl.$i &
done

向数据库中建表

在ECS机器上检查是否存在PSQL命令,如果没有,安装PSQL客户端:
sudo yum install postgresql

准备TPC-H涉及到的8张表创建SQL,建表语句参考如下。
列存表适合向量计算、JIT架构。对大批量数据的访问和统计,效率更高。因此建表语句中使用了
• AO列存表
• ZSTD 开启压缩

create table nation (
n_nationkey  integer not null,
n_name       char(25) not null,
n_regionkey  integer not null,
n_comment    varchar(152)
)
with (appendonly=true, orientation=column, compresstype=zstd, compresslevel=9)
distributed by (n_nationkey);
create table region (
r_regionkey  integer not null,
r_name       char(25) not null,
r_comment    varchar(152)
)
with (appendonly=true, orientation=column, compresstype=zstd, compresslevel=9)
distributed by (r_regionkey);
create table part (
p_partkey     integer not null,
p_name        varchar(55) not null,
p_mfgr        char(25) not null,
p_brand       char(10) not null,
p_type        varchar(25) not null,
p_size        integer not null,
p_container   char(10) not null,
p_retailprice numeric(15,2) not null,
p_comment     varchar(23) not null
)
with (appendonly=true, orientation=column, compresstype=zstd, compresslevel=9)
distributed by (p_partkey);
create table supplier (
s_suppkey     integer not null,
s_name        char(25) not null,
s_address     varchar(40) not null,
s_nationkey   integer not null,
s_phone       char(15) not null,
s_acctbal     numeric(15,2) not null,
s_comment     varchar(101) not null
)
with (appendonly=true, orientation=column, compresstype=zstd, compresslevel=9)
distributed by (s_suppkey);
create table partsupp (
ps_partkey     integer not null,
ps_suppkey     integer not null,
ps_availqty    integer not null,
ps_supplycost  numeric(15,2)  not null,
ps_comment     varchar(199) not null
)
with (appendonly=true, orientation=column, compresstype=zstd, compresslevel=9)
distributed by (ps_partkey);
create table customer (
c_custkey     integer not null,
c_name        varchar(25) not null,
c_address     varchar(40) not null,
c_nationkey   integer not null,
c_phone       char(15) not null,
c_acctbal     numeric(15,2)   not null,
c_mktsegment  char(10) not null,
c_comment     varchar(117) not null
)
with (appendonly=true, orientation=column, compresstype=zstd, compresslevel=9)
distributed by (c_custkey);
create table orders (
o_orderkey       bigint not null,
o_custkey        integer not null,
o_orderstatus    char(1) not null,
o_totalprice     numeric(15,2) not null,
o_orderdate      date not null,
o_orderpriority  char(15) not null,
o_clerk          char(15) not null,
o_shippriority   integer not null,
o_comment        varchar(79) not null
)
with (appendonly=true, orientation=column, compresstype=zstd, compresslevel=9)
distributed by (o_orderkey);
create table lineitem (
l_orderkey    bigint not null,
l_partkey     integer not null,
l_suppkey     integer not null,
l_linenumber  integer not null,
l_quantity    numeric(15,2) not null,
l_extendedprice  numeric(15,2) not null,
l_discount    numeric(15,2) not null,
l_tax         numeric(15,2) not null,
l_returnflag  char(1) not null,
l_linestatus  char(1) not null,
l_shipdate    date not null,
l_commitdate  date not null,
l_receiptdate date not null,
l_shipinstruct char(25) not null,
l_shipmode     char(10) not null,
l_comment      varchar(44) not null
)
with (appendonly=true, orientation=column, compresstype=zstd, compresslevel=9)
distributed by (l_orderkey);

运行SQL脚本文件的shell脚本参考:

export PGPASSWORD=<数据库账号密码>
psql -h <ADB PG实例内网或外网地址> -p 3432 -U <数据库账号> -f <创建表的SQL脚本文件路径> 

导入数据

准备工作就绪,可以开始导入数据了,导入数据有两种方式:

  1. 通过 copy from导入
  2. 通过OSS外表方式导入

下面分别介绍两种导入方法

COPY方式导入

SQL脚本参考:

copy nation from '/data/tpch_1t/nation.tbl' DELIMITER '|';
copy region from '/data/tpch_1t/region.tbl' DELIMITER '|';
copy supplier from '/data/tpch_1t/supplier.tbl' DELIMITER '|';

tbl路径以实际路径为准,导入shell脚本参考创建表的shell脚本(或psql进入数据库执行)。为提高导入效率(ECS网络带宽保障),可以把SQL拆开并发导入。

使用OSS外表方式导入数据

将生成的数据文件上传到oss

./ossutil64 cp -r <tbl文件目录> oss://<oss bucket>/<目录>/
-i <AccessKey ID> -k <Access Key Secret>
-e <EndPoint>

使用AnalyticDB for PostgreSQL 6.0的OSS外表进行TPCH测试数据导入,已在北京可用区的OSS上准备好了可以使用OSS外表的数据,且为公共可读权限。用户无需再进行数据生成,然后再导入数据到OSS。
OSS外表文档参考: OSS数据导入AnalyticDB for PostgreSQL

创建OSS外部表的建表语句示例(含OSS地址)

create readable external table ext_nation ( n_nationkey int, n_name varchar(25), n_regionkey integer,
n_comment varchar(152))
location('oss://oss-cn-beijing.aliyuncs.com
filepath=data/tpch_data_1000x/nation.tbl
id=LTAI0gcwwnjiwalU key= Z64Sf5jfk9xDBpwzivxesrGWCWN5CX
bucket=oss-y') FORMAT 'TEXT' (DELIMITER '|' ) ;
create readable external table ext_region ( R_REGIONKEY int, R_NAME CHAR(25),R_COMMENT VARCHAR(152))
location('oss://oss-cn-beijing.aliyuncs.com
filepath=data/tpch_data_1000x/region.tbl
id= LTAI0gcwwnjiwalU key= Z64Sf5jfk9xDBpwzivxesrGWCWN5CX
bucket=oss-y') FORMAT 'TEXT' (DELIMITER '|' ) ;
CREATE readable external TABLE ext_lineitem ( l_orderkey bigint, l_partkey bigint, l_suppkey bigint,
l_linenumber bigint, l_quantity double precision, l_extendedprice double precision,
l_discount double precision, l_tax double precision, l_returnflag CHAR(1),
l_linestatus CHAR(1), l_shipdate DATE, l_commitdate DATE, l_receiptdate DATE,
l_shipinstruct CHAR(25), l_shipmode CHAR(10), l_comment VARCHAR(44))
location('oss://oss-cn-beijing.aliyuncs.com
filepath=data/tpch_data_1000x/lineitem.tbl
id= LTAI0gcwwnjiwalU key= Z64Sf5jfk9xDBpwzivxesrGWCWN5CX
bucket=oss-y ') FORMAT 'TEXT' (DELIMITER '|' ) ;
CREATE readable external TABLE ext_orders ( o_orderkey bigint , o_custkey bigint , o_orderstatus CHAR(1) ,
o_totalprice double precision, o_orderdate DATE , o_orderpriority CHAR(15) , o_clerk CHAR(15) ,
o_shippriority bigint , o_comment VARCHAR(79) )
location('oss://oss-cn-beijing.aliyuncs.com
filepath=data/tpch_data_1000x/orders.tbl
id= LTAI0gcwwnjiwalU key= Z64Sf5jfk9xDBpwzivxesrGWCWN5CX
bucket=oss-y') FORMAT 'TEXT' (DELIMITER '|' ) ;
CREATE readable external TABLE ext_part ( p_partkey bigint , p_name VARCHAR(55) , p_mfgr CHAR(25) ,
p_brand CHAR(10) , p_type VARCHAR(25) , p_size bigint , p_container CHAR(10) ,
p_retailprice double precision , p_comment VARCHAR(23) )
location('oss://oss-cn-beijing.aliyuncs.com
filepath=data/tpch_data_1000x/part.tbl
id= LTAI0gcwwnjiwalU key= Z64Sf5jfk9xDBpwzivxesrGWCWN5CX
bucket=oss-y') FORMAT 'TEXT' (DELIMITER '|' ) ;
CREATE readable external TABLE ext_partsupp ( ps_partkey bigint , ps_suppkey bigint ,
ps_availqty bigint , ps_supplycost double precision , ps_comment VARCHAR(199) )
location('oss://oss-cn-beijing.aliyuncs.com
filepath=data/tpch_data_1000x/partsupp.tbl
id= LTAI0gcwwnjiwalU key= Z64Sf5jfk9xDBpwzivxesrGWCWN5CX
bucket=oss-y') FORMAT 'TEXT' (DELIMITER '|' ) ;
CREATE readable external TABLE ext_supplier ( s_suppkey bigint , s_name CHAR(25) ,
s_address VARCHAR(40) , s_nationkey bigint , s_phone CHAR(15) , s_acctbal DECIMAL(15,2) ,
s_comment VARCHAR(101) )
location('oss://oss-cn-beijing.aliyuncs.com
filepath=data/tpch_data_1000x/supplier.tbl
id= LTAI0gcwwnjiwalU key= Z64Sf5jfk9xDBpwzivxesrGWCWN5CX
bucket=oss-y') FORMAT 'TEXT' (DELIMITER '|' ) ;
CREATE readable external TABLE ext_customer ( c_custkey bigint , c_name VARCHAR(25) ,
c_address VARCHAR(40) , c_nationkey bigint , c_phone CHAR(15) , c_acctbal double precision ,
c_mktsegment CHAR(10) , c_comment VARCHAR(117) )
location('oss://oss-cn-beijing.aliyuncs.com
filepath=data/tpch_data_1000x/customer.tbl
id= LTAI0gcwwnjiwalU key= Z64Sf5jfk9xDBpwzivxesrGWCWN5CX
bucket=oss-y') FORMAT 'TEXT' (DELIMITER '|' ) ;

从OSS外表写入TPC-H数据到AnalyticDB for PostgreSQL

insert into nation select * from ext_nation;
insert into region select * from ext_region;
insert into lineitem select * from ext_lineitem;
insert into orders select * from ext_orders;
insert into customer select * from ext_customer;
insert into part select * from ext_part;
insert into partsupp select * from ext_partsupp;
insert into supplier select * from ext_supplier;

至此数据导入完毕,进入查询执行阶段

收集统计信息

analyze nation;
analyze region;
analyze lineitem;
analyze orders;
analyze customer;
analyze part;
analyze partsupp;
analyze supplier;

执行查询

使用如下shell脚本测试,也可以通过psql等其他客户端逐条执行查询SQL。具体的22条SQL语句见本文最后。

total_cost=0
for i in {1..22}
do
echo "begin run Q${i}, query/q$i.sql , `date`"
begin_time=`date +%s.%N`
#psql -h ${实例连接地址} -p ${端口号} -U ${数据库用户} -f query/q${i}.sql > ./log/log_q${i}.out
rc=$?
end_time=`date +%s.%N`
cost=`echo "$end_time-$begin_time"|bc`
total_cost=`echo "$total_cost+$cost"|bc`
if [ $rc -ne 0 ] ; then
printf "run Q%s fail, cost: %.2f, totalCost: %.2f, `date`n" $i $cost $total_cost
else
printf "run Q%s succ, cost: %.2f, totalCost: %.2f, `date`n" $i $cost $total_cost
fi
done

测试结果

各表数据量说明,1TB数据(不含索引)

表名 数据条目数
customer 15000w
lineitem 600000w
nation 25
orders 150000w
part 20000w
partsupp 80000w
region 5
supplier 1000w

每个SQL的执行时间统计

SQL 1core * 64节点 SSD型(RT:秒) 4core * 32节点 SSD型(RT:秒)
Q1 338.58 429.23
Q2 30.38 35.82
Q3 105.71 87.32
Q4 84.33 111.34
Q5 148.58 173.19
Q6 34.44 43.67
Q7 179.93 103.53
Q8 89.64 120.09
Q9 416.47 264.56
Q10 71.54 72.75
Q11 24.82 15.53
Q12 46.12 71.27
Q13 119.61 84.84
Q14 66.86 41.13
Q15 120.14 93.65
Q16 24.07 22.98
Q17 897.21 1171.89
Q18 284.00 379.58
Q19 78.82 54.93
Q20 180.50 112.77
Q21 240.50 245.00
Q22 121.30 77.21
Total 3703.56 3812.29

22个SQL

-- Q1
select
l_returnflag,
l_linestatus,
sum(l_quantity) as sum_qty,
sum(l_extendedprice) as sum_base_price,
sum(l_extendedprice * (1 - l_discount)) as sum_disc_price,
sum(l_extendedprice * (1 - l_discount) * (1 + l_tax)) as sum_charge,
avg(l_quantity) as avg_qty,
avg(l_extendedprice) as avg_price,
avg(l_discount) as avg_disc,
count(*) as count_order
from
lineitem
where
l_shipdate <= date '1998-12-01' - interval '93 day'
group by
l_returnflag,
l_linestatus
order by
l_returnflag,
l_linestatus;
-- Q2
select
s_acctbal,
s_name,
n_name,
p_partkey,
p_mfgr,
s_address,
s_phone,
s_comment
from
part,
supplier,
partsupp,
nation,
region
where
p_partkey = ps_partkey
and s_suppkey = ps_suppkey
and p_size = 23
and p_type like '%STEEL'
and s_nationkey = n_nationkey
and n_regionkey = r_regionkey
and r_name = 'EUROPE'
and ps_supplycost = (
select
min(ps_supplycost)
from
partsupp,
supplier,
nation,
region
where
p_partkey = ps_partkey
and s_suppkey = ps_suppkey
and s_nationkey = n_nationkey
and n_regionkey = r_regionkey
and r_name = 'EUROPE'
)
order by
s_acctbal desc,
n_name,
s_name,
p_partkey
limit 100;
-- Q3
select
l_orderkey,
sum(l_extendedprice * (1 - l_discount)) as revenue,
o_orderdate,
o_shippriority
from
customer,
orders,
lineitem
where
c_mktsegment = 'MACHINERY'
and c_custkey = o_custkey
and l_orderkey = o_orderkey
and o_orderdate < date '1995-03-24'
and l_shipdate > date '1995-03-24'
group by
l_orderkey,
o_orderdate,
o_shippriority
order by
revenue desc,
o_orderdate
limit 10;
-- Q4
select
o_orderpriority,
count(*) as order_count
from
orders
where
o_orderdate >= date '1996-08-01'
and o_orderdate < date '1996-08-01' + interval '3' month
and exists (
select
*
from
lineitem
where
l_orderkey = o_orderkey
and l_commitdate < l_receiptdate
)
group by
o_orderpriority
order by
o_orderpriority;
-- Q5
select
n_name,
sum(l_extendedprice * (1 - l_discount)) as revenue
from
customer,
orders,
lineitem,
supplier,
nation,
region
where
c_custkey = o_custkey
and l_orderkey = o_orderkey
and l_suppkey = s_suppkey
and c_nationkey = s_nationkey
and s_nationkey = n_nationkey
and n_regionkey = r_regionkey
and r_name = 'MIDDLE EAST'
and o_orderdate >= date '1994-01-01'
and o_orderdate < date '1994-01-01' + interval '1' year
group by
n_name
order by
revenue desc;
-- Q6
select
sum(l_extendedprice * l_discount) as revenue
from
lineitem
where
l_shipdate >= date '1994-01-01'
and l_shipdate < date '1994-01-01' + interval '1' year
and l_discount between 0.06 - 0.01 and 0.06 + 0.01
and l_quantity < 24;
-- Q7
select
supp_nation,
cust_nation,
l_year,
sum(volume) as revenue
from
(
select
n1.n_name as supp_nation,
n2.n_name as cust_nation,
extract(year from l_shipdate) as l_year,
l_extendedprice * (1 - l_discount) as volume
from
supplier,
lineitem,
orders,
customer,
nation n1,
nation n2
where
s_suppkey = l_suppkey
and o_orderkey = l_orderkey
and c_custkey = o_custkey
and s_nationkey = n1.n_nationkey
and c_nationkey = n2.n_nationkey
and (
(n1.n_name = 'JORDAN' and n2.n_name = 'INDONESIA')
or (n1.n_name = 'INDONESIA' and n2.n_name = 'JORDAN')
)
and l_shipdate between date '1995-01-01' and date '1996-12-31'
) as shipping
group by
supp_nation,
cust_nation,
l_year
order by
supp_nation,
cust_nation,
l_year;
-- Q8
select
o_year,
sum(case
when nation = 'INDONESIA' then volume
else 0
end) / sum(volume) as mkt_share
from
(
select
extract(year from o_orderdate) as o_year,
l_extendedprice * (1 - l_discount) as volume,
n2.n_name as nation
from
part,
supplier,
lineitem,
orders,
customer,
nation n1,
nation n2,
region
where
p_partkey = l_partkey
and s_suppkey = l_suppkey
and l_orderkey = o_orderkey
and o_custkey = c_custkey
and c_nationkey = n1.n_nationkey
and n1.n_regionkey = r_regionkey
and r_name = 'ASIA'
and s_nationkey = n2.n_nationkey
and o_orderdate between date '1995-01-01' and date '1996-12-31'
and p_type = 'STANDARD BRUSHED BRASS'
) as all_nations
group by
o_year
order by
o_year;
-- Q9
select
nation,
o_year,
sum(amount) as sum_profit
from
(
select
n_name as nation,
extract(year from o_orderdate) as o_year,
l_extendedprice * (1 - l_discount) - ps_supplycost * l_quantity as amount
from
part,
supplier,
lineitem,
partsupp,
orders,
nation
where
s_suppkey = l_suppkey
and ps_suppkey = l_suppkey
and ps_partkey = l_partkey
and p_partkey = l_partkey
and o_orderkey = l_orderkey
and s_nationkey = n_nationkey
and p_name like '%chartreuse%'
) as profit
group by
nation,
o_year
order by
nation,
o_year desc;
-- Q10
select
c_custkey,
c_name,
sum(l_extendedprice * (1 - l_discount)) as revenue,
c_acctbal,
n_name,
c_address,
c_phone,
c_comment
from
customer,
orders,
lineitem,
nation
where
c_custkey = o_custkey
and l_orderkey = o_orderkey
and o_orderdate >= date '1994-08-01'
and o_orderdate < date '1994-08-01' + interval '3' month
and l_returnflag = 'R'
and c_nationkey = n_nationkey
group by
c_custkey,
c_name,
c_acctbal,
c_phone,
n_name,
c_address,
c_comment
order by
revenue desc
limit 20;
-- Q11
select
ps_partkey,
sum(ps_supplycost * ps_availqty) as value
from
partsupp,
supplier,
nation
where
ps_suppkey = s_suppkey
and s_nationkey = n_nationkey
and n_name = 'INDONESIA'
group by
ps_partkey having
sum(ps_supplycost * ps_availqty) > (
select
sum(ps_supplycost * ps_availqty) * 0.0001000000
from
partsupp,
supplier,
nation
where
ps_suppkey = s_suppkey
and s_nationkey = n_nationkey
and n_name = 'INDONESIA'
)
order by
value desc;
-- Q12
select
l_shipmode,
sum(case
when o_orderpriority = '1-URGENT'
or o_orderpriority = '2-HIGH'
then 1
else 0
end) as high_line_count,
sum(case
when o_orderpriority <> '1-URGENT'
and o_orderpriority <> '2-HIGH'
then 1
else 0
end) as low_line_count
from
orders,
lineitem
where
o_orderkey = l_orderkey
and l_shipmode in ('REG AIR', 'TRUCK')
and l_commitdate < l_receiptdate
and l_shipdate < l_commitdate
and l_receiptdate >= date '1994-01-01'
and l_receiptdate < date '1994-01-01' + interval '1' year
group by
l_shipmode
order by
l_shipmode;
-- Q13
select
c_count,
count(*) as custdist
from
(
select
c_custkey,
count(o_orderkey)
from
customer left outer join orders on
c_custkey = o_custkey
and o_comment not like '%pending%requests%'
group by
c_custkey
) as c_orders (c_custkey, c_count)
group by
c_count
order by
custdist desc,
c_count desc;
-- Q14
select
100.00 * sum(case
when p_type like 'PROMO%'
then l_extendedprice * (1 - l_discount)
else 0
end) / sum(l_extendedprice * (1 - l_discount)) as promo_revenue
from
lineitem,
part
where
l_partkey = p_partkey
and l_shipdate >= date '1994-11-01'
and l_shipdate < date '1994-11-01' + interval '1' month;
-- Q15
create view revenue0 (supplier_no, total_revenue) as
select
l_suppkey,
sum(l_extendedprice * (1 - l_discount))
from
lineitem
where
l_shipdate >= date '1997-10-01'
and l_shipdate < date '1997-10-01' + interval '3' month
group by
l_suppkey;
select
s_suppkey,
s_name,
s_address,
s_phone,
total_revenue
from
supplier,
revenue0
where
s_suppkey = supplier_no
and total_revenue = (
select
max(total_revenue)
from
revenue0
)
order by
s_suppkey;
drop view revenue0;
-- Q16
select
p_brand,
p_type,
p_size,
count(distinct ps_suppkey) as supplier_cnt
from
partsupp,
part
where
p_partkey = ps_partkey
and p_brand <> 'Brand#44'
and p_type not like 'SMALL BURNISHED%'
and p_size in (36, 27, 34, 45, 11, 6, 25, 16)
and ps_suppkey not in (
select
s_suppkey
from
supplier
where
s_comment like '%Customer%Complaints%'
)
group by
p_brand,
p_type,
p_size
order by
supplier_cnt desc,
p_brand,
p_type,
p_size;
-- Q17
select
sum(l_extendedprice) / 7.0 as avg_yearly
from
lineitem,
part
where
p_partkey = l_partkey
and p_brand = 'Brand#42'
and p_container = 'JUMBO PACK'
and l_quantity < (
select
0.2 * avg(l_quantity)
from
lineitem
where
l_partkey = p_partkey
);
-- Q18
select
c_name,
c_custkey,
o_orderkey,
o_orderdate,
o_totalprice,
sum(l_quantity)
from
customer,
orders,
lineitem
where
o_orderkey in (
select
l_orderkey
from
lineitem
group by
l_orderkey having
sum(l_quantity) > 312
)
and c_custkey = o_custkey
and o_orderkey = l_orderkey
group by
c_name,
c_custkey,
o_orderkey,
o_orderdate,
o_totalprice
order by
o_totalprice desc,
o_orderdate
limit 100;
-- Q19
select
sum(l_extendedprice* (1 - l_discount)) as revenue
from
lineitem,
part
where
(
p_partkey = l_partkey
and p_brand = 'Brand#43'
and p_container in ('SM CASE', 'SM BOX', 'SM PACK', 'SM PKG')
and l_quantity >= 5 and l_quantity <= 5 + 10
and p_size between 1 and 5
and l_shipmode in ('AIR', 'AIR REG')
and l_shipinstruct = 'DELIVER IN PERSON'
)
or
(
p_partkey = l_partkey
and p_brand = 'Brand#45'
and p_container in ('MED BAG', 'MED BOX', 'MED PKG', 'MED PACK')
and l_quantity >= 12 and l_quantity <= 12 + 10
and p_size between 1 and 10
and l_shipmode in ('AIR', 'AIR REG')
and l_shipinstruct = 'DELIVER IN PERSON'
)
or
(
p_partkey = l_partkey
and p_brand = 'Brand#11'
and p_container in ('LG CASE', 'LG BOX', 'LG PACK', 'LG PKG')
and l_quantity >= 24 and l_quantity <= 24 + 10
and p_size between 1 and 15
and l_shipmode in ('AIR', 'AIR REG')
and l_shipinstruct = 'DELIVER IN PERSON'
);
-- Q20
select
s_name,
s_address
from
supplier,
nation
where
s_suppkey in (
select
ps_suppkey
from
partsupp
where
ps_partkey in (
select
p_partkey
from
part
where
p_name like 'magenta%'
)
and ps_availqty > (
select
0.5 * sum(l_quantity)
from
lineitem
where
l_partkey = ps_partkey
and l_suppkey = ps_suppkey
and l_shipdate >= date '1996-01-01'
and l_shipdate < date '1996-01-01' + interval '1' year
)
)
and s_nationkey = n_nationkey
and n_name = 'RUSSIA'
order by
s_name;
-- Q21
select
s_name,
count(*) as numwait
from
supplier,
lineitem l1,
orders,
nation
where
s_suppkey = l1.l_suppkey
and o_orderkey = l1.l_orderkey
and o_orderstatus = 'F'
and l1.l_receiptdate > l1.l_commitdate
and exists (
select
*
from
lineitem l2
where
l2.l_orderkey = l1.l_orderkey
and l2.l_suppkey <> l1.l_suppkey
)
and not exists (
select
*
from
lineitem l3
where
l3.l_orderkey = l1.l_orderkey
and l3.l_suppkey <> l1.l_suppkey
and l3.l_receiptdate > l3.l_commitdate
)
and s_nationkey = n_nationkey
and n_name = 'MOZAMBIQUE'
group by
s_name
order by
numwait desc,
s_name
limit 100;
-- Q22
select
cntrycode,
count(*) as numcust,
sum(c_acctbal) as totacctbal
from
(
select
substring(c_phone from 1 for 2) as cntrycode,
c_acctbal
from
customer
where
substring(c_phone from 1 for 2) in
('13', '31', '23', '29', '30', '18', '17')
and c_acctbal > (
select
avg(c_acctbal)
from
customer
where
c_acctbal > 0.00
and substring(c_phone from 1 for 2) in
('13', '31', '23', '29', '30', '18', '17')
)
and not exists (
select
*
from
orders
where
o_custkey = c_custkey
)
) as custsale
group by
cntrycode
order by