Working with the TPCH Data

This topic provides information about loading the TPCH demo data and running the example queries. You can also load the data and run queries using the interactive AnzoGraph Tutorial notebook as described in Zeppelin Notebook Integration or you can copy the queries in this topic and run them in the AnzoGraph Query Console (if available with your solution), command line interface, or another interface.

Loading the TPCH Data

Important: Loading the large TPCH data set requires at least 16 GB of available RAM. Attempting to load this data set with a smaller memory configuration may result in an out-of-memory error.

The attached query includes several INSERT queries that load the TPCH data into a graph named ghib. The queries load the data from the TPCH CSV files in the sf1.csv.gz directory in the Cambridge Semantics csi-sdl-data-tpch bucket on Amazon S3. Click the following link to open the query so that you can copy the contents to the Query Console or to your own file and run the queries:

Open the query for copying

The load takes a few minutes to run. When the load completes, you can run this query to list the TPCH predicates and display the number of times each predicate is referenced:

SELECT ?p (count(*) as ?count)
FROM <ghib>
WHERE { ?s ?p ?o . }
GROUP BY ?p
ORDER BY ?p

Getting to Know the TPCH Data

The TPCH demo is similar to the SQL TPC-H Decision Support data, which Cambridge Semantics converted to the RDF graph model. The data set and queries model a use case where a vendors purchase parts from suppliers and sell them to customers. Each sale is captured in an order with a line item for each part. Details include data such as the price paid, shipping status of each line item, and the clerk who took the order. Since parts may be available from multiple suppliers, the data includes the quantity and price from each supplier.

To help familiarize you with the data set, the following diagram shows the model or ontology for the GHIB graph. Circles represent subjects or classes of data and rectangles represent properties.

To help familiarize you with the triples in the GHIB graph, the diagram below shows an instance of a subset of the triples in the graph.

Running the TPCH Queries

This section lists each of the TPCH queries and provides a brief description.

  1. Pricing Summary Report
  2. Minimum Cost Supplier
  3. Shipping Priority
  4. Order Priority Checking
  5. Local Supplier Volume
  6. Forecasting Revenue Change
  7. Volume Shipping
  8. National Market Share
  9. Product Type Profit Measure
  10. Returned Item Reporting
  11. Important Stock Identification
  12. Shipping Modes and Order Priority
  13. Customer Distribution
  14. Promotion Effect
  15. Top Supplier
  16. Parts/Supplier Relationship
  17. Small-Quantity-Order Revenue
  18. Large Volume Customer
  19. Discounted Revenue
  20. Potential Part Promotion
  21. Suppliers Who Kept Orders Waiting
  22. Global Sales Opportunity

Pricing Summary Report

The query below returns a summary pricing report for all of the items shipped, billed, and returned in an 86-day period. In the results, items are grouped by their return and line item status. For each group the query returns the total number of items, total base price, total discounted price, total charged, and the average quantity of items, average price, average discount, and number of orders.

PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
PREFIX    : <ghib:tpch/>
SELECT
  ?l_returnflag
  ?l_linestatus
  (FORMATNUMBER(SUM(?l_quantity),'#.##') AS ?sum_qty)
  (FORMATNUMBER(SUM(?l_extendedprice),'#.##') AS ?sum_base_price)
  (FORMATNUMBER(SUM(?l_extendedprice * (1 - ?l_discount)),'#.##') AS ?sum_disc_price)
  (FORMATNUMBER(SUM(?l_extendedprice * (1 - ?l_discount) * (1 + ?l_tax)),'#.##') AS ?sum_charge)
  (FORMATNUMBER(AVG(?l_quantity),'#.##') AS ?avg_qty)
  (FORMATNUMBER(AVG(?l_extendedprice),'#.##') AS ?avg_price)
  (FORMATNUMBER(AVG(?l_discount),'#.##') AS ?avg_disc)
  (COUNT(*) AS ?count_order)
FROM <ghib>
WHERE {
  ?item :l_returnflag ?l_returnflag ;
  :l_linestatus ?l_linestatus ;
  :l_quantity ?l_quantity ;
  :l_extendedprice ?l_extendedprice ;
  :l_discount ?l_discount ;
  :l_shipdate ?l_shipdate ;
  :l_tax ?l_tax .
  FILTER(?l_shipdate <= (xsd:date('1998-12-01')-'P86D'^^xsd:duration)) .
}
GROUP BY ?l_returnflag ?l_linestatus
ORDER BY ?l_returnflag ?l_linestatus

Minimum Cost Supplier

The query below determines which supplier offers the lowest price for a given part in a given region. Since multiple suppliers in a region could offer the same part at the same (minimum) cost, the query also lists the suppliers with the 100 highest account balances. For each supplier, the query returns the part's number and manufacturer and the supplier's account balance, name, nation, address, phone number, and comments.

PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
PREFIX    : <ghib:tpch/>
SELECT
  ?s_acctbal
  ?s_name
  ?n_name
  (STRAFTER(STR(?part),'part_') AS ?p_partkey)
  ?p_mfgr
  ?s_address
  ?s_phone
  ?s_comment
FROM <ghib>
WHERE {
  ?part :p_mfgr ?p_mfgr ;
        :p_type ?p_type ;
        :p_size 38 .
  FILTER(REGEX(?p_type,'.*NICKEL')).
  ?supplier :s_name ?s_name ;
            :s_address ?s_address ;
            :s_comment ?s_comment ;
            :s_phone ?s_phone ;
            :s_acctbal ?s_acctbal ;
            :s_nation ?nation .
  << ?supplier :s_part ?part >> :s_supplycost ?s_supplycost .
  ?nation :n_name ?n_name;
          :r_name 'AFRICA' .
  { 
    SELECT ?part (MIN(?s_supplycost) AS ?min_supplycost)
    WHERE {
      ?part :p_type ?p_type ;
            :p_size 38 .
      FILTER(regex(?p_type,'.*NICKEL')) .
      ?supplier :s_nation ?nation .
      ?nation :r_name 'AFRICA'  .
      << ?supplier :s_part ?part >> :s_supplycost ?s_supplycost .
  }
  GROUP BY ?part
 }
  FILTER(?s_supplycost = ?min_supplycost)
}
ORDER BY DESC(?s_acctbal) ?n_name ?s_name ?part
LIMIT 100

Shipping Priority

The query below determines shipping priority by returning the 10 highest-value orders that have not yet shipped. For each order, the query returns the order number, revenue, order date, and shipping priority.

PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
PREFIX    : <ghib:tpch/>
SELECT
  (STRAFTER(STR(?order),'order_') AS ?l_orderkey)
  (SUM(?l_extendedprice * (1 - ?l_discount)) AS ?revenue)
  ?o_orderdate
  ?o_shippriority
FROM <ghib>
WHERE {
  ?item :l_order ?order ;
        :l_extendedprice ?l_extendedprice ;
        :l_discount ?l_discount ;
        :l_shipdate ?l_shipdate   .
  ?order :o_orderdate ?o_orderdate ;
         :o_shippriority ?o_shippriority ;
         :o_customer/:c_mktsegment 'BUILDING' .
  FILTER(?o_orderdate < '1995-03-03'^^xsd:date) .
  FILTER(?l_shipdate > '1995-03-03'^^xsd:date) .
}
GROUP BY ?order ?o_orderdate ?o_shippriority
ORDER BY DESC(?revenue) ?o_orderdate
LIMIT 10

Order Priority Checking

The query below returns the number of orders in each order priority group for a given quarter in a given year.

PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
PREFIX    : <ghib:tpch/>
SELECT ?o_orderpriority (COUNT(*) AS ?order_count)
FROM <ghib>
WHERE {
  ?order :o_orderpriority ?o_orderpriority;
         :o_orderdate ?o_orderdate .
  FILTER(?o_orderdate >= '1993-06-01'^^xsd:date) .
  FILTER(?o_orderdate < '1993-09-01'^^xsd:date) .
  FILTER( EXISTS { ?item :l_order ?order ;
                         :l_commitdate ?l_commitdate;
                         :l_receiptdate ?l_receiptdate .
          FILTER(?l_commitdate < ?l_receiptdate) . } )
}
GROUP BY ?o_orderpriority
ORDER BY ?o_orderpriority
LIMIT 10

Local Supplier Volume

The query below helps determine whether to invest in a local distribution center by returning the revenue volume for one year by local suppliers in the "America" region. The results are filtered to include only the orders where both the supplier and customer are in the same nation. The query displays the nations and their revenue volume in descending order.

PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
PREFIX    : <ghib:tpch/>
SELECT ?n_name (SUM(?l_extendedprice * (1 - ?l_discount)) AS ?revenue)
FROM <ghib>
WHERE {
  ?item :l_extendedprice ?l_extendedprice;
        :l_discount ?l_discount;
        :l_order ?order;
        :l_supplier ?supplier .
      {
        ?order :o_orderdate ?o_orderdate;
               :o_customer ?customer .
        FILTER(?o_orderdate >= '1994-01-01'^^xsd:date) .
        FILTER(?o_orderdate < '1994-01-01'^^xsd:date + 'P1Y'^^xsd:duration) .
          { 
            ?customer :c_nation ?c_nation .
            ?c_nation :n_name ?n_name;
                      :r_name 'AMERICA'.
          }
      } 
 { 
  ?supplier :s_nation ?s_nation .
  ?s_nation :n_name ?n_name;
            :r_name 'AMERICA'.
 }
}
GROUP BY ?n_name
ORDER BY DESC(?revenue)
LIMIT 10

Forecasting Revenue Change

The query below helps determine ways to raise profits by calculating the revenue increase that would result after eliminating certain discounts. The query considers all of the items shipped in one year where the quantity is less than 25 and the discount on the item is between 6% (0.06) and 8% (0.08). The results list the increase in revenue for that year if the discount was not applied.

PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
PREFIX    : <ghib:tpch/>
SELECT (SUM(?l_extendedprice * ?l_discount) AS ?revenue)
FROM <ghib>
WHERE {
  ?item :l_extendedprice ?l_extendedprice;
        :l_discount ?l_discount;
        :l_shipdate ?l_shipdate; 
        :l_quantity ?l_quantity .
  FILTER(?l_shipdate >= '1994-01-01'^^xsd:date).
  FILTER(?l_shipdate < '1995-01-01'^^xsd:date).
  FILTER(?l_discount >= 0.06 && ?l_discount <= 0.08).
  FILTER(?l_quantity < 25).
}

Volume Shipping

The query below provides information to help in the re-negotiation of shipping contracts. The query returns the gross discounted revenues derived from line items for which parts were shipped from a supplier in either nation to a customer in the other nation during 1995 and 1996. The results list the supplier nation, the customer nation, the year, and the revenue from the shipments for that year.

PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
PREFIX    : <ghib:tpch/>
SELECT ?supp_nation ?cust_nation ?l_year (SUM(?volume) AS ?revenue)
FROM <ghib>
WHERE { 
  {
    SELECT (?n_name1 AS ?supp_nation) (?n_name2 AS ?cust_nation)
           (year(?l_shipdate) AS ?l_year) (?l_extendedprice * (1 - ?l_discount) AS ?volume)
    WHERE { 
            { 
              ?item :l_supplier ?supplier ;
                    :l_shipdate ?l_shipdate ;
                    :l_extendedprice ?l_extendedprice ;
                    :l_discount ?l_discount ;
                    :l_order ?order .
              ?supplier :s_nation/:n_name ?n_name1 .
            }
            { ?order :o_customer/:c_nation/:n_name ?n_name2 . }
            FILTER((?n_name1 = 'CHINA' && ?n_name2 = 'RUSSIA') ||
                  (?n_name1 = 'RUSSIA' && ?n_name2 = 'CHINA')) .
            FILTER(?l_shipdate >= '1995-01-01'^^xsd:date &&
                   ?l_shipdate <= '1996-12-31'^^xsd:date) .
            }
     }
}
GROUP BY ?supp_nation ?cust_nation ?l_year
ORDER BY ?supp_nation ?cust_nation ?l_year

National Market Share

The query below determines how a nation's market share for a certain part type has changed over two years. The query shows results for the 1995 and 1996 market share of suppliers in Russia who supply the "Large Plated Tin" part.

PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
PREFIX    : <ghib:tpch/>
SELECT ?o_year
       (SUM(CASE WHEN ?nation = 'RUSSIA' THEN ?volume
        ELSE 0
        END) / SUM(?volume) AS ?mkt_share)
FROM <ghib>
WHERE {
{
    SELECT (YEAR(?o_orderdate) AS ?o_year)
    (?l_extendedprice * (1 - ?l_discount) AS ?volume)
    (?n_name2 AS ?nation)
    WHERE {
{ 
    ?item :l_order ?order ;
          :l_supplier ?supplier ;
          :l_extendedprice ?l_extendedprice ;
          :l_discount ?l_discount ;
          :l_part ?part .
    ?order :o_customer ?customer ;
           :o_orderdate ?o_orderdate .
    FILTER(?o_orderdate >= '1995-01-01'^^xsd:date &&
           ?o_orderdate <= '1996-12-31'^^xsd:date) .
}
{ 
    ?supplier :s_nation/:n_name ?n_name2 . 
} 
    ?part :p_type 'LARGE PLATED TIN' .
    ?customer :c_nation ?nation1 .
    ?nation1 :r_name 'EUROPE' ;
             :n_name ?n_name1 .
  }
 }
}
GROUP BY ?o_year
ORDER BY ?o_year

Product Type Profit Measure

The query below determines how much profit is made for a given line of parts by supplier per year. The query returns the supplier nation, order year, and total profit for a line of parts that includes the text "mint" in the name.

PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
PREFIX    : <ghib:tpch/>
SELECT ?nation ?o_year (SUM(?amount) AS ?sum_profit)
FROM <ghib>
WHERE {
  {
    SELECT ?nation (YEAR(?o_orderdate) AS ?o_year)
           ((?l_extendedprice * (1 - ?l_discount) - ?s_supplycost * ?l_quantity) AS ?amount)
    WHERE { 
       ?order :o_orderdate ?o_orderdate .
       {
        { 
          ?supplier :s_nation/:n_name ?nation .
          << ?supplier :s_part ?part >> :s_supplycost ?s_supplycost .
        }
       ?part :p_name ?p_name .
       FILTER(REGEX(?p_name,".*mint.*")) . 
       ?item :l_supplier ?supplier ;
             :l_part ?part ;
             :l_order ?order ;
             :l_extendedprice ?l_extendedprice ;
             :l_discount ?l_discount ;
             :l_quantity ?l_quantity .
     }
   }
 }
}
GROUP BY ?nation ?o_year
ORDER BY ?nation DESC(?o_year)

Returned Item Reporting

The query below determines the amount of revenue lost in one quarter due to customer returns. The query reports the top 20 customers (in terms of lost revenue) who returned parts that were ordered between 1/1/1994 and 4/1/1994 and displays the amount of revenue that was lost from each customer. The results display the customer key, name, account balance, phone number, address, nation, and comment details as well as the amount of lost revenue.

PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
PREFIX    : <ghib:tpch/>
SELECT 
  (STRAFTER(STR(?customer),'customer_') AS ?c_custkey)
  ?c_name (SUM(?l_extendedprice * (1 - ?l_discount)) AS ?revenue)
  ?c_acctbal ?n_name ?c_address ?c_phone ?c_comment
FROM <ghib>
WHERE {
  ?customer :c_name ?c_name ;
            :c_acctbal ?c_acctbal ;
            :c_address ?c_address ;
            :c_phone ?c_phone ;
            :c_comment ?c_comment ;
            :c_nation/:n_name ?n_name .
  ?order :o_orderdate ?o_orderdate ;
         :o_customer ?customer .
  FILTER(?o_orderdate >= '1994-01-01'^^xsd:date) .
  FILTER(?o_orderdate < '1994-04-01'^^xsd:date) .
  ?item :l_order ?order ;
        :l_returnflag 'R' ;
        :l_discount ?l_discount ;
        :l_extendedprice ?l_extendedprice .
}
GROUP BY ?c_custkey ?c_name ?c_acctbal ?c_phone ?n_name ?c_address ?c_comment
ORDER BY DESC(?revenue)
LIMIT 20

Important Stock Identification

The query below identifies the parts that suppliers in a given nation have in stock that represent the most significant percentage of the total value of all available parts. The query displays results for suppliers in Saudi Arabia and lists the part number and the value of those parts in descending order of value.

PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
PREFIX    : <ghib:tpch/>
SELECT (STRAFTER(STR(?part),'part_') AS ?s_partkey) ?value
FROM <ghib>
WHERE {
  {
    SELECT ?part (SUM(?s_supplycost * ?s_availqty) AS ?value)
    WHERE { 
      ?supplier :s_nation/:n_name 'SAUDI ARABIA' .
      <<?supplier :s_part ?part>> :s_supplycost ?s_supplycost .
      <<?supplier :s_part ?part>> :s_availqty ?s_availqty .
  }
  GROUP BY ?part
  }
  {
    SELECT (SUM(?s_supplycost * ?s_availqty) * 0.0001000000 AS ?supply_cost)
    WHERE {
      ?supplier :s_nation/:n_name 'SAUDI ARABIA' .
      <<?supplier :s_part ?part>> :s_supplycost ?s_supplycost .
      <<?supplier :s_part ?part>> :s_availqty ?s_availqty .
  }
 }
 FILTER(?value > ?supply_cost)
}
ORDER BY DESC(?value)

Shipping Modes and Order Priority

The query below determines whether using less expensive shipping methods cause critical-priority orders to arrive to customers late.

PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
PREFIX    : <ghib:tpch/>
SELECT ?l_shipmode 
       (SUM(CASE WHEN ?o_orderpriority = '1-URGENT' || ?o_orderpriority = '2-HIGH'
        THEN 1
        ELSE 0
        END) AS ?high_line_count)
       (SUM(CASE WHEN ?o_orderpriority != '1-URGENT' && ?o_orderpriority != '2-HIGH'
        THEN 1 
        ELSE 0 
        END) AS ?low_line_count)
FROM <ghib>
WHERE { 
  ?item :l_shipmode ?l_shipmode ;
        :l_commitdate ?l_commitdate ;
        :l_receiptdate ?l_receiptdate ;
        :l_shipdate ?l_shipdate ;
        :l_order/:o_orderpriority ?o_orderpriority .
  FILTER(?l_shipmode in ('SHIP','RAIL')) .
  FILTER(?l_commitdate < ?l_receiptdate) .
  FILTER(?l_shipdate < ?l_commitdate) .
  FILTER(?l_receiptdate >= '1997-01-01'^^xsd:date)  .
  FILTER(?l_receiptdate < '1998-01-01'^^xsd:date) .
}
GROUP BY ?l_shipmode
ORDER BY ?l_shipmode

Customer Distribution

The query below reports on relationships between customers and their orders. The results display the distribution of customers by the number of orders they have placed.

PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
PREFIX    : <ghib:tpch/>
SELECT ?c_count (COUNT(*) AS ?custdist)
FROM <ghib> 
WHERE { 
    SELECT ?customer (COUNT(?order) AS ?c_count)
    WHERE {
      ?customer :c_nation ?nation .
      OPTIONAL {
         ?order :o_customer ?customer ;
                :o_comment ?o_comment .
      FILTER(!REGEX(?o_comment,'.*pending.*deposits.*'))
      }
      }
      GROUP BY ?customer
}
GROUP BY ?c_count
ORDER BY DESC(?custdist) DESC(?c_count)

Promotion Effect

The query below displays the revenue that results from the market response to a promotion.

PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
PREFIX    : <ghib:tpch/>
SELECT (100.00 * SUM(CASE WHEN (REGEX(?p_type,'PROMO.*'))
        THEN (?l_extendedprice * (1 - ?l_discount)) 
        ELSE 0
        END) / SUM(?l_extendedprice * (1 - ?l_discount)) AS ?promo_revenue)
FROM <ghib>
WHERE {
  ?item :l_shipdate ?l_shipdate ;
        :l_extendedprice ?l_extendedprice ;
        :l_discount ?l_discount ;
        :l_part ?part .
  FILTER(?l_shipdate >= '1997-07-01'^^xsd:date &&
  ?l_shipdate < '1997-08-01'^^xsd:date) .
  ?part :p_type ?p_type .
}

Top Supplier

The query below finds the supplier who contributed the most overall revenue for parts shipped between 12/1/1994 and 3/1/1995. The results list the supplier's key, name, address, phone number, and total revenue.

PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
PREFIX    : <ghib:tpch/>
SELECT (STRAFTER(STR(?supplier),'supplier_') AS ?s_suppkey) ?s_name ?s_address
       ?s_phone ?total_revenue
FROM <ghib>
WHERE { ?supplier :s_name ?s_name ;
                  :s_address ?s_address ;
                  :s_phone ?s_phone .
  { SELECT ?supplier (SUM(?l_extendedprice * (1 - ?l_discount)) AS ?total_revenue)
    WHERE { ?item :l_supplier ?supplier ;
                  :l_extendedprice ?l_extendedprice;
                  :l_discount ?l_discount;
                  :l_shipdate ?l_shipdate .
          FILTER (?l_shipdate >= '1994-12-01'^^xsd:date &&
                  ?l_shipdate < '1995-03-01'^^xsd:date)
  }
  GROUP BY ?supplier
  } .
  { SELECT (MAX(?total_revenue) AS ?max_total_revenue)
    WHERE { { SELECT ?supplier (SUM(?l_extendedprice * (1 - ?l_discount)) AS ?total_revenue)
              WHERE { ?item :l_supplier ?supplier ;
                            :l_extendedprice ?l_extendedprice;
                            :l_discount ?l_discount;
                            :l_shipdate ?l_shipdate .
                    FILTER (?l_shipdate >= '1994-12-01'^^xsd:date &&
                            ?l_shipdate < '1995-03-01'^^xsd:date)
  }
  GROUP BY ?supplier
  }
 }
}
FILTER(?total_revenue = ?max_total_revenue)
}
ORDER BY ?s_suppkey

Parts/Supplier Relationship

The query delow determines the number of suppliers who can supply parts with certain attributes.

PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
PREFIX    : <ghib:tpch/>
SELECT ?p_brand ?p_type ?p_size (COUNT(DISTINCT ?supplier) AS ?supplier_cnt)
FROM <ghib>
WHERE {
  ?supplier :s_part ?part .
  ?part :p_brand ?p_brand ;
        :p_type ?p_type ;
        :p_size ?p_size .
  FILTER(?p_brand != 'Brand#23') .
  FILTER(!REGEX(?p_type,'MEDIUM BURNISHED.*')) .
  FILTER(?p_size IN (35, 40, 47, 9, 20, 3, 6, 27)) .
  FILTER (NOT EXISTS {
      SELECT ?supplier 
      WHERE { 
        ?supplier :s_comment ?s_comment .
        FILTER(REGEX(?s_comment,'.*Customer.*Complaints')) .
      }
   }
 )
}
GROUP BY ?p_brand ?p_type ?p_size
ORDER BY DESC(?supplier_cnt) ?p_brand ?p_type ?p_size
LIMIT 200

Small-Quantity-Order Revenue

The query below helps determine whether it is beneficial to eliminate orders for small quantities of certain parts. The query calculates the average amount of yearly revenue that would be lost by concentrating sales on large shipments only.

PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
PREFIX    : <ghib:tpch/>
SELECT ((SUM(?l_extendedprice)) / 7.0 AS ?avg_yearly)
FROM <ghib>
WHERE {
   { 
     ?item :l_part ?part ;
           :l_extendedprice ?l_extendedprice ;
           :l_quantity ?l_quantity .
     ?part :p_brand 'Brand#32' ;
           :p_container 'JUMBO JAR' .
   }
   { SELECT ?part (0.2 * AVG(?l_quantity) AS ?avg_qty)
     WHERE { ?part :p_brand 'Brand#32' ;
                   :p_container 'JUMBO JAR' .
             ?item :l_part ?part ;
                   :l_quantity ?l_quantity .
   }
   GROUP BY ?part
 }
 FILTER(?l_quantity < ?avg_qty) .
}

Large Volume Customer

The query below ranks customers who place large volume orders.

PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
PREFIX    : <ghib:tpch/>
SELECT ?c_name (STRAFTER(STR(?customer),'customer_') AS ?c_custkey)
       (STRAFTER(STR(?order),'order_') AS ?o_orderkey) ?o_orderdate
       ?o_totalprice (SUM(?l_quantity) AS ?sum_quantity)
FROM <ghib>
WHERE { 
   { SELECT ?order
     WHERE { ?item :l_order ?order ;
                   :l_quantity ?l_quantity .
       }
       GROUP BY ?order
       HAVING (SUM(?l_quantity) > 313)
   }
   ?customer :c_name ?c_name .
   ?order :o_customer ?customer ;
          :o_orderdate ?o_orderdate ;
          :o_totalprice ?o_totalprice .
   ?item :l_quantity ?l_quantity ;
         :l_order ?order .
}
GROUP BY ?c_name ?customer ?order ?o_orderdate ?o_totalprice
ORDER BY DESC(?o_totalprice) ?o_orderdate
LIMIT 100

Discounted Revenue

The query below reports the gross discounted revenue attributed to the sale of selected parts.

PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
PREFIX    : <ghib:tpch/>
SELECT (SUM(?l_extendedprice * (1 - ?l_discount)) AS ?revenue)
FROM <ghib>
WHERE { 
  ?part :p_brand ?p_brand ;
        :p_container ?p_container ;
        :p_size ?p_size .
  ?item  :l_part ?part ;
         :l_extendedprice ?l_extendedprice ;
         :l_discount ?l_discount ;
         :l_quantity ?l_quantity ;
         :l_shipmode ?l_shipmode ;
         :l_shipinstruct ?l_shipinstruct .
  FILTER( ( ?p_brand = 'Brand#44'
            && ?p_container IN ('SM CASE', 'SM BOX', 'SM PACK', 'SM PKG')
            && ?l_quantity >=10 && ?l_quantity <= 10+10
            && ?p_size >= 1 && ?p_size <= 5
            && ?l_shipmode IN ('AIR', 'AIR REG')
            && ?l_shipinstruct = 'DELIVER IN PERSON'
         ) ||
         ( ?p_brand = 'Brand#51'
           && ?p_container IN ('MED BAG', 'MED BOX', 'MED PKG', 'MED PACK')
           && ?l_quantity >=15 && ?l_quantity <= 15+10
           && ?p_size >= 1&& ?p_size <=10
           && ?l_shipmode IN ('AIR', 'AIR REG')
           && ?l_shipinstruct = 'DELIVER IN PERSON'
         ) ||
         ( ?p_brand = 'Brand#25'
           && ?p_container IN ('LG CASE', 'LG BOX', 'LG PACK', 'LG PKG')
           && ?l_quantity >= 25 && ?l_quantity <= 25 + 10
           && ?p_size >= 1 && ?p_size <=15
           && ?l_shipmode IN ('AIR', 'AIR REG')
           && ?l_shipinstruct = 'DELIVER IN PERSON'
     )
   ) .
}

Potential Part Promotion

The query below identifies suppliers in a particular nation who have selected parts that may be good candidates for a promotional offer.

PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
PREFIX    : <ghib:tpch/>
SELECT ?s_name ?s_address
FROM <ghib>
WHERE { 
  ?supplier :s_name ?s_name;
            :s_address ?s_address;
            :s_nation ?nation .
  ?nation :n_name 'BRAZIL' .
  { 
    SELECT DISTINCT ?supplier
    WHERE { 
      << ?supplier :s_part ?part >> :s_availqty ?s_availqty .
      ?part :p_name ?p_name .
      FILTER(REGEX(?p_name, '^royal.*')) .
      { 
        SELECT ?part ?supplier (0.5 * SUM(?l_quantity) AS ?sum_quantity)
        WHERE { 
          ?item :l_part ?part ;
                :l_supplier ?supplier ;
                :l_quantity ?l_quantity;
                :l_shipdate ?l_shipdate .
          FILTER(?l_shipdate >= '1997-01-01'^^xsd:date && ?l_shipdate < '1998-01-01'^^xsd:date).
        }
        GROUP BY ?part ?supplier 
        }
      FILTER(?s_availqty > ?sum_quantity ) .
   }
 }
}
ORDER BY ?s_name

Suppliers Who Kept Orders Waiting

The query below identifies suppliers who did not ship required parts on time.

PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
PREFIX    : <ghib:tpch/>
SELECT ?s_name (COUNT(*) AS ?numwait)
FROM <ghib>
WHERE {
  ?supplier1 :s_name ?s_name ;
             :s_nation ?nation .
  ?nation :n_name 'UNITED KINGDOM' .
  ?item1 :l_supplier ?supplier1 ;
         :l_order ?order1 ;
         :l_receiptdate ?l_receiptdate1 ;
         :l_commitdate ?l_commitdate1 .
  FILTER(?l_receiptdate1 > ?l_commitdate1) .
  ?order1 :o_orderstatus 'F' . 
  FILTER(EXISTS { ?item2 :l_order ?order1 ; :l_supplier ?supplier2 .
  FILTER(?supplier1 != ?supplier2) . } ) .
  FILTER(NOT EXISTS { ?item3 :l_order ?order1 ; 
                             :l_supplier ?supplier3 ;
                             :l_receiptdate ?l_receiptdate3 ;
                             :l_commitdate ?l_commitdate3 .
                      FILTER(?supplier1 != ?supplier3) .
                      FILTER(?l_receiptdate3 > ?l_commitdate3) . } ) .
}
GROUP BY ?s_name
ORDER BY DESC(?numwait) ?s_name
LIMIT 100

Global Sales Opportunity

The query below identifies geographies where customers might be likely to make a purchase. The query reports on customers in seven specific countries (as identified by country code) who have not placed orders in seven years but who still have a positive account balance that is greater than the average amount.

PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
PREFIX    : <ghib:tpch/>
SELECT ?cntrycode (COUNT(*) AS ?numcust) (SUM(?c_acctbal) AS ?totacctbal)
FROM <ghib>
WHERE { 
  SELECT ?cntrycode ?c_acctbal 
  WHERE { 
    customer :c_phone ?c_phone ;
             :c_acctbal ?c_acctbal .
    BIND(SUBSTR(?c_phone,1,2) AS ?cntrycode)
    FILTER(?cntrycode in ('22', '28', '43', '20', '33', '34', '23')) .
    FILTER(?c_acctbal > ?avgaccbal) .
{ SELECT (AVG(?c_acctbal) AS ?avgaccbal)
  WHERE {
    ?customer :c_acctbal ?c_acctbal ;
              :c_phone ?c_phone .
    FILTER(?c_acctbal > 0.00) .
    FILTER(SUBSTR(?c_phone,1,2) in ('22', '28', '43', '20', '33', '34', '23')) .
    }
  }
  FILTER (NOT EXISTS { ?order :o_customer ?customer } ) .
 }
}
GROUP BY ?cntrycode
ORDER BY ?cntrycode
Related Topics