SELECT

The SELECT statement specifies columns that you want to retrieve from a table.

SELECT [ <qualifier> ] <select_expressions>
    [{TO | INTO} <variable_comma_list>]
    [FROM <extended_table_specification_comma_list>]
    [WHERE <search_condition>]
    [GROUP BY {col_name | expr} [ASC | DESC], ...[WITH ROLLUP]]
    [HAVING  <search_condition> ]
    [ORDER BY {col_name | expr} [ASC | DESC], ... [NULLS {FIRST | LAST}]
    [LIMIT [offset,] row_count]
    [USING INDEX { index_name [,index_name, ...] | NONE }]
    [FOR UPDATE [OF <spec_name_comma_list>]]

    <qualifier> ::= ALL | DISTINCT | DISTINCTROW | UNIQUE

    <select_expressions> ::= * | <expression_comma_list> | *, <expression_comma_list>

    <variable_comma_list> ::= [:] identifier, [:] identifier, ...

    <extended_table_specification_comma_list> ::=
        <table_specification>   [
                                    {, <table_specification> } ... |
                                    <join_table_specification> ... |
                                    <join_table_specification2> ...
                                ]

<table_specification> ::=
    <single_table_spec> [<correlation>] |
    <metaclass_specification> [ <correlation> ] |
    <subquery> <correlation> |
    TABLE ( <expression> ) <correlation>

<correlation> ::= [AS] <identifier> [(<identifier_comma_list>)]

<single_table_spec> ::= [ONLY] <table_name> |
                      ALL <table_name> [ EXCEPT <table_name> ]

<metaclass_specification> ::= CLASS <class_name>

<join_table_specification> ::=
    {
        [INNER | {LEFT | RIGHT} [OUTER]] JOIN |
        STRAIGHT_JOIN
    } <table_specification> ON <search_condition>

<join_table_specification2> ::=
    {
        CROSS JOIN |
        NATURAL [ LEFT | RIGHT ] JOIN
    } <table_specification>
  • qualifier: A qualifier. When omitted, it is set to ALL.

    • ALL: Retrieves all records of the table.
    • DISTINCT: Retrieves only records with unique values without allowing duplicates. DISTINCT, DISTINCTROW, and UNIQUE are used interchangeably.
  • <select_expressions>

    • *: By using SELECT * statement, you can retrieve all columns from the table specified in the FROM clause.

    • expression_comma_list: expression can be a path expression (ex.: tbl_name.col_name), variable or table name. All general expressions including arithmetic operations can also be used. Use a comma (,) to separate each expression in the list. You can specify aliases by using the AS keyword for columns or expressions to be queried. Specified aliases are used as column names in GROUP BY, HAVING and ORDER BY clauses. The position index of a column is assigned based on the order in which the column was specified. The starting value is 1.

      As AVG, COUNT, MAX, MIN, or SUM, an aggregate function that manipulates the retrieved data can also be used in the expression.

  • table_name.*: Specifies the table name and using * has the same effect as specifying all columns for the given table.

  • variable_comma_list: The data retrieved by the select_expressions can be stored in more than one variable.

  • [:]identifier: By using the :identifier after TO (or INTO), you can store the data to be retrieved in the ':identifier' variable.

  • <single_table_spec>

    • If a superclass name is specified after the ONLY keyword, only the superclass, not the subclass inheriting from it, is selected.
    • If a superclass name is specified after the ALL keyword, the superclass as well as the subclass inheriting from it are both selected.
    • You can define the list of subclass not to be selected after the EXCEPT keyword.

The following example shows how to retrieve host countries of the Olympic Games without any duplicates. This example is performed on the olympic table of demodb. The DISTINCT or UNIQUE keyword makes the query result unique. For example, when there are multiple olympic records of which each host_nation value is 'Greece', you can use such keywords to display only one value in the query result.

SELECT DISTINCT host_nation
FROM olympic;
  host_nation
======================
  'Australia'
  'Belgium'
  'Canada'
  'Finland'
  'France'
...

The following example shows how to define an alias to a column to be queried and sort the result record by using the column alias in the ORDER BY clause. At this time, the number of the result records is limited to 5 by using the LIMIT clause.

SELECT host_year as col1, host_nation as col2
FROM olympic
ORDER BY col2 LIMIT 5;
         col1  col2
===================================
         2000  'Australia'
         1956  'Australia'
         1920  'Belgium'
         1976  'Canada'
         1948  'England'
SELECT CONCAT(host_nation, ', ', host_city) AS host_place
FROM olympic
ORDER BY host_place LIMIT 5;
  host_place
======================
  'Australia,  Melbourne'
  'Australia,  Sydney'
  'Belgium,  Antwerp'
  'Canada,  Montreal'
  'England,  London'

FROM Clause

The FROM clause specifies the table in which data is to be retrieved in the query. If no table is referenced, the FROM clause can be omitted. Retrieval paths are as follows:

  • Single table
  • Subquery
  • Derived table
SELECT [<qualifier>] <select_expressions>
[
    FROM <table_specification> [ {, <table_specification> | <join_table_specification> }... ]
]

<select_expressions> ::= * | <expression_comma_list> | *, <expression_comma_list>

<table_specification> ::=
    <single_table_spec> [<correlation>] |
    <metaclass_specification> [<correlation>] |
    <subquery> <correlation> |
    TABLE (<expression>) <correlation>

<correlation> ::= [AS] <identifier> [(<identifier_comma_list>)]

<single_table_spec> ::= [ONLY] <table_name> |
                      ALL <table_name> [EXCEPT <table_name>]

<metaclass_specification> ::= CLASS <class_name>
  • <select_expressions>: One or more columns or expressions to query is specified. Use * to query all columns in the table. You can also specify an alias for a column or an expression to be queried by using the AS keyword. This keyword can be used in GROUP BY, HAVING and ORDER BY clauses. The position index of the column is given according to the order in which the column was specified. The starting value is 1.
  • <table_specification>: At least one table name is specified after the FROM clause. Subqueries and derived tables can also be used in the FROM clause. For details on subquery derived tables, see Subquery Derived Table.
--FROM clause can be omitted in the statement
SELECT 1+1 AS sum_value;
    sum_value
=============
            2
SELECT CONCAT('CUBRID', '2008' , 'R3.0') AS db_version;
  db_version
======================
  'CUBRID2008R3.0'

Derived Table

In the query statement, subqueries can be used in the table specification of the FROM clause. Such subqueries create derived tables where subquery results are treated as tables. A correlation specification must be used when a subquery that creates a derived table is used.

Derived tables are also used to access the individual element of an attribute that has a set value. In this case, an element of the set value is created as an instance in the derived table.

Subquery Derived Table

Each instance in the derived table is created from the result of the subquery in the FROM clause. A derived table created form a subquery can have any number of columns and records.

FROM (subquery) [AS] [derived_table_name [(column_name [{, column_name } ... ])]]
  • The number of column_name and the number of columns created by the subquery must be identical.
  • derived_table_name can be omitted.

The following example shows how to retrieve the sum of the number of gold (gold) medals won by Korea and that of silver medals won by Japan. This example shows a way of getting an intermediate result of the subquery and processing it as a single result, by using a derived table. The query returns the sum of the gold values whose nation_code is 'KOR' and the silver values whose nation_code column is 'JPN'.

SELECT SUM (n)
FROM (SELECT gold FROM participant WHERE nation_code = 'KOR'
      UNION ALL
      SELECT silver FROM participant WHERE nation_code = 'JPN') AS t(n);

Subquery derived tables can be useful when combined with outer queries. For example, a derived table can be used in the FROM clause of the subquery used in the WHERE clause. The following example shows nation_code, host_year and gold records whose number of gold medals is greater than average sum of the number of silver and bronze medals when one or more silver or bronze medals were won. In this example, the query (the outer SELECT clause) and the subquery (the inner SELECT clause) share the nation_code attribute.

SELECT nation_code, host_year, gold
FROM participant p
WHERE gold > (SELECT AVG(s)
              FROM (SELECT silver + bronze
                    FROM participant
                    WHERE nation_code = p.nation_code
                    AND silver > 0
                    AND bronze > 0)
                   AS t(s));
  nation_code      host_year      gold
=========================================
  'JPN'                2004         16
  'CHN'                2004         32
  'DEN'                1996          4
  'ESP'                1992         13

WHERE Clause

In a query, a column can be processed based on conditions. The WHERE clause specifies a search condition for data.

WHERE <search_condition>

    <search_condition> ::=
        <comparison_predicate>
        <between_predicate>
        <exists_predicate>
        <in_predicate>
        <null_predicate>
        <like_predicate>
        <quantified_predicate>
        <set_predicate>

The WHERE clause specifies a condition that determines the data to be retrieved by search_condition or a query. Only data for which the condition is true is retrieved for the query results. (NULL value is not retrieved for the query results because it is evaluated as unknown value.)

The logical operator AND or OR can be used for multiple conditions. If AND is specified, all conditions must be true. If OR is specified, only one needs to be true. If the keyword NOT is preceded by a condition, the meaning of the condition is reserved. The following table shows the order in which logical operators are evaluated.

Priority Operator Function
1 () Logical expressions in parentheses are evaluated first.
2 NOT Negates the result of the logical expression.
3 AND All conditions in the logical expression must be true.
4 OR One of the conditions in the logical expression must be true.

GROUP BY ... HAVING Clause

The GROUP BY clause is used to group the result retrieved by the SELECT statement based on a specific column. This clause is used to sort by group or to get the aggregation by group using the aggregation function. Herein, a group consists of records that have the same value for the column specified in the GROUP BY clause.

You can also set a condition for group selection by including the HAVING clause after the GROUP BY clause. That is, only groups satisfying the condition specified by the HAVING clause are queried out of all groups that are grouped by the GROUP BY clause.

By SQL standard, you cannot specify a column (hidden column) not defined in the GROUP BY clause to the SELECT column list. However, by using extended CUBRID grammars, you can specify the hidden column to the SELECT column list. If you do not use the extended CUBRID grammars, the only_full_group_by parameter should be set to yes. For details, see Statement/Type-Related Parameters.

SELECT ...
GROUP BY {col_name | expr | position} [ASC | DESC], ...
          [WITH ROLLUP] [HAVING <search_condition>]
  • col_name | expr | position: Specifies one or more column names, expressions, aliases or column location. Items are separated by commas. Columns are sorted on this basis.

  • [ASC | DESC]: Specifies the ASC or DESC sorting option after the columns specified in the GROUP BY clause. If the sorting option is not specified, the default value is ASC.

  • <search_condition>: Specifies the search condition in the HAVING clause. In the HAVING clause, you can refer to columns and aliases specified in the GROUP BY clause, or columns used in aggregate functions.

    Note

    Even the hidden columns not specified in the GROUP BY clause can be referred to, if the value of the only_full_group_by parameter is set to yes. At this time, the HAVING condition does not affect to the query result.

  • WITH ROLLUP: If you specify the WITH ROLLUP modifier in the GROUP BY clause, the aggregate information of the result value of each GROUPed BY column is displayed for each group, and the total of all result rows is displayed at the last row. When a WITH ROLLUP modifier is defined in the GROUP BY clause, the result value for all rows of the group is additionally displayed. In other words, total aggregation is made for the value aggregated by group. When there are two columns for Group By, the former is considered as a large unit and the latter is considered as a small unit, so the total aggregation row for the small unit and the total aggregation row for the large unit are added. For example, you can check the aggregation of the sales result per department and salesperson through one query.

-- creating a new table
CREATE TABLE sales_tbl
(dept_no INT, name VARCHAR(20), sales_month INT, sales_amount INT DEFAULT 100, PRIMARY KEY (dept_no, name, sales_month));

INSERT INTO sales_tbl VALUES
(201, 'George' , 1, 450), (201, 'George' , 2, 250), (201, 'Laura'  , 1, 100), (201, 'Laura'  , 2, 500),
(301, 'Max'    , 1, 300), (301, 'Max'    , 2, 300),
(501, 'Stephan', 1, 300), (501, 'Stephan', 2, DEFAULT), (501, 'Chang'  , 1, 150),(501, 'Chang'  , 2, 150),
(501, 'Sue'    , 1, 150), (501, 'Sue'    , 2, 200);

-- selecting rows grouped by dept_no
SELECT dept_no, avg(sales_amount)
FROM sales_tbl
GROUP BY dept_no;
      dept_no         avg(sales_amount)
=======================================
          201     3.250000000000000e+02
          301     3.000000000000000e+02
          501     1.750000000000000e+02
-- conditions in WHERE clause operate first before GROUP BY
SELECT dept_no, avg(sales_amount)
FROM sales_tbl
WHERE sales_amount > 100
GROUP BY dept_no;
      dept_no         avg(sales_amount)
=======================================
          201     4.000000000000000e+02
          301     3.000000000000000e+02
          501     1.900000000000000e+02
-- conditions in HAVING clause operate last after GROUP BY
SELECT dept_no, avg(sales_amount)
FROM sales_tbl
WHERE sales_amount > 100
GROUP BY dept_no HAVING avg(sales_amount) > 200;
      dept_no         avg(sales_amount)
=======================================
          201     4.000000000000000e+02
          301     3.000000000000000e+02
-- selecting and sorting rows with using column alias
SELECT dept_no AS a1, avg(sales_amount) AS a2
FROM sales_tbl
WHERE sales_amount > 200 GROUP
BY a1 HAVING a2 > 200
ORDER BY a2;
           a1                        a2
=======================================
          301     3.000000000000000e+02
          501     3.000000000000000e+02
          201     4.000000000000000e+02
-- selecting rows grouped by dept_no, name with WITH ROLLUP modifier
SELECT dept_no AS a1, name AS a2, avg(sales_amount) AS a3
FROM sales_tbl
WHERE sales_amount > 100
GROUP BY a1, a2 WITH ROLLUP;
           a1  a2                                          a3
=============================================================
          201  'George'                 3.500000000000000e+02
          201  'Laura'                  5.000000000000000e+02
          201  NULL                     4.000000000000000e+02
          301  'Max'                    3.000000000000000e+02
          301  NULL                     3.000000000000000e+02
          501  'Chang'                  1.500000000000000e+02
          501  'Stephan'                3.000000000000000e+02
          501  'Sue'                    1.750000000000000e+02
          501  NULL                     1.900000000000000e+02
         NULL  NULL                     2.750000000000000e+02

ORDER BY Clause

The ORDER BY clause sorts the query result set in ascending or descending order. If you do not specify a sorting option such as ASC or DESC, the result set in ascending order by default. If you do not specify the ORDER BY clause, the order of records to be queried may vary depending on query.

SELECT ...
ORDER BY {col_name | expr | position} [ASC | DESC], ...] [NULLS {FIRST | LAST}]
  • col_name | expr | position: Specifies a column name, expression, alias, or column location. One or more column names, expressions or aliases can be specified. Items are separated by commas. A column that is not specified in the list of SELECT columns can be specified.
  • [ASC | DESC]: ASC means sorting in ascending order, and DESC is sorting in descending order. If the sorting option is not specified, the default value is ASC.
  • [NULLS {FIRST | LAST}]: NULLS FIRST sorts NULL at first, NULLS LAST sorts NULL at last. If this syntax is omitted, ASC sorts NULL at first, DESC sorts NULL at last.
-- selecting rows sorted by ORDER BY clause
SELECT *
FROM sales_tbl
ORDER BY dept_no DESC, name ASC;
      dept_no  name                  sales_month  sales_amount
==============================================================
          501  'Chang'                         1           150
          501  'Chang'                         2           150
          501  'Stephan'                       1           300
          501  'Stephan'                       2           100
          501  'Sue'                           1           150
          501  'Sue'                           2           200
          301  'Max'                           1           300
          301  'Max'                           2           300
          201  'George'                        1           450
          201  'George'                        2           250
          201  'Laura'                         1           100
          201  'Laura'                         2           500
-- sorting reversely and limiting result rows by LIMIT clause
SELECT dept_no AS a1, avg(sales_amount) AS a2
FROM sales_tbl
GROUP BY a1
ORDER BY a2 DESC
LIMIT 3;
           a1           a2
=======================================
          201     3.250000000000000e+02
          301     3.000000000000000e+02
          501     1.750000000000000e+02

The following is an example how to specify the NULLS FIRST or NULLS LAST after ORDER BY clause.

CREATE TABLE tbl (a INT, b VARCHAR);

INSERT INTO tbl VALUES
(1,NULL), (2,NULL), (3,'AB'), (4,NULL), (5,'AB'),
(6,NULL), (7,'ABCD'), (8,NULL), (9,'ABCD'), (10,NULL);
SELECT * FROM tbl ORDER BY b NULLS FIRST;
            a  b
===================================
            1  NULL
            2  NULL
            4  NULL
            6  NULL
            8  NULL
           10  NULL
            3  'ab'
            5  'ab'
            7  'abcd'
            9  'abcd'
SELECT * FROM tbl ORDER BY b NULLS LAST;
            a  b
===================================
            3  'ab'
            5  'ab'
            7  'abcd'
            9  'abcd'
            1  NULL
            2  NULL
            4  NULL
            6  NULL
            8  NULL
           10  NULL

Note

Translation of GROUP BY alias

CREATE TABLE t1(a INT, b INT, c INT);
INSERT INTO t1 VALUES(1,1,1);
INSERT INTO t1 VALUES(2,NULL,2);
INSERT INTO t1 VALUES(2,2,2);

SELECT a, NVL(b,2) AS b
FROM t1
GROUP BY a, b;  -- Q1

When you run the above SELECT query, "GROUP BY a, b" is translated as:

  • "GROUP BY a, NVL(b, 2)"(alias name b) in 9.2 or before. The result is the same as Q2's result as below.

    SELECT a, NVL(b,2) AS bxxx
    FROM t1
    GROUP BY a, bxxx;  -- Q2
    
            a            b
    ======================
            1            1
            2            2
    
    • "GROUP BY a, b"(column name b) in 9.3 or higher. The result is the same as Q3's result as below.
    SELECT a, NVL(b,2) AS bxxx
    FROM t1
    GROUP BY a, b;  -- Q3
    
            a            b
    ======================
            1            1
            2            2
            2            2
    

LIMIT Clause

The LIMIT clause can be used to limit the number of records displayed. You can specify a very big integer for row_count to display to the last row, starting from a specific row. The LIMIT clause can be used as a prepared statement. In this case, the bind parameter (?) can be used instead of an argument.

INST_NUM () and ROWNUM cannot be included in the WHERE clause in a query that contains the LIMIT clause. Also, LIMIT cannot be used together with HAVING GROUPBY_NUM ().

LIMIT {[offset,] row_count | row_count [OFFSET offset]}

<offset> ::= <limit_expression>
<row_count> ::= <limit_expression>

<limit_expression> ::= <limit_term> | <limit_expression> + <limit_term> | <limit_expression> - <limit_term>
<limit_term> ::= <limit_factor> | <limit_term> * <limit_factor> | <limit_term> / <limit_factor>
<limit_factor> ::= <unsigned int> | <input_hostvar> | ( <limit_expression> )
  • offset: Specifies the offset of the starting row to be displayed. The offset of the starting row of the result set is 0; it can be omitted and the default value is 0. It can be one of unsigned int, a host variable or a simple expression.
  • row_count: Specifies the number of records to be displayed. It can be one of unsigned integer, a host variable or a simple expression.
-- LIMIT clause can be used in prepared statement
PREPARE stmt FROM 'SELECT * FROM sales_tbl LIMIT ?, ?';
EXECUTE stmt USING 0, 10;
-- selecting rows with LIMIT clause
SELECT *
FROM sales_tbl
WHERE sales_amount > 100
LIMIT 5;
      dept_no  name                  sales_month  sales_amount
==============================================================
          201  'George'                        1           450
          201  'George'                        2           250
          201  'Laura'                         2           500
          301  'Max'                           1           300
          301  'Max'                           2           300
-- LIMIT clause can be used in subquery
SELECT t1.*
FROM (SELECT * FROM sales_tbl AS t2 WHERE sales_amount > 100 LIMIT 5) AS t1
LIMIT 1,3;

-- above query and below query shows the same result
SELECT t1.*
FROM (SELECT * FROM sales_tbl AS t2 WHERE sales_amount > 100 LIMIT 5) AS t1
LIMIT 3 OFFSET 1;
      dept_no  name                  sales_month  sales_amount
==============================================================
          201  'George'                        2           250
          201  'Laura'                         2           500
          301  'Max'                           1           300
-- LIMIT clause allows simple expressions for both offset and row_count
SELECT *
FROM sales_tbl
WHERE sales_amount > 100
LIMIT ? * ?, (? * ?) + ?;

Join Query

A join is a query that combines the rows of two or more tables or virtual tables (views). In a join query, a condition that compares the columns that are common in two or more tables is called a join condition. Rows are retrieved from each joined table, and are combined only when they satisfy the specified join condition.

A join query using an equality operator (=) is called an equi-join, and one without any join condition is called a cartesian product. Meanwhile, joining a single table is called a self join. In a self join, table ALIAS is used to distinguish columns, because the same table is used twice in the FROM clause.

A join that outputs only rows that satisfy the join condition from a joined table is called an inner or a simple join, whereas a join that outputs both rows that satisfy and do not satisfy the join condition from a joined table is called an outer join.

An outer join is divided into a left outer join which outputs all rows of the left table as a result(outputs NULL when the right table's columns don't match conditions), a right outer join which outputs all rows of the right table as a result(outputs NULL when the left table's columns don't match conditions) and a full outer join which outputs all rows of both tables. If there is no column value that corresponds to a table on one side in the result of an outer join query, all rows are returned as NULL.

FROM <table_specification> [{, <table_specification>
    | { <join_table_specification> | <join_table_specification2> } ...]

<table_specification> ::=
    <single_table_spec> [<correlation>] |
    <metaclass_specification> [<correlation>] |
    <subquery> <correlation> |
    TABLE (<expression>) <correlation>

<join_table_specification> ::=
    {
        [INNER | {LEFT | RIGHT} [OUTER]] JOIN |
        STRAIGHT_JOIN
     } <table_specification> ON <search_condition>

<join_table_specification2> ::=
            {
        CROSS JOIN |
        NATURAL [ LEFT | RIGHT ] JOIN
    } <table_specification>
  • <join_table_specification>
    • [INNER] JOIN: Used for inner join and requires join conditions.
    • {LEFT | RIGHT} [OUTER] JOIN: LEFT is used for a left outer join query, and RIGHT is for a right outer join query.
    • STRAIGHT_JOIN: (on changing)
  • <join_table_specification2>
    • CROSS JOIN: Used for cross join and requires no join conditions.
    • NATURAL [LEFT | RIGHT] JOIN: Used for natural join and join condition is not used. It operates in the equivalent same way to have a condition between columns equivalent of the same name .

Inner Join

The inner join requires join conditions. The INNER JOIN keyword can be omitted. When it is omitted, the table is separated by a comma (,). The ON join condition can be replaced with the WHERE condition.

The following example shows how to retrieve the years and host countries of the Olympic Games since 1950 where a world record has been set. The following query retrieves instances whose values of the host_year column in the history table are greater than 1950. The following two queries output the same result.

SELECT DISTINCT h.host_year, o.host_nation
FROM history h INNER JOIN olympic o ON h.host_year = o.host_year AND o.host_year > 1950;

SELECT DISTINCT h.host_year, o.host_nation
FROM history h, olympic o
WHERE h.host_year = o.host_year AND o.host_year > 1950;
    host_year  host_nation
===================================
         1968  'Mexico'
         1980  'U.S.S.R.'
         1984  'United States of America'
         1988  'Korea'
         1992  'Spain'
         1996  'United States of America'
         2000  'Australia'
         2004  'Greece'

Outer Join

CUBRID does not support full outer joins; it supports only left and right joins. Path expressions that include subqueries and sub-columns cannot be used in the join conditions of an outer join.

Join conditions of an outer join are specified in a different way from those of an inner join. In an inner join, join conditions can be expressed in the WHERE clause; in an outer join, they appear after the ON keyword within the FROM clause. Other retrieval conditions can be used in the WHERE or ON clause, but the retrieval result depends on whether the condition is used in the WHERE or ON clause.

The table execution order is fixed according to the order specified in the FROM clause. Therefore, when using an outer join, you should create a query statement in consideration of the table order. It is recommended to use standard statements using { LEFT | RIGHT } [ OUTER ] JOIN, because using an Oracle-style join query statements by specifying an outer join operator (+) in the WHERE clause, even if possible, might lead the execution result or plan in an unwanted direction.

The following example shows how to retrieve the years and host countries of the Olympic Games since 1950 where a world record has been set, but including the Olympic Games where any world records haven't been set in the result. This example can be expressed in the following right outer join query. In this example, all instances whose values of the host_year column in the history table are not greater than 1950 are also retrieved. All instances of host_nation are included because this is a right outer join. host_year that does not have a value is represented as NULL.

SELECT DISTINCT h.host_year, o.host_year, o.host_nation
FROM history h RIGHT OUTER JOIN olympic o ON h.host_year = o.host_year
WHERE o.host_year > 1950;
    host_year    host_year  host_nation
================================================
         NULL         1952  'Finland'
         NULL         1956  'Australia'
         NULL         1960  'Italy'
         NULL         1964  'Japan'
         NULL         1972  'Germany'
         NULL         1976  'Canada'
         1968         1968  'Mexico'
         1980         1980  'USSR'
         1984         1984  'USA'
         1988         1988  'Korea'
         1992         1992  'Spain'
         1996         1996  'USA'
         2000         2000  'Australia'
         2004         2004  'Greece'

A right outer join query can be converted to a left outer join query by switching the position of two tables in the FROM clause. The right outer join query in the previous example can be expressed as a left outer join query as follows:

SELECT DISTINCT h.host_year, o.host_year, o.host_nation
FROM olympic o LEFT OUTER JOIN history h ON h.host_year = o.host_year
WHERE o.host_year > 1950;
    host_year    host_year  host_nation
================================================
         NULL         1952  'Finland'
         NULL         1956  'Australia'
         NULL         1960  'Italy'
         NULL         1964  'Japan'
         NULL         1972  'Germany'
         NULL         1976  'Canada'
         1968         1968  'Mexico'
         1980         1980  'USSR'
         1984         1984  'USA'
         1988         1988  'Korea'
         1992         1992  'Spain'
         1996         1996  'USA'
         2000         2000  'Australia'
         2004         2004  'Greece'

Outer joins can also be represented by using (+) in the WHERE clause. The above example is a query that has the same meaning as the example using the LEFT OUTER JOIN. The (+) syntax is not ISO/ANSI standard, so it can lead to ambiguous situations. It is recommended to use the standard syntax LEFT OUTER JOIN (or RIGHT OUTER JOIN) if possible.

SELECT DISTINCT h.host_year, o.host_year, o.host_nation
FROM history h, olympic o
WHERE o.host_year = h.host_year(+) AND o.host_year > 1950;
    host_year    host_year  host_nation
================================================
         NULL         1952  'Finland'
         NULL         1956  'Australia'
         NULL         1960  'Italy'
         NULL         1964  'Japan'
         NULL         1972  'Germany'
         NULL         1976  'Canada'
         1968         1968  'Mexico'
         1980         1980  'USSR'
         1984         1984  'USA'
         1988         1988  'Korea'
         1992         1992  'Spain'
         1996         1996  'USA'
         2000         2000  'Australia'
         2004         2004  'Greece'

In the above examples, h.host_year=o.host_year is an outer join condition, and o.host_year > 1950 is a search condition. If the search condition is not written in the WHERE clause but in the ON clause, the meaning and the result will be different. The following query also includes instances whose values of o.host_year are not greater than 1950.

SELECT DISTINCT h.host_year, o.host_year, o.host_nation
FROM olympic o LEFT OUTER JOIN history h ON h.host_year = o.host_year AND o.host_year > 1950;
    host_year    host_year  host_nation
================================================
         NULL         1896  'Greece'
         NULL         1900  'France'
         NULL         1904  'USA'
         NULL         1908  'United Kingdom'
         NULL         1912  'Sweden'
         NULL         1920  'Belgium'
         NULL         1924  'France'
         NULL         1928  'Netherlands'
         NULL         1932  'USA'
         NULL         1936  'Germany'
         NULL         1948  'England'
         NULL         1952  'Finland'
         NULL         1956  'Australia'
         NULL         1960  'Italy'
         NULL         1964  'Japan'
         NULL         1972  'Germany'
         NULL         1976  'Canada'
         1968         1968  'Mexico'
         1980         1980  'USSR'
         1984         1984  'USA'
         1988         1988  'Korea'
         1992         1992  'Spain'
         1996         1996  'USA'
         2000         2000  'Australia'
         2004         2004  'Greece'

In the above example, LEFT OUTER JOIN should attach all rows to the result rows even if the left table's rows do not match to the condition; therefore, the left table's condition, "AND o.host_year > 1950" is ignored. But "WHERE o.host_year > 1950" is applied after the join operation is completed. Please consider that a condition after ON clause and a condition after WHERE clause can be applied differently in OUTER JOIN.

Cross Join

The cross join is a cartesian product, meaning that it is a combination of two tables, without any condition. For the cross join, the CROSS JOIN keyword can be omitted. When it is omitted, the table is separated by a comma (,).

The following example shows how to write cross join.

SELECT DISTINCT h.host_year, o.host_nation
FROM history h CROSS JOIN olympic o;

SELECT DISTINCT h.host_year, o.host_nation
FROM history h, olympic o;

The above two queries output the same results.

    host_year  host_nation
===================================
         1968  'Australia'
         1968  'Belgium'
         1968  'Canada'
         1968  'England'
         1968  'Finland'
         1968  'France'
         1968  'Germany'
...
         2004  'Spain'
         2004  'Sweden'
         2004  'USA'
         2004  'USSR'
         2004  'United Kingdom'

144 rows selected. (1.283548 sec) Committed.

Natural Join

When column names to be joined to each table are the same, that is, when you want to grant equivalent conditions between each column with the same name, a natural join, which can replace inner/outer join, can be used.

CREATE TABLE t1 (a int, b1 int);
CREATE TABLE t2 (a int, b2 int);

INSERT INTO t1 values(1,1);
INSERT INTO t1 values(3,3);
INSERT INTO t2 values(1,1);
INSERT INTO t2 values(2,2);

The below is an example of running NATURAL JOIN.

SELECT /*+ RECOMPILE*/ *
FROM t1 NATURAL JOIN t2;

Running the above query is the same as running the below query, and they display the same result.

SELECT /*+ RECOMPILE*/ *
FROM t1 INNER JOIN t2 ON t1.a=t2.a;
        a           b1            a           b2
================================================
        1            1            1            1

The below is an example of running NATURAL LEFT JOIN.

SELECT /*+ RECOMPILE*/ *
FROM t1 NATURAL LEFT JOIN t2;

Running the above query is the same as running the below query, and they display the same result.

SELECT /*+ RECOMPILE*/ *
FROM t1 LEFT JOIN t2 ON t1.a=t2.a;
            a           b1            a           b2
====================================================
            1            1            1            1
            3            3         NULL         NULL

The below is an example of running NATURAL RIGHT JOIN.

SELECT /*+ RECOMPILE*/ *
FROM t1 NATURAL RIGHT JOIN t2;

Running the above query is the same as running the below query, and they display the same result.

SELECT /*+ RECOMPILE*/ *
FROM t1 RIGHT JOIN t2 ON t1.a=t2.a;
            a           b1            a           b2
====================================================
            1            1            1            1
         NULL         NULL            2            2

(on changing)

STRAIGHT_JOIN

CUBRIDSUS-12814

Subquery

A subquery can be used wherever expressions such as SELECT or WHERE clause can be used. If the subquery is represented as an expression, it must return a single column; otherwise it can return multiple rows. Subqueries can be divided into single-row subquery and multiple-row subquery depending on how they are used.

Single-Row Subquery

A single-row subquery outputs a row that has a single column. If no row is returned by the subquery, the subquery expression has a NULL value. If the subquery is supposed to return more than one row, an error occurs.

The following example shows how to retrieve the history table as well as the host country where a new world record has been set. This example shows a single-row subquery used as an expression. In this example, the subquery returns host_nation values for the rows whose values of the host_year column in the olympic table are the same as those of the host_year column in the history table. If there are no values that meet the condition, the result of the subquery is NULL.

SELECT h.host_year, (SELECT host_nation FROM olympic o WHERE o.host_year=h.host_year) AS host_nation,
       h.event_code, h.score, h.unit
FROM history h;
    host_year  host_nation            event_code  score                 unit
============================================================================================
    2004       'Greece'               20283       '07:53.0'             'time'
    2004       'Greece'               20283       '07:53.0'             'time'
    2004       'Greece'               20281       '03:57.0'             'time'
    2004       'Greece'               20281       '03:57.0'             'time'
    2004       'Greece'               20281       '03:57.0'             'time'
    2004       'Greece'               20281       '03:57.0'             'time'
    2004       'Greece'               20326       '210'                 'kg'
    2000       'Australia'            20328       '225'                 'kg'
    2004       'Greece'               20331       '237.5'               'kg'
...

Multiple-Row Subquery

The multiple-row subquery returns one or more rows that contain the specified column. The result of the multiple-row subquery can create SET, MULTISET and LIST) by using an appropriate keyword.

The following example shows how to retrieve nations, capitals and host cities for Olympic Game all together in the nation table. In this example, the subquery result is used to create a List from the values of the host_city column in the olympic table. This query returns name and capital value for nation table, as well as a set that contains host_city values of the olympic table with host_nation value. If the name value is an empty set in the query result, it is excluded. If there is no olympic table that has the same value as the name, an empty set is returned.

SELECT name, capital, list(SELECT host_city FROM olympic WHERE host_nation = name) AS host_cities
FROM nation;
  name                      capital                 host_cities
==================================================================
  'Somalia'                   'Mogadishu'           {}
  'Sri Lanka'                 'Sri Jayewardenepura Kotte' {}
  'Sao Tome & Principe'       'Sao Tome'            {}
  ...
  'U.S.S.R.'                  'Moscow'              {'Moscow'}
  'Uruguay'                   'Montevideo'          {}
  'United States of America'  'Washington.D.C'      {'Atlanta ', 'St. Louis', 'Los Angeles', 'Los Angeles'}
  'Uzbekistan'                'Tashkent'            {}
  'Vanuatu'                   'Port Vila'           {}

Such multiple-row subquery expressions can be used anywhere a collection-type value expression is allowed. However, they cannot be used where a collection-type constant value is required as in the DEFAULT specification in the class attribute definition.

If the ORDER BY clause is not used explicitly in the subquery, the order of the multiple-row query result is not set. Therefore, the order of the multiple-row subquery result that creates LIST must be specified by using the ORDER BY clause.

VALUES

The VALUES clause prints out the values of rows defined in the expression. In most cases, the VALUES clause is used for creating a constant table, however, the clause itself can be used. When one or more rows are specified in the VALUES clause, all rows should have the same number of the elements.

VALUES (expression[, ...])[, ...]
  • expression : An expression enclosed within parentheses stands for one row in a table.

The VALUES clause can be used to express the UNION ALL query, which consists of constant values in a simpler way. For example, the following query can be executed.

VALUES (1 AS col1, 'first' AS col2), (2, 'second'), (3, 'third'), (4, 'fourth');

The above query prints out the following result.

SELECT 1 AS col1, 'first' AS col2
UNION ALL
SELECT 2, 'second'
UNION ALL
SELECT 3, 'third'
UNION ALL
SELECT 4, 'fourth';

The following example shows use of the VALUES clause with multiple rows in the INSERT statement.

INSERT INTO athlete (code, name, gender, nation_code, event)
VALUES ('21111', 'Jang Mi-Ran ', 'F', 'KOR', 'Weight-lifting'),
       ('21112', 'Son Yeon-Jae ', 'F', 'KOR', 'Rhythmic gymnastics');

The following example shows how to use subquery in the FROM statement.

SELECT a.*
FROM athlete a, (VALUES ('Jang Mi-Ran', 'F'), ('Son Yeon-Jae', 'F')) AS t(name, gender)
WHERE a.name=t.name AND a.gender=t.gender;
         code  name                gender   nation_code        event
=====================================================================================================
        21111  'Jang Mi-Ran'       'F'      'KOR'              'Weight-lifting'
        21112  'Son Yeon-Jae'      'F'      'KOR'              'Rhythmic gymnastics'

FOR UPDATE

The FOR UPDATE clause can be used in SELECT statements for locking rows returned by the statement for a later UPDATE/DELETE.

SELECT ... [FOR UPDATE [OF <spec_name_comma_list>]]

    <spec_name_comma_list> ::= <spec_name> [, <spec_name>, ... ]
        <spec_name> ::= table_name | view_name
  • <spec_name_comma_list>: A list of table/view names referenced from the FROM clause.

Only table/view referenced in <spec_name_comma_list> will be locked. If the <spec_name_comma_list> is missing but FOR UPDATE is present then we assume that all tables/views from the FROM clause of the SELECT statement are referenced. Rows are locked using X_LOCK.

Note

Restrictions

  • It cannot be used in subqueries (but it can reference subqueries).
  • It cannot be used in a statement that has GROUP BY, DISTINCT or aggregate functions.
  • It cannot reference UNIONs.

The following shows how to use SELECT ... FOR UPDATE statements.

CREATE TABLE t1(i INT);
INSERT INTO t1 VALUES (1), (2), (3), (4), (5);

CREATE TABLE t2(i INT);
INSERT INTO t2 VALUES (1), (2), (3), (4), (5);
CREATE INDEX idx_t2_i ON t2(i);

CREATE VIEW v12 AS SELECT t1.i AS i1, t2.i AS i2 FROM t1 INNER JOIN t2 ON t1.i=t2.i;

SELECT * FROM t1 ORDER BY 1 FOR UPDATE;
SELECT * FROM t1 ORDER BY 1 FOR UPDATE OF t1;
SELECT * FROM t1 INNER JOIN t2 ON t1.i=t2.i ORDER BY 1 FOR UPDATE OF t1, t2;

SELECT * FROM t1 INNER JOIN (SELECT * FROM t2 WHERE t2.i > 0) r ON t1.i=r.i WHERE t1.i > 0 ORDER BY 1 FOR UPDATE;

SELECT * FROM v12 ORDER BY 1 FOR UPDATE;
SELECT * FROM t1, (SELECT * FROM v12, t2 WHERE t2.i > 0 AND t2.i=v12.i1) r WHERE t1.i > 0 AND t1.i=r.i ORDER BY 1 FOR UPDATE OF r;