Monday, April 3, 2017

Amazon RedShift: Key highlights on "Temporary Tables" and "Derived Tables'

Important points to remember about Temporary Tables are

  • Temporary tables are visible only within the current session. They are automatically dropped at the end of the session.
  • If we begin the table name with #, it will automatically creates a temporary table.
  • Operations on temporary tables
    • create table #employee_temp ...  
      • This sql automatically creates employee_temp as temporary table due to # at the front of the table name.
    • insert into #employee_temp select * from employee; 
      • # is required at the front of the temporary table name while inserting.
    • create temp table employee_temp like employee;
      • This command creates a temporary table employee_temp. It inherits its columns, Distkey ,sortkey and NOT NULL from employee table.
    • create temp table employee_temp as select * from employee;
      • This command creates a temporary table employee_temp. The temporary table only inherits the column names.
    • create temp table employee_temp as select firstname,lastname from employee;
      • Creates temporary table employee_temp that has firstname and lastname from employee.
    • create temp table emp_temp distkey(emp_id) as select * from employee;
    • create temp table emp_temp diststyle even as select * from employee;
    • create temp table emp_temp diststyle even sortkey(emp_name) as select * from employee;
  • A derived table lasts the life of a query but temporary table lasts the life of a session.
  • A temporary table can have the same name as permanent table,but not recommended.
  • All Users by default has temporary table creation privilege. To remove, revoke TEMP privilege from PUBLIC and grant TEMP privilege to specific user/group.
  • Deep Copy:
    • Deep copy operation recreates and repopulates the table using bulk insert  and also automatically sorts the table.
    • If a table has large unsorted region, deep copy is a preferred method compared to vacuum to sort the table.
    • 4 Methods to perform deep copy:
      • Use original DDL command - CREATE TABLE command. We can specify all PK, FK, distkey, sortkey in DDL command.
      • CREATE TABLE LIKE - The new table doesn't inherit PK and FK. It only inherits distkey, sortkey and not null attributes from parent table.
      • CTAS - The new table will not inherit PK, FK, not null, distkey, sortkey from parent table.
      • Create temporary table - If we need to retain all the attributes of the original table, then we have to create temporary table using CTAS command. Then, truncate parent table and then insert into parent table from temporary table.
    • During deep copy operation concurrent updates are not allowed whereas its allowed during vacuum.
    • v_generate_tbl_ddl script is used to generate the table DDL. 
  • Derived Tables:
    • Exists only within query.
    • Analogous to subquery concept in RDBMS world.
    • Derived table lives in memory.
    • Derived table must be given name.
    • All columns must be given name in the derived table. 
    • Derived table is materialized by a select statement inside the query and exists between open and closed parenthesis.
    • Different ways to define derived tables:
      • select * from (select dept,avg(salary) from employee) as salavg(dept, avgsal);
        • salavg is the name given to derived table. dept, avgsal are aliases defined externally to dept, avg(salary) columns.
      • select * from (select dept,avg(salary) avgsal from employee) as salavg;
        • Salavg is the name given to derived table. dept column is not aliased because it can default to normal column and avgsal alias is defined within the derived table query.
      • WITH salavg(avgsal) as (select avg(salary) from employee) select * from salavg;
        • Using with command we create derived table with column alias before running the main query.
      • WITH emp as (select * from employee) select * from emp;
        • Derived table emp selects all rows and columns from main table employee. 
      • The following query creates two derived tables using WITH command and then performs a join.
                              WITH emp(deptid,avgsal) as (select deptid, avg(sal) from employee) ,
                                      dept as (select deptid,deptname from department)
                              select emp.deptid, dept.deptname, avgsal from emp inner join dept
                              on emp.deptid = dept.deptid;          
    • The derived table is built first while executing a query.

Check my post on Key highlights on Explain.

No comments:

Post a Comment

Amazon Athena: Key highlights on Amazon Athena

Amazon Athena is a serverless interactive query service that is used to analyze data in Amazon S3 using standard SQL. Amazon Athena app...