blogger

    blogger
  • hot log

  • comment

    twitter

    • 空间续费,加上ssl连接。。

      2018-11-22 01:04

    • 12.10.xC6出来时候碰到"Could not determine encryption mode from page zero",竟果然是BUG!http://www-01.ibm.com/support/docview.wss?rs=630&uid=swg21973408

      2015-12-21 20:15

    • 空间已经转移到新加坡,希望会好一些。

      2015-11-16 14:34

    • 增加新域名liaosnet.cc,与liaosnet.com同指向。

      2015-10-09 15:25

    • 空间迁移到香港。或许速度会快一点。。

      2014-02-13 20:10

    • More»

    blog sort

    links

    record

    banner

    GBase 8t数据库SQL优化一例

    暗夜星空's Memory from Jul 14 , 2017 at 13:38 PM , under Category:IFX/GBase
    客户的数据库从Oracle迁移到GBase 8t,反应有个SQL语句变得很慢很慢。而在原Oracle环境下,执行只需要10秒钟。
    SQL语句如下:
    select count(*) 
    from TA_DOC doc left join TA_FORM_DATA t 
        on t.docid=doc.id
        and t.elementname = '纸质文件印数' 
    where isprint!=1 
        and doc.state!=2 
        and (select count(*) 
             from TA_TODO t 
             where currtype=4 
             and currdeptid=10000086  
             and curruserid is not null 
             and currusername is not null 
             and state=2 
             and mainsend!=4 
             and docid=doc.id)>0
    对该语句做sqexplain,发现用时达到80分钟,ta_doc扫了23881次(子查询select count(*) from ta_todo t),每次扫23881行,结果如下:
    Estimated Cost: 1561030
    Estimated # of Rows Returned: 1
          1) informix.doc: INDEX PATH
                Filters: ((informix.doc.state != 2.0000000000000000 AND informix.doc.isprint != 1.0000000000000000 ) AND  > 0 ) 
            (1) Index Name: informix. 202_417
                Index Keys: id   (Serial, fragments: ALL)
          2) informix.t: INDEX PATH
                Filters: informix.t.elementname = '纸质文件印数' 
            (1) Index Name: informix. 209_457
                Index Keys: docid fromid elementid   (Serial, fragments: ALL)
                Lower Index Filter: informix.t.docid = informix.doc.id 
        ON-Filters:(informix.t.docid = informix.doc.id AND informix.t.elementname = '纸质文件印数' ) 
        NESTED LOOP JOIN(LEFT OUTER JOIN)
        Subquery:
        ---------
        Estimated Cost: 805
        Estimated # of Rows Returned: 1
          1) informix.t: INDEX PATH
                Filters: (((((informix.t.docid = informix.doc.id AND informix.t.currtype = '4' ) AND informix.t.mainsend != 4.0000000000000000 ) AND informix.t.currusername IS NOT NULL ) AND informix.t.curruserid IS NOT NULL ) AND informix.t.currdeptid = 10000086 ) 
            (1) Index Name: informix.fk_todo_state
                Index Keys: state   (Serial, fragments: ALL)
                Lower Index Filter: informix.t.state = '2'
    Query statistics:
    -----------------
      Table map :
      ----------------------------
      Internal name     Table name
      ----------------------------
      t1                doc
      t2                t
      type     table  rows_prod  est_rows  rows_scan  time       est_cost
      -------------------------------------------------------------------
      scan     t1     1275       6448      23881      81:32.02   1559152 
      type     table  rows_prod  est_rows  rows_scan  time       est_cost
      -------------------------------------------------------------------
      scan     t2     1692       6810      12560      00:00.04   0       
      type     rows_prod  est_rows  time       est_cost
      -------------------------------------------------
      nljoin   1275       6449      81:32.05   1561031 
      type     rows_prod  est_rows  rows_cons  time
      -------------------------------------------------
      group    1          1         1275       81:32.05
    分析:程序员设计该业务逻辑时,并没有考虑到数据的增长情况,在数据量不断增大的情况下,ta_todo会执行的次数越来越多,越来越影响性能。从sqexplain看,虽然单一步花费的时间很快,但是执行的次数多了(23881次),总体时间就上来了。
    于是我们对该查询进行优化,子查询改为临时表操作,改为关联查询,即需要一次ta_doc与ta_form_data join操作,再与ta_todo查询结果的临时表进行关联查询,语句改为如下:
    select count(*) 
    from 
        (select doc.id as id 
         from TA_DOC doc left join TA_FORM_DATA t 
             on doc.id = t.docid 
             and t.elementname = '纸质文件印数' 
         where isprint!='1' 
             and doc.state!='2'
        ) t1,
        (select docid,count(*) as numofid
         from TA_TODO
         where currtype='4' 
             and currdeptid='10000086'  
             and curruserid is not null 
             and currusername is not null 
             and state='2' 
             and mainsend!='4'
         group by docid
        ) t2
    where t1.id = t2.docid
    使用sqexplain跟踪,发现有极大的改善:
    Estimated Cost: 4796
    Estimated # of Rows Returned: 1
      1) (Temp Table For Collection Subquery): SEQUENTIAL SCAN
      2) (Temp Table For Collection Subquery): SEQUENTIAL SCAN
    DYNAMIC HASH JOIN 
        Dynamic Hash Filters: (Temp Table For Collection Subquery).id = (Temp Table For Collection Subquery).docid 
    Query statistics:
    -----------------
      Table map :
      ----------------------------
      Internal name     Table name
      ----------------------------
      t1                doc
      t2                t
      t3                (Temp Table For Collection Subquery)
      t4                ta_todo
      t5                (Temp Table For Collection Subquery)
      type     table  rows_prod  est_rows  rows_scan  time       est_cost
      -------------------------------------------------------------------
      scan     t1     23768      19344     23881      00:00.04   7357    
      type     table  rows_prod  est_rows  rows_scan  time       est_cost
      -------------------------------------------------------------------
      scan     t2     4728       6810      67219      00:00.25   0       
      type     rows_prod  est_rows  time       est_cost
      -------------------------------------------------
      nljoin   23768      19345     00:00.18   12826   
      type     table  rows_prod  est_rows  rows_scan  time       est_cost
      -------------------------------------------------------------------
      scan     t3     23768      19345     23768      00:00.00   780     
      type     table  rows_prod  est_rows  rows_scan  time       est_cost
      -------------------------------------------------------------------
      scan     t4     1517       1955      2686       00:00.00   806     
      type     rows_prod  est_rows  rows_cons  time       est_cost
      ------------------------------------------------------------
      group    1290       1892      1517       00:00.00   2720    
      type     table  rows_prod  est_rows  rows_scan  time       est_cost
      -------------------------------------------------------------------
      scan     t5     1290       1892      1290       00:00.00   78      
      type     rows_prod  est_rows  rows_bld  rows_prb  novrflo  time       est_cost
      ------------------------------------------------------------------------------
      hjoin    1275       366008    1290      23768     0        00:00.00   4796    
      type     rows_prod  est_rows  rows_cons  time
      -------------------------------------------------
      group    1          1         1275       00:00.00
    结果:将语句耗费的时间从4891秒,提升到0.3秒,效率提升1.6万倍;将同样的SQL优化到原Oracle环境下,也从10几秒,得升到0.3秒,效率也有36倍的提升。
    总结:Oracle的优化器对SQL的优化效率比较高,但也是有优化空间;GBase 8t/Informix的优化器相对保守(严格按照语句的要求进行优化),在SQL语句不怎么高效时,效率会比较慢。

    Tags: SQL优化

    Post a comment:

    Your email address will not be published. Required fields are marked﹡