2017年11月23日木曜日

[PsotgreSQL]JDBC経由でサーバサイドでの解析結果を使わないようにする

PostgreSQLにPreparedStatementを使用して繰り返しSQLを実行すると、効率化のためにサーバサイドでの解析結果が使用されるようになる。
これによって、条件に埋め込む値が変わっても、実行計画が作り直されないので想定外の実行計画になってしまう場合があるらしい。

動きを確認するために、以下のデータを持つテーブルを使って検証してみる。
select flg, count(*) from test group by flg;
 flg |  count  
-----+---------
 0   | 1000000
 1   |       1
(2 rows)


条件に固定値を指定した場合の実行計画は、"1"を指定した場合はインデックスが使われ"0"を指定した場合はフルスキャンとなる。
#EXPLAIN select * from test where flg = '1';
                              QUERY PLAN                              
----------------------------------------------------------------------
 Index Scan using test_idx on test  (cost=0.42..4.44 rows=1 width=17)
   Index Cond: (flg = '1'::bpchar)
(2 rows)

# EXPLAIN select * from test where flg = '0';
                          QUERY PLAN                           
---------------------------------------------------------------
 Seq Scan on test  (cost=0.00..18860.01 rows=1000001 width=17)
   Filter: (flg = '0'::bpchar)
(2 rows)


これをPreparedStatementを使って実行する。
最初の10回は条件に"0"を指定しているのでフルスキャンとなり、最後実行では条件に"1"を指定しているのでインデックスが使われるはず。
try (Connection connection = dataSource.getConnection()) {
    try (Statement statement = connection.createStatement()) {
        statement.execute("load 'auto_explain'");
        statement.execute("set auto_explain.log_min_duration = 0");
        statement.execute("set auto_explain.log_analyze = true");
    }
    try (PreparedStatement statement = connection.prepareStatement("select * from test where flg = ?")) {
        for (int i = 0; i < 10; i++) {
            statement.setString(1, "0");
            statement.executeQuery();
        }
    }
    try (PreparedStatement statement = connection.prepareStatement("select * from test where flg = ?")) {
        statement.setString(1, "1");
        statement.executeQuery();
    }
}

実際の実行結果はこのようになる。
最後のSQL実行は、固定値を指定した時とは異なりインデックスが使われずフルスキャンとなる。
デフォルトだと5回繰り返してPreparedStatementが使われるとサーバサイドの解析結果が使われるらしいけど、Filterのflg=の部分が固定値じゃなくなったのは5回目じゃないが不思議…実行計画からは判断できないのだろうか。
デフォルト値から変えたい場合は、接続プロパティのprepareThresholdを使う。
LOG:  duration: 880.555 ms  plan:
        Query Text: select * from test where flg = $1
        Seq Scan on test  (cost=0.00..18860.01 rows=1000001 width=17) (actual time=0.011..250.647 rows=1000000 loops=1)
          Filter: (flg = '0'::bpchar)
          Rows Removed by Filter: 1
LOG:  duration: 965.254 ms  plan:
        Query Text: select * from test where flg = $1
        Seq Scan on test  (cost=0.00..18860.01 rows=1000001 width=17) (actual time=0.013..263.084 rows=1000000 loops=1)
          Filter: (flg = '0'::bpchar)
          Rows Removed by Filter: 1
LOG:  duration: 713.326 ms  plan:
        Query Text: select * from test where flg = $1
        Seq Scan on test  (cost=0.00..18860.01 rows=1000001 width=17) (actual time=0.015..216.970 rows=1000000 loops=1)
          Filter: (flg = '0'::bpchar)
          Rows Removed by Filter: 1
LOG:  duration: 603.937 ms  plan:
        Query Text: select * from test where flg = $1
        Seq Scan on test  (cost=0.00..18860.01 rows=1000001 width=17) (actual time=0.010..189.935 rows=1000000 loops=1)
          Filter: (flg = '0'::bpchar)
          Rows Removed by Filter: 1
LOG:  duration: 565.579 ms  plan:
        Query Text: select * from test where flg = $1
        Seq Scan on test  (cost=0.00..18860.01 rows=1000001 width=17) (actual time=0.008..175.277 rows=1000000 loops=1)
          Filter: (flg = '0'::bpchar)
          Rows Removed by Filter: 1
LOG:  duration: 582.891 ms  plan:
        Query Text: select * from test where flg = $1
        Seq Scan on test  (cost=0.00..18860.01 rows=1000001 width=17) (actual time=0.007..183.025 rows=1000000 loops=1)
          Filter: (flg = '0'::bpchar)
          Rows Removed by Filter: 1
LOG:  duration: 604.002 ms  plan:
        Query Text: select * from test where flg = $1
        Seq Scan on test  (cost=0.00..18860.01 rows=1000001 width=17) (actual time=0.010..187.921 rows=1000000 loops=1)
          Filter: (flg = '0'::bpchar)
          Rows Removed by Filter: 1
LOG:  duration: 590.842 ms  plan:
        Query Text: select * from test where flg = $1
        Seq Scan on test  (cost=0.00..18860.01 rows=1000001 width=17) (actual time=0.007..185.131 rows=1000000 loops=1)
          Filter: (flg = '0'::bpchar)
          Rows Removed by Filter: 1
LOG:  duration: 589.730 ms  plan:
        Query Text: select * from test where flg = $1
        Seq Scan on test  (cost=0.00..18860.01 rows=1000001 width=17) (actual time=0.007..183.538 rows=1000000 loops=1)
          Filter: (flg = '0'::bpchar)
          Rows Removed by Filter: 1
LOG:  duration: 610.742 ms  plan:
        Query Text: select * from test where flg = $1
        Seq Scan on test  (cost=0.00..18860.01 rows=1000001 width=17) (actual time=0.009..192.248 rows=1000000 loops=1)
          Filter: (flg = ($1)::bpchar)
          Rows Removed by Filter: 1
LOG:  duration: 114.769 ms  plan:
        Query Text: select * from test where flg = $1
        Seq Scan on test  (cost=0.00..18860.01 rows=1000001 width=17) (actual time=114.755..114.756 rows=1 loops=1)
          Filter: (flg = ($1)::bpchar)
          Rows Removed by Filter: 1000000


サーバサイドの解析を使わないようにするには、接続プロパティのprepareThresholdに0を指定する。
例えば、「jdbc:postgresql://localhost/db?prepareThreshold=0」みたいに指定する。
prepareThreshold=0を指定した結果は、上の結果とは異なり最後の実行がインデックススキャンとなっている。
G:  duration: 813.314 ms  plan:
        Query Text: select * from test where flg = $1
        Seq Scan on test  (cost=0.00..18860.01 rows=1000001 width=17) (actual time=0.011..248.204 rows=1000000 loops=1)
          Filter: (flg = '0'::bpchar)
          Rows Removed by Filter: 1
LOG:  duration: 769.790 ms  plan:
        Query Text: select * from test where flg = $1
        Seq Scan on test  (cost=0.00..18860.01 rows=1000001 width=17) (actual time=0.013..236.008 rows=1000000 loops=1)
          Filter: (flg = '0'::bpchar)
          Rows Removed by Filter: 1
LOG:  duration: 663.801 ms  plan:
        Query Text: select * from test where flg = $1
        Seq Scan on test  (cost=0.00..18860.01 rows=1000001 width=17) (actual time=0.013..198.553 rows=1000000 loops=1)
          Filter: (flg = '0'::bpchar)
          Rows Removed by Filter: 1
LOG:  duration: 562.532 ms  plan:
        Query Text: select * from test where flg = $1
        Seq Scan on test  (cost=0.00..18860.01 rows=1000001 width=17) (actual time=0.009..177.537 rows=1000000 loops=1)
          Filter: (flg = '0'::bpchar)
          Rows Removed by Filter: 1
LOG:  duration: 599.577 ms  plan:
        Query Text: select * from test where flg = $1
        Seq Scan on test  (cost=0.00..18860.01 rows=1000001 width=17) (actual time=0.009..188.422 rows=1000000 loops=1)
          Filter: (flg = '0'::bpchar)
          Rows Removed by Filter: 1
LOG:  duration: 611.355 ms  plan:
        Query Text: select * from test where flg = $1
        Seq Scan on test  (cost=0.00..18860.01 rows=1000001 width=17) (actual time=0.008..192.423 rows=1000000 loops=1)
          Filter: (flg = '0'::bpchar)
          Rows Removed by Filter: 1
LOG:  duration: 626.638 ms  plan:
        Query Text: select * from test where flg = $1
        Seq Scan on test  (cost=0.00..18860.01 rows=1000001 width=17) (actual time=0.010..197.469 rows=1000000 loops=1)
          Filter: (flg = '0'::bpchar)
          Rows Removed by Filter: 1
LOG:  duration: 639.909 ms  plan:
        Query Text: select * from test where flg = $1
        Seq Scan on test  (cost=0.00..18860.01 rows=1000001 width=17) (actual time=0.012..200.366 rows=1000000 loops=1)
          Filter: (flg = '0'::bpchar)
          Rows Removed by Filter: 1
LOG:  duration: 590.332 ms  plan:
        Query Text: select * from test where flg = $1
        Seq Scan on test  (cost=0.00..18860.01 rows=1000001 width=17) (actual time=0.007..187.930 rows=1000000 loops=1)
          Filter: (flg = '0'::bpchar)
          Rows Removed by Filter: 1
LOG:  duration: 587.067 ms  plan:
        Query Text: select * from test where flg = $1
        Seq Scan on test  (cost=0.00..18860.01 rows=1000001 width=17) (actual time=0.012..185.482 rows=1000000 loops=1)
          Filter: (flg = '0'::bpchar)
          Rows Removed by Filter: 1
LOG:  duration: 0.018 ms  plan:
        Query Text: select * from test where flg = $1
        Index Scan using test_idx on test  (cost=0.42..4.44 rows=1 width=17) (actual time=0.014..0.014 rows=1 loops=1)
          Index Cond: (flg = '1'::bpchar)

試した環境
  • サーバ:9.5
  • JDBC:org.postgresql:postgresql:42.1.4