运行 使用 OLEDB(VFPOLEDB) 查询 DBF 文件太慢

Running queries on DBF files using OLEDB(VFPOLEDB) is too slow

我正在开发一个界面,用于在 WinForm 应用程序中显示来自 DBF 文件的数据。 我开始使用 OdbcConnection。尽管它有效,但由于 Visual FoxPro 驱动程序的某些限制(不支持子查询),我打开了使用 OLEDB(VFPOLEDB)。现在我可以 运行 复杂的查询,但出现了必须解决的新困难。问题是这些查询太慢了。比预期慢 100 倍。

下面是演示代码。 有一个 DBF table ‘PROD’。索引字段 PRICE_N 用于查询的 Where 子句中。 table 与应用程序 运行 在同一台 PC 上。如您所见,运行通过 ODBC(Microsoft Visual FoxPro 驱动程序)和 OLEDB(VFPOLEDB)进行查询所花费的时间差异很大。

            TimeSpan timeSpanODBC;
        DateTime timeODBC = DateTime.Now;

        OdbcConnection odbcConnection = new OdbcConnection(@"Driver={Microsoft Visual FoxPro Driver};SourceType=DBF;SourceDB=C:\Users\Vakshul\Documents\dbfs;Exclusive=No;Collate=Machine;NULL=NO;DELETED=NO;BACKGROUNDFETCH=NO;");
        odbcConnection.Open();
        OdbcCommand odbcCommand = new OdbcCommand("SELECT utk_ved FROM prod WHERE (price_n='641857')", odbcConnection);
        odbcCommand.ExecuteScalar();
        timeSpanODBC = DateTime.Now - timeODBC;
        double timeOdbcEqual = timeSpanODBC.TotalMilliseconds;
        System.Console.WriteLine("Time spent via ODBC(milliseconds) using '=' to compare - {0}", timeOdbcEqual.ToString());


        timeODBC = DateTime.Now;

        odbcCommand = new OdbcCommand("SELECT utk_ved FROM prod WHERE (price_n like'641857')", odbcConnection);
        odbcCommand.ExecuteScalar();
        timeSpanODBC = DateTime.Now - timeODBC;
        double timeOdbcLike = timeSpanODBC.TotalMilliseconds;
        System.Console.WriteLine("Time spent via ODBC(milliseconds) using 'Like' to compare - {0}", timeOdbcLike.ToString());

        TimeSpan timeSpanOLEDB;
        DateTime timeOLEDB = DateTime.Now;

        OleDbConnection oleDbCon = new OleDbConnection(@"Provider=VFPOLEDB.1;Data Source=C:\Users\Vakshul\Documents\dbfs;Collating Sequence=MACHINE;Mode=Read");
        oleDbCon.Open();
        OleDbCommand oleDbcommand = new OleDbCommand("SELECT utk_ved FROM prod WHERE (price_n = '641857')", oleDbCon);
        oleDbcommand.ExecuteScalar();
        timeSpanOLEDB = DateTime.Now - timeOLEDB;
        double timeOLEDBEqual = timeSpanOLEDB.TotalMilliseconds;
        System.Console.WriteLine("Time spent via OLEDB(milliseconds) using '=' to compare - {0}", timeOLEDBEqual.ToString());

        timeOLEDB = DateTime.Now;

        oleDbcommand = new OleDbCommand("SELECT utk_ved FROM prod WHERE (price_n like '641857')", oleDbCon);
        oleDbcommand.ExecuteScalar();
        timeSpanOLEDB = DateTime.Now - timeOLEDB;
        double timeOLEDLike = timeSpanOLEDB.TotalMilliseconds;
        System.Console.WriteLine("Time spent via OLEDB(milliseconds) using 'Like' to compare - {0}", timeOLEDLike.ToString());

        System.Console.WriteLine("ODBC is faster than OLEDB {0} times using '=' to compare", Math.Round(timeOLEDBEqual / timeOdbcEqual, 0));
        System.Console.WriteLine("ODBC is faster than OLEDB {0} times using 'Like' to compare", Math.Round(timeOLEDBEqual / timeOdbcEqual, 0));

控制台,第一个运行之后:

Time spent via ODBC(milliseconds) using '=' to compare - 5,0006
Time spent via ODBC(milliseconds) using 'Like' to compare - 3,5005
Time spent via OLEDB(milliseconds) using '=' to compare - 1630,207
Time spent via OLEDB(milliseconds) using 'Like' to compare - 1755,2228
ODBC is faster than OLEDB 326 times using '=' to compare
ODBC is faster than OLEDB 326 times using 'Like' to compare

Console, after the second run:
Time spent via ODBC(milliseconds) using '=' to compare - 4,5006
Time spent via ODBC(milliseconds) using 'Like' to compare - 4,5005
Time spent via OLEDB(milliseconds) using '=' to compare - 1526,1938
Time spent via OLEDB(milliseconds) using 'Like' to compare - 1595,2026
ODBC is faster than OLEDB 339 times using '=' to compare
ODBC is faster than OLEDB 339 times using 'Like' to compare

Console, after the third run:
Time spent via ODBC(milliseconds) using '=' to compare - 4,0005
Time spent via ODBC(milliseconds) using 'Like' to compare - 3,0004
Time spent via OLEDB(milliseconds) using '=' to compare - 1449,184
Time spent via OLEDB(milliseconds) using 'Like' to compare - 1451,1843
ODBC is faster than OLEDB 362 times using '=' to compare
ODBC is faster than OLEDB 362 times using 'Like' to compare

Console, after the fourth run:
Time spent via ODBC(milliseconds) using '=' to compare - 3,5004
Time spent via ODBC(milliseconds) using 'Like' to compare - 4,5006
Time spent via OLEDB(milliseconds) using '=' to compare - 1475,6874
Time spent via OLEDB(milliseconds) using 'Like' to compare - 1621,2059
ODBC is faster than OLEDB 422 times using '=' to compare
ODBC is faster than OLEDB 422 times using 'Like' to compare

在此示例中,索引字段 PRICE_N 包含在查询的 Where 子句中。 我还测试了相同的查询,包括 Where 子句中的非索引字段而不是索引字段。结果是相同的 ~ 1400 – 1600 毫秒。 我的印象是在 OLEDB(VFPOLEDB) 的情况下不使用索引。 我对结果不满意,需要使用索引。

如果有人有任何建议,我将不胜感激。

只是想知道...您的 "Price_n" 列是 NUMERIC 还是 STRING 列。如果是数字,那么我想知道 VFP OleDb 是否正在尝试将所有数字转换为 STRING 等价物以进行测试,而不是像我期望的那样将带引号的字符串转换为 Price_n 数据类型的数字等价物 自然指数以.

如果是这样,请尝试测试并将所有 WHERE 子句分别更改为

哪里 price_n = 641857

哪里 price_n 喜欢 641857

但是如果基于数字的列,那么 Like 真的不适用,因为数字是数字而不是部分字符串匹配 like

"1" LIKE "1"
"1" LIKE "10"
"1" LIKE "19302"... etc where they all start with a same value

@Sergiy, 你在做什么有很大的不同。 VFP6 之后的版本不存在 ODBC 驱动程序(即 2.5 或 2.6 是包含 ODBC 驱动程序的最后一个软件包)。 IOW ODBC 仅支持 VFP6 引擎。 OTOH VFPOLEDB 支持 VFP9 引擎(以及所有添加的 SQL 功能,带有花里胡哨的功能)。

在这些引擎之间,存在导致文本字段查询变慢的问题: 如果 OS 代码页与 table 的代码页不同,并且正在对表达式为字符类型的索引进行搜索。然后它不使用索引而是进行 table 扫描。此 "bug" 在 VFP9 初始发布后浮出水面,并且未更正 AFAIK。

根据 = vs like,like 是隐含的 ANSI so 表现得像 < g > 使用 == 运算符(完全匹配)。使用 =,如果 ANSI 关闭,您将部分匹配为真。

PS:VFPOLEDB,即使在更正代码页之后,速度稍慢但可以忽略不计。

这是我对你的代码的计时,测试 table 有 1,000,000 行:

Time spent via ODBC(milliseconds) using '=' to compare - 41.0023
Time spent via ODBC(milliseconds) using 'Like' to compare - 0
Time spent via OLEDB(milliseconds) using '=' to compare - 68.0038
Time spent via OLEDB(milliseconds) using 'Like' to compare - 2.0002
ODBC is faster than OLEDB 2 times using '=' to compare
ODBC is faster than OLEDB 2 times using 'Like' to compare

这些是秒后的时间 运行:

Time spent via ODBC(milliseconds) using '=' to compare - 1
Time spent via ODBC(milliseconds) using 'Like' to compare - 1.0001
Time spent via OLEDB(milliseconds) using '=' to compare - 3.0001
Time spent via OLEDB(milliseconds) using 'Like' to compare - 0
ODBC is faster than OLEDB 3 times using '=' to compare
ODBC is faster than OLEDB 3 times using 'Like' to compare

@Cetin Basoz,

Between those engines, there were an issue making the queries slow on text fields: If OS codepage is different than the table's codepage and the search is being done on an index whose expression is of character type. Then it doesn't use the index but do a table scan. This "bug" surfaced after the initial release of VFP9 and not corrected AFAIK.

你抓住了问题的关键!

我不知道这种怪癖。我决定检查是否真的如此。如果您的假设是正确的,那么低速的原因是 DBF 文件的代码页和我的 OS 不同。为了测试我安装了 Visual Fox Pro 9(我以前从未处理过它)并将所有数据传输到新的 table。然后我在 Table Designer 中打开 table 并在 PRICE_N 字段上创建了一个常规索引。因此,新 table 和我的 OS 的代码页变得相同。

然后我运行再测试一次。结果大变。

第一个运行之后:

Time spent via ODBC(milliseconds) using '=' to compare - 12,5016
Time spent via ODBC(milliseconds) using 'Like' to compare - 3,5005
Time spent via OLEDB(milliseconds) using '=' to compare - 20,0025
Time spent via OLEDB(milliseconds) using 'Like' to compare - 3,0004
ODBC is faster than OLEDB 2 times using '=' to compare
ODBC is faster than OLEDB 2 times using 'Like' to compare

第二个运行之后:

Time spent via ODBC(milliseconds) using '=' to compare - 3,0004
Time spent via ODBC(milliseconds) using 'Like' to compare - 2,5003
Time spent via OLEDB(milliseconds) using '=' to compare - 11,0014
Time spent via OLEDB(milliseconds) using 'Like' to compare - 3,5005
ODBC is faster than OLEDB 4 times using '=' to compare
ODBC is faster than OLEDB 4 times using 'Like' to compare

谢谢,Cetin Basoz。评论很棒:)

尽管不允许我更改生产 DBF 文件的代码页,但至少现在当我知道发生了什么时,我可以卸下重担,如释重负。

@Sergiy, 我可能有适合您的解决方案:

string sqlEq = "SELECT utk_ved FROM prod WHERE Price_N = '641857'";
string sqlLike = "SELECT utk_ved FROM prod WHERE Price_N like '641857'";

TimeSpan timeSpanODBC;
DateTime timeODBC = DateTime.Now;

OdbcConnection odbcConnection = new OdbcConnection(@"Driver={Microsoft Visual FoxPro Driver};SourceType=DBF;SourceDB=C:\Users\Vakshul\Documents\dbfs;Exclusive=No;Collate=Machine;NULL=NO;DELETED=NO;BACKGROUNDFETCH=NO;");
odbcConnection.Open();
OdbcCommand odbcCommand = new OdbcCommand(sqlEq, odbcConnection);
odbcCommand.ExecuteScalar();
timeSpanODBC = DateTime.Now - timeODBC;
double timeOdbcEqual = timeSpanODBC.TotalMilliseconds;
System.Console.WriteLine("Time spent via ODBC(milliseconds) using '=' to compare - {0}", timeOdbcEqual.ToString());


timeODBC = DateTime.Now;

odbcCommand = new OdbcCommand(sqlLike, odbcConnection);
odbcCommand.ExecuteScalar();
timeSpanODBC = DateTime.Now - timeODBC;
double timeOdbcLike = timeSpanODBC.TotalMilliseconds;
System.Console.WriteLine("Time spent via ODBC(milliseconds) using 'Like' to compare - {0}", timeOdbcLike.ToString());

TimeSpan timeSpanOLEDB;
DateTime timeOLEDB = DateTime.Now;

OleDbConnection oleDbCon = new OleDbConnection(@"Provider=VFPOLEDB.1;Data Source=C:\Users\Vakshul\Documents\dbfs;Collating Sequence=MACHINE;Mode=Read");
oleDbCon.Open();
new OleDbCommand("set enginebehavior 80", oleDbCon).ExecuteNonQuery();
OleDbCommand oleDbcommand = new OleDbCommand(sqlEq, oleDbCon);
oleDbcommand.ExecuteScalar();
timeSpanOLEDB = DateTime.Now - timeOLEDB;
double timeOLEDBEqual = timeSpanOLEDB.TotalMilliseconds;
System.Console.WriteLine("Time spent via OLEDB(milliseconds) using '=' to compare - {0}", timeOLEDBEqual.ToString());

timeOLEDB = DateTime.Now;

oleDbcommand = new OleDbCommand(sqlLike, oleDbCon);

oleDbcommand.ExecuteScalar();
timeSpanOLEDB = DateTime.Now - timeOLEDB;
double timeOLEDLike = timeSpanOLEDB.TotalMilliseconds;
System.Console.WriteLine("Time spent via OLEDB(milliseconds) using 'Like' to compare - {0}", timeOLEDLike.ToString());

注意这一行 在同一连接上:

new OleDbCommand("set enginebehavior 80", oleDbCon).ExecuteNonQuery();

这不会影响您的结果,但会得到您想要的。在我的脑海中,它可能影响的唯一地方是 "group by" 查询。认为您不会以旧的有问题的 VFP 方式编写查询组,您应该是安全的。

没有 "Set EngineBehavior 80" 我的时间安排:

Time spent via ODBC(milliseconds) using '=' to compare - 4.0002
Time spent via ODBC(milliseconds) using 'Like' to compare - 1.0001
Time spent via OLEDB(milliseconds) using '=' to compare - 352.0201
Time spent via OLEDB(milliseconds) using 'Like' to compare - 659.0377

和 "Set EngineBehavior 80":

Time spent via ODBC(milliseconds) using '=' to compare - 3.0001
Time spent via ODBC(milliseconds) using 'Like' to compare - 2.0002
Time spent via OLEDB(milliseconds) using '=' to compare - 15.0008
Time spent via OLEDB(milliseconds) using 'Like' to compare - 3.0002