比较向量中的元素时花费的时间呈指数增长

Question

我有一段代码如下所示：

void MonteCarloRainbowOptionFunction::ValueInstrument()
{
    std::vector<MJArray> correlatedNormVariates = GetArraysOfCorrelatedGauassiansByBoxMuller(numberOfPaths, covMatrix); 
    std::vector<StatisticAllPaths> thesePathGatherers;
    for (unsigned long i = 0; i < S_vect.size(); i++)
    {
        StandardExcerciseOption thisOption(ThePayOffVect[i], TTM);
        StatisticAllPaths onePathGatherer;
        thesePathGatherers.push_back(onePathGatherer);
        OneStepMonteCarloValuation(thisOption, S_vect[i], impvol_vect[i], r, d_vect[i], numberOfPaths, correlatedNormVariates[i], thesePathGatherers[i]);
    }
    f = 0;
    for (unsigned long i = 0; i < numberOfPaths; i++)
    {       
        int count = 0;
        if (i % 2500 == 0)
        {
            count += 1;
            std::cout << i << " paths done" << "\n";
        }
        std::vector<double> outcomes;
        for (unsigned long j = 0; j < S_vect.size(); j++)
        {
            outcomes.push_back(thesePathGatherers[j].GetResultsSoFar()[0][i]);
        }
        switch (optionType)
        {
        case RainbowOptionType::best_of:
            f += *max_element(outcomes.begin(), outcomes.end());
            break;
        case RainbowOptionType::worst_of:
            f += *min_element(outcomes.begin(), outcomes.end());
            break;          
        default:
            throw std::runtime_error("Unsupported or unknown option type found in " + GetuniqueIdentifier()[0]);
            break;
        }   
    }
    f *= ((double)nominal/numberOfPaths);
    return;
}

在第一部分（在第一部分之前 "Time Check"）我模拟了大量的值，这些值通过函数 OneStepMonteCarloValuation 保存在向量中（实际上在向量向量的第一个向量中）thesePathGatherers。像这样（这样做花费的时间最少）：

void OneStepMonteCarloValuation(const StandardExcerciseOption& TheOption, double Spot, double Vol, double r, double d, unsigned long NumberOfPaths, MJArray normVariates, StatisticsMC& gatherer)
{
    if (normVariates.size() != NumberOfPaths)
        throw("mismatched number of paths and normal variates");
    //Pre-calculate as much as possible
    double Expiry = TheOption.GetExpiry();
    double variance = Vol * Vol * Expiry;
    double rootVariance = sqrt(variance);
    double itoCorrection = -0.5 * variance;
    double movedSpot = Spot * exp((r-d) * Expiry + itoCorrection);
    double thisSpot;
    double discounting = exp(-r * Expiry);
    for (unsigned long i = 0; i < NumberOfPaths; i++)
    {
        thisSpot = movedSpot * exp(rootVariance * normVariates[i]);
        double thisPayoff = TheOption.OptionPayOff(thisSpot);
        gatherer.DumpOneResult(discounting * thisPayoff);
    }
    return;
}

这里调用的 gatherer 是 class:

StatisticAllPaths::StatisticAllPaths(const unsigned long minimumNumberOfPaths) : PathsDone(0)
{
    ResultList.reserve(minimumNumberOfPaths);
}

void StatisticAllPaths::DumpOneResult(double result)
{
    ResultList.push_back(result);
    PathsDone++;
}

std::vector<std::vector<double>> StatisticAllPaths::GetResultsSoFar() const
{
    vector<double> innerVector(ResultList);
    vector<vector<double> > Results(1, innerVector);
    Results[0] = ResultList;
    return Results;
}

在第二部分中，我遍历所有向量并将第 i 个元素保存在临时 "output" 向量中，比较输出中的值并保存结果，然后执行相同的操作。遍历 thesePathGatherers 中的所有值。第一部分运行速度很快，但这个部分需要更长的时间，甚至无法一起完成足够多的 numberOfPaths。我添加了这部分：

    if (i % 2500 == 0)
    {
        count += 1;
        std::cout << i << " paths done" << "\n";
    }

调试并发现，例如，如果路径总数很大，则执行前 10^4 条路径会花费更长的时间，即每条路径花费的时间越长，您执行的路径越多。 GetResultsSoFar() implementation/usage 有问题吗？这是我唯一能想到的。任何帮助表示赞赏

如果有更多需要分享的内容，请告诉我。

Answer 1

好吧，您将 ResultList 向量复制了 4 次，这无济于事。

std::vector<std::vector<double>> StatisticAllPaths::GetResultsSoFar() const
{
    vector<double> innerVector(ResultList); //first copy
    vector<vector<double> > Results(1, innerVector); //second copy
    Results[0] = ResultList;// third copy
    return Results;// return by value = fourth copy
}

你最好通过引用传递你的矢量并填充它：

void StatisticAllPaths::GetResultsSoFar(std::vector<std::vector<double>> & result ) const
{
    result.clear(); //discard the current contents if any
    result.push_back(ResultList); //one copy
}

希望对您有所帮助

Answer 2

这里有一些答案着重于 GetResultsSoFar 中发生的过度复制，但其中 none 解决了您如何调用此函数的毫无意义的问题：

std::vector<double> outcomes;
for (unsigned long j = 0; j < S_vect.size(); j++)
{
    outcomes.push_back(thesePathGatherers[j].GetResultsSoFar()[0][i]);
}

在这里，您要为每个 "path gatherer" 收集一个结果值，但为此您要在提取一个值之前多次复制每个结果列表。我认为改进这一点的第一步是避免 any 复制 ResultList 因为它不需要。

相反，如何向结果返回一个 const 引用。我还将重命名此函数以避免将其与您现有的定义混淆。但随便你怎么称呼它：

const std::vector<double>& StatisticAllPaths::GetResultList() const
{
    return ResultList;
}

当然，循环随后将访问 ith 值作为 thesePathGatherers[j].GetResultList()[i]。这里的关键区别是根本没有发生复制。

outcomes 还有一个小的优化要做。由于您确切知道要推送多少项目，因此您应该预先保留该内存。

即

std::vector<double> outcomes;
outcomes.reserve(S_vect.size());
for (unsigned long j = 0; j < S_vect.size(); j++)
{
    outcomes.push_back(thesePathGatherers[j].GetResultList()[i]);
}

如果你想让它更整洁，可以考虑定义 double GetResult(size_t) const 只有 returns 一个结果，然后调用 thesePathGatherers[j].GetResult(i).

比较向量中的元素时花费的时间呈指数增长

Time taken increasing exponentially when comparing elements in vectors

c++

performance

vector

montecarlo