有效的编程:如何在不编写实际谓词的情况下使用 find_if

Effiective programming: How to use find_if without writing the actual predicate

假设我有以下定义:

typedef std::vector<std::string> StringVec;
typedef std::set<std::string> StringSet;

和以下对象:

const StringVec& v
const StringSet& s

并且我想使用 find_if 查找 v 中存在于 s 中的第一个字符串。

我最好的行动方案是什么?这可以通过绑定一些函数调用来避免必须编写新的谓词来完成吗?

编辑:s 很大,所以 find_first_of 是不可能的。

使用std::find_first_of。即使在 C++11 中,这也是比 std::find_if 加 lambda 更好的选择,因为它的名称更清楚地表明了代码的意图,而且它更短更简洁。

这是一个完整的例子:

#include <vector>
#include <set>
#include <string>
#include <algorithm>
#include <iostream>
#include <assert.h>

typedef std::vector<std::string> StringVec;
typedef std::set<std::string> StringSet;

int main()
{
    StringVec v;
    v.push_back("zero");
    v.push_back("one");
    v.push_back("two");

    StringSet s;
    s.insert("one");
    s.insert("two");
    s.insert("three");

    StringVec::const_iterator const find_it =
        std::find_first_of(v.begin(), v.end(), s.begin(), s.end());

    assert(find_it != v.end());

    std::cout << *find_it << "\n"; // prints "one"
}

请注意,如果您的集合很大,您可能会因为这种方法的算法复杂性而遇到性能问题。

另一种方式,使用 boost::bind 允许灵活的谓词而无需编写 class:

#include <iostream>
#include <set>
#include <vector>
#include <algorithm>
#include <boost/bind.hpp>

typedef std::vector<std::string> StringVec;
typedef std::set<std::string> StringSet;

namespace {

    bool contains(const StringSet& set, const std::string& s)
    {
        return set.count(s) != 0;
    }
}

StringVec::const_iterator search(const StringVec& vec, const StringSet& filter)
{
    using namespace std;

    StringVec::const_iterator it = std::find_if(vec.begin(),
                                                vec.end(),
                                                boost::bind(&contains,
                                                            boost::cref(filter),
                                                            _1));
    return it;
}

int main()
{

    StringVec v;
    v.push_back("hello");
    v.push_back("goodbye");
    v.push_back("echo");

    StringSet s;
    s.insert("a");
    s.insert("goodbye");
    s.insert("foo");

    StringVec::const_iterator it = search(v, s);

    static const std::string notfound = "not found";
    std::cout << ((it == v.end()) ? notfound : *it) << std::endl;

    return 0;
}

当然,如果您知道 StringVec 已排序,您可以反转算法并对过滤器中的每个项目使用二进制搜索。这将搜索时间减少到比 O(N.logM) 更好,因为我们知道搜索集也已排序,因此我们可以折扣数组中前一个 lower_bound 之前的所有元素搜索:

StringVec::const_iterator search_sorted(const StringVec& vec, const StringSet& filter)
{
    using namespace std;


    StringVec::const_iterator best = vec.begin();
    for (StringSet::const_iterator search_it = filter.begin() ; search_it != filter.end() ; ++search_it)
    {
        best = lower_bound(best, vec.end(), *search_it);
        if (best == vec.end() || *best == *search_it)
            return best;
    }
    return vec.end();
}

这是一个基于您的 set::findboost::bind 的工作示例:

const StringVec& v = {"a", "b", "z"};
const StringSet& s = {"c", "d", "z"};

StringSet::const_iterator (StringSet::*f)(const StringSet::value_type& val) const = &StringSet::find;

StringVec::const_iterator iter = std::find_if(v.begin(), v.end(), boost::bind(f, &s, _1) != s.end());

cout << *iter << endl;

问题是 find 函数有一个 const 重载,因此您必须在 bind 中进行强制转换或使用 pointer-to-member 来指定超载你想要的。

此外,您必须小心通过引用传递 s 绑定,否则它将复制 set 并且返回的 end 迭代器将不同于 end 你在比较中使用的迭代器。

我认为这是在多个 "corpora"(正文)中搜索已知词。

经典方法是将 "dictionary" 放入前缀树或 "Trie" 并从中匹配。

这是一个使用 Boost Spirit 的 Trie 实现 (symbols<>) 及其输入流匹配器(parseistream_iterator)的简单示例。

程序从/etc/dictionaries-common/words(99k 个单词)中读取一个字典从一个硬编码的随机单词字典(1000 个单词) 并将它们放在一个 s Trie 中。

接下来它读取所有在命令行中命名的文件并显示第一个匹配项:

Live On Coliru

#include <boost/spirit/home/x3.hpp>
#include <boost/spirit/home/x3/directive/seek.hpp>
#include <boost/spirit/include/support_istream_iterator.hpp>
#include <iostream>
#include <fstream>
#include <set>

namespace x3 = boost::spirit::x3;

typedef std::vector<std::string> StringVec;

int main(int argc, char** argv) {

   //  

    auto const& dict = [] {
        x3::symbols<x3::unused_type> dict;

        std::ifstream file("main.cpp");

        std::string line;
        while (getline(file, line) && line != "DICTIONARYSTART")
            ;
        while (file >> line && line != "DICTIONARYEND")
            dict.add(line)
            ;

        return dict;
    }();

    for (auto fname : std::vector<std::string> (argv+1, argv+argc)) {
        std::ifstream corpus(fname);
        corpus.unsetf(std::ios::skipws);

        boost::spirit::istream_iterator f(corpus), l;

        std::string matched;
        if (parse(f, l, x3::seek [ x3::raw [ dict ] ], matched)) {
            std::cout << "first match in " << fname << " is " << matched << "\n";
        }
    }
}

#if 0
DICTIONARYSTART
fining attractions centripetal misanthropy duplexes Estella allusively battleship legging Charolais badmouthed mooning reptile soggy unethical conniver roadbed phlebitis feathering decrepitude dreariest mudguard capaciousness stroll devaluations
discomfiture boner cream Rachelle marine cornets earnest upshot degenerative hick collocation dowagers gangling synod petunia howdahs coils troubled squealed Oxfords counselors tinderbox latitudes robustly Gordon
economizing stagger encrusts neural vindication foaled nightclubbing case knot purling disciplinarian thatching ingrain rhymes flyweight wholesaler giggling displease catheters paradoxes miscellanies Valenzuela Gamble dykes judgemental
Tagalog wandering children classmates climactic workmanship Patricia fonder chubbiest cogitated Natchez tremendously dominated distemper castrating approbation Charles taproots head newsprint Quaalude unruffled Manaus legatees furrowed
rubberize ambience alleviated enthroning vermilion composure neglect fallibly Ike irks frostbite amnestying token playacted ligaments ancestral pureeing telepathy legrooms barrows solidified alumna Brazos inter Gogol
cups gesundheit brigs ordeals adulated gunboats Gatorade table Winnebago knit Cadillac indecision Florentine playgrounds houseplants trifler idolizing revile epitome berry candies whirred afterthoughts pickup tansy
maggots glazed mineralogists seeds iridescence tightness gusher product blackjack methodologies writes globetrotter hub allowances chronologically rid advantages breathe terrifyingly enormities dérailleur shame admission nosey bending
brogues mansion pursuits smell courtesy faulting plough oxygenate marinating customer OKs grayest cuttlefish showcased impunity stratifying elopes carefulness die amplifier Gobi exorcists then reediest bayberry
immure diligent enamors rapider presenter enhancements seasoning furnish Hilario glissandi holograph furor equably betrayer scavenger uncontested breastplate bicycling eclecticism Honduran hinted whitened coped noncommercial guileless
vinyl masonry desecrate kitchenware layouts degenerate reincarnating fondu Guinevere Kitchener reaffirms ploys fetchingly workings dictate debts porpoised skivvying Badlands posts contiguous runes bristling streak preempted
innovation landmasses reciprocates jigs crone permuted unkinder expectations meatloaves starters Aurangzeb paymasters thankfulness Brahms editions giveaways Salk Neogene indulged schoolbook summoner rerun monogrammed Sikhism protectorates
beholders tasseled Ebonics Antilles imbibed knobbiest Summer orthodontics parachutes overprices Harrods accessioned gatecrasher vents syncopates sects safeguards tulle Vitim cut living serest Hittite marshaling diffidence
nightshades sorted inputted parterres botching casually maturest cede backstopping misprinted calluses deflected crawlspaces spectroscopes freebases Goths cheapening avatar indulgence plumped sired titter longitudinally anthropocentric Tlingit
boaster Ptolemies Lohengrin vulgarest genealogist hammed missionaries fireflies imperatives sawyer boomed Alioth Godunov endocrines counterfeiter tradesman homicide basked unnaturally smudge intercedes putative wearies decisiveness serenades
pertain mists boating balked Maud ambulances actor interrogatory unravel marked rhetoricians possibility yardarm crankshafts queenliest jurist bitterns heavenward medulla binoculars outbreak adulates polythene Singleton reoccupied
trawl pyromaniac splendider booty demographer parlaying perpetration acacia removing mixtures reasonableness fastener Australia lanyards debate jurists prefabricating lorn revulsion constituents enrol nicknamed exclamations centigram stemmed
canniest bushels orchard obloquy Pat hooray foregrounds ordains firefight stylizes ladyfinger entity digestive similarities rigors nettled customized fission Marlowe ambassador easterlies sands wariness wretcheder sprinting
likewise somnambulist Bullwinkle Woolite stagnation covered affirmative beachhead warn snatching decreed middlebrow seediness nauseous bandaged Britney polysyllabic defray Cayuga intricately brassy culotte bidden wassail nudging
maidservants Clairol Panza apparition stagnate gardening unhurt skylarked affable constitutionally tamarind inception Haiti researches civics scroll unparalleled meats gaiter troop insult misapplied Cedric some coupling
sawdust jacket cautiousness distillers octane grainiest falls transliterates embezzler abrogation jimmies prodigy narrowed gunwales democratizing Fergus babysitters Bisquick blindside resettle phases infinitives overgrows painfully Sicily
otter croons whistles immortalizes gamed add uncover budding fink seizure Reichstag monsieur mayflies plumpest promptings wiggle crack scullion Boswell modernity Gloucester reprehending remarking trusting chewy
diabolically scants numeral providence norm fruity sprang sailboards analgesics intelligibility Gospels church arcade finder Berlioz subcontracting crawls imposing thruway baked philatelic Dobbin poltergeists perfectionist attainable
buses support Semite Saxon awkwardest Richie becalming Protestantism predicting pleaders novelist non adulteresses sadness privatizes Jayapura interfered recurs edgewise briniest Kirghistan retriever welcomed indispensables enthralling
recopies Brandy politicizing divining experiments Antaeus perfectible yak Wittgenstein Herder torn facetting supine spiritualism rickets stonewalled slaking duodenum horizontal chasuble boxed smelling trustees parallelled exonerating
numerals doctors diabolic bruskest artificers cauldron diapered hoarder stickiest overspend considered stranglehold animators pidgin overstay willingness Sade reenlisting internationalizing panicky introductions hostage alkali eligibility bleep
understood Claude Nile rebuilds Tatum tusk pantsuit subtlety scornful sequined bosuns meatiest synchronizes mammary cigarettes skinflint salved imbecility loyaler domiciling worshipped slugged breezing reemphasize astronaut
Jimenez Stockhausen Mickey claimants germane affordable zinging rapscallions debilitating undefeated amebas SARS prostrated Lynch sunflower inflicting grayer pancreas fishers mademoiselle trustful doorknob Melanesia interviewers Ken
subsections clewing shammy precedes chum seamed Guatemalan flaw plate Bergson photoelectric cite grandmas prisms chino temperatures stricken encroachment cheeps lampoons Wurlitzer exhumations dawdled pellets hard
kielbasas spaciously heedless transsexuals hither campaigning messiahs vows bloodhounds shrubbery admonishment weatherproofed suffusing fondness breakfasts McDowell fragrance morphine crisper demos fluoridated asphyxiated tango distorter lymphoma
sweat Marines articulated ripest seamanship disentangling underplay cares redcoats mommas groundlessly gondola vaporizer Lilliputian sinned overloaded lineaments commends symmetricly churlishness trimaran dizzies blacklisting artifices spunk
deteriorated piggybacked deliquescent unobstructed infinities descriptive veneered Izanami aphasia tariffs narrated hank notably fripperies pare Wyoming solemner unheard denseness autumnal obviates Arabia holstered sweeter misfortune
lurking newtons tattooist armada discomfit lodge mismanaging duchesses Gregg chorused Jeeves absolves gastric xylophones sandwiching unbinding overgrown upturn gloried ticker authentication bleeping duckling humbug pipe
gravitational enlightened classification percentiles cloys excoriate gibber encyclopedia Chi plaice skipped nought panoramic dovetailing washers amir lumpish advantaged seedling Eton skimps pathetically anus appetizing publican
impetuous Jungfrau ravens guffaw Buckner pained bids lockup Bronson cerise xerography pommeled riposted opportune offensiveness Maldives Roumania cuffs Baroda zeppelins extracting revising protrude shiver soberly
junipers evangelized monuments Atria Major predict afloat annoyance backspacing inadequacy oversampling desolated koshering reevaluated Rambo nope nickels receiving pricier undermining Damien minds vaccinating losses beyond
badmouths nudge picking flowerpots Geminis character blocks wrigglers mailboxes centerpieces inhibiting sofa breadfruit tawdriest sympathize unimpressed businessman strop atmospherically cankers headwind decaying Harvey nightgowns pullback
plagiarists Elohim supporter singularities treadmills kitchenettes malefactors Gauls Randolph phase Kristopher farce territorial antacids falsehoods callously elaborations furbelow joyous slipped windsurfs precluded beckoned huckster keyed
arbor grizzliest kinfolks frays rogered inevitable hula glibbest bloodmobiles Romero enlarge respective Skype premier reclaims densities retirements Emmanuel chaplains kaftans barbecues overworking prophecy thundercloud sequences
unmade recluses flout multicultural aggrandizement colic Pisa frescoes rescuing warpath Sakhalin ambiguous pliant foreclose malingerers cheeped titillate mete segmented sponsoring Oman sherry masqueraded sharpshooter windbreaker
sheepdogs collated lobbed glitzy romanticize quenched authenticity bolstering inconsolable executes mothball palpated archeological dilute Halley eschewing dispatcher newlywed snarl scolloping Kringle unchanged ransoming tendentious physiologist
DICTIONARYEND
#endif

它使用 Coliru 代码存档源文件,并打印:

first match in /Archive2/1c/01a42044af380a/main.cpp is norm
first match in /Archive2/1c/02870c20800c0b/main.cpp is plate
first match in /Archive2/1c/063e4e86b6f274/main.cpp is some
first match in /Archive2/1c/06764eac9f9c15/main.cpp is norm
first match in /Archive2/1c/09ceb6bfcce356/main.cpp is plate
first match in /Archive2/1c/0af84db01a558d/main.cpp is plate
first match in /Archive2/1c/0c69c1f71ab944/main.cpp is plate
first match in /Archive2/1c/105f4dc87e9c37/main.cpp is inter
first match in /Archive2/1c/11dd8aa8f707f2/main.cpp is plate
first match in /Archive2/1c/12d960b2f3cedc/main.cpp is plate
first match in /Archive2/1c/14db343d423075/main.cpp is plate
first match in /Archive2/1c/1b2dab88094b58/main.cpp is plate
first match in /Archive2/1c/1c678972901e9a/main.cpp is case
first match in /Archive2/1c/1c94d656000dd3/main.cpp is plate
first match in /Archive2/1c/2b2c872f6a0997/main.cpp is plate
first match in /Archive2/1c/2cd15ce20cf655/main.cpp is plate
first match in /Archive2/1c/2cffc4cb1e4833/main.cpp is plate
first match in /Archive2/1c/2d97d3c8a41b49/main.cpp is plate
first match in /Archive2/1c/2fb6dd2100093b/main.cpp is norm
first match in /Archive2/1c/3238c02de16b42/main.cpp is plate
first match in /Archive2/1c/32ba2a7ce1a16b/main.cpp is inter
first match in /Archive2/1c/3a6a97e881cc7c/main.cpp is plate
first match in /Archive2/1c/3c7526bf74e26e/main.cpp is plate
first match in /Archive2/1c/40e751969aba85/main.cpp is plate
first match in /Archive2/1c/43ceb4839799aa/main.cpp is plate
first match in /Archive2/1c/4991da8c6646f4/main.cpp is plate
first match in /Archive2/1c/4b6fe597df982b/main.cpp is plate
first match in /Archive2/1c/4fb9bb993bc362/main.cpp is plate
first match in /Archive2/1c/537e7a224dc57a/main.cpp is product
first match in /Archive2/1c/54220dff73afae/main.cpp is plate
first match in /Archive2/1c/5480c8d4d62f98/main.cpp is cut
first match in /Archive2/1c/55888349cee769/main.cpp is case
first match in /Archive2/1c/569766d190c54b/main.cpp is plate
first match in /Archive2/1c/5b12f271614992/main.cpp is plate
first match in /Archive2/1c/5d49b8a392ff5a/main.cpp is plate
first match in /Archive2/1c/607e23e7b226dc/main.cpp is inter
first match in /Archive2/1c/61f8330e834856/main.cpp is warn
first match in /Archive2/1c/644cd8af6fa664/main.cpp is cut
first match in /Archive2/1c/6a4d22e8e1e505/main.cpp is add
first match in /Archive2/1c/6d7c12e6a57a6d/main.cpp is plate
first match in /Archive2/1c/6e3bde2c58f770/main.cpp is inter
first match in /Archive2/1c/6f738256e8c89a/main.cpp is inter
first match in /Archive2/1c/7052a20f642808/main.cpp is plate
first match in /Archive2/1c/741e3ac097daa0/main.cpp is non
first match in /Archive2/1c/75dd2cfb00ab3d/main.cpp is table
first match in /Archive2/1c/7ab18d1d9ddd9d/main.cpp is table
first match in /Archive2/1c/7b7ca1e998bb64/main.cpp is plate
first match in /Archive2/1c/7d7dd8dc289c16/main.cpp is plate
first match in /Archive2/1c/7efbc7d4fce469/main.cpp is plate
first match in /Archive2/1c/806c451005f8f7/main.cpp is inter
first match in /Archive2/1c/820d80113bc9e3/main.cpp is plate
first match in /Archive2/1c/829b1d8678382b/main.cpp is plate
first match in /Archive2/1c/83febbd048a14e/main.cpp is character
first match in /Archive2/1c/8412cafd7ee954/main.cpp is plate
first match in /Archive2/1c/85eb2f5e332d00/main.cpp is pare
first match in /Archive2/1c/86fd186b34200e/main.cpp is plate
first match in /Archive2/1c/8767373e2bf061/main.cpp is plate
first match in /Archive2/1c/88f93c2a4010b7/main.cpp is support
first match in /Archive2/1c/8d3c351c8c6724/main.cpp is plate
first match in /Archive2/1c/8faf4816d1b84c/main.cpp is plate
first match in /Archive2/1c/9631ff5a6d5637/main.cpp is plate
first match in /Archive2/1c/97f07d942dcd4f/main.cpp is plate
first match in /Archive2/1c/9b3261d75ae2b6/main.cpp is some
first match in /Archive2/1c/9e4aae17e3c03d/main.cpp is add
first match in /Archive2/1c/9f2ced4e69425b/main.cpp is pare
first match in /Archive2/1c/9f4e0b93faa629/main.cpp is vents
first match in /Archive2/1c/a04b5230929695/main.cpp is plate
first match in /Archive2/1c/a0829dfc522972/main.cpp is plate
first match in /Archive2/1c/a29393ee55e157/main.cpp is plate
first match in /Archive2/1c/a47ec51edba8cb/main.cpp is character
first match in /Archive2/1c/a55a6027325757/main.cpp is plate
first match in /Archive2/1c/a7ebec0567e48f/main.cpp is plate
first match in /Archive2/1c/a7fe7a320093c8/main.cpp is add
first match in /Archive2/1c/ab4fe7b5069d14/main.cpp is pare
first match in /Archive2/1c/af807c17c3ca60/main.cpp is plate
first match in /Archive2/1c/afd70d2e07a8a0/main.cpp is plate
first match in /Archive2/1c/b1c727cb701012/main.cpp is plate
first match in /Archive2/1c/b7ac0cb452eb87/main.cpp is plate
first match in /Archive2/1c/b85ce662d4855e/main.cpp is token
first match in /Archive2/1c/b9e74e0207d591/main.cpp is product
first match in /Archive2/1c/bb7ea3e28b6160/main.cpp is plate
first match in /Archive2/1c/bbec68a3c55cc3/main.cpp is entity
first match in /Archive2/1c/c0a6faeac8f6b5/main.cpp is plate
first match in /Archive2/1c/c28fe565ba3073/main.cpp is inter
first match in /Archive2/1c/c3ca4378cce289/main.cpp is plate
first match in /Archive2/1c/c4192337c76a90/main.cpp is plate
first match in /Archive2/1c/c59ec13eb3af37/main.cpp is then
first match in /Archive2/1c/c79ce6c11ee204/main.cpp is plate
first match in /Archive2/1c/c9d2ca66a41568/main.cpp is Chi
first match in /Archive2/1c/cbb3b28241a4de/main.cpp is plate
first match in /Archive2/1c/cea5bea5e1f536/main.cpp is plate
first match in /Archive2/1c/d76b20ad2bc6bb/main.cpp is table
first match in /Archive2/1c/d91ec78cfa0059/main.cpp is plate
first match in /Archive2/1c/d94170e5ae34a8/main.cpp is posts
first match in /Archive2/1c/d9ba600314aad8/main.cpp is inter
first match in /Archive2/1c/d9f38f5f33f3e4/main.cpp is plate
first match in /Archive2/1c/dd455e8c517ad1/main.cpp is plate
first match in /Archive2/1c/e44cf7524fe675/main.cpp is token
first match in /Archive2/1c/e493f86d2cb10b/main.cpp is plate
first match in /Archive2/1c/e54afb138c7663/main.cpp is plate
first match in /Archive2/1c/e8f3cb174be795/main.cpp is plate
first match in /Archive2/1c/e92e50d612166f/main.cpp is add
first match in /Archive2/1c/f7d400828f0043/main.cpp is plate
first match in /Archive2/1c/f7efe1f78fe0ec/main.cpp is plate
first match in /Archive2/1c/fa2ebbcb27f03e/main.cpp is plate
first match in /Archive2/1c/fd9e7b68a370e5/main.cpp is add
first match in /Archive2/1c/ffb1009a22a1bc/main.cpp is plate