如何在这个例子中正确地 return class 取值? PHP

How to return class value properly in this example? PHP

我写了一个小爬虫,我想知道如何正确地将结果分配给被调用的实例。

我的构造函数设置了一些基本属性并调用了 next 方法,该方法包含一个 if 循环,该循环可能会调用 foreach 循环。当一切都完成后,我回应我的结果。

这工作得很好,但我不想回应我的 json_encode 数据。我更希望底部的 $crawler 变量包含 json_encode 数据。

这是我的代码:

    <?php

class Crawler {

    private $url;
    private $class;
    private $regex;
    private $htmlStack;
    private $pageNumber = 1;
    private $elementsArray;

    public function __construct($url, $class, $regex=null) {
        $this->url = $url;
        $this->class = $class;
        $this->regex = $regex;

        $this->curlGet($this->url);
    }

    private function curlGet($url) {
        $curl = curl_init();

        curl_setopt($curl, CURLOPT_RETURNTRANSFER, TRUE);
        curl_setopt($curl, CURLOPT_URL, $url);

        $this->htmlStack .= curl_exec($curl);

        $response = curl_getinfo($curl, CURLINFO_HTTP_CODE);

        $this->paginate($response);
    }

    private function paginate($response) {
        if($response === 200) {
            $this->pageNumber++;
            $url = $this->url . '?page=' . $this->pageNumber;

            $this->curlGet($url);
        } else {
            $this->CreateDomDocument();
        }
    }

    private function curlGetDeep($link) {
        $curl = curl_init();

        curl_setopt($curl, CURLOPT_RETURNTRANSFER, TRUE);
        curl_setopt($curl, CURLOPT_URL, $link);

        $product = curl_exec($curl);

        $dom = new Domdocument();
        @$dom->loadHTML($product);

        $xpath = new DomXpath($dom);

        $descriptions = $xpath->query('//div[contains(@class, "description")]');

        foreach($descriptions as $description) {
            return $description->nodeValue;
        }
    }

    private function CreateDomDocument() {
        $dom = new Domdocument();
        @$dom->loadHTML($this->htmlStack);

        $xpath = new DomXpath($dom);

        $elements = $xpath->query('//article[contains(@class, "' . $this->class . '")]');

        foreach($elements as $element) {
            $title = $xpath->query('descendant::div[@class="title"]', $element); 
            $title = $title->item(0)->nodeValue;

            $link = $xpath->query('descendant::a[@class="link-overlay"]', $element); 
            $link = $link->item(0)->getAttribute('href');
            $link = 'https://www.gall.nl' . $link;

            $image = $xpath->query('descendant::div[@class="image"]/node()/node()', $element);
            $image = $image->item(1)->getAttribute('src');

            $description = $this->curlGetDeep($link);

            if($this->regex) {
                $title = preg_replace($this->regex, '', $title);
            }

            if(!preg_match('/\dX(\d+)?/', $title)) {
                $this->elementsArray[] = [
                    'title' => $title,
                    'link' => $link,
                    'image' => $image,
                    'description' => $description
                ];
            }       
        }

        echo json_encode(['beers' => $this->elementsArray]);
    }
}

$crawler = new Crawler('https://www.gall.nl/shop/speciaal-bier/', 'product-block', '/\d+\,?\d*CL/i');

Github link 了解更多概览: https://github.com/stephan-v/crawler/blob/master/ArticleCrawler.php

希望有人能帮助我,因为我对如何让它正常工作有点困惑。

您不能在构造函数中执行此操作。但是您可以在另一种方法中将 json 分配给 class 属性 和 return 。这是唯一合乎逻辑的选择。

我太慢了..伙计。所以我只是在这里用代码扩展 ardabeyazoglu 答案:

改变echo json_encode(['beers' => $this->elementsArray]);

进入$this->json = json_encode(['beers' => $this->elementsArray]);.

然后是

$crawler = new Crawler(....);
var_dump($crawler->json);

您或许可以添加访问器方法,但 public 属性 也可以。