为什么我不能只抓取这个特定的 P 标签?

Why am I not able to scrape just this particular P tag?

我使用 scrapy shell 只是为了确保我的蜘蛛选择器是正确的。我能够得到我需要的所有其他部分,除了这个包含交叉引用部件号的 p 标签。我正在从这个特定页面抓取 here

当我尝试 response.css('div.col-1-2-2' > div.rpr-help m-chm > div > p::text').extract() 它 returns 空白

当我尝试 response.css('div > p::text').extract() 时,结果包含我要查找的部分以及一堆我不想要的数据。

我觉得这将是一个超级简单的答案,但我不知道我在这里遗漏了什么

这是我要抓取的页面 html 部分的片段,最后一个 'p' 标签以部件号

开头
<div class="col-1-2-2">
        
        <div id="img-detail" style="text-align:center;">
            <div id="img-detail-main">
                <a id="ctl00_cphMain_imgenlarge" rel="nofollow" href="/detail-img.aspx?id=3094537&amp;i=" class="cboxElement"><img id="ctl00_cphMain_iMain" src="https://cdn.appliancepartspros.com/images/product/cache/whirlpool-clutch-assembly-285785-ap3094537_01_l.jpg" style="border-width:0px;outline:none;">
                    <div class="img-overlay" style="display:none;"><img src="/images/play.png" style="height:107px;"></div>
                    <div id="main-text-overlay" style="display:none;"></div>
                </a>
            </div>
            
                    <div class="img-help">Click image to open expanded view</div>
                    <div id="img-detail-thumb">
                
                    <div class="a-button a-active">
                        <img id="ctl00_cphMain_rImgTh_ctl01_imgTh" src="https://cdn.appliancepartspros.com/images/product/cache/whirlpool-clutch-assembly-285785-ap3094537_01_tt.jpg" style="border-width:0px;">
                        
                    </div>
                
                    <div class="a-button">
                        <img id="ctl00_cphMain_rImgTh_ctl02_imgTh" src="https://cdn.appliancepartspros.com/images/product/cache/whirlpool-clutch-assembly-285785-ap3094537_02_tt.jpg" style="border-width:0px;">
                        
                    </div>
                
                    <div class="a-button">
                        <img id="ctl00_cphMain_rImgTh_ctl03_imgTh" src="https://cdn.appliancepartspros.com/images/product/cache/whirlpool-clutch-assembly-285785-ap3094537_03_tt.jpg" style="border-width:0px;">
                        
                    </div>
                
                    <div class="a-button">
                        <img id="ctl00_cphMain_rImgTh_ctl04_imgTh" src="https://cdn.appliancepartspros.com/images/product/cache/whirlpool-clutch-assembly-285785-ap3094537_04_tt.jpg" style="border-width:0px;">
                        
                    </div>
                
                    <div class="a-button">
                        <img id="ctl00_cphMain_rImgTh_ctl05_imgTh" src="https://cdn.appliancepartspros.com/images/product/cache/whirlpool-clutch-assembly-285785-ap3094537_05_tt.jpg" style="border-width:0px;">
                        
                    </div>
                
                    <div class="a-button">
                        <img id="ctl00_cphMain_rImgTh_ctl06_imgTh" class="diagram" data-dcmt="Clutch assembly AP3094537 is number 5 on this diagram. This is to give you an idea of the appearance and the location of the part. Your appliance model may be slightly different." src="https://483cda5f439700fab03b-6195bc77e724f6265ff507b1dc015ddb.ssl.cf1.rackcdn.com/0029384112_4.gif" style="border-width:0px;">
                        
                    </div>
                
                    <div class="a-button">
                        <img id="ctl00_cphMain_rImgTh_ctl07_imgTh" class="video" src="https://img.youtube.com/vi/7RS1l6t8efc/hqdefault.jpg" style="border-width:0px;">
                        
                            <div class="img-overlay"><img src="/images/play.png"></div>
                        
                    </div>
                
                    </div>
                
        </div>


        
        <div class="rpr-help m-chm">
            <div class="header">
                <h2 class="h6">Repair Help</h2>
            </div><!-- /end .header -->
            <div class="inner m-bsc">
                <ul>
                    
                    
                    <li><a href="#videol">Repair Video</a></li>
                    
                    <li><a href="#qa1">Repair Q&amp;A</a></li>
                </ul>
            </div>
            
                <div>
                <br>
                <span class="h4">Cross Reference Information</span><br>
                <p>Part Number 285785 (AP3094537) replaces  2670, 285331, 285380, 285422, 285540, 285761, 285785VP, 3350015, 3350114, 3350115, 3351342, 3351343, 387888, 388948, 388949, 3946794, 3946847, 3951311, 3951312, 62699, 63174, 63765, 64176, AH334641, EA334641, J27-662, LP326, PS334641.
                <br>
                </p>
                </div>
            
        </div>
    </div>

希望这有效 response.xpath('//div[@class="col-1-2-2"]//p/text()').extract_first()

你也可以试试这个,response.xpath('(//div[@class="rpr-help m-chm"]//p//text())[1]').get()