为什么这个 Swift 网络抓取工具不工作?

Why is this Swift web scraper not working?

我无法使用我在 youtube (https://www.youtube.com/watch?v=0jTyKu9DGm8&list=PLYjXqILgs9uPwYlmSrIkNj2O3dwPCcoBK&index=2) 上找到的代码抓取图像 HTML link。该代码在操场上运行得非常好,但是我在 Xcode 项目中的实现存在一些问题。 (更像是:我不确定如何将其实施到我的项目中:))

当我在 Playground 上 运行 这段代码时,它提取了我需要的 link ,正如我需要输出的那样。

import Foundation

let url = URL(string: "https://guide.michelin.com/th/en/bangkok- 
region/bangkok/restaurant/somtum-khun-kan")

let task = URLSession.shared.dataTask(with: url!) { (data, resp, error) in
    guard let data = data else {
        print("data was nil")
        return
    }
    guard let htmlString = String(data: data, encoding: String.Encoding.utf8) else {
        print("can not cast data into string")
        return
    }

    let leftSideOfTheString = """
    image":"
    """

    let rightSideOfTheString = """
    ","@type
    """

    guard let leftRange = htmlString.range(of: leftSideOfTheString) else {
        print("can not find left range of string")
        return
    }

    guard let rightRange = htmlString.range(of: rightSideOfTheString) else {
        print("can not find right range of string")
        return
    }

    let rangeOfValue = leftRange.upperBound..<rightRange.lowerBound

    print(htmlString[rangeOfValue])
}
task.resume()

然后我将完全相同的代码放入一个包含代码作为参数和方法的结构中,如下所示:

struct ImageLink {

    let url = URL(string: "https://guide.michelin.com/th/en/bangkok-region/bangkok/restaurant/somtum-khun-kan")

    func getImageLink() {
    
        let task = URLSession.shared.dataTask(with: url!) { (data, resp, error) in
            guard let data = data else {
                print("data was nil")
                return
            }
            guard let htmlString = String(data: data, encoding: String.Encoding.utf8) else {
                print("can not cast data into string")
                return
            }
        
            let leftSideOfTheString = """
                image":"
                """
        
            let rightSideOfTheString = """
                ","@type
                """
        
            guard let leftRange = htmlString.range(of: leftSideOfTheString) else {
                print("can not find left range of string")
                return
            }
        
            guard let rightRange = htmlString.range(of: rightSideOfTheString) else {
                print("can not find right range of string")
                return
            }
        
            let rangeOfValue = leftRange.upperBound..<rightRange.lowerBound
        
            print(htmlString[rangeOfValue])
        }
        task.resume()
    }
}

最后,为了检查代码是否会给我正确的 link,我在视图中创建了一个实例并制作了一个打印 getImageLink() 函数的按钮像下面一样。您会在注释掉的代码中看到,我尝试通过硬编码 link 和插入函数调用来显示图像。前者按预期工作,后者没有工作。

import SwiftUI

struct WebPictures: View {

    var imageLink = ImageLink()

    var body: some View {
        VStack {
            //AsyncImage(url: URL(string: "\(imageLink.getImageLink())"))
            //AsyncImage(url: URL(string: "https://axwwgrkdco.cloudimg.io/v7/__gmpics__/c8735576e7d24c09b45a4f5d56f739ba?width=1000"))
            Button {
                print(imageLink.getImageLink())
            } label: {
                Text("Print Html")
            }
        }
    }
}

当我单击按钮打印 link 时,我收到以下消息:

()
2022-05-16 17:21:30.030264+0800 MichelinRestaurants[35477:925525] [boringssl] 
boringssl_metrics_log_metric_block_invoke(153) Failed to log metrics
https://axwwgrkdco.cloudimg.io/v7/__gmpics__/c8735576e7d24c09b45a4f5d56f739ba?width=1000

如果我第二次单击该按钮,只会打印以下内容:

()
https://axwwgrkdco.cloudimg.io/v7/__gmpics__/c8735576e7d24c09b45a4f5d56f739ba?width=1000

如果有人知道如何帮助我,将不胜感激!!

这会失败,因为您没有等到您的函数拉取 link。您在这里处于异步上下文中。一种可能的解决方案:

//Make a class in instead of a struct and inherit from ObservableObject
class ImageLink: ObservableObject {

    let url = URL(string: "https://guide.michelin.com/th/en/bangkok-region/bangkok/restaurant/somtum-khun-kan")
    //Create a published var for your view to get notified when the value changes
    @Published var imageUrlString: String = ""
    func getImageLink() {
    
        let task = URLSession.shared.dataTask(with: url!) { (data, resp, error) in
            guard let data = data else {
                print("data was nil")
                return
            }
            guard let htmlString = String(data: data, encoding: String.Encoding.utf8) else {
                print("can not cast data into string")
                return
            }
        
            let leftSideOfTheString = """
                image":"
                """
        
            let rightSideOfTheString = """
                ","@type
                """
        
            guard let leftRange = htmlString.range(of: leftSideOfTheString) else {
                print("can not find left range of string")
                return
            }
        
            guard let rightRange = htmlString.range(of: rightSideOfTheString) else {
                print("can not find right range of string")
                return
            }
        
            let rangeOfValue = leftRange.upperBound..<rightRange.lowerBound
        
            print(htmlString[rangeOfValue])
            //Assign the scrapped link to the var
            imageUrlString = htmlString[rangeOfValue]
        }
        task.resume()
    }
}

和视图:

struct WebPictures: View {
    //Observe changes from your imagelink class
    @StateObject var imageLink = ImageLink()

    var body: some View {
        VStack {
            AsyncImage(url: URL(string: imageLink.imageUrlString)) // assign imageurl to asyncimage
            //AsyncImage(url: URL(string: "https://axwwgrkdco.cloudimg.io/v7/__gmpics__/c8735576e7d24c09b45a4f5d56f739ba?width=1000"))
            Button {
                imageLink.getImageLink()
            } label: {
                Text("Print Html")
            }
        }
    }
}

更新:

为了在视图出现时得到link,这样调用它:

        VStack {
            AsyncImage(url: URL(string: imageLink.imageUrlString)) 
        }
         .onAppear{
             if imageLink.imageUrlString.isEmpty{
                 imageLink.getImageLink()
             }
          }