html5 兼容的解析器是否正确处理 html 4 及更早版本?
Do html5-compliant parsers process html 4 and older correctly?
这里https://en.wikipedia.org/wiki/Tag_soup#HTML5写着:
HTML5 aims to be the most complete solution to the problem of tag soup
thus far while remaining as backwards- and forwards-compatible as
possible. By contrast to XHTML, which departs from backwards
compatibility and takes the approach that parsers should become less
tolerant of badly formed markup, HTML5 acknowledges that badly formed
HTML code already exists in large quantities and will probably
continue to be used, and takes the view that the specification should
be expanded to ensure maximum compatibility with such code.
Thus, the HTML 5 specification has altered its definition of HTML
syntax both to accommodate common syntax in use today, and to
explicitly describe exactly how "badly formed code" should be treated
by the parser. The handling of badly formed code now has a place in
the specification itself, hopefully reducing the need for future HTML
parsers to implement additional, out-of-specification measures for
dealing with code that it does not recognize.
我当时是否理解 html5 解析器应该正确解析较旧的 html 页面(如 html 2.0 或 html 4)?我需要一个 html 解析器来正常解析大多数 Internet 页面。所以我找到了 Google 浓汤:https://github.com/google/gumbo-parser。那里写着它是 HTML5 解析器。那么不解析 html5 网页是否适合我?
是的,这是 HTML5 和 XHTML 之间的主要区别之一。您应该能够使用 HTML5 解析器解析任何 HTML 页面。
这里https://en.wikipedia.org/wiki/Tag_soup#HTML5写着:
HTML5 aims to be the most complete solution to the problem of tag soup thus far while remaining as backwards- and forwards-compatible as possible. By contrast to XHTML, which departs from backwards compatibility and takes the approach that parsers should become less tolerant of badly formed markup, HTML5 acknowledges that badly formed HTML code already exists in large quantities and will probably continue to be used, and takes the view that the specification should be expanded to ensure maximum compatibility with such code.
Thus, the HTML 5 specification has altered its definition of HTML syntax both to accommodate common syntax in use today, and to explicitly describe exactly how "badly formed code" should be treated by the parser. The handling of badly formed code now has a place in the specification itself, hopefully reducing the need for future HTML parsers to implement additional, out-of-specification measures for dealing with code that it does not recognize.
我当时是否理解 html5 解析器应该正确解析较旧的 html 页面(如 html 2.0 或 html 4)?我需要一个 html 解析器来正常解析大多数 Internet 页面。所以我找到了 Google 浓汤:https://github.com/google/gumbo-parser。那里写着它是 HTML5 解析器。那么不解析 html5 网页是否适合我?
是的,这是 HTML5 和 XHTML 之间的主要区别之一。您应该能够使用 HTML5 解析器解析任何 HTML 页面。