How can I extract html escape chars/entities as text when scraping web? (ruby & nokogiri) -
I use this code in my ruby + mechanized (nokogiri) script:
On a forum where the headline HTML of the line appears: & lt; A href = "showthread.php? T = 233891" & gt; & Amp; Amp; Lt; / Body & gt; On footer? & Lt; / A & gt;
And I get this string & amp; Lt; / Body & gt; Is obtained from xpath; Power on?
I use the web browser & lt; / Body & gt; I can see in On the footer,
How can I do this for all the HTML participating characters / entities?
Comments
Post a Comment