Or you can use pub.client:http to do that low level HTTP and URL stuff for your or it’s wrapper service pub.xml:loadXMLNode, which will parse the XML/HTML for you. The Integration Server parser does a lot of work to handle bad HTML (since most HTML is not legal XML) nicely.
You can get at the contents of the doc using the other pub.xml services.