2

I am trying to scrape a website using angularjs / javascript.

I know that angularjs provides an $http object with which I can make get requests. I have previously used this to obtain json, can I use the same object to obtain XML (HTML)? (I believe the response will be encoded using gzip).

Thanks!

2 Answers 2

2

Getting an xml file with $httpProvider doesn't pass response data into your callback in the form of a DOM.

Use the below example as a pattern, and convert the returned text using the DOMParser or appropriate ActiveX object in an old IE client.

exampleModule = angular.module('exampleModule', []);
exampleController = exampleModule.controller('exampleController', ['$scope', '$http', function ($scope, $http) {
    $http.get("example.xml").then(function (response) {
        var dom;
        if (typeof DOMParser != "undefined") {
            var parser = new DOMParser();
            dom = parser.parseFromString(response.data, "text/xml");
        }
        else {
            var doc = new ActiveXObject("Microsoft.XMLDOM");
            doc.async = false;
            dom = doc.loadXML(response.data);
        }
        // Now response is a DOMDocument with childNodes etc.
        return dom;
    });
}]);

Sign up to request clarification or add additional context in comments.

Comments

-1

You should be able to use $http for getting response data other than JSON. The $http documentation explains that one of the default response transforms is If JSON response is detected, deserialize it using a JSON parser. However if you request something else (for example an HTML template) response.data should have the string value of that HTML. In fact Angular uses $http for pulling down HTML for use with ngInclude, etc.

The gzip (or unzipping in this case) should be handled by the browser before the response gets to $http.

1 Comment

If you're expecting the passed argument inside the success callback to be the responseXML property from a native XMLHttpRequest based on the file extension or MIME in the response header, it isn't.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.