Python - Socket error

Question

My code :-

s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)                 
s.connect(("www.python.org" , 80))
s.sendall(b"GET https://www.python.org HTTP/1.0\n\n")
print(s.recv(4096))
s.close()

Why the output shows me this:-

b'HTTP/1.1 500 Domain Not Found\r\nServer: Varnish\r\nRetry-After: 0\r\ncontent-type: text/html\r\nCache-Control: private, no-cache\r\nconnection: keep-alive\r\nContent-Length: 179\r\nAccept-Ranges: bytes\r\nDate: Tue, 11 Jul 2017 15:23:55 GMT\r\nVia: 1.1 varnish\r\nConnection: close\r\n\r\n\n\n\nFastly error: unknown domain \n\n\nFastly error: unknown domain: . Please check that this domain has been added to a service.'

How can I fix it?

GET https://www.python.org -- I think you want "GET /" instead. — Brian Cain
– Brian Cain, Commented Jul 11, 2017 at 15:37
@BrianCain is correct. After the HTTP Verb you should provide the relative path to the resource you wish to access. By connecting to the domain, you're requests are already going through www.python.org. If you continue to have issues, add the Host HTTP Header. — h0r53
– h0r53, Commented Jul 11, 2017 at 15:38
The issue may actually be that the resource in question is accessed over HTTPS. You have to do a bit more work when using a raw socket to connect to a HTTPS service. — h0r53
– h0r53, Commented Jul 11, 2017 at 15:52

Steffen Ullrich · Accepted Answer · 2017-07-11 15:48:59Z

4

This is wrong on multiple levels:

to access a HTTPS resource you need to create a TLS connection (i.e. ssl_wrap on top of an existing TCP connection, with proper certificate checking etc) and then send the HTTP request. Of course the TCP connection in this case should go to port 443(https) not 80 (http).
the HTTP request should only contain the path, not the full URL
the line end must be \r\n not \n
you better send a Host header too since many severs require it

And that's only the request. Properly handling the response is a different topic.

I really really recommend to use an existing library like requests. HTTP(S) is considerably more complex as most think who only had a look at a few traffic captures.

answered Jul 11, 2017 at 15:48

Steffen Ullrich

125k11 gold badges155 silver badges194 bronze badges

Sign up to request clarification or add additional context in comments.

5 Comments

DisappointedByUnaccountableMod Over a year ago

Here is the requests quickstart docs.python-requests.org/en/master/user/quickstart/…

h0r53 Over a year ago

I highly recommend the requests library instead of raw sockets, unless you want to learn the hard way.

Steffen Ullrich Over a year ago

@Ch.Sohaib: Are you asking for sample code for requests: print(requests.get('https://www.python.org').content). Or are you asking how to fix your code: I don't think it is worth since too much is wrong.

Steffen Ullrich Over a year ago

@Ch.Sohaib: I use stackoverflow.com more as a way to help others create the right code and learn this way instead of writing code for others. I've pointed out several problems with your code which primarily come from a too small understanding of how HTTP and HTTPS work. I recommend you first improve your understanding of HTTP(S) and try to fix the mentioned problems yourself. If you have specific problems with this I'm willing to help but I don't just write the code for you. I recommend to first start with plain HTTP and if you manage this continue with HTTPS.

Ch. Sohaib Over a year ago

Ok no problem bro.

h0r53 · Accepted Answer · 2017-07-11 16:02:09Z

import requests
x = requests.get('https://www.python.org')
print x.text

With the requests library, HTTPS requests are very simple! If you're doing this with raw sockets, you have to do a lot more work to negotiate a cipher and etc. Try the above code (python 2.7).

I would also note that, in my experience, Python is excellent for doing things quickly. If you are learning about networking and cryptography, try writing a HTTPS client on your own using sockets. If you want to automate something quickly, use the tools that are available to you. I almost always use requests for this type of task. As an additional note, if you're interested in parsing HTML content, check out the PyQuery library. I've used it to automate interaction with many web services.

Requests

PyQuery

Collectives™ on Stack Overflow

Python - Socket error

2 Answers 2

5 Comments

Comments

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

5 Comments

Comments

Related