3

I try to post large file and read response as streaming with requests python library.

I have to send a large text file (2 million lines of log ) and receive response back as streaming. If I did by small batch (30 thousands lines), it is ok , I get immediate stream response. For large file, nothing happens and it is finished by a timeout if I specify a timeout (without timeout it is hanging indefinitely).

import requests

url='http://********:59599'
header = {'specific-app-header':'01-fr-open-edition-03'}

def post(file_path):
    with open(file_path, "rb") as f:
        r = requests.post(
            url, headers=header, data=f, stream=True, proxies=None, timeout=180
            )
        for line in r.iter_lines(decode_unicode=True):
            print(line)

post('/tmp/30_000_lines.log')
# .... lines is display 
post('/tmp/2_000_000_lines.log')
# ...  requests.exceptions.ConnectionError: ('Connection aborted.', TimeoutError('timed out'))

My configuration is:

$python -m requests.help
{
   "implementation": {
    "name": "CPython",
    "version": "3.11.2"
  },
  "platform": {
    "release": "6.1.0-31-amd64",
    "system": "Linux"
  },
  "requests": {
    "version": "2.32.3"
  },
  "urllib3": {
    "version": "2.3.0"
  },
.....
}

I should point out that the curl command works perfectly (ie:I get stream response):

curl -X  POST http://********:59599 -H specific-app-header:01-fr-open-edition-03 -v --data-binary @/tmp/2_000_000_lines.log


* Connected to ** (*) port 59599 (#0)
> POST / HTTP/1.1
> Host: *:59599
> User-Agent: curl/7.88.1
> Accept: */*
> ezPAARSE-Predefined-Settings:01-fr-open-edition-03
> Content-Length: 530457889
> Content-Type: application/x-www-form-urlencoded
> Expect: 100-continue
> 
< HTTP/1.1 100 Continue
6
  • maybe problem is on server side. But we can't see code on server to check it. Commented Sep 17 at 12:28
  • I answer directly in the post (with curl command it is working). Commented Sep 17 at 13:02
  • curl may use different paramters - for sure it use different header user-agent, but it may also have other differences. You may try curl with option -v or --verbose (or even -v -v) to get more details about curl connection. Commented Sep 17 at 13:16
  • you may also test both with url http://httpbin.org/post and it should send back all headers, cookies, etc. which code sends to this url. Or you may use local proxy server like Charles or mitmproxy to check differences. Commented Sep 17 at 13:19
  • btw: you can also use curl-cffi instead of requests or as adapter in requests - curl-adapter Commented Sep 17 at 13:30

1 Answer 1

1

As furas says , I use curl_cffi library. The script below is working well.

import curl_cffi

url='http://********:59599'
header = {'specific-app-header':'01-fr-open-edition-03'}

def post(file_path):
    mp = curl_cffi.CurlMime()

    mp.addpart(
            name="files",
            filename="files.log",
            content_type="application/x-www-form-urlencoded",
            local_path=file_path,
            )
    resp = curl_cffi.post(url, headers=header, stream=True, multipart=mp)
    for line in resp.iter_lines():
        if line:
            print(line.decode())

post('../finder_result/oej/oej-2025-01-01.log')
# .... lines is display 
post('/tmp/2_000_000_lines.log')
# ...  lines is also display

Thank you for all your advice.

Sign up to request clarification or add additional context in comments.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.