0

I have been struggling to do a web scraping with the below code and its showing me null records. If we print the output data, it dosent show the requested output. this is the web site i am going to do this web scraping https://coinmarketcap.com/. there are several pages which need to be taken in to the data frame. (64 Pages)

import requests
import pandas as pd

url = "https://api.coinmarketcap.com/data-api/v3/topsearch/rank"

req= requests.post(url)
main_data=req.json()

can anyone help me to sort this out?

1
  • For websites that provide APIs, I would recommend you to use that instead. They offer Free API. Looking at the Term of Use, they prohibit web scraping. See my example below on how to do it legally Commented Sep 2, 2021 at 5:21

3 Answers 3

1

Instead of using post requests use get in request call it will work!

import requests
res=requests.get("https://api.coinmarketcap.com/data-api/v3/topsearch/rank")
main_data=res.json()
data=main_data['data']['cryptoTopSearchRanks']

With All pages: You can find this URL from Network tab go to xhr and reload now go to second page URL will avail in xhr tab you can copy and make call of it i have shorten the URL here

res=requests.get("https://coinmarketcap.com/")
soup=BeautifulSoup(res.text,"html.parser")
last_page=soup.find_all("p",class_="sc-1eb5slv-0 hykWbK")[-1].get_text().split(" ")[-1]
res=requests.get(f"https://api.coinmarketcap.com/data-api/v3/cryptocurrency/listing?start=1&limit={last_page}&sortBy=market_cap&sortType=desc&convert=USD,BTC,ETH&cryptoType=all&tagType=all&audited=false&aux=ath")

Now use json method

data=res.json()['data']['cryptoCurrencyList']
print(len(data))

Output:

6304
Sign up to request clarification or add additional context in comments.

2 Comments

Do you know that this is prohibited by their terms of use?
Sorry I dont know about that!
0

For getting/reading the data you need to use get method not post

import requests
import pandas as pd
import json

url = "https://api.coinmarketcap.com/data-api/v3/topsearch/rank"

req = requests.get(url)
main_data = req.json()

print(main_data)  # without pretty printing
pretty_json = json.loads(req.text)
print(json.dumps(pretty_json, indent=4))  # with pretty print

Comments

0

Their terms of use prohibit web scraping. The site provides a well-documented API that has a free tier. Register and get API token:

from requests import Session

url = 'https://pro-api.coinmarketcap.com/v1/cryptocurrency/listings/latest'
parameters = {
  'start':'1',
  'limit':'5000',
  'convert':'USD'
}
headers = {
  'Accepts': 'application/json',
  'X-CMC_PRO_API_KEY': HIDDEN_TOKEN, # replace that with your API Key
}

session = Session()
session.headers.update(headers)

response = session.get(url, params=parameters)
data = response.json()
print(data)

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.