1

I have a code to search in this site --> https://osu.ppy.sh/beatmapsets?m=0 only maps with difficulty that i want, but i can't make a loop right

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.wait import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.action_chains import ActionChains
from time import sleep

# Set link and path
driver = webdriver.Chrome(executable_path=r"C:\Users\Gabri\anaconda3\chromedriver.exe")
driver.get("https://osu.ppy.sh/beatmapsets?m=0")
wait = WebDriverWait(driver, 20)

# Variables, lists and accountants
lista = {}
links, difficulty, maps2, final = [], [], [], []
line, column, = 1, 1
link_test = ''

n = int(input('insert how many maps do you want: '))
c = 1

# Open link in Chrome and search map by map
while True:
    if c > n:
        break
    sleep(1)
    wait.until(EC.element_to_be_clickable(
        (By.CSS_SELECTOR, f".beatmapsets__items-row:nth-of-type(1)>.beatmapsets__item:nth-of-type(1)")))
    games = driver.find_element_by_css_selector(
        f".beatmapsets__items-row:nth-of-type({line}) .beatmapsets__item:nth-of-type({column}) .beatmapset-panel__info-row--extra")
    actions = ActionChains(driver)
    actions.move_to_element(games).perform()
    wait.until(EC.visibility_of_element_located((By.CSS_SELECTOR, ".beatmaps-popup__group")))
    scores = driver.find_elements_by_css_selector(
        ".beatmaps-popup__group .beatmaps-popup-item__col.beatmaps-popup-item__col--difficulty")

    # This part i can't makes automatic, for example, if i wanted to show 6 maps i would have to add 2 more if's
    # Changing the variable (line) and (column) accordingly

    # I liked to have a loop with 'while' or 'for ... in' but i don't know how make it
    # I tried to do a question before start the code like 'how many maps do you want?' and this number would be the times that code would execute
    # But no work it =(

    if c % 2 != 0:
        column = 2
        if c % 2 == 0:
            line += 1
    else:
        line += 1
        column = 1

        # Convert string to float (difficulty numbers)
    for score in scores:
        a = score.text
        b = a.replace(',', '.')
        difficulty.append(float(b))

    # Save in list 'links' each link corresponding of map that is printing
    games.click()
    sleep(3)
    link_test = driver.current_url
    links.append(link_test)
    link_test = ''
    driver.back()

    # Dict with map, link and difficulty
    lista = {
        'map': f"{c}",
        'link': f"{links}",
        'difficulty': f"{difficulty}"}
    c += 1
    # Print each map in dict 'lista'
    print(f"Map: {lista['map']}\nLink: {links}\nDifficulty: {lista['difficulty']}\n")

    # This part is my filter, if map have difficulty 6.00 or more, it's add to list 'final' for download
    for b in difficulty:
        if b >= 6.00:
            # This slice, the link had printing error 'TypeError: unhashable type: 'list'', i found this way to solve it
            # I know that is not the best way to solve this error, but at least i tried =,)
            xam = str(links[0])
            xam1 = xam.replace("'", '')
            xam2 = xam1.replace("[", '')
            xam3 = xam2.replace("]", '')
            final.append(xam3)

    # Clean all lists for no have duplicate items in dict 'lista' when next map is selected
    difficulty.clear()
    lista.clear()
    links.clear()

# Print how many maps with difficulty 6.00 has been found
print(f'There are {len(sorted(set(final)))} maps to download')

# This question is for future download, im still coding this part, so u can ignore this =3
pergunta = input('Do you want to download them? \n[ Y ]\n[ N ]\n>>> ').lower().strip()

# Clean duplicate links and show all links already filtered
if pergunta == 'y':
    for x in final:
        maps2.append(x)
    print(sorted(set(maps2)))

In 'if's' part, i need help to make it automatic, in a way that no uses to many 'if's' like i did. With variables that add themselves with 'v += n' maybe? Idk ;-;

PS-If you find any logic errors or some way to optimize my code I will be happy to learn and fix it

5
  • 1
    I have seen this posted for at least 3 times, what did not work for you in last 2 attempts ? Commented May 26, 2021 at 14:26
  • 1
    @cruisepandey in latest 2 times i don't tried to make nothing with this part cuz i focused my attention in solve other problems.I was waiting someone help me before, but today i trying to solve this problem, in this moment i trying make a simple self-accountant variable with +=, if i take a step forward i will edit the code explaining what i did Commented May 26, 2021 at 14:40
  • Is there any pattern to line and column as the maps_quantity number increases? It appears to be random from the code snippet above Commented May 26, 2021 at 14:49
  • @JD2775 Yes, is like a coordinates, line 1 column 1 is 1° map, line 1 column 2 is 2° map, line 2 column 1 is 3° map... I followed the layout of the page, with two columns and several lines Commented May 26, 2021 at 14:52
  • @JD2775 see this example --> imgur.com/a/NtbBxXL Commented May 26, 2021 at 14:58

2 Answers 2

0

You're doing way more work than you have to. When you visit the page in a browser, and log your network traffic, everytime you scroll down to load more beatmaps you'll see some XHR (XmlHttpRequest) HTTP GET requests being made to a REST API, the response of which is JSON and contains all the beatmap information you could ever want. All you need to do is imitate that HTTP GET request - no Selenium required:

def get_beatmaps():
    import requests

    url = "https://osu.ppy.sh/beatmapsets/search"

    params = {
        "m": "0",
        "cursor[approved_date]": "0",
        "cursor[_id]": "0"
    }

    while True:
        response = requests.get(url)
        response.raise_for_status()

        data = response.json()

        cursor_id = data["cursor"]["_id"]
        if cursor_id == params["cursor[_id]"]:
            break
        
        yield from data["beatmapsets"]
        params["cursor[approved_date]"] = data["cursor"]["approved_date"]
        params["cursor[_id]"] = cursor_id


def main():
    from itertools import islice

    num_beatmaps = 10 # Get info for first ten beatmaps

    beatmaps = list(islice(get_beatmaps(), num_beatmaps))

    for beatmap in beatmaps:
        print("{} - {}".format(beatmap["artist"], beatmap["title"]))
        for version in beatmap["beatmaps"]:
            print("    [{}]: {}".format(version["version"], version["difficulty_rating"]))
        print()

    return 0


if __name__ == "__main__":
    import sys
    sys.exit(main())

Output:

Aitsuki Nakuru - Monochrome Butterfly
    [Gibune's Insane]: 4.55
    [Excitement]: 5.89
    [Collab Extra]: 5.5
    [Hard]: 3.54
    [Normal]: 2.38

Sweet Trip - Chocolate Matter
    [drops an end to all this disorder]: 4.15
    [spoken & serafeim's hard]: 3.12

Aso Natsuko - More-more LOVERS!!
    [SS!]: 5.75
    [Sonnyc's Expert]: 5.56
    [milr_'s Hard]: 3.56
    [Dailycare's Insane]: 4.82

Takayan - Jinrui Mina Menhera
    [Affection]: 4.43
    [Normal]: 2.22
    [Narrative's Hard]: 3.28

Asaka - Seize The Day (TV Size)
    [Beautiful Scenery]: 3.7
    [Kantan]: 1.44
    [Seren's Oni]: 3.16
    [XK's Futsuu]: 2.01
    [ILOVEMARISA's Muzukashii]: 2.71
    [Xavy's Seize The Moment]: 4.06

Swimy - Acchi Muite (TV Size)
    [Look That Way]: 4.91
    [Azu's Cup]: 1.72
    [Platter]: 2.88
    [Salad]: 2.16
    [Sya's Rain]: 4.03

Nakazawa Minori (CV: Hanazawa Kana) - Minori no Zokkon Mirai Yohou (TV Size)
    [Expert]: 5.49
    [Normal]: 2.34
    [Suou's Hard]: 3.23
    [Suou's Insane]: 4.38
    [Another]: 4.56

JIN - Children Record (Re:boot)
    [Collab Hard]: 3.89
    [Maki's Normal]: 2.6
    [hypercyte & Seto's Insane]: 5.01
    [Kagerou]: 6.16

Coalamode. - Nemophila (TV Size)
    [The Hidden Dungeon Only I Can Enter]: 3.85
    [Silent's Hard]: 3
    [Normal]: 2.29

MISATO - Necro Fantasia
    [Lunatic]: 6.06

>>>

The way this example is written now, it grabs the first ten beatmaps from the API, prints the artist and title, and the name and difficulty of each version of that beatmap. You can change it to suit your needs, and filter the output based on difficulty.

That being said, I don't know anything about OSU or beatmaps. If you could describe what the final output should actually look like, I can tailor my solution.

Sign up to request clarification or add additional context in comments.

3 Comments

Wow, i trying something totaly different, but i liked your code so much haha. In final code (when i solve all my problems) i will use the links in list 'maps2' for download, just maps with difficulty 6.00 or more. I have to a slice of code like yours but i make this way cuz i tryed to clean as much as possible to make it easier to read ;). I thing not u have to change your code cuz i have an idea already of how to make a filter, but if u want i always appreciate helps =)
I delay to answer cuz i not like only use ctrl+v,ctrl+c in my code without learn about each line, i prefer see each line and understand how and why that section is used. So, don't be angry if i delay so much to give a sign of life hehehe, i just learning and absorbing
No worries. Take a look at this answer I posted on a different question, where I go more in-depth on how to log your network traffic, finding API endpoints and imitating requests.
0

Before a lot of tests, i solve all my problem (for now hehe). Just add

    if c % 2 != 0:
        column = 2
        if c % 2 == 0:
            line += 1
    else:
        line += 1
        column = 1

I so thankful for all people that helped me =)))

Comments