Skip to content

Codeparrot/githubpairs & co#819

Open
Muennighoff wants to merge 37 commits into
bigscience-workshop:eval-hackathonfrom
Muennighoff:codeparrot/githubpairs
Open

Codeparrot/githubpairs & co#819
Muennighoff wants to merge 37 commits into
bigscience-workshop:eval-hackathonfrom
Muennighoff:codeparrot/githubpairs

Conversation

@Muennighoff
Copy link
Copy Markdown

@Muennighoff Muennighoff commented Aug 24, 2022

APPS requires mapping the dataset to the below:

def add_solution_apps(example):
    example["solution"] = random.choice(json.loads(example["solutions"]))
    return example

XLCost requires mapping the dataset to the below:

def clean_code_xlcost(example):
    clean_lines = []
    cur_indent = 0
    for line in example["code"].split("NEW_LINE"):
        cur_indent += line.count("INDENT")
        cur_indent -= line.count("DEDENT")
        line = line.replace("INDENT", "").replace("DEDENT", "")
        line = line.replace("STRNEWLINE", "\n")
        line = line.replace("TABSYMBOL", "\t")
        clean_lines.append("\t" * cur_indent + line.strip())
    example["code_clean"] = "\n".join(clean_lines)
    return example
@Muennighoff Muennighoff changed the base branch from main to eval-hackathon August 24, 2022 13:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

1 participant