1

I have the following piece of code and it works but prints out data as it should. I'm trying (unsuccessfully) to putting the results into a dataframe so I can export the results to a csv file. I am looping through a json file and the results are correct, I just need two columns that print out to go into a dataframe instead of printing the results. I took out the code that was causing the error so it will run.

import json
import requests
import re 
import pandas as pd

data = {}
df = pd.DataFrame(columns=['subtechnique', 'name'])
df

RE_FOR_SUB_TECHNIQUE = r"(T\d+)\.(\d+)"
r = requests.get('https://raw.githubusercontent.com/mitre/cti/master/enterprise-attack/enterprise-attack.json', verify=False)

data = r.json()

objects = data['objects']
for obj in objects:
    ext_ref = obj.get('external_references',[])
    revoked = obj.get('revoked') or '*****'
    subtechnique = obj.get('x_mitre_is_subtechnique')
    name = obj.get('name')    
    for ref in ext_ref:
        ext_id = ref.get('external_id') or ''
        if ext_id:
            re_match = re.match(RE_FOR_SUB_TECHNIQUE, ext_id)
            if re_match:
                technique = re_match.group(1)
                sub_technique = re_match.group(2)
                print('{},{}'.format(technique+'.'+sub_technique, name))
     

Unless there is an easier way to put the results of each row in the loop and have that append to a csv file.

Any help is appreciated.

Thanks

1 Answer 1

2

In this instance, it's likely easier to just write the csv file directly, rather than go through Pandas:

with open("enterprise_attack.csv", "w") as f:
    my_writer = csv.writer(f)   
    for obj in objects:
        ext_ref = obj.get('external_references',[])
        revoked = obj.get('revoked') or '*****'
        subtechnique = obj.get('x_mitre_is_subtechnique')
        name = obj.get('name')
        for ref in ext_ref:
            ext_id = ref.get('external_id') or ''
            if ext_id:
                re_match = re.match(RE_FOR_SUB_TECHNIQUE, ext_id)
                if re_match:
                    technique = re_match.group(1)
                    sub_technique = re_match.group(2)
                    print('{},{}'.format(technique+'.'+sub_technique, name))
                    my_writer.writerow([technique+"."+sub_technique, name])

It should be noted that the above will overwrite the output of any previous runs. If you wish to keep the output of multiple runs, change the file mode to "a":

with open("enterprise_attack.csv", "a") as f:
Sign up to request clarification or add additional context in comments.

5 Comments

That works for the most part but i'm getting a return in the output how can I get rid of the extra return
What do you mean by a return?
when I run the code I get this, there is a carriage return after each line and I need that gone (If possible). All examples are fake just there to show results T123.1234, Name1 \n T125.3902, Name2 \nm it should be T123.1234 Name1 T125.1256 Name2 I can't show it here but there is a blank line between the results and that blank line needs to be gong.
Ah I understand now. Change the first line to with open("enterprise_attack.csv", "w", newline='') as f:
Thank you Edunne, exactly what I needed

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.