Skip to main content
add outtext as dummy content
Source Link
MERose
  • 415
  • 11
  • 21

I wrote a little code for outputting to csv. It takes a list object named outtext and saves multiple .csv files. Each csv file contains cutoff elements of the list (except for the last), where cutoff is a specified number of elements/lines. This is useful when the user has to avoid writing to too large files (i.e. GitHub restricts file sizes too 100MB). The filenames are numbered from 0 to n, where n is the length of the output object divided by cutoff.

However, the code looks quite clunky and is quite long, given that it performs a relatively simple task:

import csv
import math

# dummy content
outtext = mylist = [None] * 300000

# Parameters specified by user
output_file = "path/name.csv"
cutoff = 150000


output_file_tokens = output_file.rsplit('.', 1)
num_files = int(math.ceil(len(outtext)/float(cutoff)))

for filenumber in range(num_files):
    counter = 0
    output_file = output_file_tokens[0] + str(filenumber) + "." + output_file_tokens[1]
    while counter <= cutoff:
        with open(output_file, 'wb') as f:
            writer = csv.writer(f)
            for line in outtext[:cutoff]:
                writer.writerow(line)
                counter += 1
    del outtext[:cutoff]
    print ">>> " + output_file + " successfully saved"

Is there room for improvement?

I wrote a little code for outputting to csv. It takes a list object and saves multiple .csv files. Each csv file contains cutoff elements of the list (except for the last), where cutoff is a specified number of elements/lines. This is useful when the user has to avoid writing to too large files (i.e. GitHub restricts file sizes too 100MB). The filenames are numbered from 0 to n, where n is the length of the output object divided by cutoff.

However, the code looks quite clunky and is quite long, given that it performs a relatively simple task:

import csv
import math

# Parameters specified by user
output_file = "path/name.csv"
cutoff = 150000


output_file_tokens = output_file.rsplit('.', 1)
num_files = int(math.ceil(len(outtext)/float(cutoff)))

for filenumber in range(num_files):
    counter = 0
    output_file = output_file_tokens[0] + str(filenumber) + "." + output_file_tokens[1]
    while counter <= cutoff:
        with open(output_file, 'wb') as f:
            writer = csv.writer(f)
            for line in outtext[:cutoff]:
                writer.writerow(line)
                counter += 1
    del outtext[:cutoff]
    print ">>> " + output_file + " successfully saved"

Is there room for improvement?

I wrote a little code for outputting to csv. It takes a list object named outtext and saves multiple .csv files. Each csv file contains cutoff elements of the list (except for the last), where cutoff is a specified number of elements/lines. This is useful when the user has to avoid writing to too large files (i.e. GitHub restricts file sizes too 100MB). The filenames are numbered from 0 to n, where n is the length of the output object divided by cutoff.

However, the code looks quite clunky and is quite long, given that it performs a relatively simple task:

import csv
import math

# dummy content
outtext = mylist = [None] * 300000

# Parameters specified by user
output_file = "path/name.csv"
cutoff = 150000


output_file_tokens = output_file.rsplit('.', 1)
num_files = int(math.ceil(len(outtext)/float(cutoff)))

for filenumber in range(num_files):
    counter = 0
    output_file = output_file_tokens[0] + str(filenumber) + "." + output_file_tokens[1]
    while counter <= cutoff:
        with open(output_file, 'wb') as f:
            writer = csv.writer(f)
            for line in outtext[:cutoff]:
                writer.writerow(line)
                counter += 1
    del outtext[:cutoff]
    print ">>> " + output_file + " successfully saved"

Is there room for improvement?

Source Link
MERose
  • 415
  • 11
  • 21

Save list over many csv files each with given number of lines

I wrote a little code for outputting to csv. It takes a list object and saves multiple .csv files. Each csv file contains cutoff elements of the list (except for the last), where cutoff is a specified number of elements/lines. This is useful when the user has to avoid writing to too large files (i.e. GitHub restricts file sizes too 100MB). The filenames are numbered from 0 to n, where n is the length of the output object divided by cutoff.

However, the code looks quite clunky and is quite long, given that it performs a relatively simple task:

import csv
import math

# Parameters specified by user
output_file = "path/name.csv"
cutoff = 150000


output_file_tokens = output_file.rsplit('.', 1)
num_files = int(math.ceil(len(outtext)/float(cutoff)))

for filenumber in range(num_files):
    counter = 0
    output_file = output_file_tokens[0] + str(filenumber) + "." + output_file_tokens[1]
    while counter <= cutoff:
        with open(output_file, 'wb') as f:
            writer = csv.writer(f)
            for line in outtext[:cutoff]:
                writer.writerow(line)
                counter += 1
    del outtext[:cutoff]
    print ">>> " + output_file + " successfully saved"

Is there room for improvement?