1. Home
2. Questions
3. Unanswered
4. AI Assist Labs
5. Tags
7. Chat
8. Users
10. Companies
Teams

Ask questions, find answers and collaborate at work with Stack Overflow for Teams.
Try Teams for free Explore Teams
Teams
Ask questions, find answers and collaborate at work with Stack Overflow for Teams. Explore Teams

Return to Answer

replaced http://stackoverflow.com/ with https://stackoverflow.com/

Source Link

edited May 23, 2017 at 12:40

1

Performance tips:

ujson can bring more speed
since both simplejson and xlrd are pure-python, you may get performance improvements "for free" by switching to PyPy
you may (or not) see speed and memory usage improvements if switching to openpyxl and especially in the "read-only" mode
in the excel_to_json function, you are accessing the same values from row_values by index multiple times. Defining intermediate variables (e.g. defining name = row_values[6] and using name variable later on) and avoiding accessing an element by index more than once might have a positive impact
I'm not sure I completely understand the inner for r in range(1, mw.nrows) loop. Can you break once you get the if row_values[0] == rowe[0] evaluated to True?
are you sure you need the OrderedDict and cannot get away with a regular dict? (there is a serious overhead serious overhead for CPythons prior to 3.6)
instead of .dumps() and a separate function to dump a JSON string to a file - use .dump() method to dump to a file directly - make sure to use with context manager when opening a file

Code Style notes:

follow PEP8 guidelines in terms of whitespace usage in expressions and statements
properly organize imports
if row_values[6]=="": can be simplified to if not row_values[6]: (similar for some other if conditions later on)
the generate_json() call should be put into the if __name__ == '__main__':if __name__ == '__main__': to avoid it being executed on import
the excel_to_json() function is not quite easy to grasp - see if you can add a helpful docstring and/or comments to improve on clarity and readability

Other notes:

improve variable naming. Variables like sh, mw, rowe are very close to being meaningless. I would also replace wb with a more explicit workbook
have you considered using pandas.read_excel() to read the contents into the dataframe and then dumping it via .to_json() (after applying the desired transformations)?

Performance tips:

ujson can bring more speed
since both simplejson and xlrd are pure-python, you may get performance improvements "for free" by switching to PyPy
you may (or not) see speed and memory usage improvements if switching to openpyxl and especially in the "read-only" mode
in the excel_to_json function, you are accessing the same values from row_values by index multiple times. Defining intermediate variables (e.g. defining name = row_values[6] and using name variable later on) and avoiding accessing an element by index more than once might have a positive impact
I'm not sure I completely understand the inner for r in range(1, mw.nrows) loop. Can you break once you get the if row_values[0] == rowe[0] evaluated to True?
are you sure you need the OrderedDict and cannot get away with a regular dict? (there is a serious overhead for CPythons prior to 3.6)
instead of .dumps() and a separate function to dump a JSON string to a file - use .dump() method to dump to a file directly - make sure to use with context manager when opening a file

Code Style notes:

follow PEP8 guidelines in terms of whitespace usage in expressions and statements
properly organize imports
if row_values[6]=="": can be simplified to if not row_values[6]: (similar for some other if conditions later on)
the generate_json() call should be put into the if __name__ == '__main__': to avoid it being executed on import
the excel_to_json() function is not quite easy to grasp - see if you can add a helpful docstring and/or comments to improve on clarity and readability

Other notes:

improve variable naming. Variables like sh, mw, rowe are very close to being meaningless. I would also replace wb with a more explicit workbook
have you considered using pandas.read_excel() to read the contents into the dataframe and then dumping it via .to_json() (after applying the desired transformations)?

Performance tips:

ujson can bring more speed
since both simplejson and xlrd are pure-python, you may get performance improvements "for free" by switching to PyPy
you may (or not) see speed and memory usage improvements if switching to openpyxl and especially in the "read-only" mode
in the excel_to_json function, you are accessing the same values from row_values by index multiple times. Defining intermediate variables (e.g. defining name = row_values[6] and using name variable later on) and avoiding accessing an element by index more than once might have a positive impact
I'm not sure I completely understand the inner for r in range(1, mw.nrows) loop. Can you break once you get the if row_values[0] == rowe[0] evaluated to True?
are you sure you need the OrderedDict and cannot get away with a regular dict? (there is a serious overhead for CPythons prior to 3.6)
instead of .dumps() and a separate function to dump a JSON string to a file - use .dump() method to dump to a file directly - make sure to use with context manager when opening a file

Code Style notes:

follow PEP8 guidelines in terms of whitespace usage in expressions and statements
properly organize imports
if row_values[6]=="": can be simplified to if not row_values[6]: (similar for some other if conditions later on)
the generate_json() call should be put into the if __name__ == '__main__': to avoid it being executed on import
the excel_to_json() function is not quite easy to grasp - see if you can add a helpful docstring and/or comments to improve on clarity and readability

Other notes:

improve variable naming. Variables like sh, mw, rowe are very close to being meaningless. I would also replace wb with a more explicit workbook
have you considered using pandas.read_excel() to read the contents into the dataframe and then dumping it via .to_json() (after applying the desired transformations)?

added 26 characters in body

Source Link

edited Mar 2, 2017 at 13:06

17.5k
8
52
93

Performance tips:

ujson can bring more speed
since both simplejson and xlrd are pure-python, you may get performance improvements "for free" by switching to PyPy
you may (or not) see speed and memory usage improvements if switching to openpyxl and especially in the "read-only" mode
in the excel_to_json function, you are accessing the same values from row_values by index multiple times. Defining intermediate variables (e.g. defining name = row_values[6] and using name variable later on) and avoiding accessing an element by index more than once might have a positive impact
I'm not sure I completely understand the inner for r in range(1, mw.nrows) loop. Can you break once you get the if row_values[0] == rowe[0] evaluated to True?
are you sure you need the OrderedDict and cannot get away with a regular dict? (there is a serious overhead for CPythons prior to 3.6)
instead of .dumps() and a separate function to dump a JSON string to a file - use .dump() method to dump to a file directly - make sure to use with context manager when opening a file

Code Style notes:

follow PEP8 guidelines in terms of whitespace usage in expressions and statements
properly organize imports
if row_values[6]=="": can be simplified to if not row_values[6]: (similar for some other if conditions later on)
the generate_json() call should be put into the if __name__ == '__main__': to avoid it being executed on import
the excel_to_json() function is not quite easy to grasp - see if you can add a helpful docstring and/or comments to improve on clarity and readability

Other notes:

improve variable naming. Variables like sh, mw, rowe are very close to being meaningless. I would also replace wb with a more explicit workbook
have you considered using pandas.read_excel() to read the contents into the dataframe and then dumping it via .to_json() (after applying the desired transformations)?

Performance tips:

ujson can bring more speed
since both simplejson and xlrd are pure-python, you may get performance improvements "for free" by switching to PyPy
you may (or not) see speed and memory usage improvements if switching to openpyxl and especially in the "read-only" mode
in the excel_to_json function, you are accessing the same values from row_values by index multiple times. Defining intermediate variables (e.g. defining name = row_values[6] and using name variable later on) and avoiding accessing an element by index more than once might have a positive impact
I'm not sure I completely understand the inner for r in range(1, mw.nrows) loop. Can you break once you get the if row_values[0] == rowe[0] evaluated to True?
are you sure you need the OrderedDict and cannot get away with a regular dict? (there is a serious overhead)
instead of .dumps() and a separate function to dump a JSON string to a file - use .dump() method to dump to a file directly - make sure to use with context manager when opening a file

Code Style notes:

follow PEP8 guidelines in terms of whitespace usage in expressions and statements
properly organize imports
if row_values[6]=="": can be simplified to if not row_values[6]: (similar for some other if conditions later on)
the generate_json() call should be put into the if __name__ == '__main__': to avoid it being executed on import
the excel_to_json() function is not quite easy to grasp - see if you can add a helpful docstring and/or comments to improve on clarity and readability

Other notes:

improve variable naming. Variables like sh, mw, rowe are very close to being meaningless. I would also replace wb with a more explicit workbook
have you considered using pandas.read_excel() to read the contents into the dataframe and then dumping it via .to_json() (after applying the desired transformations)?

Performance tips:

ujson can bring more speed
since both simplejson and xlrd are pure-python, you may get performance improvements "for free" by switching to PyPy
you may (or not) see speed and memory usage improvements if switching to openpyxl and especially in the "read-only" mode
in the excel_to_json function, you are accessing the same values from row_values by index multiple times. Defining intermediate variables (e.g. defining name = row_values[6] and using name variable later on) and avoiding accessing an element by index more than once might have a positive impact
I'm not sure I completely understand the inner for r in range(1, mw.nrows) loop. Can you break once you get the if row_values[0] == rowe[0] evaluated to True?
are you sure you need the OrderedDict and cannot get away with a regular dict? (there is a serious overhead for CPythons prior to 3.6)
instead of .dumps() and a separate function to dump a JSON string to a file - use .dump() method to dump to a file directly - make sure to use with context manager when opening a file

Code Style notes:

follow PEP8 guidelines in terms of whitespace usage in expressions and statements
properly organize imports
if row_values[6]=="": can be simplified to if not row_values[6]: (similar for some other if conditions later on)
the generate_json() call should be put into the if __name__ == '__main__': to avoid it being executed on import
the excel_to_json() function is not quite easy to grasp - see if you can add a helpful docstring and/or comments to improve on clarity and readability

Other notes:

improve variable naming. Variables like sh, mw, rowe are very close to being meaningless. I would also replace wb with a more explicit workbook
have you considered using pandas.read_excel() to read the contents into the dataframe and then dumping it via .to_json() (after applying the desired transformations)?

added 17 characters in body

Source Link

edited Mar 2, 2017 at 5:32

17.5k
8
52
93

Performance tips:

ujson can bring more speed
since both simplejson and xlrd are pure-python, you may get performance improvements "for free" by switching to PyPy
you may (or not) see speed and memory usage improvements if switching to openpyxl and especially in the "read-only" mode
in the excel_to_json function, you are accessing the same values from row_values by index multiple times. Defining intermediate variables (e.g. defining name = row_values[6] and using name variable later on) and avoiding accessing an element by index more than once might have a positive impact
I'm not sure I completely understand the inner for r in range(1, mw.nrows) loop. Can you break once you get the if row_values[0] == rowe[0] evaluated to True?
are you sure you need the OrderedDict and cannot get away with a regular dict? (there is a serious overhead)
instead of .dumps() and a separate function to dump a JSON string to a file - use .dump() method to dump to a file directly - make sure to use with context manager when opening a file

Code Style notes:

follow PEP8 guidelines in terms of whitespace usage in expressions and statements
properly organize imports
if row_values[6]=="": can be simplified to if not row_values[6]: (similar for some other if conditions later on)
the generate_json() call should be put into the if __name__ == '__main__': to avoid it being executed on import

the excel_to_json() function is not quite easy to grasp - see if you can add a helpful docstring and/or comments to improve on clarity and readability

Other notes:

improve variable naming. Variables like sh, mw, rowe are not meaningfulvery close to being meaningless. I would also replace wb with a more explicit workbook
have you considered using pandas.read_excel() to read the contents into the dataframe and then dumping it via .to_json() (after applying the desired transformations)?

Performance tips:

ujson can bring more speed
since both simplejson and xlrd are pure-python, you may get performance improvements "for free" by switching to PyPy
you may (or not) see speed and memory usage improvements if switching to openpyxl and especially in the "read-only" mode
in the excel_to_json function, you are accessing the same values from row_values by index multiple times. Defining intermediate variables (e.g. defining name = row_values[6] and using name variable later on) and avoiding accessing an element by index more than once might have a positive impact
I'm not sure I completely understand the inner for r in range(1, mw.nrows) loop. Can you break once you get the if row_values[0] == rowe[0] evaluated to True?
are you sure you need the OrderedDict and cannot get away with a regular dict? (there is a serious overhead)

Code Style notes:

follow PEP8 guidelines in terms of whitespace usage in expressions and statements
properly organize imports
if row_values[6]=="": can be simplified to if not row_values[6]: (similar for some other if conditions later on)
the generate_json() call should be put into the if __name__ == '__main__': to avoid it being executed on import

Other notes:

improve variable naming. Variables like sh, mw, rowe are not meaningful. I would also replace wb with a more explicit workbook
have you considered using pandas.read_excel() to read the contents into the dataframe and then dumping it via .to_json() (after applying the desired transformations)?

Performance tips:

ujson can bring more speed
since both simplejson and xlrd are pure-python, you may get performance improvements "for free" by switching to PyPy
you may (or not) see speed and memory usage improvements if switching to openpyxl and especially in the "read-only" mode
in the excel_to_json function, you are accessing the same values from row_values by index multiple times. Defining intermediate variables (e.g. defining name = row_values[6] and using name variable later on) and avoiding accessing an element by index more than once might have a positive impact
I'm not sure I completely understand the inner for r in range(1, mw.nrows) loop. Can you break once you get the if row_values[0] == rowe[0] evaluated to True?
are you sure you need the OrderedDict and cannot get away with a regular dict? (there is a serious overhead)
instead of .dumps() and a separate function to dump a JSON string to a file - use .dump() method to dump to a file directly - make sure to use with context manager when opening a file

Code Style notes:

follow PEP8 guidelines in terms of whitespace usage in expressions and statements
properly organize imports
if row_values[6]=="": can be simplified to if not row_values[6]: (similar for some other if conditions later on)
the generate_json() call should be put into the if __name__ == '__main__': to avoid it being executed on import

the excel_to_json() function is not quite easy to grasp - see if you can add a helpful docstring and/or comments to improve on clarity and readability

Other notes:

improve variable naming. Variables like sh, mw, rowe are very close to being meaningless. I would also replace wb with a more explicit workbook
have you considered using pandas.read_excel() to read the contents into the dataframe and then dumping it via .to_json() (after applying the desired transformations)?

added 48 characters in body

Source Link

edited Mar 2, 2017 at 5:26

17.5k
8
52
93

Loading

added 171 characters in body

Source Link

edited Mar 2, 2017 at 5:21

17.5k
8
52
93

Loading

Source Link

answered Mar 2, 2017 at 5:15

17.5k
8
52
93

Loading