Revisions to Python JIT instantiation

added 148 characters in body

Source Link

edited Jan 8, 2018 at 11:39

James Schinner

831
1
7
18

Update

New review following on from this one

deleted 37 characters in body

Source Link

edited Jan 7, 2018 at 23:39

Jamal

35.2k
13
134
238

I'm using python 3.6

As the title suggests thisThis code is about instantiating object attributes Just in Time.

I'm working on a REST API, where I currently create pythonPython objects from the JSON data returned by the server. Though, though not all attributes of the created objects will be called upon in user code.

Hence only instantiating the attributes the user access via dot notation seems like a valid way to improve performance.

The core of the idea is to use pythonPython scoping to mutate a list in the scope outside the function evaluate. I dislike this the most.

Do you think it's a good idea? Do you have a better solution? Do you think the complexity is worth it?

EDIT

Do you think it's a good idea?

Do you have a better solution?

Do you think the complexity is worth it?

I will try anand create more context here as to why I want this. I have implemented this idea in my package async_v20 found here. I have written docs here.

whenWhen the server sends jsonJSON:

In some cases there may be 5000 objects to instantiate. Or or objects may be nested around 3 deep

objects. Objects get instantiated in this module

This example runs a benchmark.

CODE: This example runs a benchmark:

RUNNING JitAttributes
TOOK  0.2956113815307617  seconds
RUNNING Attributes
TOOK  4.880501985549927  seconds

RUNNING JitAttributes
TOOK  0.2956113815307617  seconds
RUNNING Attributes
TOOK  4.880501985549927  seconds

>>> jit_instance = JitAttributes(
...                 lazy_evaluate(list, range(a)),
...                 lazy_evaluate(list, range(b)),
...                 lazy_evaluate(list, range(c)),
...             )
>>> jit_instance.foo
# EVALUATING            <- Only evaluates once :)
# [0, 1, 2, 3, ... ]
>>> jit_instance.foo
# [0, 1, 2, 3, ... ]
>>> non_jit_instance = Attributes(
...                 list(range(a)),
...                 list(range(b)),
...                 list(range(c)),
...             )

>>> non_jit_instance.foo
# [0, 1, 2, 3, ...]
>>> non_jit_instance.foo
# [0, 1, 2, 3, ...]

>>> jit_instance = JitAttributes(
...                 lazy_evaluate(list, range(a)),
...                 lazy_evaluate(list, range(b)),
...                 lazy_evaluate(list, range(c)),
...             )
>>> jit_instance.foo
# EVALUATING            <- Only evaluates once :)
# [0, 1, 2, 3, ... ]
>>> jit_instance.foo
# [0, 1, 2, 3, ... ]
>>> non_jit_instance = Attributes(
...                 list(range(a)),
...                 list(range(b)),
...                 list(range(c)),
...             )

>>> non_jit_instance.foo
# [0, 1, 2, 3, ...]
>>> non_jit_instance.foo
# [0, 1, 2, 3, ...]

I'm using python 3.6

As the title suggests this code is about instantiating object attributes Just in Time.

I'm working on a REST API, where I currently create python objects from the JSON data returned by the server. Though not all attributes of the created objects will be called upon in user code.

Hence only instantiating the attributes the user access via dot notation seems like a valid way to improve performance.

The core of the idea is to use python scoping to mutate a list in the scope outside the function evaluate. I dislike this the most.

Do you think it's a good idea? Do you have a better solution? Do you think the complexity is worth it?

EDIT

I will try an create more context here as to why I want this. I have implemented this idea in my package async_v20 found here. I have written docs here

when the server sends json:

In some cases there may be 5000 objects to instantiate. Or objects may be nested around 3 deep

objects get instantiated in this module

This example runs a benchmark

CODE:

RUNNING JitAttributes
TOOK  0.2956113815307617  seconds
RUNNING Attributes
TOOK  4.880501985549927  seconds

>>> jit_instance = JitAttributes(
...                 lazy_evaluate(list, range(a)),
...                 lazy_evaluate(list, range(b)),
...                 lazy_evaluate(list, range(c)),
...             )
>>> jit_instance.foo
# EVALUATING            <- Only evaluates once :)
# [0, 1, 2, 3, ... ]
>>> jit_instance.foo
# [0, 1, 2, 3, ... ]
>>> non_jit_instance = Attributes(
...                 list(range(a)),
...                 list(range(b)),
...                 list(range(c)),
...             )

>>> non_jit_instance.foo
# [0, 1, 2, 3, ...]
>>> non_jit_instance.foo
# [0, 1, 2, 3, ...]

This code is about instantiating object attributes Just in Time.

I'm working on a REST API, where I currently create Python objects from the JSON data returned by the server, though not all attributes of the created objects will be called upon in user code. Hence only instantiating the attributes the user access via dot notation seems like a valid way to improve performance.

The core of the idea is to use Python scoping to mutate a list in the scope outside the function evaluate. I dislike this the most.

Do you think it's a good idea?

Do you have a better solution?

Do you think the complexity is worth it?

I will try and create more context here as to why I want this. I have implemented this idea in my package async_v20 found here. I have written docs here.

When the server sends JSON:

In some cases there may be 5000 objects to instantiate or objects may be nested around 3 deep. Objects get instantiated in this module.

This example runs a benchmark:

RUNNING JitAttributes
TOOK  0.2956113815307617  seconds
RUNNING Attributes
TOOK  4.880501985549927  seconds

>>> jit_instance = JitAttributes(
...                 lazy_evaluate(list, range(a)),
...                 lazy_evaluate(list, range(b)),
...                 lazy_evaluate(list, range(c)),
...             )
>>> jit_instance.foo
# EVALUATING            <- Only evaluates once :)
# [0, 1, 2, 3, ... ]
>>> jit_instance.foo
# [0, 1, 2, 3, ... ]
>>> non_jit_instance = Attributes(
...                 list(range(a)),
...                 list(range(b)),
...                 list(range(c)),
...             )

>>> non_jit_instance.foo
# [0, 1, 2, 3, ...]
>>> non_jit_instance.foo
# [0, 1, 2, 3, ...]

Rollback to Revision 4

Source Link

edited Jan 7, 2018 at 23:35

Jamal

35.2k
13
134
238

I will try an create more context here as to why I want this. I have implemented this idea in my package async_v20 found here here. I have written docs here here

objects get instantiated in this module module

UPDATE

To quote my reviewer Austin Hastings

I suggest that you look at implementing a simpler approach, where you store (say) '_attrname' as the lazy-evaluating function, and use getattr to expand that value into 'attrname' when you fetch it the first time.

This idea is at odds with my immutable tuple class'. Though the zen of python made it clear what to do.

Special cases aren't special enough to break the rules.

Although practicality beats purity.

So I got to work and made all my class definitions inherit from object. This meant changing all my __new__ methods to __init__ and large refactoring.

So was it worth it? Yep!

I wrote a benchmark script here

The benchmark consists of running an aiohttp server locally, serving some static JSON data. While timing 1000 simultaneous calls to OandaClient.get_candles()

Here are the results with cProfile stats sorted by time:

Prior to changes

After changes

Ok, I need to come clean. In between those two changes I also implemented __slots__. All the changes took me most of my weekend and I'm not about to take it out for this test. So I'm not sure how much of an impact that made.

I will try an create more context here as to why I want this. I have implemented this idea in my package async_v20 found here. I have written docs here

objects get instantiated in this module

UPDATE

To quote my reviewer Austin Hastings

I suggest that you look at implementing a simpler approach, where you store (say) '_attrname' as the lazy-evaluating function, and use getattr to expand that value into 'attrname' when you fetch it the first time.

This idea is at odds with my immutable tuple class'. Though the zen of python made it clear what to do.

Special cases aren't special enough to break the rules.

Although practicality beats purity.

So I got to work and made all my class definitions inherit from object. This meant changing all my __new__ methods to __init__ and large refactoring.

So was it worth it? Yep!

I wrote a benchmark script here

The benchmark consists of running an aiohttp server locally, serving some static JSON data. While timing 1000 simultaneous calls to OandaClient.get_candles()

Here are the results with cProfile stats sorted by time:

Prior to changes

After changes

Ok, I need to come clean. In between those two changes I also implemented __slots__. All the changes took me most of my weekend and I'm not about to take it out for this test. So I'm not sure how much of an impact that made.

I will try an create more context here as to why I want this. I have implemented this idea in my package async_v20 found here. I have written docs here

objects get instantiated in this module

added 1923 characters in body

Source Link

edited Jan 7, 2018 at 23:33

James Schinner

831
1
7
18

Loading

deleted 4 characters in body

Source Link

edited Jan 5, 2018 at 23:43

James Schinner

831
1
7
18

Loading

Added context

Source Link

edited Jan 5, 2018 at 23:05

James Schinner

831
1
7
18

Loading

edited tags

Link

edited Jan 5, 2018 at 17:24

Peilonrayz ♦

44.6k
7
80
158

Loading

Source Link

asked Jan 5, 2018 at 13:10

James Schinner

831
1
7
18

Loading

Stack Exchange Network

Return to Question

Prior to changes

After changes

Prior to changes

After changes