Tag Info

Hot answers tagged vectorization

9 votes

My blit function for my own graphics library

it seems to be optimized quite well by the compiler, as it uses a lot of vectorized instructions Well, kind of. GCC didn't use vectorization at all, and Clang used some of it but in a very strange ...

user555045

12.4k

answered Jul 3, 2021 at 10:31

8 votes

Accepted

Implementation of linear regression in Python

Just reviewing normalizeFeatures. Instead of a comment explaining what the function does, write a docstring. (Docstrings are available from the interactive ...

Gareth Rees

50.1k

answered Aug 22, 2017 at 9:26

8 votes

AVX Vectorized Multi-threaded Mandelbrot Renderer

There are some & 0xff operations that are not necessary: (aMask & (~iMask & 0xff)), because the bits reset by ...

user555045

12.4k

answered Feb 2, 2020 at 9:17

7 votes

My blit function for my own graphics library

There are a few things that can be lifted out of loops. The three screen related variables, screen_height, screen_width, and <...

1201ProgramAlarm

7,821

answered Jul 3, 2021 at 16:23

6 votes

Accepted

Compute a numerical derivative

You can use np.roll to compute the centered differences as a vectorised operation rather than in a for loop: ...

301_Moved_Permanently

29.4k

answered Apr 6, 2018 at 9:10

5 votes

Generic pixel class to seamlessly alpha-blend and convert between different pixel structure layouts

The code seems to get more and more questionable as we read downward. Starting at the bottom: ...

Quuxplusone

19.7k

answered Sep 5, 2018 at 2:58

5 votes

Accepted

Vectorized 16-bit addition in Standard C

You Have an XY-Problem Let’s take a look at a very naïve loop to do this in C: ...

Davislor

9,115

answered Feb 26, 2023 at 23:45

5 votes

Calculating premium splits for policies

Remove your warnings ignore. The warnings are there for a reason. Remove all of your unused datetime imports. Any half-decent ...

Reinderien

71.1k

answered Dec 16, 2023 at 18:01

4 votes

Accepted

Converting Array of Floats to UINT8 (`char`) or UINT16 (`unsigned short`) Using SSE4

why not load 4 packed __m128 and then store 32 Pixels at once I think that's 16 pixels, but it's a good plan. The pack instructions were used inefficiently (in the linked question that was not really ...

user555045

12.4k

answered Oct 21, 2017 at 21:57

4 votes

Compute a numerical derivative

You can vectorize the calculation ...

Maarten Fabré

9,400

answered Apr 6, 2018 at 9:46

4 votes

Accepted

Determining whether a list of pathways and its genes are all included in another list

If you were to add a cat("hello") at the top of your all_in function, you would find that your function is called 25 times, once ...

flodel

3,555

answered Apr 19, 2018 at 3:49

4 votes

Accepted

Find the minimum value that data could have had before it was rounded

You can firstly change by difference != 0 and then use na.locf to replace NAs by last ...

m0nhawk

answered Apr 16, 2018 at 17:53

4 votes

Accepted

Vectorizing matrix operation instead of for loop in circular matrices

This is easy using numpy.roll, for example: zx = np.roll(x, 1) * (np.roll(x, 2) + np.roll(x, -1)) - x

Gareth Rees

50.1k

answered Oct 12, 2018 at 6:42

4 votes

Accepted

PyTorch Vectorized Implementation for Thresholding and Computing Jaccard Index

I think a different approach is needed to achieve a better performance. The current approach recomputes the Jaccard similarity from scratch for each possible threshold value. However, going from one ...

GZ0

2,361

answered Sep 21, 2019 at 5:51

4 votes

Accepted

Vectorizing equations for fsolve

Yes. You can simplify this code. I made two main changes. First, I forget that \$x_{50}\$ is known and write an equation for it just like all the other variables, then I replace that equation by the ...

David

answered Feb 25, 2020 at 4:58

4 votes

Accepted

Vectorized crosstabulation in Python for two arrays with two categories each

Algorithm If you look at your code, and follow the if-elif part, you see that there are 4 combinations of i and ...

Maarten Fabré

9,400

answered Aug 4, 2020 at 8:54

4 votes

Snake game from the viewpoint of the snake

UX It is not obvious what the user should do when the GUI opens up. You should display some simple instructions in the GUI, such as: ...

toolic

15.8k

answered Nov 29, 2024 at 15:47

3 votes

Converting Array of `Float32` (`float`) to Array of `UINT8` (`unsigned char`) Using AVX2

Harold's comment is correct. Consider what happens for float inputs like 5000000000 * 1.0. Conversion to int32_t with ...

Peter Cordes

3,761

answered Apr 27, 2019 at 16:59

3 votes

Remove outliers from a point cloud

Please add a docstring. Simplify_by_avg_weighted might be named points_near_centroid (or ...

J_H

42.3k

answered Dec 21, 2017 at 15:27

3 votes

Accepted

Normalise list of N dimensional numpy arrays

The trick is to use the keepdims parameter. ...

Seanny123

1,617

answered Jan 23, 2018 at 0:56

3 votes

Accepted

Function that fills a time series row-by-row by using the values in the row before

Use Panda's masks, df_buy_sell[condition] lets you select all rows in the dataframe that matches your condition. You could then apply your entire function block ...

mochi

1,144

answered Sep 5, 2017 at 4:46

3 votes

Rolling regressions in R

Here is another solution which uses the rollRegres package ...

Benjamin Christoffersen

answered Jan 24, 2019 at 23:58

3 votes

Accepted

Vectorization, 7-bit encoding

My implementation works just fine, until we go over 2^31 due to compare not doing unsigned comparison. The "incompleteness" of the set of comparisons is an old problem, and the workarounds are also ...

user555045

12.4k

answered Jan 21, 2020 at 2:42

3 votes

N-Body Optimization

Data layout You have already experienced first-hand a disadvantage of using "1 physics vector = 1 SIMD vector" (such as __m256d pos), causing some ...

user555045

12.4k

answered Jun 2, 2020 at 19:50

3 votes

Vectorizing a working custom similarity function further using numpy

The comprehension statements are far too long and complicated and need to be broken up (but shouldn't exist at all). The easy vectorisation pass involves replacing all of the comprehensions with ...

Reinderien

71.1k

answered Dec 1, 2024 at 2:34

3 votes

Tips to Finetuning to increase the GFLOPS of a SIMD kernel

Fine-grained profiling results normally need to be taken with a grain of salt. A very common situation is that (eg) a load takes a while, but the time is attributed to a later instruction instead. I ...

user555045

12.4k

answered Jan 25, 2022 at 19:31

3 votes

Finding specific promotions from two columns

The main issue with numpy's vectorize function is that it's actually not vectorized. It's an unfortunate misnomer: The vectorize...

tdy

2,266

answered Sep 16, 2024 at 21:14

3 votes

Accepted

C - SIMD Code to invert a transformation matrix

Putting the code through LLVM MCA or https://uica.uops.info/ yields one unsurprising (I think) result and one surprise (for me anyway). The not-surprise The bottleneck (on Intel Skylake in this ...

user555045

12.4k

answered Oct 2, 2024 at 1:06

2 votes

Vectorized and Multi Threaded Image Convolution

Something to look into: data access patterns. You go across all image lines, processing the first few pixels (were the kernel straddles the boundary), then again across all image lines, processing the ...

Cris Luengo

7,021

answered Oct 24, 2017 at 3:30

Only top scored, non community-wiki answers of a minimum length are eligible

101

questions tagged

vectorization

vectorization × 101
python × 50
performance × 44
numpy × 32
matlab × 21
c × 11
r × 11
pandas × 8
c++ × 6
python-3.x × 6
matrix × 6
simd × 6
statistics × 5
array × 4
image × 4
c# × 3
random × 3
mathematics × 3
simulation × 3
iteration × 3
machine-learning × 3
numerical-methods × 3
x86 × 3
sse × 3
beginner × 2

Tag Info

Hot answers tagged vectorization

Related Tags