Skip to content

Vectorize Ramp in OpenGLCompute backend#6372

Merged
dsharletg merged 1 commit intohalide:masterfrom
OmarEmaraDev:openglcompute-vectorize-ramp
Nov 3, 2021
Merged

Vectorize Ramp in OpenGLCompute backend#6372
dsharletg merged 1 commit intohalide:masterfrom
OmarEmaraDev:openglcompute-vectorize-ramp

Conversation

@OmarEmaraDev
Copy link
Copy Markdown
Contributor

@OmarEmaraDev OmarEmaraDev commented Oct 30, 2021

Currently, ramps are generated as a number of independent scalar
expressions that are finally gathered into a vector. For instance,
indexing in vectorized code is filled with ramps like the following:

int _11 = int(1) * int(1);
int _12 = _10 + _11;
int _13 = int(2) * int(1);
int _14 = _10 + _13;
int _15 = int(3) * int(1);
int _16 = _10 + _15;
ivec4 _17 = ivec4(_10, _12, _14, _16);

This patch simplifies the generated code using a multiply add expression
on a vector containing an arithmetic sequence, such that the code is
as follows:

ivec4 _11 = ivec4(0, 1, 2, 3) * int(1) + _10;

This is more performant due to vectorization, more compact, and more
readable because the base and the stride are easily identifiable.

Currently, ramps are generated as a number of independent scalar
expressions that are finally gathered into a vector. For instance,
indexing in vectorized code is filled with ramps like the following:

```
int _11 = int(1) * int(1);
int _12 = _10 + _11;
int _13 = int(2) * int(1);
int _14 = _10 + _13;
int _15 = int(3) * int(1);
int _16 = _10 + _15;
ivec4 _17 = ivec4(_10, _12, _14, _16);
```

This patch simplifies the generated code using a multiply add expression
on a vector containing an arithmetic expression, such that the code is
as follows:

```
ivec4 _11 = ivec4(0, 1, 2, 3) * int(1) + _10;
```

This is more performant due to vectorization, more compact, and more
readable because the base and the stride are easily identifiable.
@dsharletg dsharletg merged commit 76315a2 into halide:master Nov 3, 2021
@dsharletg
Copy link
Copy Markdown
Contributor

Thanks, this makes sense and is a nice change.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants