Implements dpctl.tensor.repeat, dpctl.tensor.tile#1381
Conversation
|
View rendered docs @ https://intelpython.github.io/dpctl/pulls/1381/index.html |
|
Array API standard conformance tests for dpctl=0.14.6dev4=py310ha25a700_40 ran successfully. |
|
|
|
Array API standard conformance tests for dpctl=0.14.6dev4=py310ha25a700_41 ran successfully. |
|
Array API standard conformance tests for dpctl=0.14.6dev4=py310ha25a700_44 ran successfully. |
| hev.wait() | ||
| else: | ||
| repeats = dpt.asarray(repeats, dtype="i8", sycl_queue=exec_q) | ||
| if not dpt.all(repeats >= 0): |
There was a problem hiding this comment.
Once we have tensor.min this could be made more efficient: dpt.min(repeats) >= 0
|
@ndgrigorian Since I think we must support |
Doing this will make implementing more accumulators convenient
- Also adds a check that the sole element of a length 1 tuple is an integer before proceeding to the scalar case
52a44a9 to
459d209
Compare
dpctl.tensor.repeatdpctl.tensor.repeat, dpctl.tensor.tile
|
Array API standard conformance tests for dpctl=0.14.6dev5=py310ha25a700_10 ran successfully. |
|
Thank you @ndgrigorian ! Look great. |
|
Deleted rendered PR docs from intelpython.github.com/dpctl, latest should be updated shortly. 🤞 |
This pull request implements
dpctl.tensor.repeat, as well as changes necessary to include it indpctl.This function repeats the elements of an array along a given axis, and accepts integers, tuples, and
usm_ndarrays for the number of repetitions. The basic approach, whererepeatsis a scalar, is implemented as_repeat_by_scalarin the_tensor_implsubmodule. The more complicated case, whererepeatsis a tuple, is implemented as_repeat_by_sequence.To implement
_repeat_by_sequence, kernels for cumulative sums were moved into a separate header, and a 1D cumulative sum of general integers (rather than just nonzero) was added.An example of the new functionality:
dpctl.tensor.tileis also implemented as as mostly top-level function.