PR types

Others

PR changes

OPs

Describe

  • Function scope extension of class GpuLaunchConfig
  • Move the function GetThreadConfig from elementwise_no_broadcast.cu.h to gpu_launch_config.h, to cooperate with struct GpuLaunchConfig and extend the working scope of this class
  • Since RoundToPowerOfTwo is not a gpu device function, and it is much more relevant to gpu_launch_config.h than gpu_device_function.h, then move it to gpu_launch_config.h , and replace it with new one
  • Originally, macro ELEMENTWISE_BLOCK_SIZE is set just for elementwise relevant computation, since GetThreadConfig is merged with GetGpuLaunchConfig1D, ELEMENTWISE_BLOCK_SIZE shall consistently moved to gpu_launch_config.h. However, to extend the working scope of this macro, rename the original name ELEMENTWISE_BLOCK_SIZE with PREDEFINED_BLOCK_SIZE