Commit graph

5481 commits

Author SHA1 Message Date
bunnei 268b5764c7
Merge pull request #6791 from ameerj/astc-opt
astc_decoder: Various performance and memory optimizations
2021-08-06 21:45:24 -07:00
bunnei f183668a87
Merge pull request #6799 from ameerj/vp9-fixes
nvdec: Fix VP9 reference frame refreshes
2021-08-06 17:46:46 -07:00
ameerj e3688f0627 vp9: Cleanup unused variables
With reference frames refreshes fix, we no longer need to buffer two frames in advance.
We can also remove other unused or otherwise unneeded variables.
2021-08-06 20:08:11 -04:00
ameerj a3f80a97a3 vp9: Fix reference frame refreshes
This resolves the artifacting when decoding VP9 streams.
2021-08-06 20:08:08 -04:00
yzct12345 2868d4ba84
nvdec: Implement VA-API hardware video acceleration (#6713)
* nvdec: VA-API

* Verify formatting

* Forgot a semicolon for Windows

* Clarify comment about AV_PIX_FMT_NV12

* Fix assert log spam from missing negation

* vic: Remove forgotten debug code

* Address lioncash's review

* Mention VA-API is Intel/AMD

* Address v1993's review

* Hopefully fix CMakeLists style this time

* vic: Improve cache locality

* vic: Fix off-by-one error

* codec: Async

* codec: Forgot the GetValue()

* nvdec: Address ameerj's review

* codec: Fallback to CPU without VA-API support

* cmake: Address lat9nq's review

* cmake: Make VA-API optional

* vaapi: Multiple GPU

* Apply suggestions from code review

Co-authored-by: Ameer J <52414509+ameerj@users.noreply.github.com>

* nvdec: Address ameerj's review

* codec: Use anonymous instead of static

* nvdec: Remove enum and fix memory leak

* nvdec: Address ameerj's review

* codec: Remove preparation for threading

Co-authored-by: Ameer J <52414509+ameerj@users.noreply.github.com>
2021-08-03 23:43:11 -04:00
yzct12345 f56d0db5bd
decoders: Optimize swizzle copy performance (#6790)
This makes UnswizzleTexture up to two times faster. It is the main bottleneck in NVDEC video decoding.
2021-08-02 11:18:58 -04:00
Fernando S 30f0b7cf31
Merge pull request #6720 from ameerj/vk-screenshot
renderer_vulkan: Implement screenshots
2021-08-01 13:31:33 +02:00
Ameer J db32c3762b
Merge pull request #6765 from ReinUsesLisp/y-negate-vk
vk_rasterizer: Flip viewport on Y_NEGATE
2021-08-01 01:47:37 -04:00
ameerj c439fc9be9 astc_decoder: Reduce workgroup size
This reduces the amount of over dispatching when there are odd dimensions (i.e. ASTC 8x5), which rarely evenly divide into 32x32.
2021-08-01 01:22:27 -04:00
ameerj 5ab8053511 astc_decoder: Compute offset swizzles in-shader
Alleviates the dependency on the swizzle table and a uniform which is constant for all ASTC texture sizes.
2021-08-01 01:22:26 -04:00
ameerj b2862e4772 astc_decoder: Make use of uvec4 for payload data 2021-07-31 22:28:04 -04:00
ameerj a75d70fa90 astc_decoder: Simplify Select2DPartition 2021-07-31 21:36:26 -04:00
ameerj 5665d05547 astc_decoder: Optimize the use EncodingData
This buffer was a list of EncodingData structures sorted by their bit length, with some duplication from the cpu decoder implementation.
We can take advantage of its sorted property to optimize its usage in the shader.

Thanks to wwylele for the optimization idea.
2021-07-31 21:36:26 -04:00
ameerj 15c0c213b1 astc.h: Move data to cpp implementation
Moves leftover values that are no longer used by the gpu decoder back to the cpp implementation.
2021-07-31 21:26:42 -04:00
bunnei 7530594602
Merge pull request #6759 from ReinUsesLisp/pipeline-statistics
renderer_vulkan: Add setting to log pipeline statistics
2021-07-30 11:18:52 -07:00
ReinUsesLisp b185567a03 vk_rasterizer: Flip viewport on Y_NEGATE
Matches OpenGL's behavior. I don't believe this register flips geometry,
but we have to try to match behavior on both backends.
2021-07-29 02:17:53 -03:00
ameerj 7ac99bb127 renderers: Add explicit invert_y bool to screenshot callback
OpenGL and Vulkan images render in different coordinate systems. This allows us to specify the coordinate system of the screenshot within each renderer
2021-07-28 21:46:08 -04:00
ameerj 75e7f54fb0 renderer_vulkan: Implement screenshots 2021-07-28 21:45:55 -04:00
ameerj 548bac8989 vk_blit_screen: Add public CreateFramebuffer method 2021-07-28 21:43:02 -04:00
ameerj 1e6c5d323d vk_blit_screen: Make Draw method more generic
Allows specifying the framebuffer and render area dimensions, rather than being hard coded for the render window.
2021-07-28 21:37:30 -04:00
ReinUsesLisp 3b006f4fe2 renderer_vulkan: Add setting to log pipeline statistics
Use VK_KHR_pipeline_executable_properties when enabled and available to
log statistics about the pipeline cache in a game.

For example, this is on Turing GPUs when generating a pipeline cache
from Super Smash Bros. Ultimate:

Average pipeline statistics
==========================================
Code size:       6433.167
Register count:    32.939

More advanced results could be presented, at the moment it's just an
average of all 3D and compute pipelines.
2021-07-27 21:29:24 -03:00
bunnei 92887a65f0
Merge pull request #6749 from lioncash/rtarget
render_target: Add missing initializer for size extent
2021-07-27 17:28:53 -07:00
Rodrigo Locatti ab206d6378
Merge pull request #6748 from lioncash/engine-init
video_core/engine: Consistently initialize rasterizer pointers
2021-07-27 16:17:20 -03:00
bunnei 2717e79c74
Merge pull request #6745 from lioncash/copies
video_core: Remove some unused variables
2021-07-27 11:38:32 -07:00
Lioncash 00e100de08 render_target: Add missing initializer for size extent
Everything else has a default constructor that does the straightforward
thing of initializing most members to a default value, except for the
size.

We explicitly initialize the size (and others, for consistency), to
prevent potential uninitialized reads from occurring. Particularly given
the largeish surface area that this struct is used in.
2021-07-27 07:41:21 -04:00
Lioncash f8964dd89a video_core/engine: Consistently initialize rasterizer pointers
Ensures all of the engines have consistent and deterministic
initialization of the rasterizer pointers.
2021-07-27 07:30:57 -04:00
Lioncash 8c82c594f0 vulkan_wrapper: Fix SetObjectName() always indicating objects as images
We should be using the passed in object type instead.
2021-07-27 07:19:15 -04:00
Lioncash ec56a17acd buffer_cache: Remove unused small_vector in CommitAsyncFlushesHigh()
Given this is non-trivial, the constructor is required to execute, so
this removes a bit of redundant codegen.
2021-07-27 06:24:44 -04:00
Lioncash 075a744e38 gl_shader_cache: Remove unused variable 2021-07-27 06:23:49 -04:00
Lioncash 296728ec46 vk_compute_pass: Remove unused captures
Resolves two compiler warnings.
2021-07-27 06:17:52 -04:00
bunnei d6c799494c
Merge pull request #6696 from ameerj/speed-limit-rename
general: Rename "Frame Limit" references to "Speed Limit"
2021-07-26 18:51:00 -07:00
Rodrigo Locatti 7511f65e31
Merge pull request #6741 from ReinUsesLisp/stream-remove
vk_stream_buffer: Remove unused stream buffer
2021-07-26 20:35:01 -03:00
Rodrigo Locatti 1a94b3f452
Merge pull request #6740 from K0bin/hvv-fallback
Handle allocation failure in Staging buffer
2021-07-26 20:34:44 -03:00
Robin Kertels 75050c788c
vk_staging_buffer_pool: Fall back to host memory when allocation fails 2021-07-26 23:37:18 +02:00
Rodrigo Locatti c6991fa900
Merge pull request #6728 from ReinUsesLisp/null-buffer-usage
vk_buffer_cache: Add transform feedback usage to null buffer
2021-07-26 18:30:45 -03:00
ReinUsesLisp 5f9a4817a5 vk_stream_buffer: Remove unused stream buffer
Remove unused file.
2021-07-26 18:19:53 -03:00
ReinUsesLisp 771dcb2a56 vk_compute_pass: Fix pipeline barrier for indexed quads
Use an index buffer barrier instead of a vertex input read barrier.
2021-07-26 05:51:09 -03:00
ReinUsesLisp 27ed6e7c2b vk_buffer_cache: Add transform feedback usage to null buffer
Fixes bad API usages on Vulkan.
2021-07-26 05:49:37 -03:00
bunnei 98b26b6e12
Merge pull request #6585 from ameerj/hades
Shader Decompiler Rewrite
2021-07-25 11:39:04 -07:00
bunnei 84b9c42642
Merge pull request #6690 from ReinUsesLisp/dma-clear-fixups
buffer_cache: Misc fixups related to buffer clears
2021-07-24 01:27:50 -04:00
ameerj c80ae87b4e renderer_base: Removed redundant settings
use_framelimiter was not being used internally by the renderers.
set_background_color was always set to true as there is no toggle for the renderer background color, instead users directly choose the color of their choice.
2021-07-23 22:10:01 -04:00
ameerj 9dfbc9bdce general: Rename "Frame Limit" references to "Speed Limit"
This setting is best referred to as a speed limit, as it involves the limits of all timing based aspects of the emulator, not only framerate.
This allows us to differentiate it from the fps unlocker setting.
2021-07-23 22:10:01 -04:00
ReinUsesLisp a55ff22900 vulkan/blit_image: Commit descriptor sets within worker thread
Fixes race condition caused. The descriptor pool is not thread safe, so
we have to commit descriptor sets within the same thread.
2021-07-22 21:51:40 -04:00
ReinUsesLisp f6796cad9c vulkan_device: Blacklist Volta and older from VK_KHR_push_descriptor
Causes crashes on Link's Awakening intro. It's hard to debug if it's our
fault due to bugs in validation layers.
2021-07-22 21:51:40 -04:00
ReinUsesLisp 3c6d440015 Revert "renderers: Disable async shader compilation"
This reverts commit 4a152767286717fa69bfc94846a124a366f70065.
2021-07-22 21:51:40 -04:00
ReinUsesLisp 8381490a04 opengl: Fix asynchronous shaders
Wait for shader to build before configuring it, and wait for the shader
to build before sharing it with other contexts.
2021-07-22 21:51:40 -04:00
ReinUsesLisp 258f35515d shader_environment: Receive cache version from outside
This allows us invalidating OpenGL and Vulkan separately in the future.
2021-07-22 21:51:40 -04:00
ameerj 56478bc9ac shader: Fix disabled attribute default values 2021-07-22 21:51:40 -04:00
ameerj c9528282d9 gl_device: Simplify GLASM setting logic 2021-07-22 21:51:40 -04:00
ReinUsesLisp e1ed218b41 renderer_opengl: Use ARB_separate_shader_objects
Ensures that states set for a particular stage are not attached to other
stages which may not need them.
2021-07-22 21:51:40 -04:00