Commit Graph

11282 Commits (0faf7b17a166342da1263f11d7c65dcd58116f20)
 

Author SHA1 Message Date
bunnei 0faf7b17a1
Merge pull request #2392 from lioncash/swap
common/swap: Minor cleanup and improvements to byte swapping functions
2019-04-12 21:52:16 +07:00
Lioncash 0d8ef2d3b9 common/swap: Improve codegen of the default swap fallbacks
Uses arithmetic that can be identified more trivially by compilers for
optimizations. e.g. Rather than shifting the halves of the value and
then swapping and combining them, we can swap them in place.

e.g. for the original swap32 code on x86-64, clang 8.0 would generate:

    mov     ecx, edi
    rol     cx, 8
    shl     ecx, 16
    shr     edi, 16
    rol     di, 8
    movzx   eax, di
    or      eax, ecx
    ret

while GCC 8.3 would generate the ideal:

    mov     eax, edi
    bswap   eax
    ret

now both generate the same optimal output.

MSVC used to generate the following with the old code:

    mov     eax, ecx
    rol     cx, 8
    shr     eax, 16
    rol     ax, 8
    movzx   ecx, cx
    movzx   eax, ax
    shl     ecx, 16
    or      eax, ecx
    ret     0

Now MSVC also generates a similar, but equally optimal result as clang/GCC:

    bswap   ecx
    mov     eax, ecx
    ret     0

====

In the swap64 case, for the original code, clang 8.0 would generate:

    mov     eax, edi
    bswap   eax
    shl     rax, 32
    shr     rdi, 32
    bswap   edi
    or      rax, rdi
    ret

(almost there, but still missing the mark)

while, again, GCC 8.3 would generate the more ideal:

    mov     rax, rdi
    bswap   rax
    ret

now clang also generates the optimal sequence for this fallback as well.

This is a case where MSVC unfortunately falls short, despite the new
code, this one still generates a doozy of an output.

    mov     r8, rcx
    mov     r9, rcx
    mov     rax, 71776119061217280
    mov     rdx, r8
    and     r9, rax
    and     edx, 65280
    mov     rax, rcx
    shr     rax, 16
    or      r9, rax
    mov     rax, rcx
    shr     r9, 16
    mov     rcx, 280375465082880
    and     rax, rcx
    mov     rcx, 1095216660480
    or      r9, rax
    mov     rax, r8
    and     rax, rcx
    shr     r9, 16
    or      r9, rax
    mov     rcx, r8
    mov     rax, r8
    shr     r9, 8
    shl     rax, 16
    and     ecx, 16711680
    or      rdx, rax
    mov     eax, -16777216
    and     rax, r8
    shl     rdx, 16
    or      rdx, rcx
    shl     rdx, 16
    or      rax, rdx
    shl     rax, 8
    or      rax, r9
    ret     0

which is pretty unfortunate.
2019-04-12 00:07:39 +07:00
bunnei ea80e2bc57
Merge pull request #2235 from ReinUsesLisp/spirv-decompiler
vk_shader_decompiler: Implement a SPIR-V decompiler
2019-04-11 21:54:23 +07:00
bunnei 83a2fb3c3a
Merge pull request #2360 from lioncash/svc-global
kernel/svc: Deglobalize the supervisor call handlers
2019-04-11 21:50:05 +07:00
bunnei e2f2155dab
Merge pull request #2388 from lioncash/constexpr
kernel: Make handle type declarations constexpr
2019-04-11 21:49:45 +07:00
bunnei c0b2b7020d
Merge pull request #2387 from FernandoS27/fast-copy-relax
gl_rasterizer_cache: Relax restrictions on FastCopySurface
2019-04-11 21:49:21 +07:00
Lioncash 66b73fd399 common/swap: Mark byte swapping free functions with [[nodiscard]] and noexcept
Allows the compiler to inform when the result of a swap function is
being ignored (which is 100% a bug in all usage scenarios). We also mark
them noexcept to allow other functions using them to be able to be
marked as noexcept and play nicely with things that potentially inspect
"nothrowability".
2019-04-11 20:42:44 +07:00
Lioncash 9cb4b7be40 common/swap: Simplify swap function ifdefs
Including every OS' own built-in byte swapping functions is kind of
undesirable, since it adds yet another build path to ensure compilation
succeeds on.

Given we only support clang, GCC, and MSVC for the time being, we can
utilize their built-in functions directly instead of going through the
OS's API functions.

This shrinks the overall code down to just

if (msvc)
  use msvc's functions
else if (clang or gcc)
  use clang/gcc's builtins
else
  use the slow path
2019-04-11 20:36:19 +07:00
Lioncash 598954436f common/swap: Remove 32-bit ARM path
We don't plan to support host 32-bit ARM execution environments, so this
is essentially dead code.
2019-04-11 20:15:47 +07:00
Lioncash 6300ccbc3c kernel: Make handle type declarations constexpr
Some objects declare their handle type as const, while others declare it
as constexpr. This makes the const ones constexpr for consistency, and
prevent unexpected compilation errors if these happen to be attempted to be
used within a constexpr context.
2019-04-11 16:34:53 +07:00
Fernando Sahmkow c9305959d3 gl_rasterizer_cache: Relax restrictions on FastCopySurface and FastLayeredCopySurface 2019-04-11 13:14:28 +07:00
bunnei 6951741a94
Merge pull request #2278 from ReinUsesLisp/vc-texture-cache
video_core: Implement API agnostic view based texture cache
2019-04-10 21:17:35 +07:00
bunnei 0371650bd7
Merge pull request #2372 from FernandoS27/fermi-fix
Correct Fermi Copy on Linear Textures.
2019-04-10 21:17:03 +07:00
ReinUsesLisp 75d23a3679 vk_shader_decompiler: Implement flow primitives 2019-04-10 14:20:25 +07:00
ReinUsesLisp 58ad8dfac6 vk_shader_decompiler: Implement most common texture primitives 2019-04-10 14:20:25 +07:00
ReinUsesLisp 4667ed8e22 vk_shader_decompiler: Implement texture decompilation helper functions 2019-04-10 14:20:25 +07:00
ReinUsesLisp 676172e20d vk_shader_decompiler: Implement Assign and LogicalAssign 2019-04-10 14:20:25 +07:00
ReinUsesLisp d316d248ab vk_shader_decompiler: Implement non-OperationCode visits 2019-04-10 14:20:25 +07:00
ReinUsesLisp b758c861b0 vk_shader_decompiler: Implement OperationCode decompilation interface 2019-04-10 14:20:25 +07:00
ReinUsesLisp fec4eb9776 vk_shader_decompiler: Implement Visit 2019-04-10 14:20:25 +07:00
ReinUsesLisp ca51f99840 vk_shader_decompiler: Implement labels tree and flow 2019-04-10 14:20:25 +07:00
ReinUsesLisp 13aa664f3f vk_shader_decompiler: Implement declarations 2019-04-10 14:20:25 +07:00
ReinUsesLisp ad53b233c5 vk_shader_decompiler: Declare and stub interface for a SPIR-V decompiler 2019-04-10 14:20:25 +07:00
ReinUsesLisp 970d9e57c8 video_core: Add sirit as optional dependency with Vulkan
sirit is a runtime assembler for SPIR-V
2019-04-10 14:20:25 +07:00
bunnei 97648f4841
Merge pull request #2345 from ReinUsesLisp/multibind
gl_rasterizer: Use ARB_multi_bind to update buffers with a single call per drawcall
2019-04-10 11:23:19 +07:00
bunnei 1312cf15d6
Merge pull request #2377 from lioncash/todo
kernel/server_session: Remove obsolete TODOs
2019-04-10 10:29:24 +07:00
Lioncash 08d507a196 kernel/server_session: Remove obsolete TODOs
These are holdovers from Citra.
2019-04-09 23:34:49 +07:00
bunnei ed9dba89d3
Merge pull request #2375 from FernandoS27/fix-ldc
Remove unnecessary bounding in LD_C
2019-04-09 21:23:24 +07:00
bunnei f46c3164e7
Merge pull request #2353 from lioncash/surface
yuzu/debugger: Remove graphics surface viewer
2019-04-09 21:20:02 +07:00
Fernando Sahmkow c9f35d96be Remove bounding in LD_C 2019-04-09 20:02:11 +07:00
bunnei 2598433f9c
Merge pull request #2354 from lioncash/header
video_core/texures/texture: Remove unnecessary includes
2019-04-09 19:19:41 +07:00
bunnei 61f63bb994
Merge pull request #1957 from DarkLordZach/title-provider
file_sys: Provide generic interface for accessing game data
2019-04-09 19:16:37 +07:00
bunnei 353a099481
Merge pull request #2366 from FernandoS27/xmad-fix
Correct XMAD mode, psl and high_b on different encodings.
2019-04-09 19:15:01 +07:00
bunnei 1a3098f11a
Merge pull request #2132 from FearlessTobi/port-4437
Port citra-emu/citra#4437: "citra-qt: Make hotkeys configurable via the GUI (Attempt 2)"
2019-04-09 18:08:30 +07:00
bunnei 71182643f7
Merge pull request #2370 from lioncash/qt-warn
yuzu/loading_screen: Resolve runtime Qt string formatting warnings
2019-04-09 17:21:18 +07:00
bunnei bc7e149835
Merge pull request #2369 from FernandoS27/mip-align
gl_backend: Align Pixel Storage
2019-04-09 17:20:43 +07:00
bunnei 088c7c1bb5
Merge pull request #2368 from FernandoS27/fix-lop
Correct LOP_IMM encoding
2019-04-09 17:19:56 +07:00
Fernando Sahmkow cd91e98dab Correct Fermi Copy on Linear Textures. 2019-04-09 14:13:58 +07:00
Hexagon12 b81260c65c
Merge pull request #2371 from lioncash/pagetable
kernel/process: Set page table when page table resizes occur.
2019-04-09 20:13:37 +07:00
Lioncash 2abf979c35 kernel/process: Set page table when page table resizes occur.
We need to ensure dynarmic gets a valid pointer if the page table is
resized (the relevant pointers would be invalidated in this scenario).

In this scenario, the page table can be resized depending on what kind
of address space is specified within the NPDM metadata (if it's
present).
2019-04-09 13:00:56 +07:00
Lioncash b73e433dff yuzu/loading_screen: Resolve runtime Qt string formatting warnings
In our error console, when loading a game, the strings:

QString::arg: Argument missing: "Loading...", 0
QString::arg: Argument missing: "Launching...", 0

would occasionally pop up when the loading screen was running. This was
due to the strings being assumed to have formatting indicators in them,
however only two out of the four strings actually have them.

This only applies the arguments to the strings that have formatting
specifiers provided, which avoids these warnings from occurring.
2019-04-09 10:49:38 +07:00
Fernando Sahmkow 9f16833097 gl_backend: Align Pixel Storage
This commit makes sure GL reads on the correct pack size for the
respective texture buffer.
2019-04-08 17:16:02 +07:00
Fernando Sahmkow 5c55ae4e18 Correct LOP_IMN encoding 2019-04-08 13:39:12 +07:00
Fernando Sahmkow 16adc735a5 Correct XMAD mode, psl and high_b on different encodings. 2019-04-08 13:01:17 +07:00
Lioncash b117ca5fce kernel/svc: Deglobalize the supervisor call handlers
Adjusts the interface of the wrappers to take a system reference, which
allows accessing a system instance without using the global accessors.

This also allows getting rid of all global accessors within the
supervisor call handling code. While this does make the wrappers
themselves slightly more noisy, this will be further cleaned up in a
follow-up. This eliminates the global system accessors in the current
code while preserving the existing interface.
2019-04-07 20:30:05 +07:00
bunnei f14328bf0a
Merge pull request #2300 from FernandoS27/null-shader
shader_cache: Permit a Null Shader in case of a bad host_ptr.
2019-04-07 17:58:27 +07:00
bunnei c2fee0e519
Merge pull request #2355 from ReinUsesLisp/sync-point
maxwell_3d: Reduce severity of ProcessSyncPoint
2019-04-07 17:56:11 +07:00
bunnei 06ece52cfe
Merge pull request #2359 from FearlessTobi/port-2-prs
Port citra-emu/citra#4718: "fix clang-format target when using a path with spaces on windows"
2019-04-07 17:54:57 +07:00
bunnei 8aaf418bd6
Merge pull request #2306 from ReinUsesLisp/aoffi
shader_ir: Implement AOFFI for TEX and TLD4
2019-04-07 17:52:30 +07:00
bunnei 3c1ce290d0
Merge pull request #2361 from lioncash/pagetable
core/memory: Minor simplifications to page table management
2019-04-07 17:50:31 +07:00