39

MP3,1 (& others?) SSE 4.2 emulation (to enable AMD Metal driver) | MacRumors...

 1 year ago
source link: https://forums.macrumors.com/threads/mp3-1-others-sse-4-2-emulation-to-enable-amd-metal-driver.2206682/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
neoserver,ios ssh client

MP3,1 (& others?) SSE 4.2 emulation (to enable AMD Metal driver)

Syncretic

macrumors 6502

Original poster

MouSSE (Partial SSE4.2 Emulator) Public Release
Current version: 0.95 (8 Jun 2021)
When referring folks to MouSSE, please just link directly to this post, which will always contain the latest version & info.
EDIT (8 Jun 2021/1 Sep 2021): Monterey changed things up, so a new version was required. Version 0.95 (attached to this post) supports Monterey (as well as earlier MacOS versions). No new functionality was added, so if you're already using MouSSE on a pre-Monterey MacOS, there is no need to "upgrade" to v0.95. (I posted this version on 8 Jun 2021, but I was too busy to update Post #1 until 1 Sep 2021, hence the two dates.)

EDIT (22 May 2020): I found what should be a fairly rare bug in v0.92, which is fixed in the current attached version (v0.93). This is a low-priority update, so if you have a working system, don't feel any urgency to replace your copy of MouSSE.

EDIT (5 May 2020): There is a new version of MouSSE (v0.92) attached to this post. It represents a major rewrite, and should be more stable, more robust, and faster than the previous public version (v0.38). To upgrade from the previous release, either overwrite the existing kext with this one (then kextcache -i / and reboot) or get the newest MacOS patcher that contains MouSSE and let it upgrade for you (as of this writing, the authors of both patchers are aware of the new MouSSE version but have not yet released updates to their patchers).

EDIT (13 November 2019): If your system does not contain a Metal AMD GPU, you should not install MouSSE. There are several reported instances of instability in non-AMD-GPU systems, with no clear cause yet identified. Instructions on how to uninstall MouSSE appear in this post (see below).

This software has been tested on various hardware and software configurations, and it seems to be stable enough to release from beta. Please provide any feedback you have, positive or negative, success or failure, in this thread or via direct message to me.

*** IMPORTANT: USE AT YOUR OWN RISK!
This software is provided as-is, without any warranty whatsoever. There is no reason it should cause any problems, but if things go wrong, they're your responsibility. It has been tested and benchmarked on various systems and various versions of MacOS, but there are no guarantees that it will work on any given system or version of MacOS. The author is not responsible for any losses of time, money, hair, memory, peace of mind, car keys, or television remote controls related in any way to the use of this software.​
(And now, after that ringing endorsement...)​
*** What is it?
Newer AMD Mac video drivers use some SSE 4.2 instructions. Older CPUs (Penryn, Harpertown, and earlier) don't support those instructions. Older Mac Pro systems (such as the Mac Pro 3,1) use those older CPUs - therefore, the new AMD drivers won't work in those systems. MouSSE is a partial SSE4.2 emulator that allows those old CPUs to use the newer AMD drivers. While its primary focus has always been making the AMD drivers work, it also appears to allow World of Warcraft to run on a Mac Pro 3,1 regardless of whether or not AMD drivers are in use.​
In theory, depending upon what you're trying to accomplish, this emulator could prove useful on any Mac system with a Penryn/Harpertown/Wolfdale CPU. These include:​
Mac Pro 3,1 (Early 2008)​
Xserve 2,1 (Early 2008)​
MacBook Pro 4,1 (Early 2008)​
MacBook 4,1 (Early 2008)​
MacBook 7,1 (Mid-2010)​
MacBook Air 2,1 (October 2008)​
MacBook Air 3,1 (October 2010) (11")​
MacBook Air 3,2 (October 2010) (13")​
iMac 8,1 (April 2008)​
iMac 9,1 (March 2009)​
iMac 10,1 (October 2009) (with Core2 Duo, not i5)​
Mac Mini 3,1 (March 2009, October 2009)​
Mac Mini 4,1 (Mid 2010) (aka Mac Mini Server)​
Not all of these have been tested; your mileage may vary. In addition, CPUs even older than Penryn could utilize this emulator; however, since they lack SSE4.1, adding only partial SSE4.2 would probably be of little or no benefit. IMPORTANT: If you don't have one of those systems - and, in particular, if you have a newer system (such as a Mac Pro 4,1 or 5,1) - you should not install MouSSE. It can't do anything useful for those systems, and could potentially get in the way or cause other issues.
I personally have tested this on MacOS 10.13-10.15 (High Sierra, Mojave, and Catalina 15.0-15.4). Others report that it loads on 10.12 (Sierra), although its AMD drivers do not seem to require emulation.​
*** What does it do?
MouSSE traps illegal instructions in both privileged (kernel) mode and unprivileged (user) mode, and emulates the POPCNT, PCMPGTQ, and CRC32 instructions. As of this writing, PCMPGTQ and POPCNT are the only problematic instructions used in the AMD drivers. (And, apparently, the only two currently used in World of Warcraft.)​
MouSSE is fully reentrant, and automatically runs on all CPUs/cores/threads.​
*** What does it NOT do?
At this time, MouSSE does not implement any SSE4.2 instructions other than POPCNT, PCMPGTQ, and CRC32. If there is sufficient demand, a future version may include support for additional SSE4.2 (or other) instructions. If MouSSE encounters an unsupported instruction, it passes control to MacOS, and you will see the same error handling you'd see if MouSSE was not installed (usually, an "Illegal instruction" message and program termination).​
* As of version 0.32, MouSSE no longer behaves transparently when an​
unsupported instruction is encountered. In order to assist troubleshooting,​
MouSSE loads the 16 bytes at the illegal opcode into XMM15, and the first​
8 bytes of that into R15. Both registers are used because while 16 bytes​
provides all the context necessary for analysis, some crash dumps do not​
include the XMM registers, so R15 is also loaded (8 bytes is better than​
nothing). This violates the transparency principle, but given that the​
"illegal opcode" error is generally fatal, the effect is negligible. If you​
know of a situation where this is problematic, please let me know so I can​
devise an alternative. (As of version 0.45, XMM15 and R15 are left untouched,​
and the first 8 bytes of the illegal opcode are returned in RAX instead.)​
* As of version 0.38, a "magic" illegal opcode is handled: the​
normally-illegal instruction 3f 55 44 ("?UD") returns "MouSSE42" in RAX,​
as a way to test that MouSSE is loaded and active. Without MouSSE running,​
that instruction will throw a #UD exception on any system, new or old.​
* As of version 0.91, MouSSE keeps statistics on the instructions it emulates, as​
well as AVX/AVX2/AVX-512 instructions it finds (but does not emulate). These​
statistics are returned with the "magic" instruction is trapped.​
The new MouSSEstats utility can display these statistics.​
Also as of version 0.91, the SSE4.2 CRC32 instruction is implemented.​
At this point, only the SSE4.2 PCMP?STR? instructions remain un-emulated.​
Aside from whatever IOKit uses at startup, MouSSE uses no dynamically allocated memory, so there's no chance of a memory leak. MouSSE does not create any processes; it's simply an extension of the kernel.​
MouSSE doesn't read or write any files, or access any networks, or touch any other devices. All it does is watch for the instructions it knows about, and emulate those when they turn up. At present, logging only occurs at load time.​
MouSSE also does not check to see if your CPU already supports SSE4.2. If you install MouSSE on a newer system, it will load, but it won't ever do anything besides take up a bit of memory, because newer CPUs can handle the SSE4.2 instructions themselves, and MouSSE will silently bear the loneliness of a program that's never asked to actually do anything.​
* As of version 0.38, that's not entirely true - MouSSE does check for​
native SSE 4.2, and explicitly disables itself if the CPU natively supports it.​
At present, there's no simple way to make MouSSE completely unload itself​
in this situation, so it still takes up kernel memory space, but never does​
anything else.​
Because an earlier version of MouSSE seemed to cause some confusion, let me be clear: MouSSE does not "advertise" SSE4.2 capability, it simply provides emulation for certain instructions if the CPU tries to execute them. If the CPUID instruction is used to check for SSE4.2 capability on a pre-SSE4.2 CPU, that test will return "SSE4.2 not supported," because MouSSE does not trap CPUID. In practice, this is not an issue, since the SSE4.2 usage that MouSSE primarily targets (the AMD video driver) does not perform this CPUID check.​
Because of all this, MouSSE is passive - if no SSE4.2/illegal instructions are ever encountered, MouSSE will never do anything at all.​
*** Why do I see "AAA.LoadEarly.MouSSE" in my kextstat output?
*** Why is the kext named "AAAMouSSE.kext"?
MacOS has a complex startup procedure. Since the main point of MouSSE is to allow AMD video drivers to work, MouSSE must load and initialize before the AMD drivers do. Part of the system initialization creates a dependency tree, and MouSSE tries to put itself as close to the top of that tree as possible. Outside of that tree, things are handled alphabetically - so, sticking "AAA" at the beginning of the kext name puts it at the top of the alphabetical list (much like finding "AAA-BestPlumbers" at the beginning of the Yellow Pages (does anyone even use the Yellow Pages any more?)). Ordinarily, the full name of the kext would include a right-to-left domain name, such as "com.apple.xyzzy" - but here again, by using a top-level domain named "AAA", MouSSE can push itself to the top of the alphabetical list. It's all just an effort to have MouSSE load and initialize before anything that might use an SSE4.2 instruction.​
*** Anything else?
If you're running newer AMD drivers under a newer OS on an old Mac, you're probably already running with SIP disabled. If not, sorry - this software requires SIP to be disabled.​
To simplify the IOKit interface, MouSSE has a small C++ wrapper, but all of the important code is written in assembly language (for speed).​
Just to repeat myself, USE THIS SOFTWARE AT YOUR OWN RISK. I've been using it daily on my Mac Pro 3,1 with an AMD Radeon RX 570 for nearly a year without incident, but your mileage may vary. If you lose time, money, or your sanity because of flaws in this software, consider yourself forewarned.​
The semi-whimsical name came from elsewhere, but you can think of it as "Macs oughta understand SSE" (MouSSE) if you like. Or, if you're not fond of your sibling, "My outrageously ugly sister sickens everybody." Or, if you're a megalomaniac, "My objective ultimately should supersede everything." Whatever makes you happy.​
As of version 0.91, I've added a command-line utility called MouSSEstats that will display some statistics about what MouSSE has encountered since it was last loaded (presumably, the last time the system was rebooted).​
For the time being, this post will be the official repository for MouSSE, with updates posted as they occur. At some point in the future, I may move it to a separate host, which will be noted here. When referring people to MouSSE, please just link directly to this post, which will always have the latest version and information.​
*** How do I install it?
I haven't written a standalone installer. For the time being, if you are not installing MouSSE using a MacOS patcher, you need to do the installation manually. Because this forum doesn't allow uploads of .tgz files, the .tgz is zipped - meaning you'll have to uncompress/expand it twice. I suggest the following:​
  1. expand the ZIP file, then the enclosed .TGZ file
  2. open a terminal
  3. type cd {path to the .TGZ file}/MouSSE_RELEASE (change the path for your system)
  4. type sudo tar cf - ./AAAMouSSE.kext | (cd /Library/Extensions;tar xf -)
  5. type sudo kextcache -i /
  6. reboot
(If you're installing to an alternate partition, prepend the root of that partition to both paths, e.g. cd /Volumes/OtherDisk/Library/Extensions and kextcache -i /Volumes/OtherDisk/ - but note that if your currently-booted OS is Mojave or earlier, and /Volumes/OtherDisk/ contains Catalina or later, the kextcache operation will silently fail. This has to do with how Apple divided up the filesystems in Catalina, and has nothing to do with MouSSE.)​
* As of version 0.45, it is recommended to use /Library/Extensions instead​
of /System/Library/Extensions, particularly on Catalina (due to the root​
filesystem being read-only).​
Third-party kext managers, such as KextBeast, should also work (but I haven't tested any of those).​
*** How to I uninstall it?
Since MouSSE isn't the sort of thing you'll probably install and uninstall regularly, I also haven't bothered creating an uninstall script. To uninstall MouSSE on the current running system, just do the following:​
  1. open a terminal
  2. type cd /Library/Extensions, press ENTER
  3. type sudo rm -rf AAAMouSSE.kext, press ENTER
  4. type sudo kextcache -i /, press ENTER
  5. reboot
(To uninstall MouSSE from an alternative disk/partition, change the path in steps 2 and 4 to include the leading volume information, e.g. cd /Volumes/MyOtherMacOSinstallation/Library/Extensions and kextcache -i /Volumes/MyOtherMacOSinstallation/ - but note that if your currently-booted OS is Mojave or earlier, and /Volumes/MyOtherMacOSinstallation/ contains Catalina or later, the kextcache operation will silently fail. This has to do with how Apple divided up the filesystems in Catalina, and has nothing to do with MouSSE.)​
* As of version 0.45, it is recommended to use /Library/Extensions instead of /System/Library/Extensions, particularly on Catalina (due to the root filesystem being read-only).​
Just to be clear, if you load the kext manually (via kextload or kextutil), you can safely unload it and it will clean up after itself. If you have a version of MouSSE currently running, you can try replacing it on-the-fly by executing kextunload AAAMouSSE.kext;kextload ./AAAMouSSE.kext in the directory containing ./AAAMouSSE.kext - in my experience, about 75% of the time, this will cause the WindowServer to crash and restart (you'll have to log in again), but the system will remain up; about 20% of the time, nothing will crash/die and the replacement will be seamless; and about 5% of the time, changing MouSSE on-the-fly will result in a blackscreen/reboot.​
I suggest not doing this real-time substitution unless you're prepared for a hard crash, but it is workable most of the time.​
*** OK, I installed MouSSE, but now my new AMD video card is misbehaving...
The most common problems occur if you've previously installed patches to make legacy video cards work. If those patches are installed, MacOS can get confused about which display to use, which framebuffer to use for which display, etc. If you're experiencing problems with a supported modern AMD video card and you still have those patches installed, you'll need to remove them (which unfortunately, in most cases, involves reinstalling the OS). Using a legacy video card (to see the MacOS boot screen) and a modern AMD video card simultaneously, without the "legacy video" patches, is a hit-and-miss proposition - some combinations work some of the time, some don't work at all. While it would be nice to have the "best of both worlds" (visible boot screen and modern GPU acceleration), your best bet is to keep the legacy card on a nearby shelf and only use it when necessary.​
If you're having problems but don't have any legacy video patches installed, please try contacting the author on the MacRumors forum. (I also lurk on the "MacOS * on Unsupported Macs" threads, if you'd rather try me there.)​
(As an aside, if you want to change your setup so you can see boot screens on a video card lacking a Mac EFI ROM, check out this post - it works very well. You can also pair it with OpenCore.)​
*** I have a Penryn CPU, but I'm not using a Metal AMD card or running World of Warcraft. Should I install MouSSE?
The short answer is "no." If your system is running OK, and you're not planning to upgrade your video card to a modern AMD model, and/or you're not planning to run any software that requires SSE4.2, then installing MouSSE will just eat up some kernel memory and not provide you any real benefit.​
VERSION HISTORY (only versions that escaped captivity are listed):
v0.20 - 15 Aug 2019 - First version released into the wild.
v0.30 - 12 Sep 2019 - Tweaks to allow installation from El Capitan onward
v0.32 - 20 Sep 2019 - Capture of illegal opcode data in XMM15/R15
v0.35 - 07 Oct 2019 - Added more initialization diagnostics
v0.38 - 17 Oct 2019 - Added more logging, "magic" opcode detection
v0.45 - 28 Feb 2020 - Moved illegal opcode return to RAX for compatibility, fixed POPCNT +32d bug
v0.91 - 14 Apr 2020 - Major rewrite of parser. Fixed several bugs. Added many optimizations. Added CRC32 functionality. Added support for all addressing modes. Added statistics (returned when "magic" instruction is trapped). Recognizes (but does not emulate) AVX/AVX2/AVX-512 instructions.
v0.92 - 28 Apr 2020 - Major rewrite of trap-handling infrastructure, which should make MouSSE far more stable in challenging environments. Also a few minor bug fixes. Improvements to MouSSEstats program.
v0.93 - 22 May 2020 - Minor bug fix (rare - miscalculated displacement)
v0.95 - 08 Jun 2021 - Monterey support

About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK