Fix for "Cannot map RAM" on Leopard

About SheepShaver, a PPC Mac emulator for Windows, MacOS X, and Linux that can run System 7.5.3 to MacOS 9.0.4.

Moderators: Cat_7, Ronald P. Regensburg, ClockWise

mschmitt
Tinkerer
Posts: 80
Joined: Sun Jul 05, 2009 10:33 pm

Post by mschmitt »

kelvin31415 wrote:The biggest factor affecting SheepShaver CPU usage is the choice of video refresh strategy; for example, vosf vs. non-vosf.
Thanks.

That makes the difference even more pronounced.

If I --enable-vosf (the default setting), then the result is:
Video on SEGV signals: Yes

and SheepShaver uses about 4% of cpu at idle.


If I build with --disable-vosf, then my build uses a little over 12%, as previously mentioned. This is how the version I uploaded was compiled.

I suppose this setting is a cross-compile issue; a binary download has to take the lowest common denominator. Which makes me wonder: if the configure script can figure out whether VOSF works by running a test program, then why can't it do the same test at runtime? Maybe that is just the Unix philosophy of source package delivery.

Anyway, we still don't know why the version Ronald compiled sucks up so much CPU.

Note: My Preferences are set to "Don't use CPU When Idle" and the Refresh Rate is the default (60 Hz) (frameskip 1).
User avatar
Ronald P. Regensburg
Expert User
Posts: 7821
Joined: Thu Feb 09, 2006 10:24 pm
Location: Amsterdam, Netherlands

Post by Ronald P. Regensburg »

mschmitt wrote:Anyway, we still don't know why the version Ronald compiled sucks up so much CPU.
All my builds in the past years use that much CPU, on my machine over 70%. When I disable JIT Compiler it is even higher, close to 80%. I stopped wondering why.

For this last Intel build (and for the March 2009 UB build) I used:
./autogen.sh --disable-vosf --enable-sdl-static --enable-sdl-video


Edit: Lowering the refresh rate from 60Hz (frameskip 1) to 15 Hz (frameskip 4) brings CPU usage down to little over 20%.
kelvin31415
Tinkerer
Posts: 83
Joined: Sat Apr 12, 2008 8:22 pm

Post by kelvin31415 »

mschmitt wrote:if the configure script can figure out whether VOSF works by running a test program, then why can't it do the same test at runtime?
I seem to recall that it does; i.e., if you build with VOSF enabled but at runtime the code determines that it would not perform better that way anyway, it doesn't use it. At least, I recall seeing code that claims to be doing that.
mschmitt wrote:Anyway, we still don't know why the version Ronald compiled sucks up so much CPU.
Screen size and refresh rate, most likely.
User avatar
Ronald P. Regensburg
Expert User
Posts: 7821
Joined: Thu Feb 09, 2006 10:24 pm
Location: Amsterdam, Netherlands

Post by Ronald P. Regensburg »

kelvin31415 wrote:
mschmitt wrote:Anyway, we still don't know why the version Ronald compiled sucks up so much CPU.
Screen size and refresh rate, most likely.
There must be something else. With the exact same settings, on the same (my) machine, and using the same (ROM, disk image, etc.) files, my build uses around 73% CPU and mschmitt's build uses around 20% CPU.
kelvin31415
Tinkerer
Posts: 83
Joined: Sat Apr 12, 2008 8:22 pm

Post by kelvin31415 »

What differences are there between the two builds that we know of? For example, configuration options and SDL version?
mschmitt
Tinkerer
Posts: 80
Joined: Sun Jul 05, 2009 10:33 pm

Post by mschmitt »

kelvin31415 wrote:What differences are there between the two builds that we know of? For example, configuration options and SDL version?
  1. I built on Leopard, I think Ronald built on Tiger.
  2. We're using different versions of Xcode Tools: I'm using Xcode 3.1.2, Roland is using 2.5.
  3. I included --no-gtk, Ronald kept the default. I'm not sure yet what that resulted in for his build.
  4. I compiled with SDL 1.2.13, Ronald used 1.2.10.
  5. We're compiling on different machines, and may have different frameworks etc. installed. For example, I have mono installed, and we already know that this had an impact on the build; if I don't specify --no-gtk then it compiles in a way that requires a particular mono framework at runtime.
Update: the config header that was generated by the configuration process has no functional differences. So at least at that level we're compiling with the same preprocessor defines.
Last edited by mschmitt on Thu Jul 09, 2009 12:25 am, edited 1 time in total.
kelvin31415
Tinkerer
Posts: 83
Joined: Sat Apr 12, 2008 8:22 pm

Post by kelvin31415 »

The most likely thing on your list may be the difference in SDL versions, as there could have been performance work done in SDL between 1.2.10 and 1.2.13. Also, the SDL version determines which cursor management approach will be used ("hardware" vs. software) by SheepShaver.

The reason the SDL version and cursor management approach are currently (unfortunately) tied is that neither SDL version works properly with both hardware and software cursor.
mschmitt
Tinkerer
Posts: 80
Joined: Sun Jul 05, 2009 10:33 pm

Post by mschmitt »

New experiment:

I followed vasi's instructions and built against the OS X 10.4 SDK, by exporting the following before running configure:

Code: Select all

export MACOSX_DEPLOYMENT_TARGET=10.4 
export CPPFLAGS="-isysroot /Developer/SDKs/MacOSX10.4u.sdk" 
export LDFLAGS="-Wl,-syslibroot,/Developer/SDKs/MacOSX10.4u.sdk"
I did this before configuring and building SDL (1.2.13), and then proceeded to build SheepShaver with --enable-vosf. The result had no drop in performance.

Then I configured and built SDL 1.2.10, still against the 10.4 SDK. In a previous post I said the 1.2.10 build failed due to a problem with Leopard's Core Audio headers. But this time it worked.

So I configured and built SheepShaver again, against 10.4 and SDL 1.2.10, with

Code: Select all

./configure --enable-sdl-static --enable-sdl-video --without-gtk --enable-vosf
The result is that I have the hardware cursor, and about the same performance. At idle it is using around 4% of CPU. I actually think SDL 1.2.13 is slightly more efficient at idle than 1.2.10, but they are still close.

I have uploaded this version, with a new file name and link. It is still Intel only.

http://files.getdropbox.com/u/879516/Sh ... 8-2009.zip
kelvin31415
Tinkerer
Posts: 83
Joined: Sat Apr 12, 2008 8:22 pm

Post by kelvin31415 »

I am more interested in the --disable-vosf version. Do we have comparative results for cpu consumption with your build vs. Ronald's build, both with --disable-vosf, running on the same machine with the same SheepShaver configuration?

In some profiling runs that I did many moons ago, 85% of SheepShaver's cpu time was spent in the video refresh code (with vosf disabled). Compiler improvements could be making a difference.

To get good subjective performance (e.g. cursor tracking) with vosf enabled, I had to reduce the frame rate more than I liked. Since SheepShaver's CPU consumption matters much less to me than its subjective performance, I've been disabling vosf in my own builds.

Edit: found some of my old posts on VOSF at http://www.emaculation.com/forum/viewto ... 7499#27499
mschmitt
Tinkerer
Posts: 80
Joined: Sun Jul 05, 2009 10:33 pm

Post by mschmitt »

kelvin31415 wrote:I am more interested in the --disable-vosf version. Do we have comparative results for cpu consumption with your build vs. Ronald's build, both with --disable-vosf, running on the same machine with the same SheepShaver configuration?
...
Yes. Referring to the following builds:
  1. Roland's build has --disable-vosf.
  2. 07-05-2009 build I have posted in the top post in this thread has --disable-vosf. SDL 1.2.13, built on 10.5
  3. I have a build that is exactly the same as the 07-08-2009 build I just posted, except that this one has --disable-vosf. It is SDL 1.2.10, built against 10.4 SDK. (This one in theory should be the closest match to Roland's)
On my machine (3.06 Ghz iMac w/4 GB RAM):
#1: CPU 42% at idle, measured in Activity Monitor. (I think it is 42% of 1 CPU core.)
#2: CPU 12.3%
#3: CPU 12.4%

Roland previously reported that on his machine:
#1: CPU 73%
#2: CPU 20%


FYI, here is a link to build #3 referred to above:

http://files.getdropbox.com/u/879516/Sh ... 8-2009.zip
kelvin31415
Tinkerer
Posts: 83
Joined: Sat Apr 12, 2008 8:22 pm

Post by kelvin31415 »

I haven't pulled a source tree in some time, so I'm missing any code changes that have gone in lately. Yet my CPU consumption experience is consistent with what you guys are seeing: my (non-vosf) builds are in the 50% CPU range, and yours are in the 20% range (same machine and SheepShaver configuration).

My builds are on Tiger. I will try to make some time to build identical source with identical options on Leopard; might take me a few days because my Leopard disk is not as readily available and I have only short periods of time to tinker here and there.
User avatar
Ronald P. Regensburg
Expert User
Posts: 7821
Joined: Thu Feb 09, 2006 10:24 pm
Location: Amsterdam, Netherlands

Post by Ronald P. Regensburg »

mschmitt wrote:Roland previously reported that on his machine:
#1: CPU 73%
#2: CPU 20%
And #3 again CPU 20%. So, the used SDL version does not seem to influence CPU usage.
These figures are on my Intel Core 2 Duo iMac with Leopard 10.5.7.

All builds that used SDL 1.2.10 have the hardware cursor and switch to the software cursor when needed (with colored cursors).

Yesterday's build with --enable-vosf only uses around 6% CPU on my machine, but it is hardly useable with extremely slow graphic performance and stuttering audio. Using a low Refresh Rate (frameskip 4) improves it a little but not enough for practical use.

My builds were built in Tiger with XCode 2.5. They run in both Tiger and Leopard. At the moment, I cannot test if/how the Leopard builds will run in Tiger. ( I need more disk space to keep all my backups and different working OS versions at the same time.)
mschmitt
Tinkerer
Posts: 80
Joined: Sun Jul 05, 2009 10:33 pm

Post by mschmitt »

Ronald P. Regensburg wrote:Yesterday's build with --enable-vosf only uses around 6% CPU on my machine, but it is hardly useable with extremely slow graphic performance and stuttering audio. Using a low Refresh Rate (frameskip 4) improves it a little but not enough for practical use.
Thanks for the feedback.

I updated the (Intel Only) link in this thread's top post to download build #3; which is the build with --disable-vosf, SDL 1.2.10, built against the OS X 10.4 SDK.

By the way, one possible difference could be when we pulled the source tree from CVS. I think I grabbed it around June 19th.
User avatar
Ronald P. Regensburg
Expert User
Posts: 7821
Joined: Thu Feb 09, 2006 10:24 pm
Location: Amsterdam, Netherlands

Post by Ronald P. Regensburg »

mschmitt wrote:By the way, one possible difference could be when we pulled the source tree from CVS. I think I grabbed it around June 19th.
I cannot imagine that is the essential difference. For these last builds I fetched it two days ago and added the files that you changed, but the huge CPU usage has been the same with all my builds in the past two years. Each time I took te latest source from cvs.
User avatar
Ronald P. Regensburg
Expert User
Posts: 7821
Joined: Thu Feb 09, 2006 10:24 pm
Location: Amsterdam, Netherlands

Post by Ronald P. Regensburg »

Ronald P. Regensburg wrote:My builds were built in Tiger with XCode 2.5. They run in both Tiger and Leopard. At the moment, I cannot test if/how the Leopard builds will run in Tiger. ( I need more disk space to keep all my backups and different working OS versions at the same time.)
On my machine all builds run the same in Tiger and Leopard and also use the same amount of CPU in both.
mschmitt
Tinkerer
Posts: 80
Joined: Sun Jul 05, 2009 10:33 pm

Post by mschmitt »

Ronald P. Regensburg wrote:After copying the changed files to SheepShaver source, I tried to build SheepShaver on both PPC and Intel in order to make a UB version. I built with SDL 1.2.10 in Tiger (to get the H-S compromise cursor) with XCode 2.5 installed.

Building on both PPC and Intel proceeded as expected, no errors or any other remarkable events.

The PPC version, however, does not run. It crashes as soon as it is launched.
For some reason I was under the impression that the download you posted was the UB version. The word "Intel" in the file name should have given me a clue otherwise.

Could you post the UB version, or send me the a download link, even though the PPC part crashes? I want to try it on my machine.
User avatar
Ronald P. Regensburg
Expert User
Posts: 7821
Joined: Thu Feb 09, 2006 10:24 pm
Location: Amsterdam, Netherlands

Post by Ronald P. Regensburg »

The PPC version did not run, so I did not join it with the Intel version to create a UB version. I think I did not even save it. I will have a look.
mschmitt
Tinkerer
Posts: 80
Joined: Sun Jul 05, 2009 10:33 pm

Post by mschmitt »

Question for the experts:

I coped my entire build tree over to my PPC machine, did make clean, reran all the configure and make steps to recompile as PPC.

The result was that SheepShaver couldn't start because it couldn't allocate the low memory globals.

I tracked down the problem: the SheepShaver configure step runs a test to see if it can do a "pagezero hack". To determine whether it works, configure builds and runs a couple of test programs in src/Unix/Darwin.

What happened to me is I already had a lowmem and pagezero program in the Darwin directory, from when I compiled for Intel. When configure ran, it didn't recompile them, because make (as ran by configure, not the make I did later) determined that they were already up to date. So they were still compiled for Intel.

Then the configure script tried to run the test but couldn't, so it decided that the pagezero hack wouldn't work. So the PAGEZERO_HACK define wasn't set.

Is there some command I should have done that would have told configure to clear out the outputs in the Darwin folder?

Or is it a bug in the configure scripts?
User avatar
Ronald P. Regensburg
Expert User
Posts: 7821
Joined: Thu Feb 09, 2006 10:24 pm
Location: Amsterdam, Netherlands

Post by Ronald P. Regensburg »

I did not keep the build, but I kept the source. So, I built it again. Here is the PPC version:
http://www.xs4all.nl/~ronaldpr/sheepsha ... 090712.zip

It does not run in Tiger on my PowerBook. I do not have a Leopard system on PPC at present.

When I launch it, it will first show the "crash that is not a crash" with the "unexpectedly quit" dialog that also all previous builds show on my PowerBook and that can simply be dismissed while SheepShaver continues running.

When in this build the dialog is dismissed, the SheepShaver window stays black and the MacOSX color beachball keeps turning. Force quitting SheepShaver is the only way out.

Edit: We posted at the very same moment.
User avatar
Ronald P. Regensburg
Expert User
Posts: 7821
Joined: Thu Feb 09, 2006 10:24 pm
Location: Amsterdam, Netherlands

Post by Ronald P. Regensburg »

mschmitt wrote:What happened to me is I already had a lowmem and pagezero program in the Darwin directory, from when I compiled for Intel. When configure ran, it didn't recompile them, because make (as ran by configure, not the make I did later) determined that they were already up to date. So they were still compiled for Intel.
I am not an expert, so I am not sure what the problem is, but I (again) used the source as it was last week before I compiled for either PPC or Intel (simply kept it archived). After I compiled for PPC, in the Darwin folder there were PPC executables lowmem and pagezero.
mschmitt
Tinkerer
Posts: 80
Joined: Sun Jul 05, 2009 10:33 pm

Post by mschmitt »

I believe I've figured out what goes wrong on PPC...

When running in native mode, there is some code in ppc_asm.S:

Code: Select all

	// Preset registers for ROM boot routine
	lis	r3,0x40b0		// Pointer to ROM boot structure
	ori	r3,r3,0xd000
I think it is loading r3 = 0x40b0d000, which means it is hard-coding the ROM load address. Therefore, it crashes when I try to shift the ROM load point.

It works in Intel mode because it goes through the PPC emulation, which never hits this code.

The proof of this theory is that if I coerce the ROM to load at a different point (which crashes), and then change the code in ppc_asm.S to match, then it works fine.

I don't know how to fix the ppc_asm code.

So I changed the program so it only attempts to shift the RAM and ROM load points when running using PPC emulation (e.g. running on Intel). PPC mode will load the RAM & ROM at exactly where it has been loading it before.

The downside is that this means that given the same RAM load point as before, it will still have the 512 MB RAM limit when run on PPC. Which means, the only benefit of my changes for PPC users is that I've fixed the error message you get when it can't map memory.

I've uploaded a Universal Binary:

http://files.getdropbox.com/u/879516/Sh ... 2-2009.zip

The PPC part was built on OS X 10.4 using Xcode 2.4.1 and SDL 1.2.10.

Note: Even though it won't help PPC users, it still needs to be tested on PPC, because it contains a lot of changes to make the ROM load point a variable rather than a hard-coded constant.
User avatar
Ronald P. Regensburg
Expert User
Posts: 7821
Joined: Thu Feb 09, 2006 10:24 pm
Location: Amsterdam, Netherlands

Post by Ronald P. Regensburg »

I tried the last three UB builds on my PowerBook G4 and Tiger.

#1 July 21 2008
#2 March 19 2009 - 2.3 (090319) "with fullscreen working"
#3 July 12 2009 - 2.3 (20090712)


Behavior of all three is mostly equally good (or equally poor):

- "Unexpectedly quit" crash message at launch while continuing to run.
- PCU usage of 80-90%. Judging from the Activity Monitor graphics, they simply use all PCU time that is not used by other processes.
- Acceptable video performance with the hardware cursor, poor video performance when the software cursor kicks in.
- Audio performance is inconsistant in #2 and #3, sometimes OK and sometimes really poor. Audio performance is better and more consistant in #1.
- A freeze that seems related to sound (the click sound that is produced when a Launcher button is clicked) and that can only be ended by a force-quit. I could make it happen only once in #1, but several times in both #2 and #3.

Apart from changes that were added to CVS after July 2008, the main difference between #1 and the other two is that #1 was configured with --enable-sdl-audio ( --disable-sdl-audio is the default). Maybe we should try another build with (at least for the PPC version) --enable-sdl-audio.

So far, the advantages are for Intel only: "Cannot map RAM" error solved, no 512MB RAM limit, less CPU usage (still do not understand how that can be only with your builds). And for VirusBarrier users: The UB version does not trigger a virus alarm as the Intel only version does.

For PPC users, the July 21, 2008 version is still the best choice.


I am no developer and I have no insight in what is going on in the source. But I wonder if these changes so far are the best solution. Just reading the explanation, I get the impression of changes that need a patch, that needs another patch,... After everything is sorted out, maybe the solution could be applied more elegantly? But maybe I am wrong about this.

How about the Linux version? Do the changes not affect compiling for Linux?
mschmitt
Tinkerer
Posts: 80
Joined: Sun Jul 05, 2009 10:33 pm

Post by mschmitt »

I haven't given up on getting the RAM/ROM shifting to work in PPC -- I have an idea of how to fix it.

Regarding your test results, I think the differences we are seeing in the builds is due to how we are building, both parameters and on what build platform, rather than the code changes I made. The code should either work or crash. It shouldn't have any impact on performance.

The performance difference on Intel is a real mystery. The one test we haven't tried is for someone else to do what I did: build on Leopard with the latest Xcode.

It would also be interesting to know if building the PPC version on Leopard improves performance. I can't try it directly; I don't have any PPC machine running Leopard, and I don't have a retail install disk. I was thinking of trying to cross-compile, but this is tricky, since the config process itself compiles programs and runs them.

It is strange to me that the PPC build has such terrible performance, since it runs natively. It should be just as fast as Classic. That's why we really need to figure out what is going on with the performance difference -- if we can solve it, maybe it will fix the PPC side too.

(The fact that my UB build is slow on PPC leads me to believe that does have to do with the version of Xcode tools; since like you, I built the PPC part on OS X 10.4)

Yes, the program changes are in code shared by all platforms, including Linux. For most platforms the net effect should be no change. But Linux and OS X share the same main code. So as far as I can tell, Linux should have had the same 512 MB limit, and with this code Linux will load the RAM/ROM anywhere.

Note that Linux can run on both PPC and Intel, so it has the same issues as OS X. As the code is right now, Linux on PPC can't move the ROM.

Anyway, Linux should be tested.
User avatar
Ronald P. Regensburg
Expert User
Posts: 7821
Joined: Thu Feb 09, 2006 10:24 pm
Location: Amsterdam, Netherlands

Post by Ronald P. Regensburg »

mschmitt wrote:Regarding your test results, I think the differences we are seeing in the builds is due to how we are building, both parameters and on what build platform, rather than the code changes I made. The code should either work or crash. It shouldn't have any impact on performance.
Which differences are you referring to? Running on PPC, there are no differences at all in behavior or performance between my March 2009 build and yesterday's build by you. There is a difference in audio performance with my July 2008 build, but many changes were added to CVS between July 2008 and March 2009 and, as I mentioned above, the July 2008 build was configured differently.
The performance difference on Intel is a real mystery. The one test we haven't tried is for someone else to do what I did: build on Leopard with the latest Xcode.
The difference is the difference in CPU usage. Apart from that, my builds with your code changes and your builds behave the same on my Intel machine. I did not yet build in Leopard. I will do that, the normal configuration on my iMac is 10.5.7 and XCode Tools 3.1.3. I did build with XCode 2.5 on Tiger to be sure the application would run in both Tiger and Leopard, but apparently that is not necessary.
It would also be interesting to know if building the PPC version on Leopard improves performance. I can't try it directly; I don't have any PPC machine running Leopard, and I don't have a retail install disk. I was thinking of trying to cross-compile, but this is tricky, since the config process itself compiles programs and runs them.
I lent one of my external drives to a friend. When I will have that drive back, I can install Leopard on it for my PowerBook G4.
It is strange to me that the PPC build has such terrible performance, since it runs natively. It should be just as fast as Classic. That's why we really need to figure out what is going on with the performance difference -- if we can solve it, maybe it will fix the PPC side too.
Performance on PPC has always been poor in all versions I used, starting with Gwenole's May 2006 snapshot. It never came even close to performance in Classic. Performance of SheepShaver on Intel has improved greatly since May 2006.
(The fact that my UB build is slow on PPC leads me to believe that does have to do with the version of Xcode tools; since like you, I built the PPC part on OS X 10.4)
See above. Your UB is not slower than previous builds. I built with XCode 2.5 that is designed to be used on both Tiger and Leopard.
Yes, the program changes are in code shared by all platforms, including Linux. For most platforms the net effect should be no change. But Linux and OS X share the same main code. So as far as I can tell, Linux should have had the same 512 MB limit, and with this code Linux will load the RAM/ROM anywhere.

Note that Linux can run on both PPC and Intel, so it has the same issues as OS X. As the code is right now, Linux on PPC can't move the ROM.

Anyway, Linux should be tested.
Right. No change can be added to CVS unless the result is OK for all configurations and platforms.
User avatar
Ronald P. Regensburg
Expert User
Posts: 7821
Joined: Thu Feb 09, 2006 10:24 pm
Location: Amsterdam, Netherlands

Post by Ronald P. Regensburg »

OK, here is my latest Intel build:
http://www.xs4all.nl/~ronaldpr/sheepsha ... 090713.zip

I built in Leopard 10.5.7, XCode 3.1.3, against 10.4 SDK, with SDL 1.2.10.

I configured with --enable-sdl-audio:

Code: Select all

./autogen.sh --disable-vosf --enable-sdl-static --enable-sdl-audio --enable-sdl-video
because that appeared to be better for a PPC version and I will try to make a UB as soon as I can use my PowerBook with Leopard to build the PPC version the same way.

This build runs fine on my machine in Leopard with no 512MB RAM limit (did not try Tiger yet) and it indeed uses only around 17% CPU.
Post Reply