The #develop teamblog
#  Thursday, September 27, 2007

Yesterday I wrote about how I made compiling 30% faster by using multi-threading to let my dual-core CPU work with both cores. A problem was that I had to use serialize build events to send them back to the host process.

MSBuild provides these events to loggers (number of occurrences during the test shown in parenthesis):
MessageRaised   (21156)
ErrorRaised      (0)
WarningRaised   (18)
BuildStarted      (47)
BuildFinished     (47)
ProjectStarted  (449)
ProjectFinished (449)
TargetStarted (2910)
TargetFinished (2910)
TaskStarted (4308)
TaskFinished (4308)
CustomEventRaised (0)

All these events must be send to the SharpDevelop process, where logger AddIns can process them. Each one is serialized separately, and includes information about the full name of the project file being build and some other highly redundant information. Each event is about 1 KB in serialized form. Half of the events occurred in the worker running in the SharpDevelop process, but the other half occurred in the worker that ran in its own process; so half of the events had to be transferred, so we tranferred a total of 18 MB during the 25-second build. Remember the main cost is the serialization (.NET BinaryFormatter), not the data transfer itself.

Obviously, most of these events are not interesting for SharpDevelop, and the best way to improve performance is to ignore uninteresting events in the worker instead of ignoring them in the host.
So which events does SharpDevelop need? Potentially all, since AddIns can add loggers; but if those AddIns tell SharpDevelop which
events are required before the build starts, we can optimize for the common case that all AddIns are only interested in specific events - usually only in errors and warnings. Events that SharpDevelop currently does not use anywhere are MessageRaised, TargetStarted and TargetFinished - all are called quite frequently.
To our benchmark results from yesterday, I added an entry "Less events" where MessageRaised, TargetStarted and TargetFinished are deactivated:

Core Duo Notebook, 1 GB RAM, Vista Home Premium, Aero disabled:
 Rebuild:
   One worker:  35.2, 34.9, 35.2, 35.0, 34.3 : Average 35.1 seconds
   Two workers: 25.6, 23.9, 24.2, 23.5, 23.8 : Average 24.2 seconds
   Less events: 23.7, 23.5, 23.0, 22.4, 21.2 : Average 22.8 seconds

 Build (no changes):
   One worker:  7.4, 7.1, 7.2, 7.8, 7.1 : Average 7.3 seconds
   Two workers: 6.9, 6.8, 6.7, 6.6, 6.2 : Average 6.6 seconds
   Less events: 5.1, 5.0, 6.2, 5.6, 5.1 : Average 5.4 seconds

More than 1 second saved! (see yesterdays' post for details about the benchmark)

Now why does SharpDevelop need TaskStarted and TaskFinished? Take a look at the code:
void OnTaskStarted(object sender, TaskStartedEventArgs e)
{
  activeTaskName = e.TaskName;
  if (MSBuildEngine.CompileTaskNames.Contains(e.TaskName.ToLowerInvariant())) {
    AppendText("${res:MainWindow.CompilerMessages.CompileVerb} " + Path.GetFileNameWithoutExtension(e.ProjectFile));
  }
}

TaskStarted is just there to let the user know that building a project resulted in a recompilation - when doing a Rebuild, SharpDevelop
shows "Compiling ProjectName" for each project, whereas for a Build (no changes), no such lines are shown in the Output view.
But in fact, all consumers of TaskStarted probably look for specific task names. And all consumers of TaskFinished are also listening to TaskStarted and are also only interested in some specific types of tasks.
If we pass the build worker a list of task names we are interested in, it can send us only those events and we don't spend time serializing the tons of events about the tasks <Copy>, <Message> etc.
Together with the "Less events" changes, this decreases the event count from over 36000 to less than 1000.

I implemented this in revision 2696.

Now let's extend our benchmark:
Core Duo Notebook, 1 GB RAM, Vista Home Premium, Aero disabled, two workers, new event handling:
 Rebuild: 20.3, 20.9, 19.9, 20.2, 20.0 : Average 20.3 seconds
 Build (no changes): 4.3, 4.6, 5.0, 4.4, 4.2 : Average 4.5 seconds

That's better than "Less events" because we're sending even less events now, but logger AddIns can tell SharpDevelop that they need to receive additional events, so AddIns did not lose any functionality (but if you install such an AddIn, of course you would lose the performance benefits).
So we're now down from formerly 35.1 seconds (using only one core) to 20.3 seconds - that's 43% faster.
For "Build (no changes)", the reduction of the event count now makes using two workers efficient; from originally 7.3 seconds to 4.5 seconds is a 39% decrease in build time.

Categories: Daniel
Thursday, September 27, 2007 2:31:07 PM (GMT Daylight Time, UTC+01:00)  #    Comments [0]

 



© Copyright 2014 SharpDevelop Core Team

Subscribe to this weblog's RSS feed with SharpReader, Radio Userland, NewsGator or any other aggregator listening on port 5335 by clicking this button.   RSS 2.0|Atom 1.0  Send mail to the author(s)

 

Copyright ©2000-2009 IC#Code. All rights reserved. Projects sponsored by AlphaSierraPapa.