Find font, open font, close font … Go WebKit! Go!
File system access is pretty fast … if you are running on a desktop system with lots of caching resources and no real requirement to synch/flush content to the backing media until it you have idle cycles. You can do a lot of churning and never even have it poke to the top of any desktop performance metering tools.
If you are running an embedded system though, filesystem access is your nemesis. Not only does the media that you are interfacing with have generally slower access times, but the environment much more demanding. You get the double whammy of fewer CPU cycles to spare and little or no additional cache memory to compensate for slow device access. All that, and you’ve got some pesky power monitor that keeps trying to get the system to idle so it can sleep and save a few milliwatts of power.
The customers that have approached Crank software about WebKit integration all have embedded systems that they are running, so as part of my WebKit optimization week I started out with the easy targets: Excessive filesystem access.
- Open a trace log file and open the search dialog; Search > Search or just Ctrl+H in the editor.
- Select the Trace Search tab icon and select Add to create a new condition
- Create the condition using the System class and the Path Manager code
Now if you use this condition you will see all of the path based file operations. You can use the data fields in the event such as process or pid to filter the events down to what you are interested in. Running this query, focusing in on the the WebKit based application as it launched against a test website yeilded:
Unbelievable! Considering that this was a trace that lasted only 30 seconds, 2100file accesses seems to be a little bit unreasonable and either a bug or an area ripe for optimization. The pathname provided us with lots of insight into what was happening (lots and lots of font access). Our current port uses FontConfig to manage the font mappings and FreeType to perform the rendering.
Our first change was to create a more ’embedded friendly’ font configuration file. By default the font configuration is scanned every few seconds to support dynamic font addition and removal. Usefull for desktops, but not needed by most embedded systems. Doing that, we picked up a few seconds of improvement, but were still churning.
Time to dive into the code and correlate it with our trace results:
The traces showed repetitive file access for the same font, so we added a simple filesystem name cache and used FreeType’s internal cache to avoid this churn. This dropped the file accesses by about an eighth and picked up another few seconds of improvement. Not bad for a bit of effort, but not the big gain we were looking for.
The traces still showed that we were hitting the font configuration directories several times over, definitely not the intention of the source as far as we could tell. After a day of code inspection, the culprit turned out to be an innocent looking routine that was responsible for cleaning up temporary font resources … unfortunately the cleanup also destroyed a static font configuration, causing it to be re-created each time a new font request was made! Fixing that bug and re-running our test load:
Now that’s looking better! While the number of file accesses is still high (318) it isn’t completely ridiculous (2100) and even better, most of those initial accesses are the shared library loading and don’t occur during the steady state operations of the browser. The even better news is the time savings that came with this reduction … it dropped a full ten (10) seconds off the general load time! We’ve totally moving from Super Sloppy to Super Shiny! (thanks Mario and Paul!).