purpose of framework benchmarking speed

I’ve followed the techempower benchmarks, and every now and then I check out benchmarks of various projects (usually PHP) to see what the relative state of things are. Inevitably, someone points out that “these aren’t testing anything ‘real world’ – they’re useless!”. Usually it’s from someone who’s favorite framework has ‘lost’. I used to think along the same lines; namely that “hello world” benchmarks don’t measure anything useful. I don’t hold quite the same position anymore, and I’ll explain why.

The purpose of a framework is to provide convenience, structure, guidance and hopefully some ‘best practices’ for working with the language and problem set you’re involved with. The convenience and structure come in the way of helper libraries designed to work a certain way together. In the form of code, these have a certain execution cost. What a basic “hello world” benchmark is measuring is the cost of at least some of that overhead.

What those benchmark results are telling you is “this is about the fastest this framework’s request cycle can be invoked while doing essentially nothing”. If a request cycle to do ‘hello world’ is, say, 12ms on hardware X, it will *never* be any faster than 12ms. Every single request you put through that framework will be 12ms *or slower*. Adding in cache lookups, database calls, disk access, computation, etc – those are things your application will need to do regardless of what supporting framework you’re building in (or not), but the baseline fastest performance framework X will ever achieve is 12ms.

These benchmarks are largely about establishing that baseline expectation of performance. I’d say that they’re not always necessarily presented that way, but this is largely the fault of the readers. I used to get a lot more caught up in “but framework X is ‘better'” discussions, because I was still reading them as a qualitative judgement.

But why does a baseline matter?  A standard response to slow frameworks is “they save developer time, and hardware is cheap, just get more hardware”.  Well… it’s not always that simple.  Unless you’re developing from day one to be scalable (abstracted data store instead of file system, centralized sessions vs on disk, etc), you’ll have some retooling to do.  Arguably this is a cost you’ll have to do anyway, but if you’re using a framework which has a very low baseline, you may not hit that wall for some time.  Secondly, ‘more hardware’ doesn’t really make anything go faster – it just allows you to handle more things at the same speed.  More hardware will never make anything *faster*.

“Yeah yeah yeah, but so what?”  Google uses site speed in its ranking algorithm.  What the magic formula is, no one outside Google will ever know for sure, but sites that are slower to your competitors *may* have a slight disadvantage.  Additionally, as mobile usage grows, more systems are SOA/REST based – much of your traffic will be responding to smaller calls for blobs of data.  Each request may not be huge, but they’ll need to respond quickly to give a good experience on mobile devices.  200ms response times will likely hurt you, even in the short term, as users just move to other apps, especially in the consumer space.  Business app users might be a bit more forgiving if they have to use your system for business reasons, sort of like how legions of people were stuck using IE6 for one legacy HR app.  They’ll use it, but they’ll know there are better experiences out there.

To repeat from above, throwing more hardware at the problem will never make things *faster*, so if you’ve got a slower site that needs to be measurably faster, you’ve possibly got some rearchitecting to do.  Throw some caching in, and you may get somewhat better results, but at some point, some major code modifications may be in order, and the framework that got you as far as it did may have to be abandoned for something more performant (hand rolled libraries, different language, whatever).

Of course, there’s always a maintainability aspect – I don’t recommend PHP devs throw everything away and recode their websites in C.  While this might be the most performant, it might take years to do, vs some other framework or even a different language.  I’ve incorporated Java web stacks in to my tool belt, and have some projects in Java as well as some PHP ones.  I benchmarked a simple ‘hello world’ in laravel 4, zf2 and java just this morning.  On the same hardware, the java stack was about 3-4 times faster (yes, APC cache was on).  Does this mean that all java apps are 4 times faster than PHP apps?   This was on PHP 5.4.34 – I’m interested in trying out PHP 7 soon to see what the improvements will be overall.

Grails configuration in views

I don’t know why it’s taken me this long to figure this out, but… injecting the Grails configuration object in to the view layer is pretty simple.

In a Grails filter, make an ‘after’ handler like this:

after = { Map model ->
model.config = grailsApplication.config
}

 

That’s pretty much it.  In your views, you can access ${config} directly.

This *seems* to be safe.  Are there any downsides to this approach?

grails configuration taglib

I sort of can’t believe something like this doesn’t exist, but I’ve not been able to find one (I’ll find it 10 minutes after posting this I bet!)

package com.kimsal

class ConfigTagLib {

def grailsApplication

static namespace = “config”
static defaultEncodeAs = [taglib: ‘none’]

def show = {a->
String value = a.key.tokenize( ‘.’ ).inject( grailsApplication.config ) { cfg, pr -> cfg[ pr ] }
out << value
}

}

This will allow you to do something like

And you’ll get the value for grailsApplication.config.foo.bar in your GSP.

Did I just miss this and it’s already out there someplace?

Also, props to http://www.baselogic.com/blog/development/grails-groovy-development/configslurper-with-dynamic-property-name-from-configurationholder-config-object/ for helping out with the dynamic accessing…

Grails MySQL memory leak with Tomcat

I’ve been plagued with these for a while:

SEVERE [ContainerBackgroundProcessor[StandardEngine[Catalina]]] org.apache.catalina.loader.WebappClassLoader.clearReferencesThreads The web application [##10379] appears to have started a thread named [Abandoned connection cleanup thread] but has failed to stop it. This is very likely to create a memory leak.

Using MySQL 5.1.29 driver in Grails apps (various versions of each over the years). Any time a new app is deployed, or undeployed, this shows up.

I use ‘parallel deployment’ in Tomcat a lot, but after several deployments, we hit an ‘out of memory error’.

I *think* I’ve found the fix – I think I found it last year, but never documented it. So, this morning, I ‘fixed’ this in an app again, and am watching. So far, no leaks in the new deployment.

How did I fix it?

In Grails, in the ‘/src/groovy’ directory, I created MysqlThreadsListener.groovy

package com.myapp

import com.mysql.jdbc.AbandonedConnectionCleanupThread

import javax.servlet.ServletContextEvent
import javax.servlet.ServletContextListener
import javax.servlet.annotation.WebListener

@WebListener
public class MysqlThreadsListener implements ServletContextListener {
@Override
public void contextInitialized(ServletContextEvent sce) {
//Nothing to do
}
public void contextDestroyed(ServletContextEvent arg0) {
try {
AbandonedConnectionCleanupThread.shutdown();
} catch (InterruptedException e) {
}
}
}

And that’s… it? Will repost here if there’s still an issue…

UPDATE:

I’ve needed to run the ‘grails install-templates’ function to get a stock web.xml file which I then modified with a custom listener reference. The @WebListener annotation didn’t seem to work.

In src/templates/war/web.xml, in the web-app tag, add:

<listener>
<listener-class>com.kimsal.MysqlThreadsListener</listener-class>
</listener>

Dealing with an incorrect security report

A colleague of mine has had his company’s code “security audited” by one of their clients; specifically, the client hired a firm to do security testing for many (all?) of that company’s services, and my colleague’s company was one of the services.

They’re told they’re in danger of losing the account unless all of the “security holes” are patched. The problem is, some of the things that are being reported don’t seem to be security holes, but their automated scanner is saying that they are, and people can’t understand the difference.

Here’s an example – you tell me if this is crazy or not.

For URL:
hostedapp.com/application/invite/?app_id="><script>alert(1407385165.6523)</script>&id=104

Output contains:
<input type="hidden" name="app_id" id="app_id" value="&quot;&gt;&lt;script&gt;alert(1407385165.6523)&lt;/script&gt;">

Report coming back is:
Cross-Site Scripting vulnerability found
Injected item: GET: app_id
Injection value: "><sCrIpT>alert(14073726.2017)</ScRiPt><input "
Detection value: 14073726.2017
This is a reflected XSS vulnerability, detected in an alert that was an immediate response
to the injection.

When you pass in a value, it is escaped; there is no alert box that pops up (in any browser at all). I’m *thinking* that the reporting tool is simply seeing that the number exists on the resulting page (“detection value”) and is flagging this as “bad”. That seems way too naive for a security tool, though.

Is there some other explanation for why a security tool would look at this and still report that this was ‘insecure’?

Is your code portable to subfolders?

Have been dealing with a couple of PHP projects recently which have been a far bigger pain in the backside than I anticipated, and both had some of the same stumbling blocks.

In both cases, and in other projects I’ve seen, there’s a huge assumption that the code will be run from the root of a domain, and all url and routing management have this assumption baked in to everything they touch. What’s the answer? “Just make a new vhost!” typically. Quite a pain, and seems to be a real shortcoming of all(?) major frameworks I’ve looked at of late. I remember being a bit surprised at Zend Framework as far back as 2006(!) having this be the recommended way of building with the framework.

I’ve gotten more used to Java web stuff (or, at least Spring) which respects whatever pathing your app is deployed to.

redirect(uri:"/foo")

will redirect to bar.com/ if the code is deployed to bar.com, but it will redirect to bar.com/subbar/ if the code is deployed to bar.com/subbar

I recently hit this snag in a PHP project I picked up which uses Slim framework. There are dozens and dozens of URL and route references in multiple files, like

$app->get('/sample-url-path(/)', function() use ($app) {
and
$app->redirect("/");

and there’s no way to just have the code work normally under something like “http://localhost/slimdev/”. I *have* to create a new hostname and vhost just to get this to run. Am I missing a simple global config option someplace that wouldn’t require me to rewrite dozens of lines of code?

Are there any PHP frameworks that can work relatively from a non-root-domain URL invocation?

Perhaps I just need to roll with things, but it makes working with anyone else’s code (even based on a framework) that much harder.

Maybe try grabbing your own code sometime and reinstalling it in a ‘non-traditional’ way, and see how many assumptions you’ve baked in are really necessary, vs just using defaults.

Titanium debug breaking app

I did a quick prototype of a mobile iOS app using Titanium yesterday, and hit a weird issue/bug, but not sure how to report it. Putting it here for now in case it helps someone else.

I’d run the app from Titanium Studio (3.4.0, but happened in 3.3.0 as well – both using 3.3 and 3.2.3 SDK), and when clicking a button, the app would just *die*. There was no logging in the log window – in fact, the app restarted. Well, the log window cleared and the startup logs from a new run would repopulate, but the app didn’t actually restart. Very weird.

Tracked it down to one stupid line. In the function called by the button click, I was

Ti.API.debug(Alloy.Model.modelName);

That was it. Took way too long to track down, but removing that one line made it run just fine again.

What’s even weirder is that I hadn’t seen it yesterday. I wasn’t running from Titanium Studio (well, actually, IIRC, using the ‘debug’ runner), but running from the command line with ‘ti build’ and the ‘–shadow’ Ti-Shadow reloader project. For some reason the debug line is fine there, but if the code is debugging inside Studio, and that debug line was hit, it just died.

I hope this helps someone else out there…

iPhone 5s, iOS 7.0.4, jailbroken, and using Skype

I got a iPhone 5s recently, and took advantage of the evasi0n jailbreak. I got Cydia, and reinstalled pdanet for tethering. But.. Skype wouldn’t work. It sort of hung. Googled around a bit – needed to update Cydia substrate. OK – did that. Still didn’t work. I’d read someplace that ‘xcon’ did the trick. Tried it – it actually made it worse. Skype wouldn’t load at all – just crashed on boot. Uninstalled xcon – everything worked. I’d read somewhere to try that, and I did, and it worked great. Not sure what happened there, but if you’ve got the setup I have in my post title, try it.

MySQL speed boost

I hit a problem the other day with concurrent queries causing deadlocks.  Using innodb gives you a lot of protection with respect to transaction support, but it carries a moderate amount of overhead, and unless you’re aware of what’s going on, you may be paying a higher price which can eventually cause performance or deadlock issues.

FWIW, I thought I knew what was going on, and I *sort of* did, but not entirely.

This article at high scalability has some good introductory info, but I’ll cut to the chase as to what made a huge improvement for me.

Instead of standard BEGIN to start a transaction, I set a specific isolation level for just *one* query:

SET TRANSACTION ISOLATION LEVEL READ COMMITTED;

This took my combined queries from 18 seconds down to 3. In addition to the 18 seconds average time, those 18 seconds were often going to 30-60 depending on what other concurrent queries were going on. The default ‘REPEATABLE_READ’ transaction level in InnoDB does a lot of locking (or waiting to be able to lock) data, and this was the root of my problems.

You need to understand what transaction isolation levels are doing, of course, but changing some queries to READ COMMITTED is still pretty safe for what I was doing there, and made a *HUGE* difference in speed. Of course, your mileage may vary, but definitely something to research if you haven’t yet and are facing performance issues.

Facebook app permissions bummer…

When building any site that will interact with Facebook, you need to have a user connect their Facebook account with your site.  You create an app listing on Facebook, get some handshake tokens, put them in your code, then have a user initiate a connection between your site and their Facebook account.

The initiation is usually a button that says something like “Connect with Facebook”.  Behind the button is some code that indicates your token and what permissions your site wants from the requesting user.  Usually you’ll want your site to have their email address, maybe some permissions to read their wall posts or perhaps even post on their wall.  For many types of sites (like a couple I’ve worked on over the last year) you *really* are only using Facebook as an authentication system, and you’re not planning on doing any interaction with Facebook at all, so you don’t really want any permissions to their data or wall or anything else.

However… Facebook *requires* that you get access to certain aspects of the users’ data.  Even if you don’t ask for it.  It’s confusing, poorly documented, and certainly causes many people to abandon signups partway through the process.

Specifically, Facebook will always tell the user that your site/app wants access to the user’s friends list.  Always.

The Facebook developer guide says

“The public profile and friend list is the basic information available to an app. All other permissions and content must be explicitly asked for.”

But… it doesn’t indicate that there will be a popup asking for this.

Screen Shot 2013-10-11 at 8.06.37 AM

 

 

 

 

 

 

 

 

 

 

 

The only “permission scope” being requested is “email”.  But Facebook insists on presenting this warning that MY SITE is REQUESTING “friend list” permissions.  We’re *not* doing this – we do not want the friends list, but have no way of *not* getting it.

Even more confusing, really, is the Facebook documentation on this (their docs have always been an unholy mess, imo)

“When a user logs into your app and you request no additional permissions, the app will have access to only the user’s public profile and also their friend list.”

What happens when you *do* in face request “additional permissions” is that you still are presented to the user as asking for permission for their friend list.  I suppose the word “additional” has an implication there, but really, this is dealing with computery/programmery stuff – be explicit about what happens in both situations.

More to the point, give people a way to *not* have access to friend lists.  This is offputting to users, and in an age where privacy is a bigger concern than ever, requiring access to data that is not needed or wanted is negligent.  I suppose it would disrupt all the farmville and candy crush clones from making a living by not requiring people to spam their friends.

I know this has been dealt with on stackoverflow more than a few times, but feel compelled to add my 2c.