Listening to Joe Stump from Digg.com talk about SOA and MySQL and some PHP. One key thing he’s repeating is using a service layer to access data asynchronously. His advice right now is to group data requests at the top of a user request, do them asynchronously, and then use the data in the rendering when it comes back. Services_Digg_Request is some code he’s written and published as a PEAR package to demonstrate a technique to help bundle your data requests asynchronously. The package does a lot of low level socket management to various data endpoints that you want to requests data from (HTTP-only services I believe).
Haven’t ‘dugg’ in to the code too much but the approach seems reasonable. He’s using the lovely __get() magic methods, which I’m really not a fan of, but perhaps that’s nitpicking a bit.
In discussing HTTP layer requests, he suggests not using Apache, but lighthttpd or nginx instead.
DBSlayer and Gearman were discussed in a bit of detail. DBSlayer is too tightly coupled to MySQL for Joe’s taste. Gearman looks cool, but no documentation right now. Joe and someone from Yahoo are working on a PHP client package for Gearman. Big issue right now: queue isn’t persistent. Restarting gearman means that you’ll lose work. There’s an undocumented workaround – contact Joe for more info.
Other suggestions:
- run SOA requests in parallel
- bundle your logic in endpoints to have an endpoint do multiple things
- keep junior guys away from low-level service writing
- build in intelligent caching strategies in to your services layer