I was just going through some old work files and came across some research I did in 2010 and 2011 but completely forgot about until now. It was a fairly straightforward analysis of the occurrence of keywords in domains, sub-domains and URL paths in Google’s top 100 results for the term [broadband].
Below is my summary of the sessions I attended and a selection from the plethora of interesting stuff I found out about: Read more »
These days most crawling and scraping tools use XPath or increasingly CSS3 selectors to parse HTML rather than the lengthy & obtuse regular expressions of old.
However unlike regexes, there’s not much in the way of online testing tools for CSS3 selectors (ie none that I could find), so using the excellent CSS selector support in Mojolicious I knocked up a simple CSS3 selector tester (say that 3 times fast) in ~1hr. Read more »
Slightly odd since I haven’t changed anything or even logged in for a month or so. My first instinct was to check Google Webmaster Tools for messages or outages of the blog since I have been doing some crawl testing on the site; the only thing I found was a rather weird notice about a few pages showing “soft 404s”, which seems unlikely, doesn’t seem to be an issue now, and given it appeared ~20 days before the penalty/filter, probably isn’t the culprit. Read more »
I’ve mentioned before that having a pet project is a key incentive to learn more in the world of coding.
I wanted a bookmarking service that was better than delicious.com and allowed full text search, and there didn’t seem to be any out there that I liked, so I decided to try and build one in my spare time. When I started on bkmrx.com, beyond that vague notion, I had no idea how all the pieces would slot together. I had to do a lot of research, planning and scoping when working on it – below are 20(ish) cool projects I found in the course of building it: Read more »