Minasan, Watashiwa Wawan Desu...

NurCell Movies

Saturday, February 5, 2011

2010: The year everything died

January 14, 2011 by ian

Man. 2010 turned internet marketing into a charnel house. RSS, SEO, the web were all pronounced dead by one pundit or another. Let's review:

Robert Scoble, Steve Rubel, Jason Calacanis, Jeremy Schoemaker and others all pronounced SEO dead because of Google Instant or blended search or a pressing need for more links to their blogs. Actually, Shoemoney pronounces SEO dead at least once a year, and has made it into a running gag with the SEO community, so I don't think he counts.

Reality. SEO is alive and well after another year of changes. See, there's this one fact all the coroners out there ignore: People still search for stuff. As long as they do, SEO will be around.

In their defense... As long as the entire SEO community rises up in hysteria, linking back to people who claim SEO is dead, expect pundits to make that claim. It's easier than getting links one at a time, that's for sure.

Steve Rubel, flush with victory after claiming SEO is dead (see above) and that the pageview is dead (back in 2007) decided to extend his win streak by claiming that yes, RSS is dead, or at least maimed, too. He then posted a desire to return to feed reading in 2011, so I'm just confused. But other publications and writers also say RSS is fading to dust.

Reality. Andy Beal shows pretty compelling evidence that RSS use is going up, fast. Smart marketers use the technology for all sorts of monitoring and information delivery/consumption. RSS isn't being used by consumers - that's true. That doesn't make it dead. That's like saying jet fuel's dead because we don't use it in cars yet.

In their defense... The world of RSS is an utter mess. Try to automate access to a Google Alert RSS feed and you get errors about 1/2 the time. It'd sure be nice, just once, to have a standard that's actually standard, just once.

Wired Magazine decided that the web is dead (again) claiming that the web is dead. They claim that the rise of video and decline of traditional web traffic means the old WWW ain't what she used to be.

Reality. Video is still primarily delivered using the web. So I'm not even sure that their statistics are valid. I also think that the rise of apps is going to be a temporary one. HTML5 and mobile-friendly javascript frameworks are poised to make web-driven mobile interfaces easier, more efficient and a better user experience than apps. And Wired's own venture into apps hasn't gone that great. That's right, I said it: Apps are dead (snicker).

In their defense... Declaring the web is dead sold a lotta magazines. Apps are damned cool right now. And Cisco gave them a really pretty-looking graph. Plus, Wired publishes a lot of great stuff. They have to go flying off the rails sometimes.

Subtext. The Cisco data they're using is a little murky. Wired doesn't actually say what Cisco is using as their metric. But Cisco typically uses bytes. If the graph Wired published in their article is based on bytes transferred, then the whole premise is badly flawed. Video is a heck of a lot bigger, and transferring 3 minutes of video requires a lot more bytes than transferring an HTML page that takes 3 minutes to read. You might as well claim America is on the rise because we're all fatter now.

Keep writing, Steve! Keep going, Jason! You give the rest of us stuff to write about.

Here's to 2011!

comments (6) | trackbacks (0) | permalink


View the original article here

Why Google's new in-page analytics sorta sucks: A video

November 12, 2010 by ian

I lie at the beginning of this video. I say that there are some things I love about Google Analytics' new 'in-page analytics' feature. Truth is, I hate it. You know I'm a fan of all things Google Analytics, but man, this one stinks up the joint like a rotting pig carcass. Watch the video to see why, and learn an alternative:

In which I say "don't use in-page analytics!", explain why, and present alternatives.

PS: I don't even mention the part where Google Analytics' page overlay then burrows into your browser like a tick and refuses to go away, so every time you load your site after using it, you get the overlay. Waaaah.

comments (7) | trackbacks (0) | permalink


View the original article here

9 things I want to learn in 2011

January 7, 2011 by ian

I'm not a big New Year's Resolutions kind of guy. My whole life is making lists and crossing things off of them. Why do that one more time each year?

I do love to learn stuff though. Here's my learning 'to do' list for the year. I want to learn:

How to give a kick-ass keynote. I love public speaking and teaching. I want to take what seems like the next step and give a keynote or two. But I don't want to suck at it. You may get to be the audience, who knows? Please don't throw anything that can cause permanent injury.How to use Google App Engine. For some reason, App Engine is a huge mental block for me. It's like fractions in 2nd grade - no one could explain 'em.Data visualization. I know the basics, but I want to learn the theory and principles behind really good data visualization. Because I'm a geek.Python's natural language tool kit. Because... Well, see #3.A bit of linear algebra. Just the vectors stuff. Yeah, I know. See #3 and 4.How to maintain the latest generation of bicycle drivetrains. What? You thought this would all be work? Compared to when I was a bike mechanic, the shifters, cranks, etc. on high-end bicycles these days might as well be printed circuits. I gotta start all over.4 new Yoga routines. Hopefully ones that don't make my fat hang out, my shorts ride up, my gut gurgle like Old Faithful and my face to turn beet red, but if I can avoid 2 out of 4 I'll be happy.How to build the brand of my company and myself. Ironic, I know. I seem to be good at building other's brands, pipelines, etc.. But when it comes to my own, well, I kind of suck. My laser focus turns into more of a silly string barrage.Really great paper modeling techniques. My kids, who now play D&D with me, love to see the worlds we create brought to life. Paper models are a great way to do it. It's amazing I ever got married, really. Actually, it's amazing I ever got a date.comments (7) | trackbacks (0) | permalink


View the original article here

Python web crawler code - use at your own risk


Big changes to the crawler code:

Switched from urllib, which left sockets open and created memory leaks, crashes and other computer higgledy-piggledy, to httplib.Now fetching mime-type and using it to separate images from text pages.Better URL handling.Cleaner output - removes domain name from output for smaller, easier to handle files.Switch to crawl only the current site, or to check other sites, too.Checks for URLs previously crawled and marks them as such, but still notes them. That will provide you a complete list of all links on every page, without slowing the crawl.Better connection management for faster crawls.

Things that still bug me:

The script will time out if the target site times out. Need a way to have it stop gracefully.Still not multithreaded.Not storing in a database. That's to keep the script simple and portable, but at some point it'll have to change.Needs a pretty interface. Working on that next.Download the code (and contribute to the project by improving the code!) here:

[ CMCrawler - an open source Python web crawler ]


This is a command-line Python script. It doesn't get much uglier, just so ya know. But it's fast, lightweight and the output is easy to mash for generating XML sitemaps, checking for 404 errors on your site, or just getting a sense of a site's layout.


As a speed reference: It averages 90 seconds to crawl 700 or so pages. It is single threaded (at the moment).


You must have Python installed. If you don't, or don't know how to install it, frankly I don't suggest you mess with this just now. It's not a mature-enough program yet.


You also need one library that doesn't come standard with Python: The fantastic BeautifulSoup library. It's worth the effort, and without it, writing this crawler would have reduced me to a damp, gibbering lump of flesh under my desk.


Finally, you need to know how to use the command line on your computer, just a little bit.

Download the code.Extract the compressed archive to your hard drive. Put it wherever you want - just make sure you remember the location.Start up your command-line client. On my Mac I use the trusty BASH shell.Navigate to the folder where you put the script.Type python cmcrawler.py [domain to crawl] [stay within domain]. Domain to crawl is your site's domain, without the leading 'http://'. Stay within domain is a '0' for 'stay within this domain or a '1' for 'crawl everything'. For God's sake, stick with 0 for now, OK?The script will spit out the results of the crawl, as they happen. The results are tab-delimited, so you can easily cut-and-paste them into a text editor or Excel.

A few folks at #seochat last night asked for the code from a Python-driven web crawler I'm working on, so here it is, in a Github repository.


I'm just warning you: This is some ugly stuff in that code. This was the very first Python code I wrote. Ever. It does all the horrible things developers do when learning a new platform.


I'll update it as often as I can. The code is totally, 100% free for everyone to use. There are a few conditions though:

You can't use this for a commercial project without talking to me first.Please improve it! Check out the issues page on Github and see what you can do. Send me feedback.You are not permitted to laugh at my lack of coding-fu.

Enjoy.


PS: This crawler is really me hacking together great libraries other people wrote. I get no credit for anything that works.


[ CMCrawler - an open source Python web crawler ]

comments (2) | trackbacks (0) | permalink


View the original article here