Technorati Confusion

Over the years (my blogging years) I've had an up and down relationship with Technorati.

In recent times, it's mostly been up. Now that I've given up on Trackbacks, the Technorati link in the flare at the bottom of each post is my replacement for them.

And on my blog, at the bottom of the left sidebar, I have a link to Technorati and Google so that you and I can check on who is linking to this blog. As a result, I visit Technorati and Google blog search at least once a day and often several times. They both do things differently so I tend to visit both to get a complete picture of who is linking to me at any time.

But there is something about Technorati that I don't understand. And it's been bothering me lately.

If you look at the top of the Technorati results for this blog:

Avc_technorati

You will see that they are counting 2,900 links from 1,245 sites and that the count was last updated 66 days ago. You will also see that 20 posts in the last day linked to my blog. I don't think the 2,900 link count is anywhere near right.

Google says that number is 4,238 as shown below and Google has been counting links for a lot less time than Technorati.

Avc_google

So I decided to look at some other blogs with similar audiences as mine. After looking around, I settled on Jeff Jarvis' Buzzmachine and Paul Graham, both of which I read and link to regularly. Here is the Alexa chart showing that these blogs have similar traffic to this one.

Graph

After looking at Technorati and Google for both Paul and Jeff, I learned that Technorati and Google seem to get similar results on Jeff and Paul's blogs, but not mine.

Here are Jeff's stats on Technorati and Google.

Jarvis_technorati

and

Jarvis_google

They match within about a hundred links

Here are Paul's stats:

Graham_technorati

and

Graham_google

In Paul's case, Google has about 500 more links than Technorati, but again it's fairly close.

There are some other strange things. Jeff's Technorati count was updated 6 minutes ago, Paul's was updated 11 days ago, and mine was updated 66 days ago.

According to Technorati, I got 20 posts linking to me in the past day (about the usual), Paul got the same, and Jeff got 18 posts in the last 47 days?

I think something is wrong with the link counting system that Technorati is using and it bothers me to see my link count remain static at 2900 links from 1245 sites for the past couple months while other's counts are updated hourly.

Does anyone have any suggestions on how I can get Technorati to update my link count more regularly?

The reason that I am makeing such a big deal about this is that I have suggested to a number of companies over the past six months that they use Technorati's link count as a measure of a blog's influence and reputation, both for advertisers and inclusion in aggregation services.

Technorati is doing the blog world a great service by maintaining these link counts. But unfortunately, they are not doing it well enough for my taste.

Comments

Fred,

Great post, and sorry about the problems and confusion you've been having with Technorati.

First off, we are in the middle of a huge upgrade to our link counts system, this is something that has unfortunately taken longer than anyone anticipated, both due to the overwhelming increase in size of the blogosphere, some issues with blog hosting providers and how we do canonicalization, and bugs in the system. This is very important stuff to us (and to you!) and we have been working very hard to make things go right.

First off, some minor points. Since you use Typepad, and we have a relationship with Six Apart to get streams of your post updates, we are indexing your posts in near-realtime. However, Typepad thinks your blog is at http://avc.blogs.com/a_vc so that's why your "last updated" time was set at 66 days ago. We don't currently have a good way of linking the two URLs (http://avc.blogs.com and http://avc.blogs.com/a_vc ) together in a rational way. We are working with the Typepad folks to find a way around this, but there is no simple standard right now that they are using that we can follow. Perhaps when customers demand to them that this become a higher priority we'll see this get fixed more quickly.

In the meantime, you can always make sure that Technorati is indexing your blog at the URL you like by going to http://technorati.com/ping and putting in your blog URL. Then bookmark the resulting page, and you can make sure that we're always indexing your latest stuff by just clicking on that bookmark whenever you post.

I just did this for you, btw, so you'll see that your "last updated" date is more recent.

To clear up a misunderstanding, however, I want to make it clear that "last updated" is not necesarily the last time we did a full count of your link counts. Please don't confuse your "last updated" time with the last time that we calculated your full link counts. I apologize that sometimes the link counts get "stuck" for a few days (or even weeks in the worst case) at a time, but I hope that we'll be able to win back your confidence when we roll out our improved link counter very shortly.

Again, sorry about the problems and confusion, and thanks for bearing with us while we sort things out...

Of course, if you ever have any other questions, confusion, or feedback for us, please don't hesitate to drop me a line - dsifry AT technorati DOT com, or call me on my cell phone at 415 846-0232.

Dave

The answer is not to rely on Technorati any more.

Fred, apologies fro bringing my case here, but I think it's relevant, and perhaps prompts David to further clarify.

David, as you pointed out to me earlier, I have the same confusing setup: my blog URL is www.zoliblog.com, but www.zoliblog.com/blog is used in some situations.
Originally I only had the main url claimed on Technorati, but I've registered the other one, for test purposes. The difference in numbers is quite amazing:

zoliblog.com has 330 links from 184 sites (which display, btw, has not changed for close to a month, even thouh the actual links did).

zoliblog.com/blog has 1921 links from 166 sites. That's a huge difference.

What's the solution? Which url should I keep claimed on Technorati?

Thanks.

Dave - I'm 99% sure Fred has PingShot set in FeedBurner. This should be automatically pinging Technorati whenever there is a new post. This might be another way to get around the Typepad issue described above. Fred - if you don't have PingShot enabled, you should enable it.

Fred, I had something similar happen with technorati a while back. As Dave says, there are two issues here. The "last updated" is a ping issue, meaning they don't see your posts. At one point Technorati wasn't seeing my pings either, even though I was pinging their ping server directly. The problem fixed itself after a couple of weeks.

On the link count issue, I have no idea if its an error or not. I never think of technorati as "accurate", but I do assume that any errors in their link counting roughly affect all blogs in the same way, so at lease we're comparing apples to apples. If that's not the case, then its a problem that hopefully will be fixed when the new system Dave Sifry mentions above is rolled out.

Do you think that is the worst part?

One day I was claiming a couple of my blogs on Technorati, and after going through a couple of steps I notice that the title of the blog was incorrect, at a closer look, in the middle of the claiming process, they started showing me that I was claiming somebody else's blog. I consider that a "security bug", and cancelled my claim immediately, because somebody else was a click way of claiming my blog.

At before that, I had claimed 3 extra blogs, succesfully, and 15 minutes later all of them had disappeared from my list of blogs.

For the size of their service, they should be doing a little bit better fixing some of these bugs.

Marcelo
Sampa Corp.
www.sampa.com

DaveS -

sorry, i don't buy the response that the dual-URL recognition is an issue for TypePad to fix. (note: i also have the same problem with T'rati not recognizing two URLs as the same blog).

clearly you guys seem to understand the problem, as you identified it above... and since Technorati is the site that is providing the metrics analysis, shouldn't it be *YOUR* responsibility to figure out how to make your own numbers work?

i've got to believe it's not that hard to figure out how to eliminate & combine stats from redundant URLs, particularly when the service is as popular as TypePad and when you guys seem to have had a good bit of time to address this.

blog update stats & link counting metrics have got to be at the heart of the service you guys offer... shouldn't this be higher-up on the agenda?

looking forward to the new updates,

- dmc

Thanks for the comment dave

There is no company providing exactly what you are (raw link count and # of unique sites)

And technorati picks up blogroll links and google does not

But I think my link count is way off and it bothers me

Of course everyone thinks they get more links than they do and I am surely guilty of wishful thinking

But I've watched my technorati rank drop every day for the past year while getting 20 or more links per day every day during that period and I see blogs in the top 100 that aren't getting that many per day

So its frustrating

I am sure typepad and the dual URL to blame. But there are other typepad users like seth godin who don't have this issue.

So I will forward this to TypePad and ask them to help you. They usually are willing to help

What exactly should I ask them to do?

Fred

guys - I have been following this and pingshot works spotty at best with Technorati. I usually go and manually ping. It is confusing

I find this all too funny! The 66 is the number of days since Technorati last indexed Fred's blog. My primary blog hasn't been indexed in 95 days.

http://www.technorati.com/blogs/http://www.kbcafe.com/iBLOGthere4iM

And I have others that haven't been indexed in 23 days.

http://www.technorati.com/blogs/http://www.kbcafe.com/Rmail

This problem has been going on for many months. Darren Rowse's blog wasn't indexed for more than 200 consecutive days.

http://www.problogger.net/archives/2006/01/30/technorati-problems-update/

Dave Sifry pretty much responded to Fred the same as he responded to Darren months ago. "We're working on it." And he manually updates your blog in their index. At what point to you stop kicking a dead horse?

Fred, your problem is that you think that they are IBM but in reality they are just 'web 2.0 kids gone wild'. as a father you should really know better;)

I see the same problem with my blog as well... inaccurate stats (in fact so inaccurate that it would be better to not have them there in the first place).
On the other hand, look at google blogsearch... a new service, and yet much more accurate... also picks up my blogs ultra fast..... and more importantly there is consistency in the way it works, something that I have come to NOT expect from technorati.

guess blogpulse.com is a nice service and could be a nicer alternative to technorati, if they add a few features etc.

Randy, well, said, I've been saying it's high time they get acquired :-)  Combine what’s good in Technorati – the innovation – with the infrastructure the GYM can provide…

At least Google Video is the most popular blog at this given minute! http://technorati.com/pop/blogs/

I feel for Technorati. Sometimes they get too much praise, but they are also kicked a little too much as well.

It is not easy doing what they are trying to do. Seriously. And I hope for the sake of a lot of the A-listers they are not close to being done.

One request:
Please TRati, I realize you are going the way of the pageview Bmodel, but make it easier for a novice to subscribe to a search feed! :)

These are all great comments, and we're working on the issues as hard as we can.

Thanks again for all your support, and also for the honest criticism and feedback. We are by no means perfect, but we're working on getting better each day. Zoli, your issues should be fixed now. Randy, yours too. If you keep seeing any other issues, please do let me know and we'll get on it asap.

Dave

As someone who once upon a time built a Technorati competitor I can actually attest to the fact that there is an issue w/ respect to Technorati, Typepad and Feedburner. The problem is identifying the blog / feed in the system correctly. In my previous life we used to have this exact same problem and the issue (imho) is that Typepad makes things confusing for the system builder because there is both:

http://avc.blogs.com/ and http://avc.blogs.com/a_vc and then there's the Feedburner one to boot.

I know that this may seem like a little issue but its not because (I'm guessing) depending on how the blog is identified to Technorati -- at a specific call time -- the link count numbers will vary because depending on the "way" the blog is identified it was crawled more or less recently w/ corresponding numbers. Fixing these issue on a by hand basis is simple. Dealing with it in the large when you're crawling tens of millions of blogs isn't.

This problem, what's called "url normalization" (at least to me) really is tricky. And little things like this cause ripple effects throughout the whole system.

I'm not saying that there isn't a problem here but like a lot of technical things that can be easily fixed in small scale systems, this one is hugely difficult in big systems.

Lies, damned lies... and statistics.

The Vision is Good. The Execution is Not:

I've been hearing about issues with Technorati’s accuracy, systems, and customer service for several months now.

I even have my own problems with Technorati (though I am too much a small fish to do anything about it.)

I think Technorati is overwhelmed. I know for a fact that they are very aware of many different issues. How could they not be aware of at least some of the issues, right? I just think that they don't have the resources to do everything they need to be doing. This could be purely a resource problem or it could be a problem with misallocation of resources (i.e. a management problem).

But the bottom line is, obviously, a lot of people are not happy. And that's really hard, because, after all, Technorati is one of those huge and great ideas that has changed the face of blogging.

They are only at the tip of the iceberg with their vision. They are possibly the Nielson of the Internet! We can't do without them.

Or, I should say: we can’t do without the idea of Technorati.

Technorati could take some lessons from FeedBurner. (that is, unless FeedBurner eats their lunch right now...) FeedBurner offers up consistent, or should I say, constant quality. I have NEVER had any problems with FeedBurner. I understand, they are two completely different product sets, but FB is making more and more inroads towards the consumer. Little by little, they are becoming less and less geeky-techie, and more and more mainstream-minded. And their quality is superb. So, far, FeedBurner is a sure thing.

I think I'd be surprised if FB wasn't already working on a competing product to Technorati. And I bet it would work really, really well.

So, for God's sake, whether it be Technorati or a firm like FeedBurner, could somebody please feed this Vision some more money?

And while you are at it (that is, "you" with the money), can somebody please get me onto a new job/project? This is so much more invigorating than my current projects.

:-)

Jim

I am a TYPEPAD blog, FEEDBURNER user and I have not been indexed by TECHNORATI for two months. I spent the last 3 weeks going back and forth between the 3 companies over 50 emails, numerous support tickets, and I am nowhere. I was indexing just fine when the blog opened and then it just stopped. TECHNORATI blames it on FEEDBURNER & TYPEPAD, the re-direct they now use. I can't for the life of me get a decent answer from TYPEPAD or TECHNORATI. FEEDBURNER response is we have no idea, TECHNORATI doesn't work with us. And around and around you go, 3 companies who refuse to simply work together and fix the problem. As long as the re-direct in on, the indexing is off. Posts to FEEDBURNER FORUM bring no results, as do my emails. I am taken back that a TYPEPAD forum does not exist. Numerous requests to TYPEPAD to open a forum for users to share ideas and issues are met with no answers.

You probably did not ping to techorati aobut your own blog, this is why it is nto updated for 66 dyas. If you do, it is usually update inthe same day, thought osmetimes there is couple of days delay.

Verify your Comment

Previewing your Comment

This is only a preview. Your comment has not yet been posted.

Working...
Your comment could not be posted. Error type:
Your comment has been posted. Post another comment

The letters and numbers you entered did not match the image. Please try again.

As a final step before posting your comment, enter the letters and numbers you see in the image below. This prevents automated programs from posting comments.

Having trouble reading this image? View an alternate.

Working...

Post a comment