These are the old pages from the weblog as they were published at Cornell. Visit for up-to-date entries.

March 05, 2004

Cruching feed data

I didn't get enough time today to work through the feed statistics to prove correctness, etc., so most of the analysis will have to wait until next week.

One of the issues I am trying to get a handle on is how to process the impact of infrequent visits by feed reading agents. The period I am looking at is the 60 days of January plus February of this year, and as I wrote in this earlier posting, in that period 141 agents pulled one of the 3 feeds. But many of those agents were not frequent visitors, about 75% of them returned on less than 10 days. And those 75% is only responsible for about 8% of the total feed transfers. As you can see in the graphs below the # of return visits by unique clients obeys a exponential distribution while the # feed transfers per class of return visitor is uniformly distributed. This is rather unique for a system with decentralized clients, and I believe the uniformity is triggered by the fact that that we have aggregators that are configured to pull in a closed feedback loop. E.g. whether the user is reading or not, the aggregators will poll all feeds, triggering the uniformity (there are exceptions such as feeddemon, we'll get into that later).

A CDF that shows that 90% of the feed transfers is triggered
by about 25% of the visitors

A bit friendlier respresentation of the same data that matches the
# of feed transfers with the class of visitor.

Posted by Werner Vogels at March 5, 2004 04:59 PM