- It is refreshing to discover others that are pursuing similar research paths as us.
- Their work gives more fuel to the approaches we have been taking.
- Some of the formalism is elegant and useful (particularly, for talking about blogs and entries), however, some of it gets cumbersome (the lookup table is definitely necessary).
- Using Lucene library to index the data is an idea that we could consider (we have been storing the data in a MySQL database of our own make)
- I'd be curious to know how many degrees of separation they crawled from their seed blogs? I would guess that they did not get too far since the blogs in our study appear to be less sparse on average. They reported over 2,000 blogs having 1 million entries (on average, 50 entires per blog per month). In Social Capital in the Blogosphere we retrieved blog content just two degrees away from Scoble (our single seed blog) and obtained over 38,000 blogs having 13 million entries (on average, 28.5 entries per blog per month).
- How did they perform blog entity resolution? (here is an approach that we used)