Saturday, August 02, 2008

Eclipse for Web Development

Well, I've been doing quite a bit of programming lately, however, it has been in a number of different languages. Naturally, I have considered Eclipse as a possible editor. Vanilla Eclipse is usually geared to toward Java development and can become tricky and tedious to setup for Web Development (hence the rise of Aptana). Eclipse can be great to work with, but only when you can get it working how you'd like. The following update sites may be helpful to get Eclipse setup how you hope to have it:

JavaScript
http://download.macromedia.com/pub/labs/jseclipse/autoinstall (JSEclipse)

Python
http://pydev.sourceforge.net/updates/ (PyDev)

PHP
http://update.phpeclipse.net/update/nightly (PHPEclipse)

Java Tapestry
http://m2eclipse.sonatype.org/update (Maven2)
http://jettylauncher.sourceforge.net/updates (Jetty)

Currently, all of these plugins can be loaded into a single installation of Eclipse Europa. However, I'm not sure that they are all compatible with Ganymede (latest version of Eclipse).

Tuesday, July 08, 2008

Google Visualization API


Here is another great presentation from Google I/O.

Monetizing Social Application Traffic


This is a presentation from Google I/O that was done by a company called SocialMedia.

Saturday, May 24, 2008

Nifty Data Technique

Google Spreadsheets now added some nifty ways to auto-fill data. For instance, rather than typing all of the days or months, you can simply type two or three, select them, and then drag the little blue square in the bottom right corner of the selection. Then, the rest of the days or months will be populated below. That is nice, but what I think is much more interesting, is that you can click and drag holding while down Ctrl (Windows and Linux) or Option (Mac) to pull data from Google Sets. So, in the image below, I only filled in the first three rows of each column. Then, I used the former technique to auto-fill the first three columns and the latter technique (holding down Ctrl or Option) to auto-fill the extra twelve rows.

Nowadays, software developers, such as Google, have a great opportunity to utilize the ginormous pile of data available online. The data that individuals generate is ever increasing and can be extraordinarily useful.


Save R Plot in EPS format

Here is a code example of how to save an R plot in EPS (instead of PS):
postscript(file="testplot.eps",
paper="special",
width=10,
height=10,
horizontal=FALSE)

yvalues = runif(50)
plot(yvalues)

dev.off()
The variation is adding paper="special" and horizontal=FALSE.

Friday, May 16, 2008

Java Programming Notes

Java Programming Notes is a handy Java reference by Fred Swartz. In his words, he explains:
These Java programming notes are written to fill in missing or weak topics in textbooks that I've taught from. Many pages are useful for reference, but not as an ordered tutorial. Some pages are still rough drafts, but I'm slowly working on fixing them.

Monday, May 05, 2008

Walmart Visualization

Here is an interesting animation of Walmart Store growth overtime. Below is a snapshot of the movie in progress (1991).

Sunday, May 04, 2008

Virtual Host Setup

To add a virtual host on your local machine (running apache), do the following two things:

1. Add a virtual host definition to your apache configuration file, like this:

<VirtualHost *:80>
ServerName sitename
DocumentRoot "/location/of/your/site/"
</VirtualHost>

2. Add a corresponding line to your HOSTS file (on my Mac, it is located at /etc/hosts).

127.0.0.1 sitename

You should then be able to access your site in any Web browser by going to:

http://sitename

This then allows you to develop locally in an environment nearer to how it will likely be deployed.

Friday, May 02, 2008

Abstract classes and Interfaces

In response to some of the questions asked in class today, I compiled some properties of interfaces and abstract classes that should help guide your choice when deciding when to use an Abstract class or an Interface as a parent class.

Neither an Interface nor an Abstract class can be instantiated. Both can be used to as a template for concrete (implemented) child classes.

Interfaces
  • example interface definition:
public interface Monkey {
public double getWeight();
public void setWeight(double w);
public void walk();
public void talk();
}
  • fields (i.e., members, variables) are not allowed
  • all methods are implicitly abstract
  • a child class can implement many interfaces in Java
  • child classes must implement all methods
Abstract Classes
  • example abstract class definition:
public abstract class Monkey {
private double weight;
public Monkey(){
}
public double getWeight(){
return weight;
}
public void setWeight(double w){
weight = w;
}
public abstract void walk();
public abstract void talk();
}
  • may have members (e.g., weight)
  • may have implemented methods (e.g., getWeight, setWeight) and abstract methods (e.g., walk, talk)
  • a child class can only extend a single parent class in Java (multiple inheritance is not allowed)
  • child classes must implement all of the parent's abstract methods
Section 4.4 in Data Structures and Problem Solving in Java discusses this more extensively. 

This talks more about when you might use one, the other, or both. Furthermore, I found some questions and answers about the two that interviewers like to use. ;)

Java Tutorials

Sun provides some excellent tutorials that cover most aspects of programming in Java. Learning the Java language is a set of tutorials, or "trails", on the following fundamental topics:
The content of these trails is also available as a book, called The Java Tutorial, Fourth Edition.

Wednesday, April 23, 2008

ForwardTrack

ForwardTrack is an open source tool (now entirely written in php) that allows email campaigns to be tracked and mapped as they are forwarded from person to person. This is definitely useful as it reveals the spread of information and some of the underlying social network.

Political Campaign Contributions

The Federal Election Commission (FEC) requires that all campaign contributions over $200 (per donor) be reported publicly. The reported information includes the donor's name, job title, zip code, and even address. All of it, since 2001 is available electronically via FTP at ftp://ftp.fec.gov/FEC/electronic/.

In collaboration with Political Scientists here at BYU we have been performing record linkage (aka. entity resolution) on this data, so that they will be able to more accurately perform their studies.

Fundrace
On a related note, fundrace.org has created an interesting mashup (shown below) that maps donors on a Google map colored by the party or candidate donated to. It also, reveals donor information and appears do do some coarse record linkage.


FEC Maps
Additionally, the FEC itself has started to produce maps both for the Presidential Election and House and Senate Elections.  The maps they provide aggregate the donated funds by state, party, and candidate.

Tuesday, April 22, 2008

Duncan Watts Downplays Viral Marketing Hype

A while back I quickly saw Clive Thompson's article entitled Is the Tipping Point Toast?, but didn't have the time to read it all nor investigate it any further --- until today.

Thompson's article pits Malcolm Gladwell's thesis (in The Tipping Point) against the recent research of Duncan Watts (cited below). I thought the article was well-written and adequately presented both sides of the issue. In short, Watt's claims that spending time and money marketing to influential individuals is no better than marketing to the masses.

Through all of this, Watts makes some important points such as (quoted from Thompson's article):
  • The problem of popular viral marketing talk is that it is "incredibly vague"; "how an influential actually influences is not explained." "Precision matters when trying to explain highly social epidemics"
  • "Influentials don't govern person-to-person communication. We all do."
    "Common sense is misleading"
  • Thompson writes that Watts found the "rank-and-file citizen [to be] far more likely to start a contagion"
So, today I finally took the time to learn more about Watts' recent research, available at Collective Dynamics Group website (at Columbia University) as a Working paper in the Papers section. Through the years, I had previously read some of Watts' work, so I was excited to see his recent findings. In this paper he presents an approach they call "Big Seed Marketing", which in essence combines a traditional mass marketing model with a viral propagation.

The idea that there is "no free lunch" in viral marketing is useful to point out, as "there are many more unsuccessful attempts that one never hears about." He also, points out that it is "hard, if not impossible" to predict which of attempts will succeed.

The take-home message in the conclusion is that effective marketing campaigns can be produced without identifying "influentials", but simply by adding a mechanism of peer-to-peer sharing to propagate the message. (As an aside, the formalism presented in the paper is useful for discussing the problem and easily evaluating the results.)

Watts makes some good points, however, I would still argue that people with high social capital (you might call "highly influential") can heighten the network effect. This is even evidenced in Duncan's paper --- as one of Tom Mauser's 'friend' was StopTheNRA, who, in turn sent a large email blast (Table 1, footnote 1). So, Tom Mauser, had a significant enough relationship with StopTheNRA that they used their resources (their large email list) to forward his message.

Although, there is an element of hype in the presentation of "Big Seed Marketing", I find it useful as it presents a nice way of making the issue sticky and bringing to light these more subtle points. The desired effect of propagating these ideas seems to be occurring.

Update (4/23): Podcast with Duncan Watts on Buzz Marketing (mp3)

Tuesday, April 15, 2008

Looking for a Job?

There are a lots of places to search for jobs online these days including:
An interesting approach to finding your next job might be to leverage your social connections to match you with a good employer with needs inline with your skills. Of course, as nice that sounds in theory, I would bet it could be challenging in practice.

Although, I won't be needing a full-time job for another couple years, it is always interesting to see what jobs are available (and what skills are in demand) by quickly searching on your skills and interests.

Tuesday, April 01, 2008

SIP Recap - Thursday

Here is a recap from the Social Information Processing Symposium:
  1. Brian Skyrms (UCI), Signaling Games: Some Dynamics of Evolution and Learning
  2. John Nicholson (USU), The Blind Leading the Blind: Toward Collaborative Online Route Information
  3. Cosma Shalizi (CMU), Social Media as Windows on the Social Life of the Mind
  4. Gustavo Glusman (Systems Biologist), Users, photos, groups, words: Analyzing mixed networks on Flickr
  5. Luc Steels (Vrije U), Social tagging in community memories
  6. Aram Galstyan (USC/ISI), Influence Propagation in Modular Networks
  7. Adam Anthony (UMBC), Generative Models for Clustering: The Next Generation
  8. Peter Pirolli (PARC), A Probabilistic Model of Semantics in Social Information Foraging
  9. Hak-Lae Kim (DERI), int.ere.st: Building a Tag Sharing Service with the SCOT Ontology
  10. Yu Zhang (Zhejiang U), Mining Target Marketing Groups From Users' Web of Trust on Epinions
  11. Andrei Broder (Yahoo), Reviewing the Reviewers: Characerizing Biases and Competencies using Socially Meaningful Attributes (see Sihem Amer-Yahia)

The Wednesday talks were excellent. In particular, I really enjoyed:
  • The subtleties of the blind leading the blind (see 2 above)
  • Gustavo's unique way of analyzing Flickr relationships (see 4)
  • Adam Anthony's overview of generative models that can be used in clustering (see 7)
  • Pirolli's analysis of Lostpedia using LDA (see 8)
  • Hak-Lae Kim's tag aggregator application (see 9)
  • The use of socially meaningful attributes as presented by Yahoo's Andrei Broder (see 11)