Archive for the ‘Uncategorized’ Category

The Perspectival Cleft

Sunday, December 27th, 2009

I love debate. For me, the line between a good conversation and an LD is thin, and mostly relates to time limits. Obviously, this love can be something of a challenge, socially speaking — though I do like to think that, as I’ve grown older, I’ve gradually developed a facility for “shooting the shit” or “small talk”, or whatever it is you call everything that’s not debate.

In general, other people seem to dislike debate-style conversations for two reasons: they don’t like the oppositional flavor of the discussion, or1 they don’t see the point. I’m a competitive person, so I actually enjoy the first reason — as long as ad hominem attacks and other fallacies are left out of it. However, both the engineer and the artist in me are very sensitive to the second reason.

Perfection is achieved, not when there is nothing more to add, but when there is nothing left to take away.

Antoine de Saint Exupéry

As people say, you’re never going to convince someone who believes X of the merits of ~X, and that is doubly true within a single conversation. So why waste your breath?

In two words: perspectival cleft. This is a term that has been kicking around my head for a while now, and something akin to its meaning may have already been coined. But what I’m talking about is simple: the place where two reasonable people’s opinions diverge.2 The place, in other words, where the disagreement is born.

The awareness of such a cleft can dramatically change the tenor of the conversation. Rather than attempting to force two people’s opinions to be parallel, you can simply work to uncover the exact location of the perspectival cleft. And once you do uncover it, it’s really a marvel. It’s sort of like finding a book that says one thing to one person, and something totally different to the other. Moreover, there naturally are reasons for the cleft itself. Identifying the cleft is a concrete step toward resolving them, and thereby the overarching disagreement. These consequences make the discovery of the cleft a practical goal — in addition to an amusing diversion — and thus one whose achievement is deeply satisfying.

  1. The word “or” is intrinsically inclusive, rendering the phrase “and/or” redundant. I vote to introduce “xor” into the common lexicon for the rare cases we actually demand exclusivity.
  2. This naturally assumes that they had converged to begin with, but I believe this is a fair assumption — most sane adults from the modern era actually have very similar fundamental beliefs about the world. This is true practically (people concede the utility of tools, for example), as well as morally (people concede the immorality of slavery).

Bayes

Wednesday, November 4th, 2009

I am studying Bayestheorem in school right now, probably for the third or fourth time.

Bayes formula

Bayes' formula, courtesy of Wikipedia

Nevertheless, like many aspects of mathematics, (and everything I suppose) I think I learn it just a little bit differently and better each time.

The big realization for me on this occasion is that the theorem is succinctly and practically encapsulated by Carl Sagan’s famous epigram:

Extraordinary claims require extraordinary evidence.

Basically, we can assume that P(B|A) is high. B is the observed phenomenon, and A is the explanation, so it wouldn’t make much sense to talk about them together if A didn’t do a good job explaining why B occurs.

Then, it’s just a matter of the ratio between P(A) and P(B). If A is a wild, crazy, and new explanation, that it better have some wild crazy and new data to back it up. That will help keep P(A) / P(B) close to 1, and thus P(A|B) close to 1, meaning that A is a good theory.

Now obviously I’m not the first person to think this up (in fact, neither was Sagan). But it really clicked for me, and it’s fun to share when that happens.

On John D. Rockefeller

Tuesday, September 29th, 2009

I am reading Titan: The Life of John D. Rockefeller, Sr. right now, and it is truly a great book. It is a biography of Rockefeller (of course), but it is the most engaging, novelistic biography I’ve ever read. Ron Chernow did a wonderful job. If you want more reason to pick it up (or if you just want the nuggets), here are some of my favorite parts so far.

Quotes and Anecdotes

Rockefeller’s father decided to stop paying tuition for his high school with only a few months to go before graduation, and he was forced to drop out and find his first real job. He knew he wanted a position in the commodities trading industry — rather than as a laborer or in another trade — so he decided to simply apply to every commodities trading house in Cleveland, where he was living at the time.

Each morning, [Rockefeller] left his boardinghouse at eight o’clock, clothed in a dark suit with a high collar and black tie, to make his round of appointed firms. This grimly determined trek went on each day — six days a week for six consecutive weeks — until late in the afternoon. . . . Because he approached his job hunt devoid of any doubt or self-pity, [Rockefeller] could stare down all discouragement. “I was working every day at my business — the business of looking for work. I put in my full time at this every day.”

Rockefeller was dogged but also clever. In order to capture the profits of the coopers who made barrels for his oil, he decided to begin making barrels himself.

Other Cleveland coopers bought and shipped green timber to their shops, whereas Rockefeller had the oak sawed in the woods then dried in kilns, reducing its weight and slicing transportation costs in half.

This cleverness could also be used in the service of inefficiency, as shown when a competing firm attempted to build a pipeline from Oil Creek, in northwest Pennsylvania, to Williamsport, in the center of the state.

Standard Oil . . . embarked on a real-estate spree of monumental proportions, buying up strips of land or “dead lines” that ran in a straight line from the northern to the southern border of Pennsylvania, to block the [competing pipeline's] advance. Overnight, bewildered farmers became rich by selling parcels for extravagant sums to Standard oil agents who invaded their sleepy towns.

And, most unfortunately, he could be outright unethical, purchasing stakes in several newspapers in order to ensure favorable coverage and habitually bribing politicians (though, in his defense, this was common during the period). One one occasion he outdid himself, though.

Standard Oil regarded [certain legislation] with such apprehension that Henry Flagler1 returned from Florida, where he was recuperating from poor health, to spearhead the lobbying campaign. To foster the impression of a popular groundswell against the bill, he hired lawyers to [come to the legislature and] pose as incensed farmers and landowners in favor of the status quo.

These quotes are all from the first couple hundred pages of Titan, which deal with the building of the Standard Oil empire. The rest of the book is focused more on his philanthropy and later years2, and so aren’t as interesting to me3. But anyway, try just the first third or so, and don’t be put off by its length!

Additional Musings

One of the clearest messages in Chernow’s book is that Rockefeller lived two lives. His business dealings, especially in the early years, were obviously cutthroat and unchristian4, but, as a private, devoutly Baptist citizen, I believe he honestly was unable to see that. The boundary between his two worlds was so high and thick that he wasn’t even gripped by denial — he simply failed to see the contradictions. Although one might be concerned by this lack of objective introspection, I think it was an important contributor to Rockefeller’s success, enabling him to fortify himself with Christian self-righteousness, but at the same time to fight his competitors with a broad array of weapons. His example serves as a good reminder that extraordinary people are also abnormal — and that the traits are very often interlinked.

Another strong, and perhaps obvious, message is that though he was smart and incredibly tenacious, Rockefeller was also lucky. He started Standard Oil when its chief product was kerosene, then used primarily for illumination. Automobiles, electrical generation, and plastics were all many years away when he chose his industry. Though Rockefeller certainly would have been notable if these other products had never been developed — and, in fact, he already was in the 1880s, when they by and large hadn’t been — his legacy could not have developed to nearly the same stature. The role of luck is an important factor to keep in mind, both when praising success and criticizing failure, and one that is perhaps too often overlooked today.

  1. Flagler was one of Rockefeller’s oldest and closest business partners.
  2. The last forty years of his life Rockefeller was retired.
  3. Yet.
  4. And effective.

How To Use Google Spreadsheets With Google App Engine, Part 3

Thursday, September 17th, 2009

Sorry for the long delay (first post, second post)! I just started school again (senior year!), so the last few weeks have been packing, finding an apartment, moving in, choosing classes, actually going to class, et cetera et cetera.

Anyway, the following examples are really just pretty wrappers for functionality included in the raw Spreadsheets API, but they’ll give you something concrete to work from. They are pretty much just copy-pasted from my code, but I had to reformat them a bit for the blog. As always, take a good, hard look before using them in your system.

Blazing Batch Modifications

The biggest change from the last post is the ability to cache modification orders and execute them in batches. This obviously is handy for conserving bandwidth, CPU cycles, and even time, and is included in the actual Spreadsheets API.

def setCells(cellDict):
    wsID = 'yourWorksheetID'
    destURL = 'http://spreadsheets.google.com/feeds/cells/'+wsID+'/1/private/full/batch'
    postStr = ''
    for (rowNum, colNum), content in cellDict.items():
        postStr += 'http://spreadsheets.google.com/feeds/cells/'+\
            wsID+'/1/private/full/R'+str(rowNum)+'C'+str(colNum)+''
        postStr += ''
    postStr += ''
    urlfetch.fetch(url=destURL, method=urlfetch.POST, payload=postStr, \
        headers={'Content-Type' : 'application/atom+xml', 'GData-Version' : '3.0', \
         'Authorization' : 'AuthSub token="yourAuthSubToken"', 'If-Match' : '*'})

Be sure to make cellDict a dictionary object of the form (rowNum, colNum) -> content, that is, a tuple going to some string-cast-able content. And make sure to change yourWorkSheetID and yourAuthSubToken.

Blissful Batch Reads

It would be awfully asymmetrical (and impractical) to leave out batch reads, so here is some sample code to that end. Again, this code only wraps the Spreadsheets API’s own batch read functionality.

def getCells(minRow, maxRow, minCol, maxCol):
    wsID = 'yourWorksheetID'
    destURL = 'http://spreadsheets.google.com/feeds/cells/'+ \
        wsID+'/1/private/full?min-row='+str(minRow)+'&max-row='+ \
        str(maxRow)+'&min-col='+str(minCol)+'&max-col='+str(maxCol)
    retStr = urlfetch.fetch(url=destURL, method=urlfetch.GET, \
        headers={'Content-Type' : 'application/atom+xml', \
        'GData-Version' : '3.0', 'Authorization' : \
        'AuthSub token="yourAuthSubToken"', 'If-Match' : '*'}).content
    gsOccurrences = re.findall(r"", retStr)
    retDict = {}
    for occ in gsOccurrences:
        row = int(re.findall(r"(?<=row=').*?(?=')", occ)[0])
        col = int(re.findall(r"(?<=col=').*?(?=')", occ)[0])
        content = re.findall(r"(?<=inputValue=').*?(?=')", occ)[0]
        retDict[(row, col)] = content
    return retDict

This will return a dictionary of the same format used as input in the setCells function. Again make sure to change yourWorkSheetID and yourAuthSubToken.

Final Thoughts

I've now been using the above code in production for about three weeks, and it really works well. Whenever I want an update on SearchEkko I can just check a fast-loading, real-time-updating Google Spreadsheet -- much handier than the default App Engine administrative interface. Hope it helps you out!

How To Use Google Spreadsheets With Google App Engine, Part 2

Tuesday, September 1st, 2009

This post is a continuation of this one. Last time I finished with acquiring your AuthSub session token, and now I’ll show you how to use it to manipulate your spreadsheet.

Reading Your Spreadsheet

It’s really easy to read one of your spreadsheet’s cells, especially with RESTClient. You just need to know your spreadsheet’s ID, which is the key value I mentioned earlier. Simply send a GET request to


http://spreadsheets.google.com/feeds/cells/key/1/private/full/cell

With headers:

Authorization: AuthSub token="yourSessionAuthToken"
GData-Version: 3.0

Replace key with your spreadsheet’s key, and cell with a string in the format “RXCY”, with X and Y being positive integers (row and column numbers).

Note that the section /1/ indicates that you’re reading the first worksheet in the spreadsheet, i.e. “Sheet1″. If you’d like to work with “Sheet2″, well, you can figure that out. GData-Version simply specifies which version of the API protocol you’re using.

Anyway, the server will return a bunch of XML — just look for inputValue='example cell value' and you’re good to go.

Modifying Your Spreadsheet

In order to modify the spreadsheet, you’ll send a PUT request to the same URL:


http://spreadsheets.google.com/feeds/cells/key/1/private/full/cell

The body of the request should be:

<entry xmlns="http://www.w3.org/2005/Atom" xmlns:gs="http://schemas.google.com/spreadsheets/2006">
  <link rel="edit" type="application/atom+xml" href="http://spreadsheets.google.com/feeds/cells/key/1/private/full/cell"/>
  <gs:cell row="X" col="Y" inputValue="whatever"/>
</entry>

Replace key and cell as before, and X, Y, and whatever self-explanatorily.

Finally, the headers of the request must be:

Content-Type: application/atom+xml
Authorization: AuthSub token="yourSessionAuthToken"
GData-Version: 3.0
If-Match: *

That is all pretty easy except the If-Match. The asterisk value for that header tells the spreadsheet to update even if there has been another recent write to that cell (destroying thread safety). If you want to be more punctilious, look here.

Complaints And Possible Expansions

If all went well, you can now programmatically read and write a Google Spreadsheet. I’ll add some slightly more sophisticated sample code in the next post, but for now, here are a couple weaknesses I’ve found.

First, the spreadsheet mostly updates values in real-time, but occasionally (seemingly after a long time-out) requires a page refresh to update. Yeah, pretty minor. The other problem is more serious, though. Sometimes (presumably when traffic is high) App Engine’s urlfetch throws errors because the Spreadsheets server takes too long to respond to a request. This means that some updates aren’t recorded. Could be a problem if you’re really counting on statistics, but, as mentioned, then you should be trying something else. However, there are also opportunities for expansions that could alleviate both these issues.

The best expansion/optimization I can think of would be to use App Engine’s Memcache API. You could buffer spreadsheet updates, and rollback whenever the Spreadsheets server timed out. And you could also send the buffered requests in batches (more on that in the sample code), minimizing traffic. My itch got scratched before this point, but if you do get something like this working, let me know and I’ll use it happily! The other expansion would be to use tables instead of cell-level access. This seems to be the preferred method of quasi-database Google Spreadsheets usage, but it seemed like unnecessary hassle to me. If you get serious, though, it could be something to look into.

How To Use Google Spreadsheets With Google App Engine, Part 1

Monday, August 31st, 2009

There have been some discussions about this already, notably a QCon presentation, but a redundant tutorial or two never hurt a soul. This is for the Python version of App Engine, but it shouldn’t be much different for Java.

Why?

App Engine datastore writes are really expensive, and prohibitively so for something like custom statistics. For example, in an early version of SearchEkko, where I read and wrote to an “Admin” Model object on most operations, it was costing over $6.20 CPM to serve SearchEkko search results pages! This when the marginal cost of web-delivered content is supposed to be zero. Obviously, App Engine wasn’t designed with this kind of usage in mind, but it was equally clear that I needed some type of custom statistics dashboard (how many displays, how many installs, etc). So I started poking around.

The Requirements (And Can-Do-Withouts)

To summarize, I needed some type of database solution that:

  • Offered cheap (as in resources) read and write access
  • Was cheap (as in money)
  • Was easily accessible from App Engine (so probably REST)
  • Tolerated frequent accesses and responded relatively quickly

Since all I needed was a way to track statistics, I could forego a lot of typical database niceties, like:

  • Transactions and thread safety
  • Guaranteed data integrity
  • Bullet-proof security

Google Spreadsheets was what I eventually found, and so far it’s met my needs almost to a T.

Note: if you’re looking for something to use with real, critical data, look elsewhere. This is definitely a little hacky, and certainly isn’t perfect, but it’s a relatively easy way to track things you don’t care that much about in App Engine.

Setting Up Your Spreadsheet

Just go into Google Documents and create a new spreadsheet. The URL will be something like


http://spreadsheets.google.com/ccc?key=bNadyGyiH2Ma6Cx54NmiL2e__4g&hl=en

Note the value of the key argument — you’ll have to use it later.

When you’re accessing your spreadsheet from App Engine, it’s easiest to just set specific cells to the values you want. This means you can lay out the spreadsheet pretty intuitively. In my SearchEkko statistics spreadsheet, I simply made the first row labels for each column, with the corresponding data falling below each label, exactly as you’d set up a standard spreadsheet.

Accessing Your Spreadsheet

In order to access your spreadsheet you have to use the Google Spreadsheets API. There is a Python library you can use, but it’s apparently outdated now. Plus, for something pretty easy like this, I’d rather just use the base protocol, and that way I’ll be more likely to know what’s going on when something breaks.

In any case, you need to authenticate in order access your private spreadsheet data from App Engine. The best way to do that for our purposes is by using AuthSub. AuthSub is similar to OAuth, but isn’t an open standard, and is a bit better-suited to Google-specific tasks. More to the point, it’s what I used, so it’s what you’re learning.

One of the easiest ways to play around with AuthSub is by downloading RESTClient, an addon for Firefox. Play around with it to learn how to submit HTTP POST and GET requests to specific URLs, and with modified headers, or use another tool of your choice.

The next steps are also documented here, but I’ve tried to simplify and streamline them for your coding pleasure.

  1. Direct your browser to
    
    https://www.google.com/accounts/AuthSubRequest?scope=http%3A%2F%2Fspreadsheets.google.com%2Ffeeds%2F&session=1&secure=0&next=http%3A%2F%2Fwww.example.com
    

    Naturally swap www.example.com with your own site. Grant access, and note the URL that it sends you to. Copy the the token parameter. That’s your single-use AuthSub token.

    Note: This should work fine as-is if your site is hosted on App Engine, but if it’s not (ignoring for the moment why you’re reading this), there may be some additional steps here.

  2. Send a GET request (in RESTClient terms) to the URL
    
    https://www.google.com/accounts/AuthSubSessionToken
    

    Use headers:

    Content-Type: application/x-www-form-urlencoded
    Authorization: AuthSub token="yourAuthToken"
    

    Obviously replace "yourAuthToken" with the single-use token you received, but keep the quotes. You should get back a 200 OK status code, and the body will include your session token. Though some documentation shows this value having an expiration, it really is usable indefinitely. (You can revoke any tokens your Google Account has given out here).

Alright, now you have your indefinite-use AuthSub session token, and you’re one giant step closer to mingling Google Spreadsheets and App Engine.

SearchEkko Has Launched

Wednesday, August 26th, 2009

Or at least it would have if I had a PR firm and a few thousand dollars to get some coverage. But the code has gotten to the point where there’s a real, useful product (in my opinion, anyway ;-). If you’re not in the know, here’s the blurb from the home page.

SearchEkko is like a “related posts” widget for your blog, but it’s only visible to readers who arrive from a search engine. It uses each visitor’s search query to find the best-matching pages on your site.

So basically it uses visitors’ intent to help drive page views and provide a better user experience. There are some good comments over at Hacker News.

What now?

The next step is to actually get some distribution. It’s installed on about fifteen sites at this point, and it’s displaying on the order of 1,000 times a day. Nipping at the heels of ShareThis, right?

Tim Westergren, CEO of Pandora and a man I would have taken a bullet for until all these new audio ads, pitched 347 VCs before he got his second round of funding. 347. That is almost a week of non-stop, twenty-four-hours-a-day, PowerPoint pitching. But of course it was much worse than that, instead spreading across the collapse of the internet bubble and three soul-crushing years.

Anyway, if Tim can do that, then I should be able to hang in there for a few months of direct sales. So my goal is to email the 1,000 top business/tech blogs on Technorati, and then see where I am.

Wish me luck.

Update: Paul Stamatiou just alerted me to WP Greet Box, which offers a very similar product. Their targeting seems to be worse than that provided by Yahoo BOSS, but they have more features (naturally including some I’d been planning on doing). In any case, I’ll have to muse on SearchEkko’s future more. In the meantime you can read about competition at Jessica Mah’s blog.

Our Next Form Of Government

Tuesday, August 18th, 2009

What we may be witnessing is not just the end of the Cold War … but the end of history as such … That is, the end point of mankind’s ideological evolution and the universalization of Western liberal democracy as the final form of human government.

Francis Fukuyama, The End of History and the Last Man

Fukuyama wrote those words in 1992, with the fall of the Soviet Union fresh and the bounds of Western achievement limitless. Since then there have been many missteps for democracy and its cousin the free market, with the current economic worries coming in at the top of the list. Yet few of us would seriously consider anything else – indeed, few could even conceive of an alternative.1 After all, we always seem to muddle through. Why should the future be any different?

There are a million possible answers to that question, and it’s anyone’s guess which is right (if any). But muddling is hardly something to aim for. And, more importantly, our government hasn’t fundamentally changed in over two centuries. Wouldn’t it be surprising if something better wasn’t out there?2

An Alternative

I don’t much like paying taxes (who does?), but what I really hate is seeing that money misspent. So what if citizens got to choose where their tax dollars went? And I don’t mean how it is now, with this quasi-representational democracy. I mean each and every dollar earmarked for a specific purpose. You could pay a dollar of your taxes and have it be used only for scientific research and education, or only for infrastructure and immigration. The money would go directly to the agencies responsible for the tasks, and you could pay only for whatever you personally deemed most important.

Obviously some government domains are important and universal enough that they should have some guaranteed revenue. For example, maybe everyone would send at least 5% of their tax bill to defense. And some people might not care – they could just spend it in all domains equally, or according to some default arrangement. (But I bet most people would leap at the chance to better control where their dollars went.) And some people might want to zoom in past the broad categories, spending on scientific research but NOT stem cell research, for example.

You of course would also need some type of overarching administrative body, perhaps similar to a dramatically scaled back version of our current legislature. They would handle the mechanics of collecting tax money and other non-domain-specific issues (such as tax rates, creating new agencies, emergency revenue needs, etc.).

Pros and Cons

The biggest attraction is tightening the leash from tax payers’ wallets to their government. No end of pain has come from democratic governments misinterpreting and debasing the wills of their citizens. If a government agency is ineffective, it won’t have a budget. End of story. Maybe even people could create bids to replace agencies that are doing poorly.

The biggest drawback seems to be introducing an unhealthy degree of focus on marketing and salesmanship into the government. Each agency will be wooing the taxpayer directly, even more than they already do today. But is this even a true con? Aggressive marketing sharpens companies visions, and holds them accountable when they can’t keep promises. I think the same would hold true for government agencies.

Will This Ever Happen?

Not likely. But it seems like a good dream. And, according to Thomas Jefferson, we’re long overdue for a revolution.

  1. Of course there are the Middle East and China, but outside those regions (and even within) there’s no denying that democracy has a special place in people’s hearts.
  2. And as always there’s a decent chance someone’s already thought of this particular alternative somewhere.

Magic Carpets and Pyramids

Wednesday, August 12th, 2009

Ideas don’t have much street cred in the tech world. Execution, not inspiration, is what creates great companies. And it’s true; execution beats the idea 99 times out of 100. Yet even seasonded Entrepreneurs in Residence are usually looking for a “good opportunity” — no small component of which is the idea. The idea is what drives the passion, and from the passion all else follows. But finding the right one can be hard.

This idea-less-ness is a stage I’ve been flitting in and out of for the last couple of years (my current project is here), and I think it’s particularly overwhelming for young people first getting interested in the tech world. However, I’ve discovered two classes of ideas to avoid during this period, and, having found them, I think I’m much further along the road to creating a real business.

Magic Carpets

The first class is “magic carpet ideas”. These are the ideas that have huge and proven demand, but, unfortunately, are impossible. At a larger scale, you might call them pipe dreams. Naturally the inherent infeasibility of these ideas tends to prevent them from reaching the light of day, but nevertheless they appear from time to time.

Even though most of these ideas get killed off before they hit the market, an ambitious young wantrepreneur can delude themself for quite some time (I know because I have). And that is time that could be spent working on a real idea — or at least engaging in a more satisfying fantasy. So what should you do to avoid this mess? Research. Of course it’s best to tackle ideas in areas that you know something about to begin with, but the bottom line is look into the details. Don’t waste time choosing names or making logos — figure out how exactly what your company sells, how it makes it, and (eventually) who buys it.

Pyramids

The second category of ideas to avoid is “pyramid ideas”. And I don’t mean the Madoff variety (though they also have some drawbacks), but the sort that took 20 years and 200,000 men to build. These are the ideas that are, in fact, possible, but are way too big for a young bootstrapper to manage. I, for example, at one time hoped to create a new web browser. Anyone who has ever attended a business plan competition will agree that other examples are anything but lacking.

This class of idea is especially alluring as most VCs want to invest in home runs, not singles, which makes you want to start slugging as hard as you can.1 And, even worse, examples of successful pyramid builders dominate the news. Nevertheless, in my experience it is very easy to be “working” on a large, overwhelming project while accomplishing very little. Chipping away at a smaller, more manageable project is a far surer path to tangible results.

So Which Ideas Are Good?

I’ve given you two types of ideas to avoid, but of course there are infinitely many wrong ways to do something. The good beginner ideas are ones that you’re passionate about and that you can really, truly do (like, start working on it this afternoon). These two goals seem even more important to me than demand and profitability for the first couple of projects. After all, if you successfully complete a project and it turns out you’re only scratching your own itch, the worst case scenario is you learned something and you don’t have an itch anymore.2

  1. This, by and large, is a product of financial necessity and really does make sense. Though of course you can create a real, useful project without VC money.
  2. Naturally this advice is directed to people at the beginning of their entrepreneurial arc, and I’m sure it applies differently (if at all) to people wiser than I.