Posts Tagged algorithm

Simple Backtesting of Football Pickem Strategy

I was curious to see how my simple strategy for the weighted fantasy football pickem would do in previous years.  I found an archive of NFL point spreads, over under lines, and game results on goldsheet.com.  I don’t know how accurate the data are, but I wrote a little Python script to parse it.  The formatting of the data is pretty rough, so I’ve put a CSV version up on Google Docs.

Here’s a tabular summary of the results.  The key column is Percent, which shows the percentage of available points (MaxPosib) that my strategy got in each year.

    Year      Min      Max      Avg    Total   MaxPosib    Percent
==================================================================
    1993       27      103    60.94     1097       1466      74.83
    1994       46       95    64.82     1102       1500      73.47
    1995       50      114    81.18     1380       1747      78.99
    1996       51      119    82.82     1408       1748      80.55
    1997       52      114    82.24     1398       1745      80.11
    1998       56      120    86.94     1478       1734      85.24
    1999       65      115    91.06     1548       1959      79.02
    2000       62      113    90.53     1539       1959      78.56
    2001       43      117    90.29     1535       1960      78.32
    2002       62      136    95.59     1625       2088      77.83
    2003       72      134    98.53     1675       2088      80.22
    2004       67      129    95.47     1623       2088      77.73
    2005       77      127   101.47     1725       2088      82.61
    2006       65      134    89.59     1523       2091      72.84
    2007       56      136   103.41     1758       2091      84.07
------------------------------------------------------------------
 average    56.73   120.40    87.66  1494.27    1890.13      78.96
std.dev.    12.20    11.84    11.67   189.42     211.33       3.39
     min    27.00    95.00    60.94  1097.00    1466.00      72.84
     max    77.00   136.00   103.41  1758.00    2091.00      85.24

So I expect to score 79% of the total points available this year if I use my strategy of assigning weights to the favored team, ordered by point spread.

The next step is obviously to try some other strategies and see what the results are.  First I want to see how poorly a completely random strategy does.

Comments (1)

Automated Fantasy Pro Football Pickem Weights

I’m participating in the Yahoo! Sports Fantasy Pro Football Pickem with some family and friends.   I don’t regularly play fantasy sports, but I was coaxed into donating money to the pool during a family trip this summer.  I blame the beer and port.

For the uninitiated, each week you must pick a winner in all 16 games, assigning a confidence weight from 1 to 16 to each game.  If your pick wins, then you are awarded points equal to the confidence weight that you assigned to that game.

Now that my money’s in, I want to make a solid showing.  My strategy is to leverage the gambling lines, mapping the point spreads and over under lines to my picks and weights.

I threw together a little Python script to parse the Yahoo odds web page.  For each game, the script averages the point spreads and over under lines.  The favored team is selected, and the picks are ranked according to point spread, with wider spreads getting a larger weight.  Ties on point spread are broken by the over under line, with a larger over under line mapping to a larger weight.

Now I’m wondering how to improve this simple algorithm.  I might break ties by giving a smaller over under a higher weight.  The reasoning would be that smaller over under lines will have less variance.  Surely there are other ways to improve the rankings.

Here’s the table that my little script prints out.

weight | winner                         loser                     spread    o/u
================================================================================
    16 | Minnesota Vikings         over Detroit Lions              -9.83  45.17
    15 | Washington Redskins       over St. Louis Rams             -9.58  37.00
    14 | Green Bay Packers         over Cincinnati Bengals         -9.08  42.00
    13 | Tennessee Titans          over Houston Texans             -6.50  40.92
    12 | Atlanta Falcons           over Carolina Panthers          -6.08  42.50
    11 | Buffalo Bills             over Tampa Bay Buccaneers       -4.42  42.00
    10 | New England Patriots      over New York Jets              -3.42  46.17
     9 | Jacksonville Jaguars      over Arizona Cardinals          -3.08  42.50
     8 | Indianapolis Colts        over Miami Dolphins             -3.08  42.17
     7 | San Diego Chargers        over Baltimore Ravens           -3.00  40.33
     6 | Denver Broncos            over Cleveland Browns           -3.00  38.83
     5 | Kansas City Chiefs        over Oakland Raiders            -3.00  38.50
     4 | Pittsburgh Steelers       over Chicago Bears              -2.92  37.50
     3 | Dallas Cowboys            over New York Giants            -2.75  45.00
     2 | San Francisco 49ers       over Seattle Seahawks           -1.10  39.58
     1 | New Orleans Saints        over Philadelphia Eagles        -1.00  46.08

Leave a Comment

I don’t use debuggers

I just read this post, which says “Just about every developer uses a debugger, at least occasionally.”  I don’t know if I’m inadequate or working on a peculiar class of problems, but I haven’t used a debugger in roughly 10 years.

I have used a debugger twice.  Once I was chasing down a bug in a boutique embedded systems compiler.  The compiler had a bug, and staring at the C code didn’t cut it.  I also used the Perl debugger while writing my MS thesis (back when I wrote oodles of Perl code).  I used it because a friend showed it to me, and it seemed cool.

My primary debugging technique is code examination.  I simply look carefully at my code and think about it.  I also write unit tests.

I’m not fundementally opposed to debuggers.  It’s just that I haven’t used one in the past 10 years.

Leave a Comment

Zen and the Art of Elevators

I work in a high-rise building that secures its elevators with keycards.  In principle, the access system is simple and intuitive: employees and visitors are given keycards that they swipe past a sensor inside the elevator before pressing their floor button.  In reality, the building’s access system has a bad user interface, which regularly frustrates both employees and visitors.  During one early morning ride, the frustration was so bad that it almost boiled over into fisticuffs.

When I’m riding the elevator, I enjoy observing how various people react to the security system’s quirks.  I view it as a sort of a Rorschach test that exposes how people interact with technology.  A couple small quirks in the security system make the familiar interface of elevator floor buttons awfully cumbersome.  Only people who take the time to think about the implementation details succeed in calmly selecting their floor.  Those who take the Zen approach, avoiding any detailed thought of the elevator’s control system, typically end up spastically jabbing the buttons or missing their floor.

I drew up a simple flowchart that describes the security system’s behavior.  When the elevator doors open in the building lobby, the system is in the non-blinking state at the top left.  To keep the drawing simple, I haven’t drawn what happens when two people swipe cards or press buttons simultaneously.

Elevator Control Flowchart

Elevator Control Flowchart

The correct path for trouble-free operation is marked with bold arrows that move counter-clockwise around the drawing.  To follow this path, you must:

  1. Swipe your keycard past the sensor.  Be sure nobody else swipes their keycard at the same time.
  2. The sensor’s red light starts to blink.
  3. Wait 1 second.  Do not press the floor button too quickly because the press will be ignored.
  4. You’re now able to press the floor button.  The sensor is still blinking in the same way.
  5. Press the button for your floor.  It will light up.
  6. Everybody else must wait 5 more seconds for the sensor to stop blinking before they can swipe their card.

This description and the flowchart make a few design flaws apparent.  Most glaringly, any deviation from the correct path causes confusion or delay, and the flowchart depicts many potential detours.  The most common detours are:

  1. A person swipes their keycard and immediately presses their floor button.  The button doesn’t light up, and access is delayed.  The typically causes the person to start jabbing at the button.
  2. After somebody has selected her floor, but before the red light stops blinking, somebody else swipes his card and tries to press his floor button.  The swipe is ignored and the button press is rejected.  Everybody must wait roughly 5 seconds for the little red light to stop blinking before starting the process over.
  3. Two people swipe their card at roughly the same time.  Neither are sure which swipe, if any, was accepted.  The more aggressive of the two people starts jabbing their floor buttons while the other person steps back.  Typically all the button presses are ignored until the red light stops blinking and somebody tries again.

With such a poorly designed user interface, there’s no surprise that temporary visitors usually get confused.  I’m somewhat surprised that long-time employees remain confused.  In the terminology of Robert Pirsig’s Zen and the Art of Motorcycle Maintenance (which I’m re-reading now), these people are taking the Zen viewpoint toward technology.  They don’t want to trouble themselves with taking the time to think about the system’s details, which causes frustration when they encounter a problem.

If you visit my building, beware of the elevator.

Comments (1)

Randomized optimization algorithm

I thought of a simple randomized optimization algorithm today while I was working on a problem. The algorithm’s working well for me, and while I doubt it’s truly novel, it’s new to me.

Broadly speaking, I’m searching for for a vector of floating point numbers that optimizes my objective function. My objective function is complex enough to be thought of as a black box, and therefore there’s no closed-form gradient function. Although the function has no local minima, the parameters are subject to a number of boundary conditions.

I’m coding in python, so at first I tried using scipy.optimize.fmin. In the function passed to fmin, I returned a large number if the given vector violated the boundary conditions. Unfortunately those large numbers created singularities that caused fmin to get stuck. There are other functions within scipy.optimize that might work, but it wasn’t clear how to use them. This motivated me to code up my own algorithm.

The general approach is simple: at each step generate a random neighbor. If that neighbor is superior to the current point, move to it. I generate a neighbor by perturbing each element of my vector by some fixed delta. Specifically, I multiple a vector numpy.random.random_integers(low=-1, high=1, size=mysize) by delta. Then I normalize the result. Lastly, I find the closest vector point that satisfies the boundary conditions.

This has the succinctness that makes many randomized algorithm appealing. In addition, on my problem, its performance was better
than I expected. Two enhancements made it exceptional.

First, if I generate too many neighbors without finding a superior one, then I decrease delta by a factor of 10. The algorithm stops when delta drops below some threshold. This provides a natural transition between large, coarse steps for speed and small, fine steps for precision.

Second, when I’m at point x0, and I generate a superior point at x1, then I search for increasingly superior points along the trajectory from x0 to x1. Doing so yields the benefits of a poor man’s gradient.

example_search

The figure gives a rough illustration of the search technique. We start at a point with value 8, and randomly pick a neighbor using delta 4. Next we search along that trajectory and find a better point with value 4. Once again using delta 4 we pick a random superior neighbor. Note that we hop over a boundary point when we move there. Unfortunately we don’t randomly generate a better neighbor with delta 4 from that point (even though one exists). So we switch delta to 2, and continue the algorithm.

My code is currently embedded with logic that’s specific to the problem I’m solving.  If I abstract it out, I’ll post the code here.  I’m curious to hear from anybody else who’s implemented a randomized optimization algorithm like this one.

Comments (1)

Follow

Get every new post delivered to your Inbox.