Crowdsourcing Pres-by-CD, 2nd Thread

Last week I wrote about a new crowdsourcing project we’re undertaking here at SSP: compiling presidential voting results by congressional district. Here’s a quick status report:

  • We’ve figured out ways to calculated pres-by-CD for a little over half the states. See this spreadsheet, which anyone can edit.

  • However, there are many states where we don’t have a planned method for calculating the numbers. If you have thoughts about how to figure out those states, please add them (and any links you have to official or soon-to-be-official results).

  • Separately, if you’ve started doing work on some actual numbers, I strongly encourage you to share that work in separate Google spreadsheets. I’ve added a new column on the right in the “mothership” spreadsheet called “Calculations.” Please post URLs to any other spreadsheets you’ve created to crunch the data.

And if you have any other ideas for this project, please share them here in comments. Thanks!

UPDATE: Here’s a very simple example of what I mean by “showing your work.” The CT SoS very kindly makes raw presidential vote totals by CD available – you can see them in this PDF. I’ve uploaded a Google spreadsheet into which I’ve imported those numbers from the PDF, then did some super-simple math to calculate the percentages in each district.

Even if you’re working with something more complicated like counties or precincts, you can and should create something similar to my CT sheet. That way, everyone can see what data you’re using and verify that things look right.

UPDATE 2: Thanks to an awesome find by statsgeek, I put together a spreadsheet for MN as well. There are two tabs – “Formatted Data” is just a pretty condensed version of the raw data, which you can see in all its glory in the second tab. The state of MN actually made this extremely easy, going so far as to calculate the percentages each candidate got. It was just a matter of parsing the file properly and digging out the right numbers.

24 thoughts on “Crowdsourcing Pres-by-CD, 2nd Thread”

  1. Is a media source acceptable or only an official one?  The Boston Globe has town by town data for Massachusetts which is enough to get most of the MA CDs.  We just need to get precinct data for Boston, Fall River, and Wayland and we can do everything.

  2. for some of the CDs in Michigan.  All of the districts include partial counties, but I can come pretty close for some of them.

    MI-6:  Obama won 54.8 to 43.5 w/out Allegan county, which is split.  D+2.2 for 2008.

    MI-3:  John McCain won here 49.5 to 48.8, including a conservative portion of Kent county that is actually in the 2nd District.  Obama may have won this district that George Bush won 40 to 59.  R+2.6 for 2008

    MI-1: Obama won 50.5 to 48.8 w/out Democrat leaning portions of Bay county.  R+3.1 for 2008.  

    MI-2: McCain won here 50.0 to 48.3 w/out heavily republican portions of Kent and Allegan counties.  R+ 2.3 for 2008.

    MI-7: Obama won 51.1 to 47.1 w/out strongly Democratic areas of Washtenaw County.  R+1.0 for 2008.

    The rest of the districts have to many split counties to provide any accurate results.  

  3. First, the tinyurl listed doesn’t work.  

    Second, MN data are available by precinct for download and there is a also a precinct=>CD table for download.  I’ve already done this analysis and have a combined Pres and Senate tally by precinct with both CD and county identifiers.  I can easily collapse it down to CD subtotals.  

    What specific data and layout are wanted?  I could just post a google spreadsheet with 4-5 columns — CD, Obama votes, McCain votes, and Other votes (and state if multiple states will eventually be combined).  I could also include percents or other info if wanted.  

  4. Of how that one poster that was banned was telling us all how CT-04 is the conservative area of Connecticut and Shays will never be defeated.  Oh how I loved his predictions blowing up in his face.  

  5. interesting thing I found. Virgina’s 4th District, currently represented by Randy Forbes was won narrowly by Obama and by 20 points by Warner. Our candidate Andrea Miller raised only $48,073 bucks vs Forbes’ $637,688. In the 10th district our candidate Judy Feder raised $1,930,607 compared to Republican Frank Wolf’s $1,877,504.

    And yet Andrea Miller was closer to winning then Judy Feder.

    I think that might be a prime pickup opportunity.  

  6. In our Ohio county about five per cent of the vote total will not be counted until November 14 (2,828 provisionals and an unknown number of offshore absentees) Due to various screw ups we know that percentage will be higher in some CD, notably OH-15.

    At the start of the day Tuesday, those who requested paper ballots in some areas were only given page one of a two page ballot.

    So I wouldn’t plan our our results being final for shall we say “awhile”?

  7. I was going to do Kentucky next, although there it doesn’t quite follow county lines.

    Each district includes at least one incomplete county, although in several cases the numbers of voters are fairly minimal.

    I’ll keep looking and try to find precinct breakdowns, but if that fails I’ll write up my results based on what he have so far, and try to explain my fuzzy math.

    Obviously, if somebody else finds the precinct breakdowns first, I will be eternally in their debt and will incorporate that into my write-up with the appropriate thanks.

  8. Judging by the low turnout in the fifth CD, there were plenty of voters who didn’t like either choice there.

    It is the most Appalachian district, but it had the lowest turnout of Kentucky’s CDs in 2000, 2002 and 2004 too (although not by nearly as great a margin). There were only about 3000 less votes cast for the Senate race than for the Presidential here, which is a vastly lower rate of undervote than in any other district.

    It might be worth investigating further whether this is due to racism or whether it’s primarily a distaste for recent national Democratic figures (especially since Clinton won the district twice). If it’s the latter, there might be some mileage in the netroots supporting a very economically populist candidate here running against both Republicans and corporate elements of the national party.

    I guess I’ll move on to Arkansas yet, continuing to pick up the low-hanging Appalachian/Ozark fruit, and then I thought I might try my hand at Missouri if I can find the numbers.

  9. that have whole counties in them. I know only a small minority of the 53 districts have whole counties, but I was still curious anyway, about the 2008 results and the 2004 results which are in parentheses. Here goes.

    CA-01 (North Coast)

    Del Norte: McCain 52.1-45.5 (Bush 56.9-41.3)

    Humboldt: Obama 62.8-34.0 (Kerry 57.7-39.0)

    Lake: Obama 58.3-39.1 (Kerry 53.2-44.9)

    Mendocino: Obama 69.7-27.3 (Kerry 63.5-33.7)

    Napa: Obama 65.2-32.9 (Kerry 59.5-39.0)

    CA-02 (Northern Sacramento Valley)

    Colusa: McCain 58.7-39.6 (Bush 67.2-31.6)

    Glenn: McCain 60.0-37.7 (Bush 66.7-31.7)

    Shasta: McCain 61.7-36.2 (Bush 67.2-31.3)

    Siskiyou: McCain 53.6-43.7 (Bush 60.6-37.7)

    Sutter: McCain 57.7-40.7 (Bush 67.2-31.9)

    Tehama: McCain 60.1-37.3 (Bush 66.4-32.0)

    Trinity: Obama 50.9-46.2 (Bush 54.7-42.7)

    Yuba: McCain 56.1-41.6 (Bush 67.0-31.6)

    CA-03 (Sacramento suburbs)

    Alpine: Obama 61.1-36.5 (Kerry 53.2-44.4)

    Amador: McCain 56.1-41.8 (Bush 62.1-36.6)

    Calaveras: McCain 55.0-42.4 (Bush 60.9-37.1)

    CA-04 (Northeast)

    El Dorado: McCain 54.0-44.0 (Bush 61.2-37.3)

    Modoc: McCain 67.9-29.9 (Bush 72.4-25.7)

    Nevada: Obama 51.4-46.7 (Bush 53.4-44.9)

    Placer: McCain 55.0-42.4 (Bush 62.6-36.3)

    Plumas: McCain 54.7-42.8 (Bush 61.7-36.9)

    Sierra: McCain 58.8-37.8 (Bush 64.1-33.2)

    CA-06 (North SF Bay)

    Marin: Obama 78.0 – 20.4 (Kerry 73.2-25.4)

    CA-17 (Northern Central Coast)

    Monterey: Obama 68.1-30.1 (Kerry 60.4-38.4)

    San Benito: Obama 61.4-36.9 (Kerry 52.6-46.5)

    CA-18 (Upper San Joaquin Valley)

    Merced: Obama 53.2-45.3 (Bush 56.5-42.3)

    CA-19 (Mother Lode, Yosemite)

    Mariposa: McCain 55.2-42.6 (Bush 60.2-37.6)

    Tuolumne: McCain 55.2-42.5 (Bush 60.0-38.5)

    CA-20 (Lower San Joaquin Valley)

    Kings: McCain 56.5-41.8 (Bush 65.4-33.7)

    CA-21 (Tulare)

    Tulare: McCain 57.0-41.6 (Bush 66.2-32.9)

    CA-25 (High Desert, Sierras)

    Inyo: McCain 52.9-44.3 (Bush 59.1-38.9)

    Mono: Obama 55.5-42.4 (Kerry 49.2-49.1)

    CA-51 (Imperial Valley, southern S.D. suburbs)

    Imperial: Obama 61.1-37.3 (Kerry 52.4-46.4)

  10. I put together a spreadsheet for Virginia,

    numbers are not quite final since they updated their count while I did the spreadsheet.

    To compare the results to past elections I added the Cook PVI (based on 2000 and 2004) elections of the districts.

    Interestingly Democrats outperformed the PVI in all districts but VA-09. This was the only district Hillary Clinton won in the primaries and is basically an “Appalachian district”.

    Interestingly Democrat Tom Periello outperformed Obama by about 1356 votes in VA-05, while Obama outperformed Glenn Nye in Va-2 by 818 votes and Gerry Conolly in VA-11 by about 11000 votes.

Comments are closed.