Sequoia Voting System Witch Hunt, err... Study Project
Posted by Brad Wood
Oct 21, 2009 08:15:00 UTC
Matt Woodward pointed out this Slash Dot article today about the accidental release of code from the Sequoia Voting Systems and a web site dedicated to studying that code. Apparently the Election Defense Alliance obtained a copy of the election data for Riverside County, California. It came in the form of a Microsoft SQL Server backup that was SUPPOSED to have all the code such as stored procs and triggers redacted. I wandered over to the "Sequoia Voting System Study Project" and scored me a copy of the data. Apparently what happened, was a DBA at Sequoia simply ran drop statements for all the code in the database, backup it up, and sent it on. What he didn't realize was that SQL Server doesn't automatically shrink the data files when you do that. It's like when you delete files, but don't wipe your hard drive's free space. The content is still there until it gets written over with something else. If you parse through the files as text you can find thousands of lines of code from stored procedures, triggers, and functions laying around the file. Sequoia's DBA presumably had no clue those were still in there. Well, the source code for Sequoia's voting system is NOT open source. In fact, it must adhere to a number of federal guidelines including one that forbids "Self-modifying, dynamically loaded, or interpreted code" in any voting system. Whatever that means. I installed the latest version of SQL Server 2008 express and restored the backup to a database. At first it errored, but then I realized all it wanted was for me to specify a different file name for the two data files contained in the backup. What's interesting is that the Sequoia Voting Project site originally accused them of "vandalizing" the database backup but later reneged on that admitting it was a perfectly valid backup file. That was a pretty strong charge and I think the Sequoia Project site needs to watch what they claim. In fact, the whole web site and all the comments on the Slash Dot article seem to be on a witch hunt to prove Sequoia are horrible people who will rot for breaking federal law. Interestedly enough, most (if not all) of the Slash Dot comments appeared to be from people who had not event checked out the code first hand. The comments seemed to be centered around a specific example posted on the site of some table and index create statements. I don't know that I'm convinced though. I'm not exactly sure what qualifies as "Self-modifying, dynamically loaded, or interpreted code". CFML is interpreted, so does that mean you couldn't use ColdFusion for a voting machine? What about dynamic SQL that is passed into the EXEC command? I guess it depends on what it is doing. Here is one of the popular code samples I copied from the BAK file:
[code]BAL_ID null -- 1 - show candidate on ballot (default) -- 0 - remove candidate from the ballot -- 2 - don't show candidate on the ballot, but reserve space for -- her on the layout , IS_ON_BALLOT T_P_BOOL null -- Code used by State reports , STATE_CODE char(7) null -- Reference to AUDIO; clip used to describe candidate header [/code]Most people seem up in arms that the Sequoia software appears to have the "real working code ... buried in with the data" and therefore the application flow is determined at run time. What? Simply because there is an IF statement somewhere based on the value in a column? I don't see the travesty yet. Further to the point, who says that all the code strewn about the backup file is even used by the voting software. For all we know, we are looking at maintenance scripts which are never run by the voting software. Further to that point, here is another interesting procedure I found buried in the data files:
[code]create procedure CreateAllObjects /****************************************************************************** PROCEDURE: CreateAllObjects Description: This procedure creates the following tables: AUDIO, BALLOT_CONTEST, BALLOT_CONTEST_POSIUION, BALLOT_PRECINCT, BALLOT_STYLE, CANDIDATE,...[/code]This proc totally looks like some sort of maintenance proc-- not something the voting app would use. Most of the code contains this header:
[code]Copyright 2005 Sequoia Voting Systems, Inc. All Rights Reserved. Any distribution of source code by others is prohibited. [/code]Oops. One thing to be sure is Sequoia commented their code judiciously and performed code reviews. Here are some sample header comments:
[code]Description/Modifications: Date Author Comments 7/16/98 ToolSmith Initial creation. 8/1/05 ECoomer Added comment blocks, removed unused or redundant variables. Modified line lengths- all to meet code review comments. Combined multiple calls to the same function into creation of a single temp table. 9/20/05 MMcKinney Formatting compacted to meet 240 line limit 9/20/05 MMcKinney Comments added in response to code review for following issues: 1) Numeric constant other than 1 or 0 needs to be enumerated or defined or commented 2) thrown error needs to be listed in header as output[/code]Frankly, I find the existence of standards to be vaguely comforting. So anyway, it will be interesting to where this goes. Especially if the Sequoia company flips over their source code being spilled. If anything, I think this faux pas will stand to help them not hurt them. I believe that reliable software CAN be developed to accurately score an election. I also don't think they meant to open source their code, but if there are any flaws, I guarantee they will be pointed out. One last note-- I took a quick look at the [WRITEIN] table to see the kinds of names people put in by hand. 22 people voted for "MICKEY MOUSE", 15 for "NONE OF THE ABOVE", and 7 for "POOH". (Whinnie the?) Other notable writeins included "BONO", "OBAMA", "SANTA CLAUS", "NO PREFERENCE" (anyone heard of him?), "DAFFY DUCK", "DONALD DUCK", and "F**K REPUBLICANS". (I added the asterisks) Way to go, Riverside County-- you sure showed them. :)
So far the only thing actually interesting is some odd records scattered throughout the logs. They appear as if some of the data was handled by a different revision of the code and had some boundary errors or buffer overruns or something. No smoking gun to be sure but something to pique the curiosity.
Witch hunt? Nice name for movement to ensuring that your fancy thingie, so-called "american democracy", is still democracy. By the way, this is not first blunder from Sequoia. Just google it.
"Frankly, I find the existence of standards to be vaguely comforting. " Frankly, I find typos, signs of incompetence and complete mess to be very, very uncomforting.
But first and most egregious problem is: why these things (software and hardware) are not opensourced by law?
@MaDeR: The "Witch Hunt" reference is really directed towards the guns-blazing-they've-broke-federal-law (Even though there's no real proof yet) thing. I mean, that's kind of embarrassing to come out and blatantly accuse them of vandalizing the database only to come back hours later and admin someone just doesn't know how to restore a bak file.
"Frankly, I find typos, signs of incompetence and complete mess to be very, very uncomforting." Typos I don't care about. I've seen lots of open source software and I can guarantee you being close source gives them no corner on the market of the spelling. Incompetence and a complete mess might be a reason to worry, but my main point is there are a LOT of conclusions flying around by people who haven't even looked at the source code yet, and there is very little real information so far. When you looking through the fragmented remains of reclaimed memory, I kind of expect everything to be a complete mess.
Don't get me wrong-- I'm not standing up for Sequoia. If we find a smoking gun in there, they should get what's coming. The attitude at this point is just a little tilted in my opinion. This "democracy" is still innocent until proven guilty, right? :)
As to why it isn't open sourced already. That's a good question. I would probably support open sourced voting systems, but not everybody sees it that way.
I agree. Pulling out fragments of code from the file and using them plus lots of assumptions is a witch hunt.
This makes me question EDA more than Sequoia.
The real point of all of this is that by law all voting systems should be open source. No ifs, ands, or buts about it. Companies like Diebold^H^H^H^H^H^H^HSequoia shouldn't be getting rich off democracy and claiming proprietary secrets every time any issues come up. Things can't continue this way.
The government needs to get serious about funding open source voting projects. I encourage everyone who's interested in this issue to watch the film "Hacking Democracy" and see if that changes your opinions on things.
This particular issue may not be a smoking gun, but there are SO many others that I simply can't understand why open source in voting systems isn't already a mandate.
@Matt: I can agree with your sentiments, but I don't know if that reflects my personal opinion. Obviously the government is concerned about the security of polling software. There are two trains of thought there. Either open it up and let everyone see it, or close it up tight and let no one see it. It's no suprise that our government seems to like the latter approach.
I will openly admit the former approach has the most potential to be safer PROVIDING THERE WILL ALWAYS BE AN AMPLE AMOUNT OF GOOD PEOPLE IN THE WORLD WHO WILL IMPROVE THE CODE AND NOT EXPLOIT IT. At this point I think there are plenty of good people out there willing to make open code as watertight as possible. Opening code can also "bring on the dogs" so to speak. Most of the comments in code that I saw where 2005, so it appears to be a fairly stagnant code base. If this leak turns up a number of critical bugs, the development cycle of Sequoia may be too slow to deal with them in a timely manner. (And don't say "if the had been open source in the first place they wouldn't have that problem")
The other thing about voting machines is they have a limited interface and are only accessible for a few minutes while you place your vote. This presents a fairly narrow range of attack vectors to compromise them unless you are on the inside-- and in that case you may well already have access to the code.
Anyway, my overall point being that I welcome open source projects especially in the best interests of the country. However, I do not feel that everything should be open source. Companies need to make money and part of their intellectual property is the code they have that no one else has.
I get a feeling you haven't been keeping up with this issue over the last few years like I have.
Someone made a Diebold voting machine key from a picture of the key on the Diebold web site and guess what? It worked on the Diebold voting machines. Add to that the number of physical hacks people have shown are possible because the the voting machines aren't physically secure, not to mention the obvious problems with the software itself (again, watch "Hacking Democracy") and your argument that there are limited attack vectors just doesn't hold water. Not to mention the fact that this is so important that ANY attack vectors are too many.
And as shown in "Hacking Democracy," the votes can be hacked after the fact and there's practically ZERO audit process (in some cases they don't even check to see that total votes match the number of votes spread between all the candidates), and again this is just ridiculous that this is the way things work.
Not sure what you mean about people exploiting the code--if the code's all open source how can anyone exploit it, and for what purpose? Bringing on the dogs is a good thing in software. Have the code open so anyone can see it, have independent software developers verify the process of getting the code on physical voting machines, and that's an infinitely better process than an extremely small number of companies (perhaps down to ONE by the end of the year) having exclusive control over our voting process. I simply can't understand why that doesn't frighten people.
I don't disagree that people need to make money, but our democratic process is NOT the place for people to do this. And you seem to be implying that open source and making money are mutually exclusive. They aren't. In my opinion voting systems should be a public works project that's funded so people can work on this important project full-time, but this is not the sort of project that should be left to the private sector.
In the end I suspect we'll just agree to disagree because I disagree very fundamentally with the notion of intellectual property and patents in software, but that's a completely different argument.
As a 30+ year software engineer, what I am most puzzled about is the apparant complexity of this system at first glance. This should be pretty simple software to write and maintain. You list candidates or proposals, count the votes, verify the user, protect the data, something every ATM in the country does a miIlion times a day. I was interviewed for a position at Election Systems several years ago and was shown some sample code that also puzzled me with its unneccesary complexity. This whole 'trade secrets' thing is just a smokescreen.
@Matt: You're probably correct that you have kept up with this issue more than I have. We probably agree on more than you think-- I'm just a little more skeptical of conspiracy claims I guess. If the voting industry has a good rebuttal to the information in Hacking Democracy, their secrecy will probably prevent them from ever sharing it. (And I wish it wasn't that way).
"Someone made a Diebold voting machine key from a picture of the key on the Diebold web site and guess what? It worked[!]"
That is a very convincing arguments indeed. Of course, it convinces me that Diebold shouldn't be stupid and put pictures of real keys on their site. It doesn't necessarily convince me that open source software would in any way have prevented that.
By "exploiting the code" I simply meant find vulnerabilities in it and take action on them. Security through obscurity is probably one of the worst models, but it is rampant. I want to be clear that I love the idea of open code, but ONLY if the maintainers of the code are willing and prepared to deal with the bugs that are found. Calling the dogs can indeed be bad if your company is not positioned to quickly patch you product when the dogs come. Imagine what would happen if a huge problem was found, but took years to be fixed. Opening your code is a large responsibility and I sincerely hope Diebold is prepared to handle what the public finds now.
"you seem to be implying that open source and making money are mutually exclusive"
Point taken. That being said, I do beleive that companies do exist which would stand to lose money if their code was open sourced. I'm just generally not in favor of blanket statements about open sourcing or closed sourcing. You seem to be implying that there are no ill side affects to open sourcing.
This particular issue shouldn't be a test case or general discussion for whether or not code should or shouldn't be open sourced. That's an individual decision made by companies all the time, and frankly given how poorly the existing companies have been handling voting systems they deserve to be put out of business. I don't even want their code to be open sourced. We need a fresh start and luckily there are a couple of groups making great strides in this area despite the friction working against them.
In this particular case there's no question that there's nothing but upside to the code for voting machines being open source. Ill effects to existing companies don't concern me. Open source is simply what needs to be done in this case, no question in my mind.
Just remember software runs on hardware. Hardware which prevents public oversight, which in turn breaks the chain of custody. Open source software isn't the answer, especially when your not even monitoring the doping process! (another broken chain of custody) paper ballots and an unbroken chain of custody is.
For all the serial number people. Our elections are supposed to be transparent, that means you don't get to have identifiable marks like serial numbers. And electronics is a broken chain of custody, cause nobody can see the electronic signal representing the vote, alternatively with paper, you can watch a box with paper ballots in it.
Broken chain of custody.
@Jeff: Interesting observation. I had thought about that, but just assumed maybe there were a whole lot requirements I just didn't understand. Seeing as how it is a government project probably provides quite a lot of scope creep.
For what it's worth, neither open-source nor closed-source code is a guarantee of what is actually on the machine during the election ... and while using open-source code gives many more people the chance to review the code prior to and after an election, you can argue that the review also gives people the chance to take advantage of vulnerabilities in the code.
I've worked as a judge for the last five elections here in Indiana, and I can say that at least here, there is most definitely an audit process. (By the way, total votes will rarely match the number spread among all the candidates; if it's a single-vote office, typically there will be a number of undervotes, meaning you'll have fewer votes cast for that office than voters overall, and if it's a multi-vote office, it's much more complicated.) After the machines are certified and picked up by the inspector at our location, we open them the night before, go through a number of checks (correct polling place, date, and time; correct ballot; all candidates, parties, and referenda; no votes cast yet), and seal them back up. On election day, we open them again, repeat the checks, and prepare them for the election itself.
Once the election closes, we go through more checks, seal the results up, drive them to the courthouse, and hand them off for the next part of the process.
Is it solid? Not really; for one thing, there's no step in our part of the process to confirm that a vote cast for a candidate is actually recorded for that candidate. (Hopefully this is part of the certification process ... I wouldn't mind doing that verification, but depending on how we were to do it, it could take at least a full day for a presidential election year. That may not sound like much, but it's hard enough to get people to volunteer to work one day plus a couple of hours as it is. A full day Monday and then the long day Tuesday would make it even harder.) But it's a far cry from "ZERO", at least in my precinct.
Brad is correct, again at least in my experience, about knowledge not being the only key to compromising a machine. Your chances of modifying the machine without my knowledge are very small ... and based on mistakes we've made in the past, I doubt it would work without inside access, and of course if the election workers are compromised then it doesn't matter how you cast your votes. (Afghanistan, anyone?)
Am I happy with the current system? No, but it's a hell of a lot better than what we had, and sadly (at least here), even if there were an open-source project that had machines that were 10 times better than the existing machines (insert joke about zero), there's no guarantee they'd be adopted. Like most of the other Indiana precincts, we had lines out the door in 2008, mostly because the election office assigned machines to each precinct based on previous elections rather than expected turnout. The head of the election board? Not an elected position. If new machines were available in 2012, if she ensures that the machines aren't certified, we can't use them.
I think it's a big mistake to assume that the biggest problems with elections are with machines.
Hey, fascinating post. (I'm an e-voting researcher.) The existence of standards shouldn't give you much comfort (you might find this lengthy but interesting as it's some of the best commenting on the new draft of said standards: http://accurate-voting.org/wp-content/uploads/2009/09/ACCURATE-vvsgv11-final.pdf
I would urge you to read the State of California's own report on the Sequoia software, as well as the hardware. The release of this database is just icing on a very large cake, and the cake is made of steaming pony loaf. Words fail at trying to describe the utter incompetence of the coders/testers of this software. Many routines were found to have never been tested even once, since they fail every time through. More egregious are all the "security" coded in, or talked about in the code, which is never actually used! Or if it is used, it is used incorrectly. The previous poster who talked of needless complexity was right on the money. The idea that our "democracy" is in the hands of vendors like this, would tend to make me want to give up on the quaint idea of democracy.
Here is the report commissioned by they state, done by UC Berkeley: http://www.sos.ca.gov/elections/voting_systems/ttbr/sequoia-source-public-jul26.pdf
Mmmmm.. steaming pony loaf. :)
@lonehighway : Seriously though, thanks for the link to that PDF I haven't read the whole thing, but just skipping down to the conclusion part is pretty crazy. That report didn't pull any punches.
"We found pervasive security weaknesses throughout the Sequoia software. Virtually every important software security mechanism is vulnerable to circumvention."
"We are regrettably unable to suggest with confidence any comprehensive strategy for mitigating the vulnerabilities in the Sequoia system..." "Fixing some of the problems will require substantial changes to the software and the architecture. In fact, we are not optimistic that acceptable practical and secure mitigation procedures are even possible ... in the absence of a comprehensive re-engineering of the system itself."
@Joe: Thanks for your PDF link as well.
BTW, I was part of the (large) team of investigators for the CA Top-To-Bottom review and the OH EVEREST review. I think we were pretty diplomatic, consdering what we found. Here, in case you haven't had enough, is the massive OH EVEREST review report:
@joe I am a glutton for punishment and I downloaded that report. It just takes a few seconds to find out the situation (with this particular system) is hopeless.
"The security failings of the ES&S system are severe and pervasive. There are exploitable weaknesses in virtually every election device and software module, and we found practical attacks that can be mounted by almost any participant in an election. For this reason, the team feels strongly that any prudent approach to securing ES&S-based elections must include a substantial re-engineering of the software and firmware architecture to make it â€œsecure by design.â€
The ONLY safe way to protect the vote is paper ballots, counted by hand at each voting location. A representative democracy is slow and messy. Computers make stealing an election way too easy and tempting, cutting out the need to involve hundreds (or more) people while erasing any trace of the crime.
A footnote to my earlier comment about ESS here in Omaha. I was offered position as a contractor through an agency, but was offered 50% of my usual rate at that time. I was told by the agent that if I did not take it or the work would be just besent offshore. I declined.
> I'm not exactly sure what qualifies as "Self-modifying, dynamically loaded, or interpreted code". It means code that has not been compiled into hardware machine code.
> CFML is interpreted, so does that mean you couldn't use ColdFusion for a voting machine? Yes, that's exactly what it means. You can't use anything that isn't directly executable by the hardware.
> What about dynamic SQL that is passed into the EXEC command? I guess it depends on what it is doing. No, you can't use dynamic SQL. That's interpreted code, and dynamic SQL would qualify as self-modifying code (from the standpoint of the overall system). Personally, having a lot of experience with large-scale business systems, and seeing what brain-dead programmers have tried to do with dynamic SQL, I'd want my voting machines to use static SQL that had been picked over with a fine-tooth comb by an experienced DBA.
@Whatever: I'm sure there is a middle of the road here. What makes "hundreds (or more)" people- each potentially with their own personal agenda-- any more trustworthy than a machine and a smaller set of people. No one's accusing Diebold of much more than stupidity at this point. It's the general public we're afraid will find a way to hack the system. But you want to include more of those people. The quality of oversight from hundreds of people in my opinion is only as good as your trust of them.
@JeffD: Now just look at what you've done. You had a chance to do it right! :) I'm sure some executive somewhere is patting himself on the back for saving millions of dollars by outsourcing that project overseas and dooming it to failure.
@JAM: "code that has not been compiled into hardware machine code."
Ok, fair enough.
"> CFML is interpreted" Actually, come to think of it, ColdFusion does allow you to do a sourceless deploy. If my memory serves me you can pre-compile all your CFML down to Java class files (bytecode), delete the CFML, and turn on trusted cache. If compiled Java counts as hardware machine code, then so would CFML when pre-compiled.
I hear you on dynamic SQL. Talk about obscurity and vulnerabilities-- I've seen my share of both of those with that magic little EXEC command. I will point out that there are quite a few instances of EXEC in the Sequoia code I looked at. Thing is, a lot of it looked maintenance scripts for restoring the database-- not anything that would run when someone was casting votes. I wonder if that makes any difference. Either way, Diebold will probably have some explaining to do about it.
** UPDATE ON THIS **
DieBold officially addressed the leaked code in this entry on their blog: http://www.sequoiavote.com/blog.php?ID=69
Pertinent Quotes: "It has come to our attention that there are several blog postings regarding the possession of Sequoiaâ€™s source code by the "Election Defense Alliance" and we would like to address those claims." "...in a recent instance in Riverside County, California, we did remove the proprietary information but did not thoroughly redact it, leaving text base remnants of various database level code related to various operations including portions of stored procedures and schema creation code." "There was no source code related to the voting machines â€“ the code that actually counts votes â€“ released or any front-end Election Management System code. Essentially only small portions of ballot layout, accumulation and reporting code were present in this database that Sequoia provided to Riverside County."
One of two things are probably true at this point.