Friday, October 13, 2017

Designing A Tamper-Proof Electronic Voting System

India's Electronic Voting Machine (EVM)

India's Electronic Voting Machines (EVMs) have been in the news a lot lately, and not always for the right reasons. There have been complaints by some candidates that when voters pressed the button in their favour, another party's symbol lit up. Faced with accusations that the EVMs may have been hacked (especially with the string of electoral successes of the ruling Bharatiya Janata Party), India's Election Commission has begun to conduct elections with a Voter-Verifiable Paper Audit Trail (VVPAT), which prints out the details of the party/candidate that a voter selected, so they can verify that their vote was registered correctly.

An EVM unit with a printer to provide immediate paper verification to voters

As a systems architect, I'm afraid I have to say that may not be good enough. Let me explain why, and then suggest a more foolproof alternative.

First of all, my familiarity with IT security tells me never to believe that a hardware device has in-built safeguards. We have all heard about how backdoors can be built into hardware, with whispers about the Russian mafia or Chinese government having control of fabrication plants that produce integrated circuits, so we know it's at least theoretically possible for criminal elements to inject malicious logic right into the hardware of an electronic device.

At the same time, I believe it's being Luddite to advocate a return to entirely paper-based ballots. It's true that many Western countries stick to paper ballots for the sheer auditability of a poll (which electronic voting makes opaque), but India has had bad experiences with paper-based polls in the past, with uniquely Indian subversions of the system such as "booth-capturing", as well as more conventional forms of fraud like "ballot-stuffing".

No, there's no going back to purely paper-based ballots, but there are serious vulnerabilities with the electronic voting system, even with VVPAT.

Let me illustrate.

The basic EVM logic is as follows. The voter presses a button corresponding to their preferred party or candidate, and the machine confirms their selection by lighting up the corresponding election symbol (because many voters are illiterate and can't read). The choice is also recorded in the unit's memory. After the polls close, all the voting units are collected and connected to a central unit that tallies the votes in each. Once all units have uploaded their votes to the central unit, the results of that election can be announced, with the tallies of all parties and candidates available.

Now, based on voter complaints about the wrong symbol lighting up, here's what many people suspect happened. Somehow (never mind how) a hack was introduced into some of the units that recorded a selection of party A as a selection of party B.

This is actually a pretty amateurish hack, as I'll explain shortly. It's readily detectable by an alert voter. What the Election Commission is attempting to do with the Voter-Verifiable Paper Audit Trail (VVPAT) is to make the voter's selection more explicit, in the hope that more of them will be forced to verify that their choice was correctly recorded. It does not make the system more secure in the sense of being able to trap more subtle hacks.

Here's the schematic of the basic logic when things go well.

(Click to expand)

When faced with a simple hack like the suspected one above, the system will respond as below.

(Click to expand)

However, any hacker with a little more smarts will realise that their subversion will have to be less readily detectable. In other words, the hack would have to be placed in a slightly different place.

(Click to expand)

With this kind of hack, any mischief would be virtually undetectable. Both the lighted symbol and the paper printout would confirm to the voter that their choice was faithfully recorded, yet their vote would have been subtly hijacked in favour of another party.

The logic of the hack could be designed to be extremely subtle indeed. Instead of switching every single vote from party A to party B, it could be designed to apply a random function so that, on average, only 1 in N votes was switched across. In many marginal constituencies, even a small skimming of votes would be enough to tip the balance, so desired results could be achieved without any suspiciously large vote swings. There could even be a threshold below which the logic would not start to kick in, say a few thousand votes. That way, if the Election Commission conducted a few test runs to ensure that a unit was working correctly, it would not arouse suspicions.

Now all this seems depressing. Is there any way to combat this?

Yes, there is, but it's not purely in hardware and software. If it were, this post would have been titled "Designing A Tamper-Proof Electronic Voting Machine". The system that we design needs to incorporate electronic and manual elements.

What we need are not one but two printouts for every vote. One copy is for the voter's own records. The other is for the Election Commission. The voter must verify that both match their selection, then place the EC copy into a ballot box before they leave the booth, just like in a paper-based poll. However, this paper ballot will only be used for verification, not for the actual vote tally on counting day, otherwise we may as well go back to a purely manual vote count.

(Click to expand)

A number of statistical techniques may be used to sample and test the performance of voting machine units in various constituencies.

Under the most pessimistic scenario, the ballot boxes of every single booth will be tallied offline, and the counting may continue for weeks after the official results. Elections will only be rescinded if the manual tally grossly contradicts the electronic one (there will always be minor discrepancies due to voter or official error).

Under less pessimistic scenarios, a random sample of booths may be chosen for such manual verification. If gross discrepancies are detected in any booth, then all of the ballot boxes in that constituency will have to be manually tallied. If more than a certain number of constituencies show suspicious results, then the tally may be expanded to cover an entire state, and so on.

There can be further refinements, such as ensuring that the random sample of booths to be verified is drawn publicly, after the voting is completed, so as to afford no opportunity for malicious elements to know in advance which booths are "safe" from being audited.

In general, the design of the overall process is meant to detect subversions after the fact, so the technically accurate term is tamper-evident rather than tamper-proof. However, advertising the fact that such audits will be taking place may deter malicious elements from attempting these hacks in the first place. Hence, in a larger sense, the system consisting of the combined electronic and manual process, plus a widespread foreknowledge of an inevitable audit, may result in a tamper-proof system after all.

Democracy works because citizens have faith that their will is reflected in the results of elections. If citizens lose faith in the electoral process, it could cause a breakdown in society, with violent revolution in the worst case. That's why it's important to act quickly to restore faith in the process, even if this makes the process costlier.

As the quote commonly attributed to Thomas Jefferson goes, "Eternal vigilance is the price of freedom."

Tuesday, August 05, 2014

My Books On Dependency-Oriented Thinking - Why They Should Count

InfoQ has published both volumes of my book "Dependency-Oriented Thinking". The links are below.

I'll admit it feels somewhat anticlimactic for me to see these books finally published, because I finished writing them in December 2013 after about two years of intermittent work. They have been available as white papers on Slideshare since Christmas 2013. The last seven months have gone by in reviews, revisions and the various other necessary steps in the publication process. And they have made their appearance on InfoQ's site with scarcely a splash. Is that all?, I feel like asking myself. But I guess I shouldn't feel blasé. These two books are a major personal achievement for me and represent a significant milestone for the industry, and I say this entirely without vanity.

You see, the IT industry has been misled for over 15 years by a distorted and heavyweight philosophy that has gone by the name "Service-Oriented Architecture" (SOA). It has cost organisations billions of dollars of unnecessary spend, and has fallen far short of the benefits that it promised. I too fell victim to the hype around SOA in its early days, and like many other converted faithful, tried hard to practise my new religion. Finally, like many others who turned apostate, I grew disillusioned with the lies, and what disillusioned me the most was the heavyhandedness of the "Church of SOA", a ponderous cathedral of orthodox practice that promised salvation, yet delivered nothing but daily guilt.

But unlike others who turned atheist and denounced SOA itself, I realised that I had to found a new church. Because I realised that there was a divine truth to SOA after all. It was just not to be found in the anointed bible of the SOA church, for that was a cynical document designed to suit the greed of the cardinals of the church rather than the needs of the millions of churchgoers. The actual truth was much, much simpler. It was not easy, because "simple" and "easy" are not the same thing. (If you find this hard to understand, think about the simple principle "Don't tell lies", and tell me whether it is easy to follow.)

I stumbled upon this simple truth through a series of learnings. I thought I had hit upon it when I wrote my white paper "Practical SOA for the Solution Architect" under the aegis of WSO2. But later, I realised there was more. The WSO2 white paper identified three core components at the technology layer. It also recognised that there was something above the technology layer that had to be considered during design. What was that something? Apart from a recognition of the importance of data, the paper did not manage to pierce the veil.

The remaining pieces of the puzzle fell into place as I began to consider the notion of dependencies as a common principle across the technology and data layers. The more I thought about dependencies, the more things started to make sense at layers even above data, and the more logical design at all these layers followed from requirements and constraints.

In parallel, there was another train of thought to which I once again owe a debt of gratitude to WSO2. While I was employed with the company, I was asked to write another white paper on SOA governance. A lot of the material I got from company sources hewed to the established industry line on SOA governance, but as with SOA design, the accepted industry notion of SOA governance made me deeply uncomfortable. Fortunately, I'm not the kind to suppress my misgivings to please my paymasters, and so at some point, I had to tell them that my own views on SOA governance were very different. To WSO2's credit, they encouraged me to write up my thoughts without the pressure to conform to any expected models. And although the end result was something so alien to establishment thought that they could not endorse it as a company, they made no criticism.

So at the end of 2011, I found myself with two related but half-baked notions of SOA design and SOA governance, and as 2012 wore on, my thoughts began to crystallise. The notion of dependencies, I saw, played a central role in every formulation. The concept of dependencies also suggested how analysis, design, governance and management had to be approached. It had a clear, compelling logic.

I followed my instincts and resisted all temptation to cut corners. Gradually, the model of "Dependency-Oriented Thinking" began to take shape. I conducted a workshop where I presented the model to some practising architects, and received heartening validation and encouragement. The gradual evolution of the model mainly came about through my own ruminations upon past experiences, but I also received significant help from a few friends. Sushil Gajwani and Ravish Juneja are two personal friends who gave me examples from their own (non-IT) experience. These examples confirmed to me that dependencies underpin every interaction in the world. Another friend and colleague, Awadhesh Kumar, provided an input that elegantly closed a gaping hole in my model of the application layer. He pointed out that grouping operations according to shared interface data models and according to shared internal data models would lead to services and to products, respectively. Kalyan Kumar, another friend who attended one of my workshops, suggested that I split my governance whitepaper into two to address the needs of two different audiences - designers and managers.

And so, sometime in 2013, the model crystallised. All I then had to do was write it down. On December 24th, I completed the two whitepapers and uploaded them to Slideshare. There has been a steady trickle of downloads since then, but it was only after their publication by InfoQ that the documents have gained more visibility.

These are not timid, establishment-aligned documents. They are audacious and iconoclastic. I believe the IT industry has been badly misled by a wrongheaded notion of SOA, and that I have discovered (or re-discovered, if you will) the core principle that makes SOA practice dazzlingly simple and blindingly obvious. I have not just criticised an existing model. I have been constructive in proposing an alternative - a model that I have developed rigorously from first principles, validated against my decades of experience, and delineated in painstaking detail. This is not an edifice that can be lightly dismissed. Again, these are not statements of vanity, just honest conviction.

I believe that if an organisation adopts the method of "Dependency-Oriented Thinking" that I have laid out in these two books (after testing the concepts and being satisfied that they are sound), then it will obtain the many benefits of SOA that have been promised for years - business agility, sustainably lower operating costs, and reduced operational risk.

It takes an arc of enormous radius to turn around a gigantic oil tanker cruising at top speed, and I have no illusions about the time it will take to bring the industry around to my way of thinking. It may be 5-10 years before the industry adopts Dependency-Oriented Thinking as a matter of course, but I'm confident it will happen. This is an idea whose time has come.

Thursday, June 19, 2014

An Example Of Public Money Used For The Public Good

I've always held that Free and Open Source Software (FOSS) is one of the best aspects of the modern IT landscape. But like all software, FOSS needs constant effort to keep up to date, and this effort costs money. A variety of funding models have sprung up, where for-profit companies try to sell a variety of peripheral services while keeping software free.

However, one of the most obvious ways to fund the development of FOSS is government funding. Government funding is public money, and if it isn't used to fund the development of software that is freely available to the public but spent on proprietary software instead, then it's an unjustifiable waste of taxpayers' money.

It was therefore good to read that the Dutch government recently paid to develop better support for the WS-ReliableMessaging standard in the popular Open Source Apache CXF services framework. I was also gratified to read that the developer who was commissioned to make these improvements was Dennis Sosnoski, with whom I have been acquainted for many years, thanks mainly to his work on the JiBX framework for mapping Java to XML and vice-versa. It's good to know that talented developers can earn a decent dime while doing what they love and contributing to the world, all at the same time.

Here's to more such examples of publicly funded public software!

Monday, June 09, 2014

A Neat Tool To Manage Sys V Services in Linux

I was trying to get PostgreSQL's "pgagent" process (written to run as a daemon) to run on startup like other Linux services, and came upon this nice visual (i.e., curses) tool to manage services.

It's called "sysv-rc-conf" (install with "sudo apt-get install sysv-rc-conf"), and when run with "sudo sysv-rc-conf", brings up a screen like this:

It's not really "graphics", but to a command-line user, this is as graphical as it gets

All services listed in /etc/init.d appear in this table. The columns are different Unix runlevels. Most regular services need to be running in runlevels 2, 3, 4 and 5, and stopped in the others. Simply move the cursor to the desired cells and press Tab to toggle it on or off. The 'K' (stop) and 'S' (start) symbolic links are automatically written into the respective rc.d directories. Press 'q' to quit the tool and satisfy yourself that the symbolic links are all correctly set up.

You can manually start and stop as usual:

/etc/init.d$ sudo ./myservice start
/etc/init.d$ sudo ./myservice stop

Plus, your service will be automatically started and stopped when the system enters the appropriate runlevels.


Saturday, April 05, 2014

The End Of Ubuntu One - What It Means

Although a big fan of Ubuntu Linux as a desktop OS, I've never been interested in their cloud storage platform Ubuntu One, and found it a bit of a nuisance when asked to sign up for it every time I installed the OS.

Now Ubuntu One is being shut down. I'm 'meh' but still a bit surprised.

The linked article talks about mobile, and how new mobiles such as the Ubuntu-powered ones need cloud storage to succeed. If so, isn't it really bad timing for Canonical to walk away from a fully operational cloud platform just when its mobile devices are entering the market?

Ubuntu-powered smartphones
(Do you know what the time on the middle phone refers to?)

I think it's about economics.

Ubuntu's statement says:

If we offer a service, we want it to compete on a global scale, and for Ubuntu One to continue to do that would require more investment than we are willing to make. We choose instead to invest in making the absolute best, open platform and to highlight the best of our partners’ services and content.
Hmm. I read this as Canonical trying to build a partner ecosystem that will substitute for having a big cloud-and-mobile story like Google does, without the investment that such a proprietary ecosystem will require. Let's see if they succeed.

The other side-story in the linked article is about telcos and their role. Having worked at a telco over the last two years, I can confirm that the major fear in the telco industry is being reduced to commodity carriers by "over the top" services. The telcos are fighting to offer content, and will want willing mobile wannabe partners like Mozilla and Canonical to offer smartphone platforms that will work with networking infrastructure and make the telcos more attractive (through content that both players source from content providers). It will be interesting to see how this four-way, federated partnership (between multiple telcos, independent smartphone platform vendors like Mozilla and Canonical, smartphone device OEMs and content providers) will play out. Many of these companies will think of themselves as the centre of the Universe and the others as partners.

"Nothing runs like a fox" - Well, let's see if the Firefox Smartphone has legs

In the meantime, some good news for startup cloud providers ("startup" only with respect to the cloud, since they will still need deep pockets to set up the infrastructure!): Canonical is open-sourcing its Ubuntu One storage code “to give others an opportunity to build on this code to create an open source file syncing platform.” This should be interesting.

Tuesday, March 11, 2014

Tools for HTML Table and Browser-Side Database Manipulation

I've been having a lot of fun building a realistic-looking demo app that is meant to showcase the features of a coming product. A big challenge in such cases is obviously dynamic data, since hard-coded data won't cut it. It needs to behave like a web app, providing the illusion of a server-side database, but without a server-side database.

HTML5 provides built-in client-side persistence features called sessionStorage and localStorage, but when the data requirements are complex, as in my application, these just aren't adequate. What I need is a client-side database like SQLite. As it turns out, different browsers have taken different routes to providing client-side databases, and it made my head hurt to read about Web SQL and SQLite and which browser incorporated which database and which browser abandoned which database. Finally, I found a JavaScript library called HTML5SQL.js that promised to abstract all those implementation issues from me, giving me a standard API to work with.

So far, it's worked great for me and seems very powerful and flexible. So that's the first tool I'd recommend, for client-side database manipulation - HTML5SQL.js.

Then I had the challenge of building HTML tables that would display this data, allow the user to manipulate it visually and save changes. After a lot of searching, I found another nice JavaScript library called DataTables. DataTables provides lots of powerful features, including sortable columns and pagination. These work even in the default settings with no configuration.

So that's the second tool I'd recommend, for HTML table manipulation - DataTables.js.

By blending HTML5SQL and DataTables functions, I could create HTML tables from a client-side  database. Even better, I could open multiple tabs on the browser, each representing a different part of the application, and they could all see the same data, because the database is a global resource to all tabs in the browser. Best of all, the data stayed persistent across browser restarts. It's a true database, with persistence, relational structure and SQL smarts.

Other tools:

Needless to say, jQuery has now become required technology, and a web developer simply cannot leave home without it. It's so powerful I'm still looking for the right categorisation to describe it.

Many of the new JavaScript libraries are built in an asynchronous style, so a rather naive usage assuming synchronous behaviour (that a function invoked first will complete before the next invoked function) will lead to some surprises. If you need to ensure that two asynchronous functions are executed in strict sequence, you will need to resort to callbacks, as explained on StackOverflow.

For visual designers, the old method of using HTML tables to lay out components is now passé. Using CSS layouts is the in thing. A popular CSS layout framework is called the "960 Grid System", or It's named for the fact that most modern screens have a width of at least 960 pixels, and this number lends itself to being divided into 12 or 16 columns with spaces or gutters in-between. A newer one is unsemantic, which is said to be better for "responsive" UIs (those that adapt to different screen sizes, such as desktop browsers, tablets and mobile phones), but is a bit more complex to use. In both these frameworks, HTML components such as "div" and "table" elements just need to be given special class names, and the CSS then uses these to lay them out. It's quite neat and powerful, but I'm not working in that area because I'm not a graphic or layout designer. I've just seen the nice effects you can get with a CSS layout framework.

Saturday, February 15, 2014

Sydney Workshop "Introduction to Dependency-Oriented Thinking" Held

After weeks of hectic preparation, Rahul Singh and I held our long-awaited workshop today on "Dependency-Oriented Thinking", comprising all-new material from my recent document "Dependency-Oriented Thinking: Vol 1 - Analysis and Design".

Somewhat disappointingly, we only had three signed-up participants, but the numbers only tell half the story. Only one of the three was from Sydney. One flew in from Melbourne the night before, and another got up at 4 am to make the 3+ hour drive from Canberra to Sydney. With such determination on their part, I just had to do my best, and I hope they were happy with the workshop. They certainly expressed satisfaction on their feedback forms :-).

The slides are now available on Slideshare. If anyone was put off from reading the original document on account of its size (264 pages), they could go through this slide pack instead (only 220 slides, heh).