Why Indian Government Sucks at Technology

Note: All opinions are mine alone. Please keep your opinions limited to comments.

This is in wake of the not-so-anonymous attacks on the nic servers. Apparently a few people had decided to brand their own version of AnonymousIndia and hack into the nic servers. This had been viewed as almost normal in the Indian Media. This is after all something that happens every other day in India. Its just the Indian Army website, nothing that we care about.

Even though there are excellent tech-security companies in India, we have never developed the right attitude to it. People still think that hacking is fun, and its something that could never happen to them. I’ve never seen people reading the fine print on the thousand social web sites they join now a days. Piracy is rampant in India without any checks and the Indian Government is silent.

Why? Because we are used to it. It has always happened this way. Information leaks have been a major part of the Indian history and would remain so unless we realize that the more we embrace technology; the more we become dependent on it, it is coming one step closer to the edge.

A recent tweet by @divyekapoor reminded me of the AADHAR project, which aims to give out a unique identification number to each Indian citizen.

I read out the UIDAI docs and it tells us that it would “prescribe protocols to ensure the confidentiality, privacy and security of data”, and “follow the confidentiality, privacy and security protocols prescribed by the UIDAI”. A search on the uidai.gov.in website gives out a little detail :

The UID database will be guarded both physically and electronically by a few select individuals with high clearance. It will not be available even for many members of the UID staff and will be secured with the best encryption, and in a highly secure data vault. All access details will be properly logged.

This is the most that they have to say on the subject. If that makes you feel safe, remember the earlier fiasco involving the security of the EVM machines. Ultimately I’m with the government on that one, however, since a user had to have physical access to actually do any harm to the machine. But it was possible and Election Council kept denying it. As it happened, it seems that all our govt. agencies are probably trying more to hide their technical requirements, rather than make them open.

So why is this approach taken in India? Beauracracy? probably yes. But I feel that unless the technocrats rise up and actually enter the Indian Govt. technical agencies (such as NIC), nothing would change. For instance I find that almost everything that Indian government does is tailored specifically for Windows. There is no reason to plaster “Works best in Internet Explorer 6 at 1024x768” in all your web sites, we already know how much the Indian government. So much so that they get it pre-installed everywhere, even in government schools, and technical institutions.

Yet there is a small faction that is working tirelessly in the other direction as well. For instance the Sakshat project has been under the news recently as well. It would ship with a version of android (which one?) with wi-fi, bluetooth and other frills. However the project has been known for shadowing and changing its details at each conference it is unvealed, so beware. It may suddenly change from an android to a Windows phone in the next one. Or maybe a blade server (speaking of which, the uid project ordered 68 of them.

I’ve worked a little bit on online Geo-Mapping tools earlier (mostly using Google Maps API). However, I wanted some accurate data for one of my projects (such as geographical boundaries). Other sources for this data are not as reliable, and I found the bhuvan online tool to be extremely accurate in this respect. If you’ve forgotten Bhuvan, its the Indian version of Google Maps. It was supposed to be the tool for mapping things. Unfortunately, as things have planned out, most of its claims have been rubbished (like 10m resolution power), or made null due to the extremely slow servers it uses(and it was supposedly optimized for low bandwidths). If you’re plannig to fight google, you’ve to step up your game. Try checking out the horrible design of the Bhuvan website. Leaving aside its horrible interface, Windows only support (.net), installation of additional plugins to just run it, and , it had brilliant geographical data(collected via various government agencies). However as it turns out, this data is not public. Why? It seems that ISRO plans to sell this data to people interested in using it. Wow! So a public funded agency decides to make money from the development done using our money. It’s been close to 3 years since its launch. And I’d be highly interested in knowing where it managed to sell this data.

In all fairness, though they’ve said that they would only sell the high resolution data, while making the general data freely available to users. But I don’t see fair unless its own my system, damn it. If by free you mean I’d have to open your website each time I need to find a village, it seems we have different thoughts about the word’s usage. Oh, and there’s a link on the website’s home page to a section called APIs which redirects to the Bhuvan Software download page, where guess what, there are no APIs at all. I managed to dig a few links and download two versions of their APIs, which seem to be downloaded copies of the documentation of the open layer javascript protocol they are using for mapping in the browser. So the data is available, it seems. But they’re forgetting to mention where.

Seriously, people wake up. This is the 21st century, and the most hyped buzz word today is open-source(well, after cloud). And if you really want to work on things, make them open source. Not just the technology, but the data as well. Becauase open data is essential to growth and planning of a nation, as Hans Rosling keeps on reminding us. Meanwhile, DRDO decides to go ahead and develop its own operating system. Why? Because it will be closed-source and will be much more secure than any of the variants of linux. Or so they think.

The Informatics center still runs thousands of its websites in ASP (not even ASP.net), and this fact alone is enough to scare me off. With major corporations suffering from data leakage (Sony, Gawker) where user access was compromised, it is high time that the Indian Govt. someone realizes that you cannot secure your systems by locking them in a vault. These companies did the same, and look at the result.

Even the American Government has come up with an open-source initiative. Their website data.gov is a collection of applications, apis and raw data collected by various government agencies. And to top it all, the US government invited the top application developers from their platform at data.gov to the White House, hailing them as unsung heroes of the new age. If you’re interested in reading more and taking a stand for the open data democracy, take a look at this whitepaper by the Netherlands Organisation for Applied Scientific Research for a keen review of what are the major barriers to a government from sharing its data. Also go ahead and donate some money to wikileaks, while you’re at it.

And where is the Indian government at this? Not very far behind, but lagging nonetheless. Let us take the prime example of the decennial festival that is the Census. Apparently all census data is free (as it should be). But there is a minor caveat. The entire site is made in asp (which doesn’t really matter, I just don’t like it), and all the data (tables, figures, maps) are in pdf format. So, you can access the data personally, on a single page. Page by page, it might be thousands of documents, figures, charts, and what not. But since it is all in pdf format, it is locked down. You could parse it by some means, but the data is supposed to be free, in the best format possible so that everyone can use it easily. And the geographical boundary data(which I mentioned earlier) is also available in the census results, but only via a java applet, which does not allow access to the raw data, that an application developer would need. Am I expected to file an RTI application, just to know the exact boundaries of my state. Or perhaps, I should just pay ISRO and be done with it.

As an additional benifit easter egg, the census website states the following:

The Census of India or any data or content providers shall not be liable for any errors in the content, or for any actions taken in reliance thereon.

All efforts have been made to ensure the accuracy and currency of the content on this website. However, users are requested to verify/check any information with us to obtain appropriate professional advice before acting on the information provided in the website. In no event will the Government or office of the Registrar General India be liable for any expense, loss or damage including, without limitation, indirect or consequential loss or damage, or any expense, loss or damage whatsoever arising from use, or loss of use, of data, arising out of or in connection with the use of this website.

And lastly, as an analogue to the excellent Right To Information Act, there must be an analogous Right To Technology act. It should empower each and every person in the country to know what is happening behind the scenes. We demand total transparency in technological decisions. Not just the passing of tenders, but the decisions which involve actual technological development. For. eg at the CDAC website, the government offers a trial version of various forensic tools they’ve developed. Why aren’t they open sourced? Why were they writen using a particular language? If RTI fought off the beauracracy, this could help us eliminate the old technocrat thinking from the India Govt. If we are able to get the right data, in the right format, thousands of application developers across the world are willing to create great ways to access that data. Data by itself is not enough, however. It must be met with an equal resolve from people to make it accesible, and usable.

This could be a turning point in the Indian Govt. Either they could continue what they’ve been doing and meet their doom in a major state sponsored hack crippling the entire nation. Or they could take a step back, and do things the right way.

And what is the right way?

Peace, Love, Linux

Update :According to an article in the Economic Times, there is a working draft for bringing Open-Source in e-governence systems under work.

Setting Up Sparkleshare Server using Gitolite and Ubuntu

Introduction

Everybody seems to be all about open-source cloud-backup and sync solutions now-a-days. The hype is all around the cloud, they say. However cloud is just a stupid concept for sales people, that I prefer to avoid. However people are coming up with all kinds of crazy ideas to create their own dropbox clones. A few similar services include SpiderOak, Ubuntu One, Sugar Sync, and Wuala. However, not all of them are compatible with Linux (unlike Dropbox, which is).

Comparision

So here’s a minor comparision of some famous clients :

  • Dropbox : Cuurent Leader, offers everything from sync, collaboration, sharing, public links, upgradable storage and is the de-facto client for synchonization tasks. However there have been a few issues regarding its privacy issues recently.
  • Ubuntu One : Ubuntu One is Ubuntu’s fighting offering to Dropbox. Its excellent, with an open source api, that allows one to create applications for the Ubuntu One platform very easily. However, the server-side of Ubuntu-One is still closed source, which means you cannot setup it on your own servers (similar to dropbox). Canonical has hinted that it might be made open source in the future.
  • Wuala : is a file-backup network where you trade your own hard disk space for extra storage. This allows wuala to offer higher space at a much lower offering rate.
  • SpiderOak : I’m using this currently along with SparkleShare and Dropbox. It has proven to be very robust, allowing me to backup almost anything to its servers. I’ve got a 5GB account which is more than enough for me till now. Its very powerful interface allows one to control each and every aspect of your backup/sync/share process. Also it boasts of a true-privacy feature, meaning that all your documents are encrypted before being sent to dropbox. It also means that you can only reset your password from your own computer.

Take a look at http://en.wikipedia.org/wiki/Comparison_of_file_synchronization_software for a better comparision of several other services as well.

Installing SparkleShare using gitolite

This is a simple tutorial on running your own sparkleshare server as a hosting server. Note, that this implementation should ideally be built as a separate module for sparkleshare-admin, which is still under works as of writing. Sparkleshare’s basic concept is to use git repositories as storage places. In case you don’t know what git is, I’d recommend this guide for more details. In short it is an awesome revisioning system for use by anyone managing code(or content for that matter as well). It allows you to keep track of what is happening with your directory, and revert back to earlier versions (among several other things).

Sparkleshare asks you to setup a git-server somewhere and use it as a remote storage system. It offers out of the box support for git hosting providers github and gitorious. It also allows you to add your own custom servers as well. Enough description, lets get down to some work :

Setup Gitolite

Assumptions :

  1. You are running a stable Linux OS (Fedore/Debian/Ubuntu etc)
  2. user@host1 is your own computer
  3. user2@host2 is the primary computer where you intend to start the server
  4. The gitolite username is sparkle
#On your host machine (which will be remote admin to the git share)
ssh-copy-id user2@host2:/tmp/user.tmp
#should not ask for password:
ssh user2@host2
sudo apt-get install gitolite 
sudo dpkg-reconfigure sparkleshare
#Configuration Options may vary, but remember the gitolite user name that you specified
logout #Come back to your own computer
git clone sparkle@host2:gitolite-admin 
#Should work, or else you did something wrong. Go read the [gitolite docs](http://sitaramc.github.com/gitolite/doc/)
cd gitolite-admin
nano conf/gitolite.conf

Setup WildRepos

Edit the file and add the following lines at the bottom :

repo	share/[a-z0-9]{6}
	C	=   @all
	RW+	=   CREATOR

Now we need a method to allow anyone to create git repositories on the server. This is accomplished via Gitolite’s very powerful Wildcard Repositories feature.

#Login back to server
ssh user2@host2
#Since we are using package install method, sparkle's password needs to be set
sudo passwd sparkle
su - sparkle
nano .gitolite.rc
#search for $GL_WILDREPOS and set it to 1
logout
#Now we push our admin repo to add the wildrepo settings
#you're still inside the gitolite-admin directory, right
git push

Setup Client

Now, your server is all setup, but there is still stuff to be done :

#On user1@host1
mkdir -p ~/.ssh
sudo add-apt-repository ppa:warp10/sparkleshare
sudo apt-get update
sudo apt-get install sparkleshare libwebkit1.1-cil git-core
sparkleshare start &
sparkleshare stop

When you run sparkleshare for the first time, it asks you for a few things, including your email-id. Fill in those details, but do not setup your repository yet. You need to first allow your sparkleshare account access to gitolite.

cd ~/.config/sparkleshare
ls #Should reveal files called sparkleshare.email.key and sparkleshare.email.key.pub
cp sparkleshare.email.key.pub /path/to/gitolite-admin/keydir/
cd /path/to/gitolite-admin/
git commit -am "Added sparkleshare client1"
git push

Now if all goes well, you’d have allowed acess to gitolite for this user. We now need to re-run the sparkeshare setup again. Find it in your Applications. Now when it asks you to fill a repository path, type in the following details :

Server: sparkle@host2
Path: /share/fh73ah

Please take care of the slashes, otherwise sparkleshare fails to recognize it as a valid ssh address. Instead of fh73ah, you can type any alphanumeric string of 6 characters. You can change this in your gitolite-admin conf.

After your first sync is complete (in which it tries to clone your existing repo, and gitolite creates it for you), you can find a folder called Sparkleshare in your home directory. This contains all your personal sparkleshares, including your first one. Put in any content inside the fh73ah and it would be automatically synchronized.

Conclusion

The best thing about sparkleshare is that you can use your own server under your own rules. I’ve synced 143GB via Sparkleshare so far, and it has been working excellent so far. It takes a complete history, takes care of moves (git) and allows you to keeo huge backups easily. Just drag and drop, and forget. If you want to sync already existing folders, just drag them , and alt+drop them inside the shared folder. This way a sym-link gets created, which refers to the original directory. The sparkleshare folder on my computer takes up hardly a few kbs, but syncs worth 150gb.

This method is only useful if you need to manage multiple accounts on the same host. Otherwise, you can refer to this excellent post on webupd8 for instructions to install it to a single user system (which does not involve the complication of gitolite). I’ve been looking for some gitolite management scripts (I’ve written a few as well) which would allow one to easily add their own ssh keys. This way anyone can easily setup accounts on the system. However, as of now, this is just a dream.

New Website [ CaptNemo.in]

Just if someone's still following this blog around(don't, its already dead).   I've moved over to my new website (http://www.captnemo.in). Its running from github and will be a perfectly static website where the power of my awesome magical skills shall finally be revealed. Just joking. Its hosted on the awesome servers of github and you might want to check it out..