Amazon Order History Encryption Bypass

The Amazon US website allows you to export your Order History easily by visiting the “Order History Reports” page. No such option seems to exist for the Amazon websites for other countries. I was trying to write a simple scraper for the Amazon India Order History Page to get the same data, and discovered something interesting: Amazon encrypts the Order history page, and decrypts it using client side cryptography1. If you were to visit the page, and check the response HTML, you’d see something like this in the source code (fairly simplified):

// Define encrypted content in JS
var payload = {
  "kid": "b70014",
  "iv": "/HenfXwYrGrrw8ff",
  "ct": "Wt78pPcibe8HAdVtoJ8+E9EGwt4IQYNghBMubBy7Zy/..."
}
// The HTML div to be populated with the decrypted HTML
var elementId = "csd-encrypted-889C1D02..";
// if client side decryption library failed to load
if (!window.SiegeClientSideDecryption) {
  window.location.href = "?disableCsd=missing-library";
  return;
}
// Decrypt and populate the div
SiegeClientSideDecryption.decryptInElementWithId(
  elementId, payload, {callSource: "now"}
);

The easiest way to scrape with such hurdles is often to just run a complete browser to scrape the site. The browser runs the javascript code with the decryption routine so you can scrape the actual content. However, it is much slower, and wastes CPU cycles - I try to avoid it if I can.

I could have spent time to parse the encryption routine, extract the key and decrypt the payload. But I found a much simpler solution - Amazon offers an alternate URL which disables encryption. As a fallback, in case the decryption code fails, it adds a query parameter ?disableCsd=missing-library. That disables the server side encryption entirely.

So if you’re trying to scrape Amazon and stumped at the missing order history in the HTML, try visiting the following URLs instead:

Amazon also sets a cookie csd-key=disabled but I didn’t experiment with that much.

Request My Data

Another alternative to scraping is to request Amazon for your data. Check the Retail.OrderHistory CSV files in the data export. The export from amazon.com includes data for other countries as well. The feature is also available on other Amazon sites:

  1. I’m hesitant to call this DRM, but it might qualify as such. 

Writing on books - Fantastic Beasts and Where to Find Them

“Fantastic Beasts and Where to Find Them” is a curious book:

  • The in-universe book is written by Newt Scamander and was published in 1927.
  • The first edition of the companion book was published in 2001. This is apparently the 52nd in-universe edition with a foreword from Dumbledore and was released to the muggle world for charity.
  • The 2001 edition pretends to be Harry’s copy of the book as of the end of Harry’s 4th year at Hogwarts. As such it includes hand-written comments from the trio. (Yes, Hermione writes on books!)

However, with the release of the film of the same name in 2016 - a new edition was released with lots of changes:

  1. 6 new beasts that made an appearance in the film were added to the book1:
    • Hidebehind
    • Hodag
    • Horned Serpent
    • Snallygaster
    • Thunderbird
    • Wampus cat
  2. The hand-lettering was removed.
  3. Dumbledore’s foreword is removed from the book, in favor of a in-universe foreword from Newt Scamander.
  4. “About the Author” section changes Newt’s background. He no longer graduates from Hogwarts, just “leaves” it, as portrayed in the film.

All of these changes are meant to fix the inconsistencies in the book with the canon, however that also makes the book much less charming. I got myself a copy of the Hogwarts Library boxset a few years ago, which includes the newer edition of the book (Bloomsbury) - that means no witty comments from Ron.

Since it didn’t have the hand-lettering, I took it upon myself to fix that mistake. Thankfully, lists of all the comments in 2001 edition are available on the internet. The trickiest part was the “this book belongs to” page, which is missing from the newer edition. I ended up creating a faux-library card for that instead.

Here is what it looks like:

  • Ron plays hangman and loses.
  • Hermione writes on books!

Thanks to Bhavya for helping with the troll illustration.

  1. I didn’t like the new additions, they sound less like a textbook and more like a transcript of what happened in the film. 

Analysing the Indian government cyberspace

I recently did some work on analysing the Indian government cyberspace, thought I should document them somewhere outside of my Twitter1.

List of GoI websites

I’d made a list of Indian government websites in Jan 2019:

The dataset was from 2 sources:

  1. GoI Directory
  2. crt.sh (All certificates ending in .gov.in were used)

I re-ran the scripts to get an updated list (12842 domains), then tabulated them against the public-suffix2 for each. There is a long-tail, and I’ve published results here. Here are the top public suffixes for Indian government sites:

Public Suffix Domains
nic.in 2454
gov.in 7259
in 528
ac.in 490
com 568
co.in 171
org.in 168
edu.in 117
org 844
res.in 134
net.in 12
net 38

Sanskari Proxy

This was a long standing idea on my ideas repo:

A lot of Indian Government websites are inaccessible on the public internet, because they geo-fence it to within Indian Boundaries. The idea is to make a Indian Proxy service that specifically works only for the Geo-fenced Indian government websites.

For eg, if uidai.gov.in is inaccessible, hitting uidai.gov.sanskariproxy.in will get you the same result, proxied via our servers.

Since I’d made an updated list of GoI websites, this seemed easy enough. I realized that setting up uidai.gov.sanskariproxy.in would likely count as impersonation under the Indian law, so I did the next best thing: run an actual proxy. Here’s the announcement tweet:

Project page is https://github.com/captn3m0/sanskari-proxy, and if you’d like to get access - please reach out.

Cyberspace Ownership

I’d planned to get a complete list of geoblocked websites next. While I’m progressing on this front, the results have been inconsistent/inaccurate so far. As an intermediate step, I’d made a list of IPs against every domain3, which looked like this:

Domain IP Address
aavin.tn.gov.in 164.100.134.148
abnhpm.gov.in 14.143.233.34
agnii.gov.in 13.232.216.65
ap.gov.in 117.254.92.53
aponline.gov.in 125.16.9.130
appolice.gov.in 118.185.110.147
attendance.gov.in 164.100.166.114
cgg.gov.in 112.133.222.115

While running numerous nmap scans (and failing), I start checking the ASN4 for some of these IPs to see who was hosting each website - especially the ones I was finding were blocked.

I stumbled upon a bulk IP to ASN service by Cymru, ran all the IPs against it and published the results. Here’s the important graph:

% of Indian Government domains hosted by each ASN. The image is a pie-chart representation showing share of each ASN. 45% of the chart is taken up by

As you can expect, NIC5 has the highest share, with NKN6, BSNL, and CtrlS following at roughly 5% each. There are a few other chart on the twitter thread, and the raw data is available here with interactive versions of each visualization.

What next?

I’m working on running and comparing connectivity scans to these IPs to get a better understanding of the geoblocking situation. There’s also some issues with the domain list, as it seems to be missing lots of domains - so more corrections are needed.

  1. Twitter decided to suspend 12 different accounts I had access to recently - I’m starting to get wary of using Twitter for archival now. 

  2. A “public suffix” is one under which Internet users can (or historically could) directly register names. For eg - nic.in or github.io. Mozilla manages the list at https://publicsuffix.org/

  3. There are issues with this approach, since domains do resolve to multiple IPs. But this is okay for the rudimentary analysis I’ve been doing so far. 

  4. Autonomous Systems (AS) is how the internet is sliced up and managed by different entities. Each AS (usually an ISP) is responsible for routing within its network, while announcing network routes on how it can be reached. 

  5. The primary government office (under MeitY) that provides infrastructure and support for government IT services. 

  6. National Knowledge Network is a multi-gigabit research and education network that provides a high speed network backbone for educational institutions in India.