A simple Bitcoin web API
Bitcoin has caught the world by storm and is making headlines; it's the most successful and famous—or infamous, depending on whom you speak to—decentralized cryptocurrency. Bitcoin is regarded as an "anonymous" online cash substitute. SilkRoad, an illegal marketplace on the Tor network, which has been shut down, accepted Bitcoin as payment for illicit goods or services. Since gaining popularity, some websites and brick and mortar stores accept Bitcoins for payment. It has also gained vast public attention for climbing to unforeseen heights as its value rose well above everyone's expectations.
Bitcoin assigns individuals addresses to store their Bitcoins. These users can send or receive Bitcoins by specifying the address they would like to use. In Bitcoin, addresses are represented as 34 case-sensitive alphanumeric characters. Fortunately, all transactions are stored publicly on the blockchain. The blockchain keeps track of the time, input, output, and values for each transaction. In addition, each transaction is assigned a unique transaction hash.
Blockchain explorers are programs that allow an individual to search the blockchain. For example, we can search for a particular address or transaction of interest. One such blockchain explorer is at https://www.blockchain.com/explorer and is what we'll use to generate our dataset. Let's take a look at some of the data we'll need to parse.
Our script will ingest the JSON-structured transaction data, process it, and output this information to examiners in an analysis-ready state. After the user inputs the address of interest, we'll use the blockchain.info API to query the blockchain and pull down the relevant account data, including all associated transactions, as follows:
https://blockchain.info/address/%btc_address%?format=json
We'll query the preceding URL by replacing %btc_address% with the actual address of interest. For this exercise, we'll be investigating the 125riCXE2MtxHbNZkRtExPGAfbv7LsY3Wa address. If you open a web browser and replace %btc_address% with the address of interest, we can see the raw JSON data that our script will be responsible for parsing:
{
"hash160":"0be34924c8147535c5d5a077d6f398e2d3f20e2c",
"address":"125riCXE2MtxHbNZkRtExPGAfbv7LsY3Wa",
"n_tx":25,
"total_received":80000000,
"total_sent":80000000,
"final_balance":0,
"txs":
[
...
]
}
This is a more complicated version of our previous JSON example; however, the same rules apply. Starting with hash160, there's general account information, such as the address, number of transactions, balance, and total sent and received. Following that is the transaction array, denoted by the square brackets, that contains each transaction the address was involved in.
Looking at an individual transaction, a few keys stand out, such as the addr value from the input and output lists, time, and hash. When we iterate through the txs list, these keys will be used to reconstruct each transaction and display that information to the examiner. We have the following transaction:
"txs":[{
"lock_time":0,
"result":0,
"ver":1,
"size":225,
"inputs":[
{
"sequence":4294967295,
"prev_out":{
"spent":true,
"tx_index":103263818,
"type":0,
"addr":"125riCXE2MtxHbNZkRtExPGAfbv7LsY3Wa",
"value":51498513,
"n":1,
"script":"76a9140be34924c8147535c5d5a077d6f398e2d3f20e2c88ac"
},
"script":"4730440220673b8c6485b263fa15c75adc5de55c902cf80451c3c54f8e49df4357ecd1a3ae022047aff8f9fb960f0f5b0313869b8042c7a81356e4cd23c9934ed1490110911ce9012103e92a19202a543d7da710af28c956807c13f31832a18c1893954f905b339034fb"
}],
"time":1442766495,
"tx_index":103276852,
"vin_sz":1,
"hash":"f00febdc80e67c72d9c4d50ae2aa43eec2684725b566ec2a9fa9e8dbfc449827",
"vout_sz":2,
"relayed_by":"127.0.0.1",
"out":[
{
"spent":false,
"tx_index":103276852,
"type":0,
"addr":"12ytXWtNpxaEYW6ZvM564hVnsiFn4QnhAT",
"value":100000,
"n":0,
"script":"76a91415ba6e75f51b0071e33152e5d34c2f6bca7998e888ac"
}
As with the previous chapter, we'll approach this task in a modular way by iteratively building our script. Besides working with serialized data structures, we're also going to introduce the concepts of creating logs and writing data to CSV files. Like argparse, the logging and csv modules will feature regularly in our forensic scripts.