Cracking the Foreign Exchange Market Using Social Data

The popular game show Who Wants to Be a Millionaire used to fascinate me as a child. The show itself was entertaining, novel for the time, and dramatic (“Is that your final answer?” anyone?). However, one part of the Millionaire always stuck out to me: why was the audience always right? According to James Surowiecki, the best-selling author of “The Wisdom of Crowds”, the audience on Millionaire was correct 91% of the time. The basis of Surowiecki’s argument is that the decision of a group is always better than the decision of one of its members. Surowiecki postulates that the group needs to meet several requirements – such as being from diverse backgrounds, resistant to groupthink, and have access to the same basic information – in order of them to make the best decisions possible. That’s what made the audience from Millionaire so great – it was a group of diverse people who couldn’t communicate with each other giving their qualified opinions on which answer was correct. So how does this relate to social media, the foreign exchange market, and a potential correlation worth millions? Read on.

 

Armed with this knowledge, I was given access to a social media-monitoring and engagement platform called Radian6. It allows the user to search, collect, and visualize data from the social web on a grand scale. Radian6 allows companies to track the effectiveness of their social campaigns (“are people actually talking about my brand?”) , along with engage directly with their customers in one convenient platform. I was given access to this amazing platform thanks to a partnership between Clemson University, Dell, and Radian6 – which gave students like myself unlimited access to research anything we wanted. I immediately thought back to an article in Wired  in which researchers from Indiana University found a correlation between the general mood on Twitter and the stock market. The general “mood” on Twitter was able to accurately predict the daily movements of the stock market 86.7% of the time, which is an incredible correlation (source). Similarly, researchers from HP Labs identified a correlation between posts on Twitter about movies and their respective box office sales. I figured that this deserved to be looked at a second time, so several of us (myself, Scott Cole, Paul Smith, James Kaplanges, and Brett Smentek) formed a group to analyze this further.

I learned very quickly that constructing a social query surrounding stocks would not be an easy feat. How would we differentiate the company “Apple” from the fruit? How would we interpret news about AAPL as positive, negative, or neither? Why were people misspelling “back” as “bac” and throwing off our search results for Bank of America’s stock ticket, BAC? This was going to be a nightmare. However, we also discovered that people on Twitter used the notation $BAC or $AAPL to talk about stock tickers. After analyzing how many people used that notation, we decided that weren’t able to effectively watch one stock. We would encounter spikes of traffic surrounding one company – for instance if an earnings report came out, or a large piece of news was released. So, we plugged in all 3000+ securities listed on the NYSE one-by-one into Radian6 and watched intently on which stocks were being talked about the most. We were able to spot a correlation between social traffic on a particular ticker and big market movements, which was exciting. However, the amount of queries that Radian6 had to run for our stock analysis slowed their servers down to the point where it affected other customers. We had our topic profile disabled, but thankfully we were able to keep our access to the platform. Whoops! If you want to see a full write-up on our efforts in the stock market, I’d recommend checking out Brett Smentek’s blog over here.

With our stock project essentially shut down I turned my focus to the foreign exchange market. Since there are only a few currency pairs to examine, it ensured that we could drive in enough social traffic to examine with only a few search keywords. After searching through social media posts about foreign exchange, we discovered several keywords that would yield tons of relevant opinions on whether to buy or short a currency pair at a specific time. We had essentially found the Ask the Audience lifeline for the foreign exchange market.

Let me show you what I mean. This is a picture of Radian6′s output:

 

We look for instances where Buy or Sell volume outweigh each other by a certain amount. In this example we would have executed trades in the 8AM and 12PM ranges. Now if we compare that to the EUR/USD price graph for the day:

 

After analyzing some preliminary data from our Radian6 topic profile, we had enough data to construct a rudimentary automated trading algorithm.  The initial results of this algorithm were overwhelmingly positive, so we pressed on. Over the course of seven weeks we have come up with a very sophisticated trading algorithm that can respond to a number of market conditions, which has shown to have a very effective in practice trading.

Out of 58 trades made by our Radian6 powered social algorithm, only 13 moved in the opposite direction. That is a 77% prediction rate, which may be higher because of inefficiencies within the autotrading algorithm. On average we secured 32 pips per trade, which beat our goal of 20 pips per day by a good percentage.

We started with $5,000 in a demo brokerage account leveraged 50:1 and let it trade over the course of seven weeks. As of today, we have $44,000 in the account (784% increase) and are on course to have over one million dollars in the account before the end of June.

The X axis represents individual trades, while the Y axis represents dollars in the demo account. We make anywhere between 1-2 trades per day, and this represents 35 days worth of trades. Note that there was a period of neutral/negative growth for about a week. That week was extremely volatile price wise (no discernible upward/downwards trends), and pointed out a flaw in our method. Group decision making may be accurate and effective, but it is far less fast and efficient in comparison to individual decision making. By the time enough posts come through to trigger a trade, the market has already made its short-lived movement and is moving towards a correction. Regardless of this setback, our social media autotrading bot destroyed our expectations and continues to make great trades.

More updates to come! Stay tuned.

Hard /CIDR Networking Academy – VTPv3

Hard /CIDR Networking Academy – VTP and VTP Pruning

Another video: this time about VTP and VTP Pruning. Enjoy!

 

HardCIDR Networking Academy – DTP

I have decided to start an instructional video series of my own to mirror my own studies. I believe that teaching material is the best way to master it, and other might benefit off of these videos as well :) Leave any comments/suggestions below!

Enjoy!

Capturing CDP Frames on Windows

Cisco Discovery Protocol (CDP) is an amazing information discovery protocol, but it seems that few people know that that you can leverage the information found in CDP on a Windows machine in the field. I ran into a unique situation whereby I needed to find which switchport a Windows based PC was connected to, but did not have my Fluke network diagnostic tools on me. After several minutes of googling for an answer, I ran across this blog post about collecting information from CDP enabled switches on Windows using TCPDump. The author seems to have a strong bias against Wireshark because it’s not easily installed on a client’s computer, but any packet capture application would work here.

The steps involved here are pretty basic, but I’ll go through them here.

  1. First you need to download TCPDump
  2. Next, cd into the directory and figure out which adapter you want to sniff packets on. You can use the command “tcpdump -D” for that
  3. Next, run “tcpdump -i 2 -nn -v -s 1500 -c 1 ether[20:2] == 0×2000″ and wait until it captures a CDP frame
  4. It will output the contents of the CDP frame in the cmd shell, and that’s it!
What information would you expect to find?

  • Switch IP address
  • Switchport native VLAN assignment
  • Switchport number
  •  VTP domain
  • Switch hostname

CDP is an incredibly useful protocol in this case, but also keep this in mind when deciding whether keeping CDP enabled is worth the security risk involved! Assume that anyone on your network has access to this information when making that decision.

The Value of a Macbook – The “Jade Plan” Revisited

I recently bought a new Macbook Pro (my first OSX computer, in fact) just a few weeks to replace my dying Google CR48 Chromebook. Since I deal a lot with used electronics, I understand that Macbooks hold their value very well, even over several hardware changes. I did some research on this topic before my purchase and came across something called the “Jade Plan” on the Ars Technica forums. The premise of this plan is very simple: sell your old Macbook to fund the purchase of your new one. There are several variations of this plan, including waiting until the next major OS upgrade to purchase (to save on software costs), to waiting for a redesign. There are numerous “Jade Plan” success stories on those forums, but I wanted to drill down and find some real evidence that you can pull this off successfully.

Objective:

Test the legitimacy of the “Jade Plan” by analyzing historical eBay prices in respect to Macbook Pro generations. The “Jade Plan” can be considered successful if the total cost of ownership of a Macbook Pro is less than $100 per year after a successful upgrade.

Data:

The data was collected over the course of a day by myself (I tend to lead an exciting life), and is stored in this Google Doc. Sheet 1 contains summary data, and Sheet2 contains the raw data. It’s a little messy, but you can figure it out.

Observations

Let’s start off with a couple simple observations I had while collecting the data.

  1. People who overvalue their Macbook (ie. set the price too high) will not sell it
  2. Likewise, those who set the price too low will not sell their Macbooks. If you plot the data on a histogram, you will in fact see observations #1 and #2 clearly defined in that regard
  3. Upgrades don’t tend to do much to selling price. Upgrading your MBP to 8GB of RAM is definitely a plus, but don’t expect much more money when you go to resell it.
  4. Preinstalled software leads to higher selling prices. The highest selling MBPs, even some of the crazy high outliers, had things like Adobe CS5 and Office 2011 preinstalled. This applies to every model, for every year.

Those are some pretty simple observations, but the real shocker lies in the data. The average selling price of an early 2011 MBP is about $920, which is a 23% depreciation over the course of a year (and a MBP refresh). Without taking into account tax and eBay/Paypal fees, that is roughly a $280/yr cost of ownership. The same cost of ownership occurred with the 15″ model, but the 17″ model was by far the worst with a $425/yr cost of ownership. The older models of Macbook Pros (I went as far as the mid-2009 model line) fair a little bit better – with the mid-2009 13″ MBP having a cost of ownership of $218/yr if you bought it on release day. Again, this isn’t taking eBay fees or sales tax into account, and as the data shows, those numbers aren’t pretty.

Conclusion

If you buy your Macbook Pro for the full retail price, then sell it on eBay after the next model comes out, then I can definitively say that the Jade Plan is nothing more than rumor. However, there are several things you can do to make sure you can get the best price for your shiny Macbook Pro. The first is to keep everything – the laptop, box, cables, etc. – in perfect condition. Any kind of dings/scratches/dents can severely cut the price. The second is to go ahead and preinstall software on the MBP if you have the ability to do so. The MBP’s that went for the most money had expensive software suites like Adobe CS5 and Office 2011 preinstalled and ready to go. Many of the items that I saw that fit those two descriptions were selling above their retail price – and therefore beat out the Jade Plan by a mile. Also, anything to avoid fees is a good thing, so try Craigslist if that is an option in your area.

 

Please feel free to make any comments/suggestions! I’ve spent the last several days collecting and pouring over this data, so feel free to ask anything.

Green Tigers Pitch Deck

Last Friday I won the LaunchPadSC 2011 entrepreneurship competition with my idea for Green Tigers – an electronics recycling firm that targets college aged students. It was a great experience, and the prize money wasn’t bad either! You can see my winning pitch deck here, and note that Green Tigers is currently a live web service. It may be down for the holidays, however, as I want to figure out where I’m taking this idea…so stay tuned!

GPU Cracking and Why You Should Use Pass Phrases

This past week in class, we discussed the importance of password security on the web. The conclusion of the discussion was that a minimum of an eight character password should be used, but more is definitely better. While the concept still holds true, GPU hash cracking has set the bar higher for what the “minimum password length” should be.

According to Majuric’s blog (http://majuric.org/software/cudamd5/), a quad core Intel CPU can crack 4.1 million MD5 hashes per second. That means that it can take days to crack an MD5′d password over 6 characters. Regardless of how many cores you have, a CPU just isn’t built for brute forcing hashes. However, according to MyTechEncounters blog  (http://mytechencounters.wordpress.com/2011/04/03/gpu-password-cracking-crack-a-windows-password-using-a-graphic-card/), a $100 graphics card (also called a GPU) can brute force 3.3 billion MD5 hashes per second. That is an incredible increase in horsepower! For an 8 character password that would take 17 years to crack via conventional methods, this $100 GPU can crack it in only 26 days. While that is still a decent amount of time, it brings the once impossible back into the realm of possible. Several people have achieved results eight times faster through the use of better/more GPUs, but the fact remains – passwords need to go.

Pass phrases are using a sentence as a password, instead of using a word and some numbers. There are two benefits: pass phrases are easier to remember, and they are nearly uncrackable. A pass phrase such as “The barking dog is annoying” is 23 characters, and much easier to remember than B@rk1ngD0g$.

In conclusion: use pass phrases, not passwords.

 

The Cloud Revolution And Our Rentership Society

The “cloud revolution” in web services has had an immense impact on how we view not only computing, but also ownership. The old device dependent computing model has clear concepts of ownership – namely you purchase a product, and you can do whatever you want with it. Usually the product comes in disc or direct download form, whereby you can actually “own” what you purchase. Let’s take for example purchasing a videogame from Best Buy for the PS3. You own the disc, you own the content, and you can use it whenever and however. You can even resell the product after you are done with it. That is a clear, conventional form of ownership from which we are starting to drift away from.

The cloud model’s sense of ownership, on the other hand, isn’t as clear cut. What do we own when all of the content is stored elsewhere, and we are merely accessing it? Let’s take two examples: Spotify and Netflix. You pay $10 a month in each case to “rent” thousands of different songs and movies at any given time. For $10 a month you don’t own anything at all, so why are we so compelled to jump on this bandwagon? The ability to create, and therefore “own” things in these models is what keeps us coming back for more. In Spotify, I have several large playlists which I have created, and would be lost without. Same with Netflix – where would I be without my personal movie recommendations based on the hundreds of movies that I have rated? But do we technically own anything in this model? Not at all, for if Netflix or Spotify were to go out of business, I would be left with nothing to show for it.

The Gift Economy and Product Reviews

This past week we discussed the concept of the Web 2.0 “gift economy”, or the fact that many Web 2.0 firms and denizens give away products and services for no cost. This concept spans everything from open source projects, to Google offering search functionality for free, to users freely contributing product reviews to Amazon. One big question that was raised in our discussion of the Web 2.0 gift economy was simply “why?” Why would we freely give up our time, resources, and sometimes money with no quantifiable benefits in return? Altruism and irrationality were two theories posited, but I think that it goes deeper than that.

Let’s take a look at Amazon product reviews. Why do people waste anywhere between 5 and 45 minutes writing a detailed review of a product? Are we just irrationally or altruistically posting our opinions online? I believe that the answer lies not in the fields of economics or business, but rather in the realm of psychology. We, as human beings, love to be experts on something. We love our voices to be heard, and what better way than giving your opinion online? When Amazon sends you an email asking you how your brand new coffee maker is working, you feel compelled to put your two cents in because you feel that it matters. Simply put, Web 2.0 is not about altruism, rather it is about ego. Is this bad? Not at all – it helps everyone involved – from Amazon to other potential purchasers of the product.