I’m a software developer, data scientist, artist and technology writer.
If you have a challenging technical problem or would just like some advice, then send me an email or message me on Twitter! My virtual door is always open.
Oh – and if you want to learn how a computer works or how to use the Rust programming language, then consider buying my book Rust in Action.
Get an introduction to programming while learning more about how to add computational methods to humanities and social science research. We’ll be using the theme of New Zealand’s early explorers to learn more about the exciting tools available to contemporary researchers.
The workshop is somewhat of a follow-up to an earlier workshop in Wellington. Dr James Smithies, the head of the University of Canterbury’s Digital Humanities programme has invited me to provide a similar day for people down south.
The day will be split into two main sections, which will include tutorial based content and interactive discussion:
Morning The web as data - an introduction to web scraping and other techniques to translate textual information to something useful to the computer as data. We will be building an interactive website from James Cook’s journal of his first voyage.
Afternoon APIs for learning - an introduction to some of the excellent tools that are available for research on New Zealand society and history. In particular, we’ll touch upon at least Digital NZ and DBpedia.
To break up the technical work through the day, we will spend time discussing the emergence of technology’s impact on humanities and social science.
You should be able to bring a laptop that you have administrative rights to. You will need to install some software called Oracle VirtualBox to make the most out of the day.
Assuming you have VirtualBox installed, download this image.
Great Scala talk in Auckland tonight. Thanks Movio!
To get a sense of a competitor’s level of engagement, http://search.twitter.com is a great tool. To begin, just add a user’s name in the search bar. You’ll be greeted with every message to and from that account within the last 3 or so days.
The search interface makes it very simple to see exactly who is talking to your target company, what they are talking about and how your target company responds. In essence, you are privy to their complete interaction - meaning you’re well You can also get a sense of the type of Twitter strategy that they are using you can draw inferences about. Do they respond quickly? Are they informal? Is there a particular type of day that they tweet?
You can get a sense of what positive tweets are being sent if you include a smile emoticon, :), with your query.
“@timClicks :)” returns messages like: @timClicks Thanks so much Tim. :-). Thankfully, there are no results for “@timClicks :(”.
Remember, that it’s possible to analyse competitors that are not even using Twitter. Not having a Twitter account doesn’t prevent people from tweeting about you. Experiment by searching for trade marks, company names or governments.
Search is great, but it doesn’t give you a full insight into a competitor’s use of Twitter. They might be sending great deals to selected followers via direct messages, for example. To find out, create an account and become their customer. Do not impersonate someone or indulge in deceptive conduct. Instead, make some genuine enquiries. Carry out a few purchases. Assess their response.
Some questions to think about:
Your company needs to make the ethical decision about how far to take this strategy. It’s plausible that you could create accounts in many target demographics to interact with everyone in your industry. This is ultimately very close to breaching Twitter’s rules. However, as long as you are being genuine, it is highly likely that you will be able to gain large quantities of information about how your industry interacts with new media.
One of the strangest bugs I’ve had to deal with was ensuring that content was “likeable” and “shareable” on Facebook. The Ushahidi platform is an amazing tool for sourcing data from the crowd in crisis situations. However, when we deployed it for eq.org.nz, there was a problem. When someone tried to share the link on Facebook, the blurb looked something like this:
The content of this site is licensed under a Creative Commons Attribution 3.0 New Zealand License. To satisfy attribution requirements, include “Source: http://eq.org.nz” in your derived content. This website is provided as is, and your use of it is exclusively at your own risk. …
Fairly uninviting discovering us via social media.
In order to get around this, we actually introduced a paragraph tag that wouldn’t be rendered by browsers, but would be picked up by Facebook’s crawler. Our HTML source looks like this:
<!-- For 'share' on Facebook links -->
<p class="hidden">
The community-driven situation map of the
Christchurch Earthquake
</p>
As it turns out, like many things in life, there’s probably a better way to do it.
Sidebar: graph means a mathematical graph, not a visual one. Now, if you studied mathematics or computer science at university, you may not have a clue what’s intended here. In mathematics, graphs represent networks. The upshot here is that computers can understand the relationships between all the bits of content on the web.
The Open Graph Protocol is a way to add metadata, content about your content, to Facebook. It uses the <meta> tag within HTML. Let’s look through an example from the Internet Movie Database:
<html xmlns:og="http://ogp.me/ns#">
<head>
<title>The Rock (1996)</title>
<meta property="og:title" content="The Rock" />
<meta property="og:type" content="movie" />
<meta property="og:url" content="http://www.imdb.com/title/tt0117500/" />
<meta property="og:image" content="http://ia.media-imdb.com/images/rock.jpg" />
...
</head>
...
</html>
Uh… what? Let’s look over things line by line.
<html xmlns:og="http://ogp.me/ns#">
This tells computer that the HTML uses the Open Graph Protocol. You can expand the pieces out: xmlns means XML namespace, og means open graph, and the weblink is just an official reference to the protocol itself. It’s actually a dead link, but will mean something to computers (e.g. Facebook) that know about the protocol.
<head>
<title>The Rock (1996)</title>
This is actually an example of what we want to avoid. We have the title, “The Rock” and information about the movie together. This looks ugly when shared through social media. With decent metadata, we make <title> refer to just to titles. This would make for a much more pleasant browsing experience:
<meta property="og:title" content="The Rock" />
<meta property="og:type" content="movie" />
<meta property="og:url" content="http://www.imdb.com/title/tt0117500/" />
<meta property="og:image" content="http://ia.media-imdb.com/images/rock.jpg" />
This is the meat of the example. Here, we describe to Facebook what to put on its pages. We have a link to the right URL, not some crazy long one. We have a link to the right thumbnail. We have the title in isolation. We’re also telling Facebook that we’re referring to a movie, which might make searching easier.
However, we still haven’t solved our problem. We need to add one final piece:
<meta property="og:description"
content="A renegade general and his group of
U.S. Marines take over Alcatraz
and threaten San Francisco Bay
with biological weapons." />
Success! Once you’ve added the og:description, it’ll be used. Sweet.
Yes. There have been other efforts in this area. The most common are known as [Dublin Core][http://en.wikipedia.org/wiki/Dublin_Core] and [FOAF][http://en.wikipedia.org/wiki/FOAF]. However, using multiple standards one one website is complex. You can see the result of when each of these standards are meshed together, on slide 16 of this presentation.
The Open Graph Protocol’s website provides a great overview of the features. You can learn add to tell Facebook the location, how to describe your page as a product page or add contact information. There are client libraries for several programming languages and a few tools, such as Wordpress.
Open Graph Protocol will follow a similar path as KML has in the geospatial field. When Google acquired Keyhole, there were several competing standards. With Google’s dominant market position, it was able to establish KML as the de facto standard. It is a simpler standard than competing alternatives. But, more importantly, everyone in the geospatial field needed to be able to talk to Google Earth and Google Maps. That’s where the consumers were.
Open Graph Protocol will do a similar thing. There are other ways to add metadata to web pages. Dublin Core and FOAF have a long tradition in the semantic web community. However, Facebook can exert its market dominance to make this simpler standard adopted. Businesses finally have a reason to get involved with semantic content. That’s where the consumers are.
This post focuses on broadly why people are telling you that the cloud is the future of the world. I want to clarify things for you. I feel that there’s lots of confusion around this area. I can’t speak for others, but when I first heard it, I wondered whether the term “cloud computing” sounded like some form of euphemism.
There are two senses of the word that are used most extensively. First, there is there is the sense that consumers are being taught - applications that are delivered via the web. The second sense of the term is more technical. It refers to how companies purchase computing power. The conflict between these two senses is where most of the confusion derives from. That’s because, when people are first introduced to the term, are taught the first meaning. Then they prod deeper, only to be perplexed by technicians using the second meaning.
The first meaning is an advertisng slogan. Here, cloud is intended to mean easy, light and free from installation. There are a few benefits to this for users. They never get interrupted with requests to update their software, they can often run the application from many devices or computers running different operating systems. Every time they run the application, it always runs the most recent version. All the operating system needs is a web browser, and they are plentiful.
The second meaning is what makes IT managers smile. See, applications have been delivered over the web for a long time. But, they wouldn’t qualify as cloud computing in this sense. That’s because the companies running the applications would have been purchasing computers and hosting everything themselves.
In order to get something onto the web, you need to have a server somewhere that’s hosting the application. This is fairly simple when you’re a startup, because you have negligible traffic. But what happens when you grow? As you expand, you may need to have servers in multiple geographic locations. But how many? If you get very big, very fast, your servers might simply break. But if your startup grows slowly, you’ll end up paying for services that you didn’t need. The cloud makes those problems far more easy to deal with.
Blame Christmas shopping. According to the urban legend, Amazon had computers idling the whole year because it needed them for the Christmas rush. In order to service all of those orders, it needed to buy lots of extra hardware. Outside of December, this hardware would site idle. Amazon thought about ways to provide this idle capacity to the world.
I also blame some things that I haven’t got evidence for, but seem to be sensible.
Internet browsers
They’re much better than they were. Most importantly, they contain far faster JavaScript interpreters, which means that applications are much more responsive.
Corporate security policies
It’s very hard to get a new software package into the enterprise. Security and procurement policies that are established to prevent new products to mess with the current setup. Corporate ICT contracts tend to have a set range of software. Adding anything new means adding a new item in the support contract. This can be conviently avoided by software delivered over via the web.
Speed of execution
Lots of tiny companies have introduced products into the market very cheaply. It’s now much cheaper to run a startup than during the 1999 tech boom. They have been able to bypass traditional software distribution models and are run very efficiently. This creates a sense of panic in larger vendors, who are desperate to remove any advantages that smaller players exploit.
Peer pressure
Google, leading with Gmail, has demonstrated that it is possible to create very good software that generates huge revenue. There have been a flood of new norms established since Wikipedia demonstrated that the web could be a force of good.
Web scale
Many web companies experience exponential growth. The problem with that is that to provision new servers exponentially. Dealing with the scale required to serve many millions of requests a month is pretty hard for a team that was used to having a few thousand users a few months ago. Sometimes, it’s just great to be able to offload that work of handling that growth to companies that build their own power stations to run their data centres.
Maturation of web frameworks
Over the last decade, a number of tools used to build websites have matured. These tools include things like dealing with databases, databases themselves, how to sharing of source code within teams, through to systems that told you authoritatively where to locate your images in your application directory. It all helps.
Agile software methodologies
Although it can look like a bit of a fad, agile has made a huge impression in the world of software development. If nothing else, it has produced a culture shift that allows companies to release products as ‘beta’.
Remember the two senses of the term cloud computing we discussed above. The second sense is what’s most interesting here. That’s because the benefits to software producers are far more immediate than software users.
Pay for what you use
Companies now only need to pay for what they use, not what they think they’re going to use. Resources are typically rented now on hourly contracts. This is great for most web applications, because they have a cyclic demand curve. An internet dating website is likely to experience peaks during the evenings. By purchasing most of its computer power for those few hours of the day, it’s possible for the save lots of cash over the long run.
Sharpened focus
Using cloud computing systems, software developers can specialise on developing software. They don’t need to worry on configuring servers, upgrades, temperature, disaster recovery and more. All of this is taken care for them.
Scaleability
Cloud systems tend to be configured in a distributed manner. This means that resources can be allocated on-demand. If your site experiences massive load, more resource can be quickly allocated to you.
Ease of distribution
Once upon a time, companies needed to sell software as if it were a physical product. Publishers needed to secure distributers in each markets. Those distributors would need to find retailers to sell the product. Those products came packaged in boxes. Inside the boxes were CDs, that contained software that would run on your computer. With the shift into cloud computing, none of that remains true.
In fact, the cost of distributing a company’s software is absorbed by the users. They pay for the bandwidth to use the service.
Ease of support
Software companies only need to support a single version of an application. This is a fairly big deal. Managing multiple versions of software is complex and fustrating. Not only does it mean that your programmers need to be fluent in multiple versions of the same thing, but so do your help desk staff, your documentation writers and your sales team. With applications delivered via the web, there is a single version for everyone. This means that it still makes sense for businesses to change the way things look or behave sometimes, even if there are gripes from some users.
As an interesting aside, even when it looks like there are multiple versions of software being sold, such as professional and educational editions, they’re probably the same thing under the hood. Often, there’s just an internal switch in the that turns functionality off for the cheaper product line. It’s cheaper to spend money to arbitrarily weaken one piece of software than maintain two products.
You’ll notice that web applications tend not to provide extra features for the premium version. Instead, they provide support for more users, more storage or more support. In this way, they get to add price discrimination without the complexity required to maintain multiple versions of the software.
Me talking about the http://eq.org.nz at #UPShake