<?xml version="1.0" encoding="utf-8"?>
<feed xmlns="http://www.w3.org/2005/Atom">

  <title><![CDATA[Alastair’s Place]]></title>
  <link href="https://alastairs-place.net/atom.xml" rel="self"/>
  <link href="https://alastairs-place.net/"/>
  <updated>2018-02-23T07:42:05+00:00</updated>
  <id>https://alastairs-place.net/</id>
  <author>
    <name><![CDATA[Alastair Houghton]]></name>
    
  </author>
  <generator uri="http://octopress.org/">Octopress</generator>

  
  <entry>
    <title type="html"><![CDATA[Aura]]></title>
    <link href="https://alastairs-place.net/blog/2018/02/21/aura/"/>
    <updated>2018-02-21T14:06:46+00:00</updated>
    <id>https://alastairs-place.net/blog/2018/02/21/aura</id>
    <content type="html"><![CDATA[<p>Having just written <a href="https://coriolis-systems.com/blog/2018/2/older-software">rather a sad piece on my company’s blog</a>,
I thought I’d cheer myself up a bit — and explain a few things that have been
hugely frustrating me over the past few years.</p>

<p>As you’ll know if you read the post I linked to above, iDefrag and iPartition
are very much in decline; they have been for some time, actually, and I’ve
been scratching around trying to work out what I’m going to do instead.  Well,
back in 2013, I thought I’d found a <em>great</em> new product that didn’t exist and
that I thought I had the skills to develop.</p>

<p>I was sat upstairs in the study in my old house, and my elderly Harmon-Kardon
SoundSticks (the original USB kind, not the newer analogue variety) died.
This made me sad - the SoundSticks sounded great, and although the built-in
speakers in my iMac weren’t <em>bad</em>, they don’t sound half as good as decent
external speakers with a subwoofer.</p>

<p>It then occurred to me that I had a spare A/V receiver in the attic, along
with a full 5.1 speaker set that I wasn’t using.  It <em>also</em> occurred to me
that the headphone socket on my iMac was actually a mini-TOSLINK port, and
since the A/V receiver had optical inputs, I could hook my Mac up to it using
an optical fibre and — maybe — get surround sound!</p>

<p>Excited, I got all the hardware together, rearranged my workspace to fit
everything in, and turned it all on.  And got stereo.  Very nice sounding
stereo — <em>way</em> better than the SoundSticks, never mind the iMac’s built-in
speakers — but stereo nonetheless.</p>

<p>I was, of course, being naïve.  It isn’t possible to send 5.1 channel raw PCM
over a standard S/PDIF interface, so my Mac was doing the best it could and
sending two channel stereo.</p>

<p>That was when I had my bright idea — I could write a Dolby Digital (aka AC-3)
encoder, that took 5.1 channel audio from Core Audio in my Mac, compressed it
in real time, and squirted it out over the optical interface.  I managed to
find the necessary specifications (not too hard, because AC-3 is part of
various other published standards), and started work.</p>

<p>I was, of course, aware that I’d have to license the AC-3 codec from Dolby
Laboratories, so I also started talking to them about that while I worked on
my encoder.</p>

<p>Well…</p>

<p>The folks from Dolby were very nice, and quite helpful, though it was clear
that they weren’t really set up for license applications for the kind of
product I wanted to make.  Months passed, and we were still talking to each
other; meanwhile I had the software side of things pretty much working.
Eventually, I was given a license agreement to look through, and that’s where
things really unravelled.</p>

<p>To explain: AC-3, like the competing DTS standard, is a non-optional part of
various other standards, including ATSC, the DVD and Blu-Ray standards and so
on.  As a result, it is licensed under terms described as “Reasonable and
Non-Discriminatory” (aka RAND).</p>

<p>Now, that <em>sounds</em> great, right?  It means, surely, that the terms are
<em>reasonable</em> and that I, as a small software developer, will get the same
terms as (say) Sony.  Well, no, not quite.</p>

<p>What it <em>actually</em> means is that for the set of people who were expected to
want to license it when the license agreement was written, the terms are
reasonable, and that everyone gets the same license agreement (it doesn’t mean
that the same <em>terms</em> in that agreement necessarily apply).</p>

<p>There were two problems; the first was that the license agreement tried to
distinguish between “professional” use (i.e. content creation software and
hardware, which is typically very expensive) and “mass market” use
(i.e. people who make DVD players and the like), by charging different amounts
per unit depending on the volume of units shipped.  Sounds reasonable, right?
Well, yes, until some upstart comes along with the idea I’d had, expecting to
ship relatively small numbers at a relatively low cost.  I can’t be specific
about the licensing costs (they’re under NDA), but the numbers didn’t work.</p>

<p>The second problem was that the license fees increase every year with
U.S. inflation.  So <em>even if</em> I could just about stomach the initial per unit
fee (and to do that I’d have had to have charged a <em>lot</em> more than the $20 per
unit I had envisaged), in a few years’ time I’d simply have to stop selling
the product because it wouldn’t make economic sense.  And in the meantime,
Dolby Laboratories would see almost all of the profit from my work.</p>

<p>At this point, I had a functioning piece of software, which worked really
nicely for me in my study, but I couldn’t even give it away because it
infringed Dolby’s patents, and I couldn’t license those patents because the
cost was prohibitive.  I asked Dolby if there was any way they could vary the
terms to make it work, and, to their credit, they did go away and think about
it, but eventually came back with the answer that they were unable to do so
because of their “Non-Discriminatory” obligation — they could only offer me
the same license they offered everyone else.</p>

<p>I managed to salvage <em>some</em> of the work I’d done — notably the new image-based
licensing system, which was included in newer versions of iDefrag and
iPartition — but most of it languished on my disk.  It had taken me about a
year’s work to get to this point.</p>

<p>I was upset.  Now, I’d made a mistake in that I’d worked on the product before
finalising the licensing — but then if it had worked out, that would
absolutely have been the right choice, as I’d have been in a position to
release it the moment the license was signed.</p>

<p>In retrospect, I should perhaps have realised that this problem existed; I had
heard that some PC sound card vendors had AC-3 encoding support in their
hardware, and that they had started trying to charge their customers extra to
enable it.</p>

<p>Anyway, fast-forward to 2018.  Sales of iDefrag and iPartition are falling
away, and it’s getting to the point where I can’t pare my company back much
more without actually shutting it down.  And that’s what I was considering
doing, as recently as two weeks ago; I’d intended to get to the end of this
financial year (31st March) and then close down.  It was looking like the end
of my 14 year run at working for myself — as I have a family to support now,
not to mention a wife doing a Master’s degree, there didn’t seem much
option.</p>

<p>And then I saw a tweet.  Just a small thing, noting that AC-3 was no longer
“patent encumbered”.  My heart leapt.  Sure enough, I found evidence that the
core AC-3 patents expired on the 20th of March 2017.  I could ship!</p>

<p>And so, finally, <a href="https://coriolis-systems.com/Aura/">Aura</a> was released,
today, some <em>four years</em> after I had something I could have shipped, but was
stopped by “Reasonable and Non-Discriminatory” licensing from doing so.</p>

<p>It’s been quite a journey, this one.</p>

<h2 id="update">Update</h2>

<p>Hah.  Apparently Apple has quietly phased out the optical outputs on its newer
models (anything made in 2016 and later, by the look of things).  Figures.</p>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[The Paradise Papers]]></title>
    <link href="https://alastairs-place.net/blog/2017/11/07/the-paradise-papers/"/>
    <updated>2017-11-07T08:34:13+00:00</updated>
    <id>https://alastairs-place.net/blog/2017/11/07/the-paradise-papers</id>
    <content type="html"><![CDATA[<p>There’s currently a huge stink about the so-called
“<a href="http://www.bbc.co.uk/news/paradisepapers">Paradise Papers</a>” —
basically, a leak from a law firm called Appleby that specialises in
“offshore” activities, detailing the shennanigans of many of its clients.</p>

<p>Understandably, today, there’s a lot of anger in social media, particularly on
the left, about the tax evasion and tax avoidance that the BBC’s
<a href="http://www.bbc.co.uk/programmes/b006t14n">Panorama</a> current affairs programme
detailed in its latest episode.  The thing is that a lot of the anger is
misplaced, and, to be perfectly blunt about it, a lot of people are being
manipulated by politicians greedy for more government cash into thinking that
this is about schools or hospitals closing versus tax revenue being
collected.  Those same politicians don’t mention, of course, that they spend
taxpayers’ money on guns, bombs, five star hotels, art for their offices,
expensive office furniture and so on.  I’m not necessarily opposed to those
things, I might add, but they’re a lot less popular with the public than
schools and hospitals, and so if you put yourself into the mindset of a
politician who would like extra money to spend on his or her personal
priorities (which may or may not include a fancy office chair, for instance;
or extra “trade envoys” so their chums can enjoy a few foreign junkets at
taxpayers’ expense), you’re always going to play the schools and hospitals
card here.  You might, if you’re a political leftie, also go on about “cuts”
to peoples’ benefits — regardless of whether or not the current government
<em>has</em> cut benefits, <em>some</em> people are likely to be receiving less than they
were for whatever reason (rule changes, changes in personal circumstances,
etcetera), and those people will suck up your argument, even if it is
basically untrue.</p>

<p>Another fact that isn’t often mentioned, unless you talk to an economist
anyway, is that there’s an underlying assumption that the money would be
better used if paid in taxes to government.  A case in point here is Apple.  I
won’t for one moment claim that I think that how Apple currently arranges its
tax affairs isn’t anything short of outrageous — though I have a different
view on it to the one being loudly expressed across the Internet today, which
I’ll go into below — but in purely objective terms, the goal here should be to
maximise the benefit to citizens, across all measures.  So, for instance, one
could argue that Apple having a lot of cash benefits millions of people
because Apple spends a lot of that money on innovation, and that innovation
makes the lives of many millions of people, the world over, better.  Yes, it
may also enrich Apple shareholders, directors and employees, but I’d argue
that’s a much smaller effect.  On the other hand, if governments had
confiscated it as taxation, it’s quite unlikely they would choose to spend
it that way.  Yes, some of them might spend <em>some</em> of the money on schools and
hospitals, or alleviating homelessness or poverty, but let’s be honest here —
quite a bit of it would instead be spent on things the public isn’t so keen
on.</p>

<p>Note: I’m not trying to defend Apple here.  Apple can do that itself.  I’m
just trying to fill in some of the missing parts of this.</p>

<h2 id="on-corporate-taxation">On Corporate Taxation</h2>

<p>On the subject of Corporation Tax, which is what the fuss about Apple is all
about, and is also, actually, at the heart of some of the other tax avoidance
schemes that we’re talking about here, CT is just a bad tax, pure and simple.
The OECD even published a paper about taxation at one point showing that
Corporation Tax was the only widely deployed form of taxation that was
<em>negatively correlated with growth</em>.  It’s also too easy to manipulate,
because it’s based on profit calculations — that is, companies can reduce it
by <em>appearing</em> on their profit and loss sheets to have spent money or to have
lost money on their assets through depreciation or other kinds of loss.
Additional confusion is caused to the public by the fact that companies only
exist within the legal jurisdiction in which they were incorporated.  In a
very real sense, for instance, there is no such company as “Apple”.  Rather,
there is <em>Apple, Inc</em> (which is in the United States), <em>Apple Europe Limited</em>
(in the United Kingdom), <em>Apple Operations International</em> (Ireland), <em>Apple
Sales International</em> (Ireland), <em>Apple Distribution International</em> (Ireland),
as well as a host of other entities.  <em>All of them are separate companies</em>,
and therefore <em>separate legal entities</em>, though some may hold shares in others
and they likely share some directors too. The thing the public thinks of as
“Apple” is not, in a legal sense, real — but instead is projected by the
actions of a number of co-operating legal entities in various different
jurisdictions.  You might say this is a sleight of hand, but it’s how the
world works because it’s how the laws passed by our politicians work.</p>

<p>The upshot of this fact is that it’s possible for (for instance) <em>Apple, Inc</em>
to pay money to <em>Apple Operations International</em>, which doesn’t change the
amount of money “Apple” (the ephemeral thing the public thinks of) has, but
<em>does</em> change things for tax purposes because <em>Apple, Inc</em> pays Corporation
Tax at the United States rate (high by global standards), while <em>Apple
Operations International</em> is taxed in Ireland.  Of course, as the Paradise
Papers make clear, things are not quite that simple, and some other steps are
involved that reduce the Irish tax bill by, basically, paying money to another
company in Jersey.  But you get the idea.</p>

<p>Now, you might say, as Donald Trump does, that this is all outrageous, that
Apple is a U.S. company, and that the billions it has “stashed away” outside
of the United States should have been taxable at 35% in the United States and
that Something Must Be Done.  Or, you might say, as some here in the United
Kingdom do, that Apple makes a lot of money here, but doesn’t seem to pay very
much tax here, and so some of those billions are “ours” in some sense.  But
that simply isn’t how the law works.</p>

<p>Nor, and I’m going to be controversial here, is that how we should <em>want</em> it
to work.  Let me take another example.  Imagine you operate a delivery and
warehousing company in the United Kingdom, whereby overseas suppliers can pay
you to stock goods for them and deliver them to addresses in the United
Kingdom.  Now imagine that there is a website that sells goods of all types,
all shapes and all sizes, and that that website does so in the United Kingdom
by hiring your delivery company to hold stock and deliver goods; in other
territories they do much the same thing, but with different delivery and
warehousing companies.  Clearly both the delivery company and the website
company will be able to calculate a profit figure (essentially sales minus
costs), and so Corporation Tax will be paid at UK rate on the profit made by
the delivery company and at some other rate depending on where the website
company is incorporated on its profits.  Now, let’s say the website company
can choose where it incorporates — after all, it’s a website and the Internet
is everywhere.  So let’s pick somewhere with low tax rates.  Luxembourg, say.</p>

<p>Now, the delivery and warehousing company is free to charge whatever it
pleases.  Obviously if it goes too high, it will lose the website company’s
business, and if it goes too low, it will lose money and eventually go out of
business, so there are limits, both low and high.  Moreover, the high limit,
according to standard economic theory, will tend towards the low limit as the
level of competition increases — assuming perfect competition, the delivery
and warehousing company will be making a profit of £0 on its operations.</p>

<p>All of this is fine and dandy, and nobody would question the right of the
owners of the distribution company to operate at the lowest possible cost and
even to make no profit at all if that’s what they wish to do.  It’s their
company, and there is no innate requirement that a company run at a profit
(they could even choose to fund its operations by constantly shoveling money
at it, though that strategy can’t last indefinitely as the owners will
eventually run out).</p>

<p>I’m sure some people can see where I’m going with the situation I just
described.  So let’s cut to the chase.  Let’s call the UK company “Amazon UK
Limited”, and let’s call the website company “Amazon EU SARL”.  Just, you
know, for sake of argument.  And since corporate ownership is rarely
straightforward, let’s say the two companies share some directors and
shareholders.  We might even imagine (though this <em>isn’t</em> how it happened in
Amazon’s case) that the website company might eventually decide to simply buy
the distribution and warehousing company (in which case, the two separate
companies still exist — it’s just that one owns the other).  <em>Why</em> does this
make it unreasonable for the distribution and warehousing company to run at
zero profit all of a sudden?  It was fine <em>before</em> I gave them both similar
sounding names, and before they had shared directors/shareholders.  Why is it
suddenly <em>not</em> OK now?</p>

<p>“Nobody would run a business for zero profit”, I hear you say.  Are you sure?
What about a family business where the owners are employed by the business?
Running at zero profit in that case might make sense — subject to tax
legislation not making it much worse to pay salary rather than dividend.</p>

<p>The fact is that “multinational” companies are largely a fiction — legally
speaking, they are really groups of national companies that happen to
co-operate for whatever reason, often but probably not always because they
have the same (or overlapping) ownership or directors.  Each of these separate
legal entities is separately taxable, in the jurisdiction in which it exists,
on its profits, and there’s little you can do to prevent them from paying one
another for services, intellectual property licensing and so on, thereby
reducing the profit in one jurisdiction and increasing it in another.  There
are <em>some</em> rules governing payments between companies with shared ownership,
so for instance you can’t have company A sell parts to company B at hugely
inflated prices in order to reduce profits at company B and increase them at
company A, but of course it’s very difficult to prove a value for intellectual
property, especially things like brand names, so this is something of a losing
battle for tax authorities as long as corporations’ accountants are on the
ball.</p>

<p>A final nail in the coffin for Corporation Tax is that while the intent is
that it should fall on the <em>owners</em> of corporations, in practice some fraction
is, for understandable reasons, borne by their customers and employees
instead, in the form of higher prices and lower wages respectively.</p>

<p>What should we do instead?  Well, CT is a non-starter.  It doesn’t work in a
globalised world, it isn’t an efficient tax, it’s poorly understood by the
public (which causes resentment when they hear that e.g. Amazon isn’t “paying
its fair share”), forces governments to get involved in and to legislate about
the calculation of corporations’ profits and is just, in general, not a good
idea.</p>

<p>But let’s think for a moment; the goal here was to impose a tax on the owners
of the corporation.  How do those owners benefit from a corporation’s profits?
Well, in two ways:</p>

<ol>
  <li>
    <p>Through appreciation in the value of their shares.</p>
  </li>
  <li>
    <p>Through dividend payments and other distributions.</p>
  </li>
</ol>

<p>In the former case, we tax the rise in value when they sell the shares.
Currently in the United Kingdom, this would be covered by Capital Gains Tax,
which is levied at a lower rate than Income Tax, which may or may not be
desirable (it’s notionally to encourage investment in businesses).  We might
consider instead taxing at Income Tax rates, but taking into account inflation
when calculating the taxable gain, if any.</p>

<p>In the latter case, these are often taxed through Income Tax, though presently
here in the UK we have some special rules for dividends that make them a
little more tax efficient.  We could abolish those and instead tax them as
ordinary income — the original justification for the different treatment was
that the money had already been subject to Corporation Tax and that taxing it
twice was, essentially, double taxation.</p>

<p>The elephant in the room here is probably overseas distributions or capital
gains, and the solution there is quite straightforward: a withholding tax.
So, for instance, if a company pays a dividend to an entity (a person or a
company) that is outside of the United Kingdom, a UK company should apply
income tax at full rate to that money <em>at source</em>.  That tax could then be
claimed back by the foreign entity if it can show that it has been taxed on
the money.  There are variations we could consider (for instance, perhaps at most
the amount paid in tax in the foreign jurisdiction could be claimed back, even
if the UK payment was higher), but the central idea is that you make it
impossible to take the money out without paying some tax on it.</p>

<p>TL;DR:</p>

<ul>
  <li>
    <p>Corporation Tax is negatively correlated with growth.</p>
  </li>
  <li>
    <p>Corporation Tax <em>should</em> fall on the owners (typically shareholders) of
corporations, but in practice is partially borne by customers and
employees.</p>
  </li>
  <li>
    <p>Corporation Tax is based on profits, and as such is easy to manipulate,
particularly for “multinationals”.</p>
  </li>
  <li>
    <p>To try to prevent manipulation, the rules have become increasingly complex
in many jurisdictions — for instance, banning “depreciation” (here in the
UK we have “capital allowances” instead), attempting to regulate the prices
of “intra-company” transfers (i.e. sales between related legal entities) to
prevent “profit-shifting” and so on.  This complexity is good for
accountants and lawyers, but it’s very bad for smaller businesses (and
therefore bad for competition, and thereby consumers) and makes it more
likely that loopholes are inadvertently introduced.</p>
  </li>
  <li>
    <p>The notion that businesses will “play fair” is a nonsense.  Big businesses,
in particular, have every incentive not to; they have adequate funds and
staff to challenge the tax authorities, and the sums involved can be
colossal.  They can also afford to employ the best and brightest — wages in
public service are typically more restricted, and worse, if someone <em>is</em>
really good, business may eventually poach them.  On the other hand,
small businesses are more likely to play fair because they don’t have those
resources and want to concentrate on their business, so treating everyone
the same way is likely to be advantageous to big business (whether you’re
going to come down hard on everyone or not).</p>
  </li>
</ul>

<h2 id="on-personal-taxation">On Personal Taxation</h2>

<p>It is often asserted that it is “immoral” not to pay your “fair share” of
tax.  But what <em>is</em> your fair share?  Let’s think about that for a moment.</p>

<p>Let’s start by assuming that we all agree that everyone should pay <em>some</em>
tax.  Not everyone thinks that, and there are details like whether the very
poor should pay any tax at all, but we’ll take it as given that, in principle,
we should pay <em>some</em> tax.</p>

<p>So, how much tax <em>should</em> you pay?  Should it be based on your income?  Or on
your wealth?  What if your income varies substantially from year to year?  Is
it fair to tax someone who earns £70,000 one year, but will only earn £20,000
the next, in the same way as someone who earns £70,000 every year?  What if
you have no income but are very wealthy?  How does that change if your wealth
is illiquid (maybe you own a large country house, or even a small house
somewhere expensive like London)?  And when we decide <em>what</em> to tax, at what
rate should we set the tax?  13%?  35%?  50%?  Higher?  Should the tax rate
increase (or decrease) with the overall amount?  Why?  Should anyone be exempt
from tax for any reason?  Should there be an amount you can earn or hold
<em>before</em> you start having to pay tax?  How should this interact with things
like benefits or indeed voting rights?</p>

<p>My goal here is to make you think.  This isn’t simple.  It isn’t like the
question of whether you should cheat on your wife (you shouldn’t, in case
you’re wondering).</p>

<p>So what we’ve chosen to do, at least in most countries, is as follows: we
elect people, who form a government.  They, in conjunction with some kind of
assembly, debate the matter and come up with a set of rules, which they pass
as legislation.  The legislation answers the above questions, at least as far
as that country is concerned.</p>

<p>That is, the amount of tax you should pay is defined <em>by the law</em>.  If the law
says you should only pay £1 in tax, then that is what you should pay.  There
is no “moral” case for paying more than that amount.  Paying less than that
amount is <em>evasion</em>, and that is both illegal and wrong — because it’s unfair
that everyone else has to comply with the law and you don’t.</p>

<p>“But tax avoiders…” I hear you say.  Well, what <em>is</em> tax avoidance?  Tax
avoidance is really just where you notice that the law says you could pay less
tax than someone else thinks you should.  Note: it isn’t paying less tax than
you owe; it’s really just where the amount of tax you owe is surprising for
some reason.  Now, <em>aggressive</em> tax avoidance can involve doing all kinds of
things that you wouldn’t ordinarily have done (typically this involves
companies owning things that you would normally have owned yourself; loans
being made where none were necessary; low tax rate investments like pensions
investing in assets you sell to them, and so on), solely to leave you in a
position where the law says you owe little or nothing in tax.  The UK, in
common with other jurisdictions, has passed legislation to prevent that,
namely the <em>General Anti-Abuse Rule</em> (or GAAR).</p>

<p>I’m in two minds about GAAR.  On the one hand, I’m not really a fan of
aggressive tax avoidance; yes, it’s legal, but I think where it’s obvious that
you’re using the law in a manner different to that which Parliament intended,
there’s an ethical problem with that.  I won’t do it, even sometimes in cases
where my accountant is convinced I should.  And I’ve been offered avoidance
schemes, which I’ve turned down.  On the other hand, it essentially amounts to
allowing the tax authorities to decide that the law doesn’t matter, and what
does matter is their view of how much you should pay.  I’m no fan of that
either, and while there are checks and balances in place, my view is simple:
<em>it’s up to Parliament to get the law right in the first place</em>.</p>

<p>Much of the aggressive avoidance is caused by Parliament complicating the tax
system for political reasons.  There are many examples; my favourite recent
example was the legislation passed to make the UK an attractive place to make
films.  This was intentionally designed to provide tax breaks for investors,
but many of the various vehicles that accountants and tax planners constructed
to take advantage of the tax break have fallen foul of GAAR, it seems because
someone didn’t realise the massive tax advantage they’d handed out.  Yes,
there were loans involved — but honestly, that’s quite normal — if you know
you’re going to make a profit on your investment, you might well take out a
loan in order to make a bigger investment than you could out of your own
capital.  The problem here was that in doing that, the investors were entitled
to a much larger tax break, and could write off in some cases very large
amounts of tax.  I don’t think you can really argue that this wasn’t what
Parliament intended; if it didn’t intend that, then those responsible for the
legislation were spectacularly inept.  And, I might add, an unfortunate
consequence is that there were many much less wealthy people who became
involved and for whom the tax consequences are dire, because the penalties
being imposed are based on the total investment, including the money they
borrowed, which was often many times the amount they put in themselves.  Nor
was this “wheeze” failing to result in films — the vehicles in question were
responsible for a number of blockbusters, so the tax break certainly
encouraged precisely what the government wanted it to.  Basically, I’m no fan
of Jimmy Carr, and I wouldn’t have invested in the scheme he did, but I’m a
little uneasy about saying that people who did were doing anything other than
what Parliament intended.</p>

<p>But even simple things like deliberately imposing high marginal rates on
“wealthier” individuals create the scope for avoidance.  Why?  Because you’ve
increased the value to those individuals of not paying that money in tax.
That can make it worthwhile to do something unusual that wouldn’t ordinarily
be viable because of the extra cost — like paying yourself
through a limited company (as, it turned out, even civil servants and BBC
staff were doing).</p>

<p>So what <em>should</em> we do?  Well, the tax system needs to be fair, but we need to
recognise that it’s just as unfair to confiscate very large portions of a rich
person’s income as it is to do the same to a poorer person, but that, unlike
poorer folk, the very rich are in a position to do something about it.  Thus,
we should resist the “soak the rich” mentality of the hard left, while
nevertheless making sure that the matter of calculating the tax that is due is
as simple as possible so that there is little room for manoeuvre.  So we should
<em>also</em> resist the urge of some to craft all kinds of exemptions, special
schemes and complicated rules to encourage this and discourage that.
Simplification should be the order of the day.</p>

<p>Personally I’m in favour of a flat tax at, say, 35%, with a large personal
allowance to cover basic living costs, which can be combined with someone
else’s so that a household with one person earning £X pays the same in taxes
as a household with two people earning, between them, £X.  At the same time,
I’d abolish National Insurance, which is far too complicated and is
contributing to the business of people unnecessarily being paid through
companies, I’d get rid of Capital Gains Tax as a special case (but allow
inflation to be used to reduce gains, and the same on savings in the bank, so
that you don’t get taxed on inflation), and I’d get rid of the special
treatment of dividends too.  Similarly, VCTs, EIS, ISAs and all the other
complications, I’d probably look to abolish — I’d rather people invested
because they had money I hadn’t taken off them, instead of investing in order
to prevent me taking money away from them which is what those schemes do — and
at the same time I’d be looking carefully at the benefit system to see how it
could be reformed (I find the universal basic income to be an interesting
suggestion in this area, though I’m not sure how it would work in practice).</p>

<p>TL;DR:</p>

<ul>
  <li>
    <p>“Evasion” is where you don’t pay the tax the law says you should.  That’s
wrong, pure and simple.</p>
  </li>
  <li>
    <p>“Avoidance” is where the law says you need to pay an amount of tax that
somebody finds surpisingly low for some reason.  It isn’t illegal, and if
something is at fault, it’s the law.  You should focus your anger about
this on the politicians who get to choose what the law says, not on those
people who pay less tax than you think they might otherwise.</p>
  </li>
  <li>
    <p>Aggressive avoidance is already tackled in many places through a General
Anti-Abuse Rule (GAAR).  These are problematic, though, because they
effectively allow a body to decide that the law itself doesn’t matter as
much as their opinion.  I’m not a huge fan of GAAR overall — I’d rather
the underlying law didn’t provide opportunities that create a need for it.</p>
  </li>
  <li>
    <p>If someone starts bleating about “schools and hospitals”, you’re being
manipulated.  Governments spend money on lots of things you probably don’t
approve of, in addition to the schools and hospitals everyone likes.  The
notion that tax avoidance or even evasion is responsible for school
closures is nonsense, even at the outside estimates of the amount of tax
that is avoided or evaded every year.</p>
  </li>
  <li>
    <p>It’s also worth being a little more sceptical about some of the more
outspoken voices you may hear in the media on this.  Margaret Hodge, for
instance, the current chair of the Public Accounts Committee <a href="https://www.scribd.com/document/113892078/Priti-Patel-Letter">appears to have a very large holding in Stemcor</a>,
much of it held in trusts.  The facts and figures in that letter are
disputed — <a href="https://www.channel4.com/news/by/michael-crick/blogs/hodge-threatens-tory-mp-patel-with-libel-writ-over-family-shares">both Hodge and Stemcor used the word “libellous” in their response</a>
— but the fact is that Stemcor is owned by and controlled by the
Oppenheimer family, of which Hodge is a member, and there is certainly
<a href="https://www.channel4.com/news/by/michael-crick/blogs/a-roasting-for-starbucks-but-a-grilling-for-hodge">some evidence that it engages in the kind of tax planning of which Hodge has been so publicly critical</a>.</p>
  </li>
</ul>

<h2 id="on-vat">On VAT</h2>

<p>Panorama made a point of mentioning Lewis Hamilton’s jet, on which he
apparently hasn’t paid any VAT in spite of it allegedly being used for
personal use, as opposed to merely business use.  I’m sure he isn’t the only
one doing this, I might add — it’s just that Panorama singled him out.</p>

<p>Now, again, this is more complicated than it seems on the face of it.  If <em>I</em>
were Lewis or his accountant or lawyer, I’d probably point out that much of
what Lewis does, including posting pictures of himself on Instagram (or wherever)
having a very nice time apparently on holiday, could be construed as
“business”, in that Lewis Hamilton is, himself, a brand (in the same way as,
for instance, David Beckham).  That, and the itinerant nature of Formula 1,
makes it quite difficult to separate personal use of his aeroplane from
business use, and I imagine that’s what the Isle of Man’s tax officials had in
mind when they allowed him to pay no VAT on its import into the European
Union.  It’s also, as I understand it, not unusual to discount a small amount
of business use on some items, though the rules are very complicated and I
don’t know how it applies to aeroplanes in practice, and it may well be that
the Isle of Man got it wrong when they did this.</p>

<p>I’m not sure VAT avoidance or VAT is really in-scope for this piece, as it
only formed a minor part of the information coming out from the Paradise Papers,
so I’m not going to talk more about it here.</p>

<h2 id="summary">Summary</h2>

<p>As always with these things, I’d say it’s a good idea to be more sceptical
about what you read or hear in the media.  Rather than blaming “the rich” or
“corporations” for the problem of tax avoidance, you should look to your
politicians.  They, not the rich and not the corporations, are responsible for
setting the law, and a lot of this is caused by baroque legislation made for
political purposes and a failure to grasp the nettle of tax simplification.</p>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[Dmgbuild Update]]></title>
    <link href="https://alastairs-place.net/blog/2017/04/28/dmgbuild-update/"/>
    <updated>2017-04-28T12:01:56+01:00</updated>
    <id>https://alastairs-place.net/blog/2017/04/28/dmgbuild-update</id>
    <content type="html"><![CDATA[<p>Users of my command line disk image building tool,
<a href="https://bitbucket.org/al45tair/dmgbuild">dmgbuild</a> might be interested in
<a href="https://pypi.python.org/pypi/dmgbuild/1.3.0">the new version</a>, which now has
support for attaching licenses to the disk images it generates.</p>

<p>This turned out to be quite a chore; I didn’t want to use Rez, because it’s
deprecated, and the <code>hdiutil udifrez</code> and <code>hdiutil udifderez</code> options are,
well, not particularly well documented (not to mention asymmetric unless
you’re using the undocumented XML format).</p>

<p>Anyway, it turns out that the documentation on the legacy MacOS resource
fork format is in the book <a href="https://developer.apple.com/legacy/library/documentation/mac/pdf/MoreMacintoshToolbox.pdf">More Macintosh Toolbox</a>,
though that doesn’t actually define the format of the resources themselves
(for that you have to look elsewhere).  I haven’t split out the resource fork
parsing/generating code, unlike the Alias/Bookmark code, because this really
is the only place that the resource fork format is being used now.</p>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[OMG Trump]]></title>
    <link href="https://alastairs-place.net/blog/2016/11/09/omg-trump/"/>
    <updated>2016-11-09T09:29:03+00:00</updated>
    <id>https://alastairs-place.net/blog/2016/11/09/omg-trump</id>
    <content type="html"><![CDATA[<p>So America has apparently taken leave of its senses and elected Donald Trump
as President of the United States, my Twitter timeline is full of liberal (in
the American sense) Americans decrying the state of the world, and my wife is
suggesting that maybe we should finish doing up our house and move to New
Zealand at the earliest opportunity.  I think she might actually be serious.</p>

<p>To my American friends, I say this: take a deep breath.  The fact is that
America electing Trump does not mean you’re surrounded by racists, misogynists
or homophobes; yes, those people doubtless voted Trump, but they won’t be a
majority of his voters any more than you’d imagine that people who voted for
Obama were black supremacists.</p>

<p>I would have preferred that you elect Hillary<sup id="fnref:surprise"><a href="#fn:surprise" class="footnote">1</a></sup>.</p>

<p>But <em>it will be OK with Trump</em>, however awful he is.  Your Constitution was
designed to check the power of the executive, and the bulk of the GOP, which
now holds a majority in both Congress and the Senate, was not united behind
Trump.  He will face plenty of opposition in both places from both Democrats
and Republicans alike, and I am sure, and I hope, that Congress, the Senate
and the Supreme Court will do their best to make sure that Trump’s presidency
is not the disaster many fear it will be, either for the United States or for
the rest of the world.</p>

<div class="footnotes">
  <ol>
    <li id="fn:surprise">

      <p>Some people will find this surprising; I am definitely
right-wing, though I identify myself as a classical liberal, certainly not a
conservative in the traditional sense and I definitely disagree with many of
the things Trump has said during his campaign.  If I had my pick of the
candidates for the GOP nomination, I’d probably have picked Rand Paul, though
I don’t share all of his views either (notably we differ on abortion). <a href="#fnref:surprise" class="reversefootnote">&#8617;</a></p>
    </li>
  </ol>
</div>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[Why You Should Learn About Algorithms]]></title>
    <link href="https://alastairs-place.net/blog/2016/10/13/why-you-should-learn-about-algorithms/"/>
    <updated>2016-10-13T19:01:41+01:00</updated>
    <id>https://alastairs-place.net/blog/2016/10/13/why-you-should-learn-about-algorithms</id>
    <content type="html"><![CDATA[<p>Last month, <a href="http://redqueencoder.com/">Janie Clayton</a> wrote
<a href="http://redqueencoder.com/how-not-to-hire-an-ios-developer/">a blog post about a particularly odd interview she had</a>.
A lot of what she writes is spot on - it <em>is</em> ridiculous interviewing for an
iOS developer and expecting them to answer questions in Java, and it’s even
more ridiculous offering to allow someone to use a language you aren’t
comfortable with as an interviewer and then telling them that they can’t after
all because you don’t know it yourself!  Yes, all of that happened.
<a href="http://redqueencoder.com/how-not-to-hire-an-ios-developer/">Read the post here.</a></p>

<p>During this interview, Janie was asked to write a linked list; this is
probably the second simplest data structure after an array, and her response
to being asked about it was to tell the interviewer that she was</p>

<blockquote>
  <p>a hacker who learned programming by writing applications rather than
learning algorithms and data structures you only use to pass code interviews
at corporate entities</p>
</blockquote>

<p>and was slightly incensed when the interviewer responded</p>

<blockquote>
  <p>“Oh, so you’re not a programmer. You’re more of a management type.”</p>
</blockquote>

<p>I think one of the reasons Janie got a bit of push back here
(which she talks about in <a href="http://redqueencoder.com/the-algorithms-of-discrimination/">her most recent blog post</a>)
is that while she’s right that it’s quite unlikely in run-of-the-mill
programming jobs that you’ll find yourself needing to implement a linked list,
the implication of her response is that this stuff is hard, that it needs a
great deal of learning, and that it will be a waste of her time.</p>

<p>None of that is true.</p>

<p>Put another way: there is a reason they teach this stuff in Computer Science
degrees.  (I <em>do</em> have a CS degree - well, Information Systems Engineering,
which included CS and Electronic Engineering - <em>but</em> I learned a lot of this
stuff on my own <em>before</em> starting my degree.)</p>

<h2 id="on-the-linked-list">On the Linked List</h2>

<p>Let’s deal with the linked list thing first.  Even if you know what one is,
the chances are very good that it’s the wrong data structure to use.  On
modern microprocessors, in 99% of cases cache locality is more important than
being able to manipulate lists using pointers, so you should use an array
instead.  Or a <code>CFArray</code>.  Or a Python <code>list</code>.  Or a C++ <code>std::vector</code>.</p>

<p>If I ever interview you and ask you about a linked list, it’s because you said
you had a CS degree and quite probably you failed to answer a question about a
more sophisticated data structure I asked you about.  Either that, or I’m
going to get you to reason about it somehow and the list itself isn’t really
what the question is about, and in that case, if you said you didn’t know what one
was, provided you didn’t study CS, I’d show you because the point wasn’t the
list, right?  (If you <em>did</em> study CS and don’t know what a linked list is, you
just failed the interview; regardless of whether you’ve ever used one or not
in a real program, you were taught about it and you really should know.)</p>

<p>For the benefit of those who <em>don’t</em> know what a linked list is, imagine you
want to store the integers 2, 4, 6, 8, 10.  You could use an array</p>

<div class="graphviz-wrapper">

<!-- Generated by graphviz version 2.38.0 (20140413.2041)
 -->
<!-- Title: array Pages: 1 -->
<svg role="img" aria-label="array" width="130pt" height="45pt" viewBox="0.00 0.00 130.00 45.00">
<title>array</title>
<desc>digraph &quot;array&quot; { 
bgcolor=&quot;#f8f8f8&quot;;
node [shape=record];
array1 [label=&quot;2|4|6|8|10&quot;]
 }</desc>

<g id="graph0" class="graph" transform="scale(1 1) rotate(0) translate(4 41)">
<title>array</title>
<polygon fill="#f8f8f8" stroke="none" points="-4,4 -4,-41 126,-41 126,4 -4,4" />
<!-- array1 -->
<g id="node1" class="node"><title>array1</title>
<polygon fill="none" stroke="black" points="0,-0.5 0,-36.5 122,-36.5 122,-0.5 0,-0.5" />
<text text-anchor="middle" x="11.5" y="-14.3" font-family="Times,serif" font-size="14.00">2</text>
<polyline fill="none" stroke="black" points="23,-0.5 23,-36.5 " />
<text text-anchor="middle" x="34.5" y="-14.3" font-family="Times,serif" font-size="14.00">4</text>
<polyline fill="none" stroke="black" points="46,-0.5 46,-36.5 " />
<text text-anchor="middle" x="57.5" y="-14.3" font-family="Times,serif" font-size="14.00">6</text>
<polyline fill="none" stroke="black" points="69,-0.5 69,-36.5 " />
<text text-anchor="middle" x="80.5" y="-14.3" font-family="Times,serif" font-size="14.00">8</text>
<polyline fill="none" stroke="black" points="92,-0.5 92,-36.5 " />
<text text-anchor="middle" x="107" y="-14.3" font-family="Times,serif" font-size="14.00">10</text>
</g>
</g>
</svg>
</div>

<p>but if you wanted to insert, say, 7, into the array, you’ll have to resize it
and copy data around.  <em>On modern architectures, in most cases, that’s
actually the right way to implement this</em>, but on older systems, on the less
powerful hardware used in embedded systems, or in certain special cases you
might instead choose to store the numbers like this:</p>

<div class="graphviz-wrapper">

<!-- Generated by graphviz version 2.38.0 (20140413.2041)
 -->
<!-- Title: list Pages: 1 -->
<svg role="img" aria-label="list" width="519pt" height="45pt" viewBox="0.00 0.00 518.78 45.00">
<title>list</title>
<desc>digraph &quot;list&quot; { 
bgcolor=&quot;#f8f8f8&quot;;
rankdir=LR;
node [shape=record];
head [shape=plaintext,label=&quot;head&quot;];
e2 [label=&quot;{2|&lt;next&gt;}&quot;];
e4 [label=&quot;{4|&lt;next&gt;}&quot;];
e6 [label=&quot;{6|&lt;next&gt;}&quot;];
e8 [label=&quot;{8|&lt;next&gt;}&quot;];
e10 [label=&quot;{10|&lt;next&gt;nil}&quot;];
head -&gt; e2:w;
e2:next -&gt; e4:w;
e4:next -&gt; e6:w;
e6:next -&gt; e8:w;
e8:next -&gt; e10:w;
 }</desc>

<g id="graph0" class="graph" transform="scale(1 1) rotate(0) translate(4 41)">
<title>list</title>
<polygon fill="#f8f8f8" stroke="none" points="-4,4 -4,-41 514.779,-41 514.779,4 -4,4" />
<!-- head -->
<g id="node1" class="node"><title>head</title>
<text text-anchor="middle" x="27" y="-14.3" font-family="Times,serif" font-size="14.00">head</text>
</g>
<!-- e2 -->
<g id="node2" class="node"><title>e2</title>
<polygon fill="none" stroke="black" points="90,-0.5 90,-36.5 144,-36.5 144,-0.5 90,-0.5" />
<text text-anchor="middle" x="104" y="-14.3" font-family="Times,serif" font-size="14.00">2</text>
<polyline fill="none" stroke="black" points="118,-0.5 118,-36.5 " />
<text text-anchor="middle" x="130.75" y="-14.3" font-family="Times,serif" font-size="14.00"> </text>
</g>
<!-- head&#45;&gt;e2 -->
<g id="edge1" class="edge"><title>head&#45;&gt;e2:w</title>
<path fill="none" stroke="black" d="M54.2459,-18.5C62.1809,-18.5 71.1188,-18.5 79.8752,-18.5" />
<polygon fill="black" stroke="black" points="80,-22.0001 90,-18.5 80,-15.0001 80,-22.0001" />
</g>
<!-- e4 -->
<g id="node3" class="node"><title>e4</title>
<polygon fill="none" stroke="black" points="180,-0.5 180,-36.5 234,-36.5 234,-0.5 180,-0.5" />
<text text-anchor="middle" x="194" y="-14.3" font-family="Times,serif" font-size="14.00">4</text>
<polyline fill="none" stroke="black" points="208,-0.5 208,-36.5 " />
<text text-anchor="middle" x="220.75" y="-14.3" font-family="Times,serif" font-size="14.00"> </text>
</g>
<!-- e2&#45;&gt;e4 -->
<g id="edge2" class="edge"><title>e2:next&#45;&gt;e4:w</title>
<path fill="none" stroke="black" d="M144,-18.5C156,-18.5 161.25,-18.5 169.875,-18.5" />
<polygon fill="black" stroke="black" points="170,-22.0001 180,-18.5 170,-15.0001 170,-22.0001" />
</g>
<!-- e6 -->
<g id="node4" class="node"><title>e6</title>
<polygon fill="none" stroke="black" points="270,-0.5 270,-36.5 324,-36.5 324,-0.5 270,-0.5" />
<text text-anchor="middle" x="284" y="-14.3" font-family="Times,serif" font-size="14.00">6</text>
<polyline fill="none" stroke="black" points="298,-0.5 298,-36.5 " />
<text text-anchor="middle" x="310.75" y="-14.3" font-family="Times,serif" font-size="14.00"> </text>
</g>
<!-- e4&#45;&gt;e6 -->
<g id="edge3" class="edge"><title>e4:next&#45;&gt;e6:w</title>
<path fill="none" stroke="black" d="M234,-18.5C246,-18.5 251.25,-18.5 259.875,-18.5" />
<polygon fill="black" stroke="black" points="260,-22.0001 270,-18.5 260,-15.0001 260,-22.0001" />
</g>
<!-- e8 -->
<g id="node5" class="node"><title>e8</title>
<polygon fill="none" stroke="black" points="360,-0.5 360,-36.5 414,-36.5 414,-0.5 360,-0.5" />
<text text-anchor="middle" x="374" y="-14.3" font-family="Times,serif" font-size="14.00">8</text>
<polyline fill="none" stroke="black" points="388,-0.5 388,-36.5 " />
<text text-anchor="middle" x="400.75" y="-14.3" font-family="Times,serif" font-size="14.00"> </text>
</g>
<!-- e6&#45;&gt;e8 -->
<g id="edge4" class="edge"><title>e6:next&#45;&gt;e8:w</title>
<path fill="none" stroke="black" d="M324,-18.5C336,-18.5 341.25,-18.5 349.875,-18.5" />
<polygon fill="black" stroke="black" points="350,-22.0001 360,-18.5 350,-15.0001 350,-22.0001" />
</g>
<!-- e10 -->
<g id="node6" class="node"><title>e10</title>
<polygon fill="none" stroke="black" points="450,-0.5 450,-36.5 510.779,-36.5 510.779,-0.5 450,-0.5" />
<text text-anchor="middle" x="465" y="-14.3" font-family="Times,serif" font-size="14.00">10</text>
<polyline fill="none" stroke="black" points="480,-0.5 480,-36.5 " />
<text text-anchor="middle" x="495.39" y="-14.3" font-family="Times,serif" font-size="14.00">nil</text>
</g>
<!-- e8&#45;&gt;e10 -->
<g id="edge5" class="edge"><title>e8:next&#45;&gt;e10:w</title>
<path fill="none" stroke="black" d="M414,-18.5C426,-18.5 431.25,-18.5 439.875,-18.5" />
<polygon fill="black" stroke="black" points="440,-22.0001 450,-18.5 440,-15.0001 440,-22.0001" />
</g>
</g>
</svg>
</div>

<p>Each number is now stored in a structure with two elements (traditionally
called a <em>node</em>); the first is the
number, while the second is a pointer to the next structure in the list.  This
is called a <em>singly-linked list</em>, and it should be apparent that inserting 7
into it is just a matter of allocating a new list node, putting 7 into it,
setting its pointer to point at the node containing 8, and then updating the
pointer in the node containing 6 to point at it.</p>

<p>Obviously with a singly-linked list, if you have a pointer to a node, you can
easily obtain a pointer to the <em>next</em> node, but you have no way to go
backwards through the list; this also makes it hard to remove a node given
just a pointer.  The desire to go either way through the list, and also to
make node removal as easy as node insertion leads to the idea of the
<em>doubly-linked list</em>:</p>

<div class="graphviz-wrapper">

<!-- Generated by graphviz version 2.38.0 (20140413.2041)
 -->
<!-- Title: doubly&#45;linked list Pages: 1 -->
<svg role="img" aria-label="doubly-linked list" width="570pt" height="143pt" viewBox="0.00 0.00 570.28 142.63">
<title>doubly-linked list</title>
<desc>digraph &quot;doubly-linked list&quot; { 
bgcolor=&quot;#f8f8f8&quot;;
rankdir=LR;
node [shape=record];
head [shape=plaintext, label=&quot;head&quot;];
e2 [label=&quot;{&lt;v&gt;2|&lt;prev&gt;|&lt;next&gt;}&quot;];
e4 [label=&quot;{&lt;v&gt;4|&lt;prev&gt;|&lt;next&gt;}&quot;];
e6 [label=&quot;{&lt;v&gt;6|&lt;prev&gt;|&lt;next&gt;}&quot;];
e8 [label=&quot;{&lt;v&gt;8|&lt;prev&gt;|&lt;next&gt;}&quot;];
{
  rank=same;
  e10 [label=&quot;{&lt;v&gt;10|&lt;prev&gt;|&lt;next&gt;nil}&quot;];
  tail [shape=plaintext, label=&quot;tail&quot;];
}
head:e -&gt; e2:w;
e2:next:e -&gt; e4:w;
e4:prev:n -&gt; e2:v:n; e4:next:e -&gt; e6:w;
e6:prev:s -&gt; e4:v:s; e6:next:e -&gt; e8:w;
e8:prev:n -&gt; e6:v:n; e8:next:e -&gt; e10:w;
e10:prev:s -&gt; e8:v:s;
tail:s -&gt; e10:n;
 }</desc>

<g id="graph0" class="graph" transform="scale(1 1) rotate(0) translate(4 138.628)">
<title>doubly&#45;linked list</title>
<polygon fill="#f8f8f8" stroke="none" points="-4,4 -4,-138.628 566.279,-138.628 566.279,4 -4,4" />
<!-- head -->
<g id="node1" class="node"><title>head</title>
<text text-anchor="middle" x="27" y="-57.4282" font-family="Times,serif" font-size="14.00">head</text>
</g>
<!-- e2 -->
<g id="node2" class="node"><title>e2</title>
<polygon fill="none" stroke="black" points="90,-43.6282 90,-79.6282 152,-79.6282 152,-43.6282 90,-43.6282" />
<text text-anchor="middle" x="101.5" y="-57.4282" font-family="Times,serif" font-size="14.00">2</text>
<polyline fill="none" stroke="black" points="113,-43.6282 113,-79.6282 " />
<text text-anchor="middle" x="122.75" y="-57.4282" font-family="Times,serif" font-size="14.00"> </text>
<polyline fill="none" stroke="black" points="132.5,-43.6282 132.5,-79.6282 " />
<text text-anchor="middle" x="142.25" y="-57.4282" font-family="Times,serif" font-size="14.00"> </text>
</g>
<!-- head&#45;&gt;e2 -->
<g id="edge1" class="edge"><title>head:e&#45;&gt;e2:w</title>
<path fill="none" stroke="black" d="M54,-61.6282C66,-61.6282 71.25,-61.6282 79.875,-61.6282" />
<polygon fill="black" stroke="black" points="80,-65.1283 90,-61.6282 80,-58.1283 80,-65.1283" />
</g>
<!-- e4 -->
<g id="node3" class="node"><title>e4</title>
<polygon fill="none" stroke="black" points="188,-43.6282 188,-79.6282 250,-79.6282 250,-43.6282 188,-43.6282" />
<text text-anchor="middle" x="199.5" y="-57.4282" font-family="Times,serif" font-size="14.00">4</text>
<polyline fill="none" stroke="black" points="211,-43.6282 211,-79.6282 " />
<text text-anchor="middle" x="220.75" y="-57.4282" font-family="Times,serif" font-size="14.00"> </text>
<polyline fill="none" stroke="black" points="230.5,-43.6282 230.5,-79.6282 " />
<text text-anchor="middle" x="240.25" y="-57.4282" font-family="Times,serif" font-size="14.00"> </text>
</g>
<!-- e2&#45;&gt;e4 -->
<g id="edge2" class="edge"><title>e2:next:e&#45;&gt;e4:w</title>
<path fill="none" stroke="black" d="M152,-61.6282C164,-61.6282 169.25,-61.6282 177.875,-61.6282" />
<polygon fill="black" stroke="black" points="178,-65.1283 188,-61.6282 178,-58.1283 178,-65.1283" />
</g>
<!-- e4&#45;&gt;e2 -->
<g id="edge3" class="edge"><title>e4:prev:n&#45;&gt;e2:v:n</title>
<path fill="none" stroke="black" d="M221,-80.6282C221,-130.42 116.408,-133.726 102.517,-90.5476" />
<polygon fill="black" stroke="black" points="105.972,-89.984 101,-80.6282 99.0523,-91.0425 105.972,-89.984" />
</g>
<!-- e6 -->
<g id="node4" class="node"><title>e6</title>
<polygon fill="none" stroke="black" points="286,-43.6282 286,-79.6282 348,-79.6282 348,-43.6282 286,-43.6282" />
<text text-anchor="middle" x="297.5" y="-57.4282" font-family="Times,serif" font-size="14.00">6</text>
<polyline fill="none" stroke="black" points="309,-43.6282 309,-79.6282 " />
<text text-anchor="middle" x="318.75" y="-57.4282" font-family="Times,serif" font-size="14.00"> </text>
<polyline fill="none" stroke="black" points="328.5,-43.6282 328.5,-79.6282 " />
<text text-anchor="middle" x="338.25" y="-57.4282" font-family="Times,serif" font-size="14.00"> </text>
</g>
<!-- e4&#45;&gt;e6 -->
<g id="edge4" class="edge"><title>e4:next:e&#45;&gt;e6:w</title>
<path fill="none" stroke="black" d="M250,-61.6282C262,-61.6282 267.25,-61.6282 275.875,-61.6282" />
<polygon fill="black" stroke="black" points="276,-65.1283 286,-61.6282 276,-58.1283 276,-65.1283" />
</g>
<!-- e6&#45;&gt;e4 -->
<g id="edge5" class="edge"><title>e6:prev:s&#45;&gt;e4:v:s</title>
<path fill="none" stroke="black" d="M319,-42.6282C319,7.16346 214.408,10.4699 200.517,-32.7088" />
<polygon fill="black" stroke="black" points="197.052,-32.2139 199,-42.6282 203.972,-33.2724 197.052,-32.2139" />
</g>
<!-- e8 -->
<g id="node5" class="node"><title>e8</title>
<polygon fill="none" stroke="black" points="384,-43.6282 384,-79.6282 446,-79.6282 446,-43.6282 384,-43.6282" />
<text text-anchor="middle" x="395.5" y="-57.4282" font-family="Times,serif" font-size="14.00">8</text>
<polyline fill="none" stroke="black" points="407,-43.6282 407,-79.6282 " />
<text text-anchor="middle" x="416.75" y="-57.4282" font-family="Times,serif" font-size="14.00"> </text>
<polyline fill="none" stroke="black" points="426.5,-43.6282 426.5,-79.6282 " />
<text text-anchor="middle" x="436.25" y="-57.4282" font-family="Times,serif" font-size="14.00"> </text>
</g>
<!-- e6&#45;&gt;e8 -->
<g id="edge6" class="edge"><title>e6:next:e&#45;&gt;e8:w</title>
<path fill="none" stroke="black" d="M348,-61.6282C360,-61.6282 365.25,-61.6282 373.875,-61.6282" />
<polygon fill="black" stroke="black" points="374,-65.1283 384,-61.6282 374,-58.1283 374,-65.1283" />
</g>
<!-- e8&#45;&gt;e6 -->
<g id="edge7" class="edge"><title>e8:prev:n&#45;&gt;e6:v:n</title>
<path fill="none" stroke="black" d="M417,-80.6282C417,-130.42 312.408,-133.726 298.517,-90.5476" />
<polygon fill="black" stroke="black" points="301.972,-89.984 297,-80.6282 295.052,-91.0425 301.972,-89.984" />
</g>
<!-- e10 -->
<g id="node6" class="node"><title>e10</title>
<polygon fill="none" stroke="black" points="482,-43.6282 482,-79.6282 562.279,-79.6282 562.279,-43.6282 482,-43.6282" />
<text text-anchor="middle" x="497" y="-57.4282" font-family="Times,serif" font-size="14.00">10</text>
<polyline fill="none" stroke="black" points="512,-43.6282 512,-79.6282 " />
<text text-anchor="middle" x="521.75" y="-57.4282" font-family="Times,serif" font-size="14.00"> </text>
<polyline fill="none" stroke="black" points="531.5,-43.6282 531.5,-79.6282 " />
<text text-anchor="middle" x="546.89" y="-57.4282" font-family="Times,serif" font-size="14.00">nil</text>
</g>
<!-- e8&#45;&gt;e10 -->
<g id="edge8" class="edge"><title>e8:next:e&#45;&gt;e10:w</title>
<path fill="none" stroke="black" d="M446,-61.6282C458,-61.6282 463.25,-61.6282 471.875,-61.6282" />
<polygon fill="black" stroke="black" points="472,-65.1283 482,-61.6282 472,-58.1283 472,-65.1283" />
</g>
<!-- e10&#45;&gt;e8 -->
<g id="edge9" class="edge"><title>e10:prev:s&#45;&gt;e8:v:s</title>
<path fill="none" stroke="black" d="M522.14,-42.6282C522.14,10.3467 410.396,13.6576 396.428,-32.6954" />
<polygon fill="black" stroke="black" points="392.959,-32.2319 395,-42.6282 399.887,-33.228 392.959,-32.2319" />
</g>
<!-- tail -->
<g id="node7" class="node"><title>tail</title>
<text text-anchor="middle" x="522.14" y="-112.428" font-family="Times,serif" font-size="14.00">tail</text>
</g>
<!-- tail&#45;&gt;e10 -->
<g id="edge10" class="edge"><title>tail:s&#45;&gt;e10:n</title>
<path fill="none" stroke="black" d="M522.14,-90.3448C522.14,-92.6141 522.14,-95.031 522.14,-98.6282" />
<polygon fill="black" stroke="black" points="525.64,-90.1282 522.14,-80.1282 518.64,-90.1282 525.64,-90.1282" />
</g>
</g>
</svg>
</div>

<p>There’s also a smart-ass variant of the above where there’s only one “pointer”
per node, which consists of the exclusive-or of the pointers to the previous
and next nodes, which is neat but unless you’re on a memory restricted
microcontroller you <em>really</em> shouldn’t.</p>

<h2 id="circular-lists">Circular lists</h2>

<p>By the way, there is a nice variant that I haven’t seen in any textbooks,
namely the <em>circular</em> list, which lets you quickly add elements at either end
of the list and also simplifies bookkeeping because there are never any null
pointers.</p>

<p>Here’s a singly-linked version:</p>

<div class="graphviz-wrapper">

<!-- Generated by graphviz version 2.38.0 (20140413.2041)
 -->
<!-- Title: singly&#45;linked circular list Pages: 1 -->
<svg role="img" aria-label="singly-linked circular list" width="512pt" height="110pt" viewBox="0.00 0.00 512.00 109.98">
<title>singly-linked circular list</title>
<desc>digraph &quot;singly-linked circular list&quot; { 
bgcolor=&quot;#f8f8f8&quot;;
rankdir=LR;
node [shape=record];
tail [shape=plaintext,label=&quot;tail&quot;];
e2 [label=&quot;{2|&lt;next&gt;}&quot;];
e4 [label=&quot;{4|&lt;next&gt;}&quot;];
e6 [label=&quot;{6|&lt;next&gt;}&quot;];
e8 [label=&quot;{8|&lt;next&gt;}&quot;];
e10 [label=&quot;{&lt;v&gt;10|&lt;next&gt;}&quot;];
tail:e -&gt; e10:w [weight=10];
e2:next -&gt; e4:w [weight=10];
e4:next -&gt; e6:w [weight=10];
e6:next -&gt; e8:w [weight=10];
e8:next:n -&gt; e10:v:n [weight=1];
e10:next -&gt; e2:w [weight=10];
 }</desc>

<g id="graph0" class="graph" transform="scale(1 1) rotate(0) translate(4 105.979)">
<title>singly&#45;linked circular list</title>
<polygon fill="#f8f8f8" stroke="none" points="-4,4 -4,-105.979 508,-105.979 508,4 -4,4" />
<!-- tail -->
<g id="node1" class="node"><title>tail</title>
<text text-anchor="middle" x="27" y="-14.3" font-family="Times,serif" font-size="14.00">tail</text>
</g>
<!-- e10 -->
<g id="node6" class="node"><title>e10</title>
<polygon fill="none" stroke="black" points="90,-0.5 90,-36.5 144,-36.5 144,-0.5 90,-0.5" />
<text text-anchor="middle" x="106" y="-14.3" font-family="Times,serif" font-size="14.00">10</text>
<polyline fill="none" stroke="black" points="122,-0.5 122,-36.5 " />
<text text-anchor="middle" x="132.75" y="-14.3" font-family="Times,serif" font-size="14.00"> </text>
</g>
<!-- tail&#45;&gt;e10 -->
<g id="edge1" class="edge"><title>tail:e&#45;&gt;e10:w</title>
<path fill="none" stroke="black" d="M54,-18.5C66,-18.5 71.25,-18.5 79.875,-18.5" />
<polygon fill="black" stroke="black" points="80,-22.0001 90,-18.5 80,-15.0001 80,-22.0001" />
</g>
<!-- e2 -->
<g id="node2" class="node"><title>e2</title>
<polygon fill="none" stroke="black" points="180,-0.5 180,-36.5 234,-36.5 234,-0.5 180,-0.5" />
<text text-anchor="middle" x="194" y="-14.3" font-family="Times,serif" font-size="14.00">2</text>
<polyline fill="none" stroke="black" points="208,-0.5 208,-36.5 " />
<text text-anchor="middle" x="220.75" y="-14.3" font-family="Times,serif" font-size="14.00"> </text>
</g>
<!-- e4 -->
<g id="node3" class="node"><title>e4</title>
<polygon fill="none" stroke="black" points="270,-0.5 270,-36.5 324,-36.5 324,-0.5 270,-0.5" />
<text text-anchor="middle" x="284" y="-14.3" font-family="Times,serif" font-size="14.00">4</text>
<polyline fill="none" stroke="black" points="298,-0.5 298,-36.5 " />
<text text-anchor="middle" x="310.75" y="-14.3" font-family="Times,serif" font-size="14.00"> </text>
</g>
<!-- e2&#45;&gt;e4 -->
<g id="edge2" class="edge"><title>e2:next&#45;&gt;e4:w</title>
<path fill="none" stroke="black" d="M234,-18.5C246,-18.5 251.25,-18.5 259.875,-18.5" />
<polygon fill="black" stroke="black" points="260,-22.0001 270,-18.5 260,-15.0001 260,-22.0001" />
</g>
<!-- e6 -->
<g id="node4" class="node"><title>e6</title>
<polygon fill="none" stroke="black" points="360,-0.5 360,-36.5 414,-36.5 414,-0.5 360,-0.5" />
<text text-anchor="middle" x="374" y="-14.3" font-family="Times,serif" font-size="14.00">6</text>
<polyline fill="none" stroke="black" points="388,-0.5 388,-36.5 " />
<text text-anchor="middle" x="400.75" y="-14.3" font-family="Times,serif" font-size="14.00"> </text>
</g>
<!-- e4&#45;&gt;e6 -->
<g id="edge3" class="edge"><title>e4:next&#45;&gt;e6:w</title>
<path fill="none" stroke="black" d="M324,-18.5C336,-18.5 341.25,-18.5 349.875,-18.5" />
<polygon fill="black" stroke="black" points="350,-22.0001 360,-18.5 350,-15.0001 350,-22.0001" />
</g>
<!-- e8 -->
<g id="node5" class="node"><title>e8</title>
<polygon fill="none" stroke="black" points="450,-0.5 450,-36.5 504,-36.5 504,-0.5 450,-0.5" />
<text text-anchor="middle" x="464" y="-14.3" font-family="Times,serif" font-size="14.00">8</text>
<polyline fill="none" stroke="black" points="478,-0.5 478,-36.5 " />
<text text-anchor="middle" x="490.75" y="-14.3" font-family="Times,serif" font-size="14.00"> </text>
</g>
<!-- e6&#45;&gt;e8 -->
<g id="edge4" class="edge"><title>e6:next&#45;&gt;e8:w</title>
<path fill="none" stroke="black" d="M414,-18.5C426,-18.5 431.25,-18.5 439.875,-18.5" />
<polygon fill="black" stroke="black" points="440,-22.0001 450,-18.5 440,-15.0001 440,-22.0001" />
</g>
<!-- e8&#45;&gt;e10 -->
<g id="edge5" class="edge"><title>e8:next:n&#45;&gt;e10:v:n</title>
<path fill="none" stroke="black" d="M491,-37.5C491,-119.546 136.934,-122.912 107.89,-47.5956" />
<polygon fill="black" stroke="black" points="111.28,-46.6852 106,-37.5 104.4,-47.9733 111.28,-46.6852" />
</g>
<!-- e10&#45;&gt;e2 -->
<g id="edge6" class="edge"><title>e10:next&#45;&gt;e2:w</title>
<path fill="none" stroke="black" d="M144,-18.5C156,-18.5 161.25,-18.5 169.875,-18.5" />
<polygon fill="black" stroke="black" points="170,-22.0001 180,-18.5 170,-15.0001 170,-22.0001" />
</g>
</g>
</svg>
</div>

<p>Note that we keep a pointer to the <em>last</em> element; to insert at the head of
the list, we update the last element’s pointer but <em>not</em> the tail pointer,
whereas to insert at the end of the list, we also update the tail pointer.</p>

<p>If you ever have cause to implement a linked list algorithm, I strongly
recommend using the circular variant.  And if you are unlucky enough to turn
up for an interview where someone really does want you to show them a linked
list, draw that kind and explain to them what the benefits are (no null
pointers, simplified manipulation, fast insertion/removal at either end with
only a single tail pointer to manage).  Well, if you want the job, anyway.</p>

<h2 id="why-you-should-learn-about-algorithms">Why you should learn about algorithms</h2>

<p><em>Note that I said “learn about”, not “learn”.</em>  You <em>do not</em> need to be able to
write a Quicksort or Shell sort routine from scratch and I would never ask
someone to in an interview; if you need to do that, you’ll be able to look it
up.</p>

<p>The main thing to understand here is the idea of algorithmic <em>complexity</em>.
Usually we’re talking <em>time complexity</em> but occasionally someone might care
about <em>space complexity</em> too.  Complexity is a measure of how expensive the
algorithm is, and we typically express it using “big O notation”.  Some examples:</p>

<table>
  <thead>
    <tr>
      <th>Notation</th>
      <th>Meaning</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>O(1)</td>
      <td>The algorithm takes constant time (best possible)</td>
    </tr>
    <tr>
      <td>O(log n)</td>
      <td>The algorithm takes time proportional to the logarithm of the size of the input (good)</td>
    </tr>
    <tr>
      <td>O(n)</td>
      <td>The algorithm takes time proportional to the size of the input (OK)</td>
    </tr>
    <tr>
      <td>O(n<sup>2</sup>)</td>
      <td>The algorithm takes time proportional to the square of the size of the input (not great)</td>
    </tr>
    <tr>
      <td>O(2<sup>n</sup>)</td>
      <td>The algorithm takes exponential time (bad)</td>
    </tr>
    <tr>
      <td>O(n!)</td>
      <td>The algorithm takes time proportional to the factorial of the size of input (really bad)</td>
    </tr>
  </tbody>
</table>

<p>You may also see people talk about <em>worst case</em>, <em>amortised worst case</em> and
<em>average case</em>.  Worst case and average case are fairly easy; <em>amortised</em>
worst case is where you consider the overall cost of an algorithm over a
set of inputs - the idea being that the amortised worst case will be
lower if the worst case is hit less frequently.</p>

<p>It’s also important to understand that, in addition to their complexity, many
algorithms have a <em>fixed cost</em>, and that there is a general trend towards
higher fixed costs for algorithms and data structures with lower time complexity.</p>

<p>How is this useful?  Well, many languages and runtime libraries make you
choose what kind of container to use to hold your data, and this choice can
have a noticable – and sometimes <em>extreme</em> – impact on your program’s
run time and memory usage.  To help you make an informed choice, the
documentation will <em>hopefully</em> tell you the algorithmic complexity (or <em>cost</em>)
of the operations on the container.  For instance, looking at
<a href="http://en.cppreference.com/w/cpp/container/vector/operator_at"><code>std::vector::operator[]</code></a>,
we can see that its complexity is listed as “constant” (i.e. O(1)), whereas
<a href="http://en.cppreference.com/w/cpp/container/map/operator_at"><code>std::map::operator[]</code></a>
lists its complexity as “logarithmic in the size of the container” (i.e.
O(log n)).</p>

<p>The C++ STL also has a few other types you could use instead of <code>std::vector</code>,
for instance <code>std::deque</code> or <code>std::list</code>.  It makes <em>you</em>, the developer,
choose, and to make that choice you need some idea of which will be better for
your particular application.</p>

<p>That’s a bit painful, and on iOS and macOS, we’re very lucky – Core
Foundation’s containers are smart and automatically use an appropriate
implementation for the number of items they contain.  So, for instance, a
small <code>CFArray</code> is basically just a C array, but as it grows it changes into a
somewhat more sophisticated data structure that allows fast insertion and
deletion in spite of the number of elements it holds.  That said, there will
still sometimes be occasions where you need to choose between a <code>CFArray</code> and
a <code>CFDictionary</code>, and there may be occasions when you need a tree rather than
a hash, in which case you might end up rolling your own.</p>

<h2 id="but">But…</h2>

<p><em>Learning this stuff will take months?</em><br />You can learn the basics very quickly
(hopefully reading the above was quite useful).</p>

<p><em>I could more profitably spend my time learning Core Data?</em><br />Yes, maybe, though
this stuff will have applications there too.</p>

<p><em>Those algorithms textbooks are huge and hard to read :-(</em><br />Well, some of
 them are, yes.  I’d recommend you pick up a copy of Sedgewick’s
 <a href="https://www.amazon.co.uk/s/ref=nb_sb_noss?url=search-alias%3Daps&amp;field-keywords=Sedgewick+Algorithms+3rd+edition&amp;rh=i%3Aaps%2Ck%3ASedgewick+Algorithms+3rd+edition"><em>Algorithms in &lt;language&gt;</em></a>.
 It’s available in a variety of different language
 flavours (I have a C++ copy, but I’ve seen C, Pascal, and Java, and there are
 probably others too), it’s short and accessible (lots of pictures and short
 example programs).  Even skimming it will give you at least some idea of
 where to look when you need to.</p>

<h2 id="finally">Finally</h2>

<p>If you go for an interview for a job as a programmer, it isn’t unreasonable to
expect that someone will ask some questions relating to fundamental algorithms
or data structures.  If someone does ask, they aren’t trying to discriminate
against the underprivileged; they’re trying to discriminate between job
applicants on grounds of competence.  Even if the question seems irrelevant to
what you’re going to do, it’s a good bet that someone who gives a good answer
is going to be better at doing the simpler work where you <em>don’t</em> need to know
this, and that is something that will factor in to the decision about who to
hire.  (Of course, that somebody may also be more expensive to hire, so bear
that in mind too.)</p>

<p>Now, as I said, I wouldn’t ask in an interview about linked lists per se,
unless you say you have a CS degree and you’ve just failed to answer a
question I think you should know the answer to, in which case I’m probably
trying to decide whether you lied about your degree.</p>

<p>I <em>might</em> ask you to show me how you would search a string (but I don’t expect
you to know the best answer OTOH; the point is to work through it and see how
you react).  I <em>might</em> ask about the merits of hash tables
(e.g. <code>std::unordered_map</code> or <code>CFDictionary</code>) versus trees (e.g. <code>std::map</code>).
I would, however, take into account your background when thinking about your
answer, and if you didn’t know about something I might explain a bit and see
what you had to say.  The point, often, is about testing your <em>reasoning
skills</em>, not about whether you know the answer and can rattle it off.</p>

<p>One final word of advice: if you respond to a question in an interview,
however silly you feel it is, with snark, you probably aren’t going to get the
job.  Part of the reason for interviewing people is for both parties to decide
whether they’d like to work together, and snark is going to put people off.</p>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[Don't Bash Iframe Payment Forms]]></title>
    <link href="https://alastairs-place.net/blog/2016/09/23/dont-bash-iframe-payment-forms/"/>
    <updated>2016-09-23T13:51:40+01:00</updated>
    <id>https://alastairs-place.net/blog/2016/09/23/dont-bash-iframe-payment-forms</id>
    <content type="html"><![CDATA[<h2 id="background">Background</h2>

<p>OK, some background first.  Owing to the increasing level of card-not-present
fraud committed via the Internet, and the generally lax security standards of
some of the websites involved, the Payment Card Industry Security Standards
Council (PCI SSC) was formed and tasked with creating and maintaining a set of
security standards called the
<a href="https://en.wikipedia.org/wiki/Payment_Card_Industry_Data_Security_Standard">Payment Card Industry Data Security Standard</a>
(PCI DSS).</p>

<p>The idea is a good one, as are many of the rules themselves, though I think
it’s legitimate to criticise PCI-DSS for demanding things of smaller
businesses that are simply unrealistic.  The upshot of this is that smaller
companies, and the payment processors who serve their market, wish to avoid
the burden of being PCI compliant, but because they know that conversion rates
are strongly impacted by being sent to a third-party site for payment, they
would also like to design payment flows where a small business is able to take
card payments on its own website.</p>

<p>The first attempt at this was to use client-side Javascript to securely
encrypt the user’s payment data, and then the payment form itself would be
submitted to the merchant’s system, but with only the encrypted blob rather
than the original payment details.  The downside of this approach is that if
something goes wrong with the Javascript code and the HTML form isn’t carefully
written, payment details go to the merchant’s server anyway and they are
dragged into the scope of PCI compliance.</p>

<p>This method of avoiding having to be fully PCI compliant was “dealt with” in
PCI DSS 3.0, which specifically imposes a compliance burden on sites doing the
above.</p>

<p>However, PCI DSS 3.0 does allow payment processors to host parts of the
payment form on their own servers instead, such that the merchant can
embed those parts into the merchant’s own form using HTML
<code>iframe</code> tags.  This provides the same visual effect, but at
reduced risk because it no longer relies on client side Javascript to keep the
payment data away from the merchant’s servers.</p>

<p>So, that’s the background.</p>

<h2 id="why-am-i-writing-this">Why am I writing this?</h2>

<p>Now, on Troy Hunt’s blog, in the comments, I happened across some
<a href="http://disq.us/p/1c0czxb">remarks from Craig Francis</a>:</p>

<blockquote>
  <p>This Stripe implementation is insecure as well.</p>
</blockquote>

<blockquote>
  <p>They use an iframe, which is trivial for a malicious hacker to replace if the original website is hacked (often possible as they use old software, FTP, bad passwords, etc - which all gets missed at the basic level of PCI checking, that Regpack also seem to suggest is acceptable).</p>
</blockquote>

<blockquote>
  <p>Troy is right to suggest that you should go to the payment gateway directly to enter your details, at least customers will know who has them.</p>
</blockquote>

<blockquote>
  <p>I’m currently working with Christine at Google to pressure the PCI council into doing something about this.</p>
</blockquote>

<p>Craig then linked to
<a href="https://www.code-poets.co.uk/misc/security/pci-saq/">this piece on his blog</a>
which advocates extending full PCI compliance (technically SAQ-A-EP) to those
businesses who are using iframe-based payment systems.</p>

<p>This would, in my opinion, be a <em>huge</em> mistake.</p>

<p>The claim, basically, is that an iframe-based system is insecure because a
third party could edit the page in which the iframe is embedded and make it
point somewhere else.  This is true, and it is a genuine vulnerability.</p>

<p>But what are the alternatives for smaller businesses?  Well, the alternative
being suggested is that they should send their customers off to a third-party
payment processor’s website, have the details filled in there, and then come
back again.  Those of us who run small businesses that take card details will
tell you for nothing that this causes two problems:</p>

<ol>
  <li>
    <p>Our conversion rate drops.  Instantly.  Customers don’t like being bumped
to another website, which they probably don’t recognise anyway, to make a card
payment.</p>
  </li>
  <li>
    <p><em>We actually get people e-mailing us to tell us they think they might be
being defrauded</em>.  Wait, what?  Yes, that’s right.  Customers don’t expect to
be suddenly redirected elsewhere; when it happens, they think something dodgy
is going on.</p>
  </li>
</ol>

<p>Now, if your goal is to destroy small business and make the huge advantages
experienced by big businesses even bigger, that’s a great idea.  What it won’t
do is improve security.  Why?  <strong>Because passing customers off to a
third-party payment website has the exact same vulnerability we were just
talking about</strong>.  The web page that does it could be edited by a malicious
third party, and pointed at a different page.</p>

<p>OK, you might say, but in that case you’ll see it in your browser’s address
bar.  Sure.  Do you know the names of every payment gateway on the Internet?
No, me neither.  So how do you know that the page you’re looking at is a
genuine payment processor?  If you’re about to utter the words “they have an
EV SSL certificate” or “because my address bar is green”, I have news for you:
it’s easy to get an EV certificate.  Even if we assume that certificate
authorities can’t be convinced to issue EV certificates in error, all the
certificate really says is that it belongs to the party listed in the
certificate details.  It doesn’t tell you they’re trustworthy.</p>

<h2 id="what-should-happen">What <em>should</em> happen?</h2>

<p>So Craig’s assertion that merchants using the iframe approach should be forced
to use SAQ A-EP, the more onerous compliance route, is clearly a non-starter.
It doesn’t improve security in practice, and has a significant impact on lots
of small businesses, most of whom will be forced to use third-party payment
gateways, which is not only bad for business but is annoying for their
customers too.</p>

<p>It’s also worth pointing out that, assuming we did tighten up this aspect of
PCI DSS, there is still nothing stopping someone from setting up a website
with a similar name, copying its appearance from a given merchant’s site, and
defrauding customers that way.  This is exactly the same kind of fraud we’re
worrying about here – customers are being sent to a site other than the one
they should be being sent to – only now it would be happening via Google,
instead of from the merchant’s own (hacked) page.  Should Google search
suddently be dragged into scope for PCI DSS somehow?  I don’t think anyone
sensibly argues that.</p>

<p>This is a hard problem, and the iframe solution is not perfect, but it is an
improvement over the client-side Javascript approach and it isn’t
significantly less secure than redirecting to a third-party website to perform
the payment.</p>

<p>The way forward is probably services like Apple Pay, which is now available in
Safari 10, where the browser is responsible for capturing the payment
information and sending it securely to the payment processor.  Even that is
not perfect – hackers could still change the merchant’s site to point at a
different payment processor and try to collect money that way.</p>

<h2 id="but-arent-servers-insecure-if-they-arent-completely-pci-compliant">But aren’t servers insecure if they aren’t completely PCI compliant?</h2>

<p>No.</p>

<p>Nor are completely PCI compliant systems necessarily secure.</p>

<p>PCI DSS compliance means that the system in question ticks all the relevant
checkboxes in the latest PCI DSS standard, meets any audit requirements and
has the appropriate paperwork in place.  There’s a good chance that systems
that are PCI DSS compliant <em>are</em> secure, but it isn’t guaranteed.</p>

<p>Why, if your system <em>is</em> secure, would you not want the burden of PCI DSS
compliance?  Well, unless you think that all small businesses’ websites (and
we’re talking about sites here that explicitly avoid touching payment data)
need automated audit logs, two factor authentication, sophisticated
penetration testing, incident response plans, written security policies,
written change control procedures, separate logging servers, and so on, I
think you already know the answer to that question.</p>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[On Security Monoculture]]></title>
    <link href="https://alastairs-place.net/blog/2016/08/18/on-security-monoculture/"/>
    <updated>2016-08-18T17:39:41+01:00</updated>
    <id>https://alastairs-place.net/blog/2016/08/18/on-security-monoculture</id>
    <content type="html"><![CDATA[<p>A pet hate of mine for some time has been the blanket assertion from those who
like to identify themselves as “security professionals” that nobody should
write their own cryptographic code.  I’ve heard a number of individuals
voicing this view and implying that all that is wrong in the world of computer
security would be fixed if people would simply stop it.</p>

<p>This is, and has been, for some time, the conventional wisdom.  <strong>It is wrong.</strong></p>

<p>Why do I say this?  Simple.  The conventional wisdom implies that we should
all be using the exact same code behind the scenes (this is often accompanied
by claims of the superiority of Open Source implementations as they will be
reviewed by many more people).  For many people, and for many applications,
this thinking leads to using OpenSSL, as it is “tried and tested”, and is Open
Source so lots of people must have looked over the code and decided it was
good, right?  Well, let’s take a look at
<a href="https://www.openssl.org/news/vulnerabilities.html">the huge list of vulnerabilities that have been found in that library</a>,
or <a href="http://arstechnica.com/information-technology/2014/04/openssl-code-beyond-repair-claims-creator-of-libressl-fork/">the comments that the founder of OpenBSD, Theo de Raadt, made about it after deciding to fork it and create LibreSSL instead</a>.</p>

<p>(Fine, you might say, use LibreSSL, or Botan, or Secure Transport, or
CryptoAPI, or…; well, yes, that’s kind of my point.  But I wouldn’t want to
recommend that <em>everyone</em> should use LibreSSL, or Botan, or Secure Transport
either.  It’s much safer if there’s a mix of software performing this task.)</p>

<p>Heartbleed was only such a big problem <em>because everyone was using the single
implementation that contained that bug</em>.  Well, almost everyone; some software
was using Apple’s Secure Transport, or Microsoft’s implementation (via
CryptoAPI), or one of the various other implementations that are floating
about.  But the overwhelming majority uses OpenSSL, and as a result, a single
vulnerability affected everyone, everywhere, simultaneously.</p>

<p>Another implication of this “thou shalt not implement crypto” view is that the
set of implementations we presently have should be fixed.  Maybe even some of
them should go away.  After all, nobody should be implementing crypto software
(the only exception seems to be if the person quoting this rule knows your
name, in which case you’re probably D.J. Bernstein or Bruce Schneier or some
such). But that will make matters <em>worse</em>, not better.  It will increase the
reliance on OpenSSL and make the monoculture worse; and everyone switching
wholesale to LibreSSL won’t help in that regard (it might be better in other
respects, but that’s another matter).  Indeed, it even implies that <em>you</em>
shouldn’t be submitting any fixes to OpenSSL, because you can’t possibly be a
suitable person to be tampering with cryptographic software.</p>

<p>Now, do I think you, dear reader, should immediately go out and roll your
own RSA implementation?  No, absolutely not.  I am categorically <em>not</em> in
favour of <em>everyone</em> implementing their own crypto (or, worse, rolling their
own cryptographic algorithm).  It isn’t something you can throw together in an
afternoon, without carefully researching the subject first, and it certainly
isn’t something you should be doing without adequate testing to make sure you
haven’t slipped up.  There are lots of gotchas in this area that you won’t
appreciate unless you go and learn about it first.  But what I don’t like
about the conventional wisdom on the subject is that it has tended to
discourage people who <em>are</em> competent to do so from writing additional
implementations, and has created an atmosphere where you’re likely to be
yelled at for merely suggesting that it might be a good idea for that to
happen.</p>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[Code Coverage From the Command Line With Clang]]></title>
    <link href="https://alastairs-place.net/blog/2016/05/20/code-coverage-from-the-command-line-with-clang/"/>
    <updated>2016-05-20T11:14:55+01:00</updated>
    <id>https://alastairs-place.net/blog/2016/05/20/code-coverage-from-the-command-line-with-clang</id>
    <content type="html"><![CDATA[<p>Having searched the Internet several times to find out how to get coverage
information out of <code>clang</code>, I ended up feeling rather confused.  I’m sure I’m
not the only one.  The reason for the confusion is fairly simple; <code>clang</code>
supports <em>two different coverage tools</em>, one of which uses a tool with a name
that used to be used by the other one!</p>

<p>About half of the posts seem to indicate that the right way to get coverage
information is to use the <code>--coverage</code> argument to <code>clang</code>:</p>

<figure class="code"><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class="line-number">1</span>
<span class="line-number">2</span>
<span class="line-number">3</span>
<span class="line-number">4</span>
<span class="line-number">5</span>
<span class="line-number">6</span>
</pre></td><td class="code"><pre><code class=""><span class="line">$ clang --coverage -g -Wall testcov.c -o testcov
</span><span class="line">$ ls
</span><span class="line">testcov      testcov.c    testcov.dSYM testcov.gcno
</span><span class="line">$ ./testcov
</span><span class="line">$ ls
</span><span class="line">testcov      testcov.c    testcov.dSYM testcov.gcno testcov.gcda</span></code></pre></td></tr></table></div></figure>

<p>This appears to produce (approximately) GCOV format data which can then be
used with the <code>gcov</code> command, <em>noting that this is really LLVM’s gcov</em>, not
GNU gcov, though it appears to be designed to be broadly compatible with the
latter.  Older versions of LLVM apparently used to call this tool <code>llvm-cov</code>
rather than replacing <code>gcov</code> with it, <em>but that name is now used for a newer,
separate tool</em>.</p>

<p>The rest of the posts, including some on the LLVM site, instead recommend using
the <code>-fprofile-instr-generate</code> and <code>-fcoverage-mapping</code> options:</p>

<figure class="code"><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class="line-number">1</span>
<span class="line-number">2</span>
<span class="line-number">3</span>
<span class="line-number">4</span>
<span class="line-number">5</span>
<span class="line-number">6</span>
</pre></td><td class="code"><pre><code class=""><span class="line">$ clang -fprofile-instr-generate -fcoverage-mapping -g -Wall testcov.c -o testcov
</span><span class="line">$ ls
</span><span class="line">testcov      testcov.c    testcov.dSYM
</span><span class="line">$ ./testcov
</span><span class="line">$ ls
</span><span class="line">default.profraw testcov         testcov.c       testcov.dSYM</span></code></pre></td></tr></table></div></figure>

<p>Instead of outputting GCOV data, this generates a file <code>default.profraw</code>,
which can be used with <code>llvm-profdata</code> and <code>llvm-cov</code></p>

<p>The way to use this file is to do something like</p>

<figure class="code"><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class="line-number">1</span>
<span class="line-number">2</span>
</pre></td><td class="code"><pre><code class=""><span class="line">$ llvm-profdata merge -o testcov.profdata default.profraw
</span><span class="line">$ llvm-cov show ./testcov -instr-profile=testcov.profdata testcov.c</span></code></pre></td></tr></table></div></figure>

<p>In case you were wondering: you <em>must</em> pass the raw profile data through
<code>llvm-profdata</code>.  It isn’t in the format <code>llvm-cov</code> wants, and apparently the
“merge” operation does more than just merging.</p>

<p>Also, you can change the name of the output file, either by setting the
<code>LLVM_PROFILE_FILE</code> environment variable, or by compiling your code with
<code>-fprofile-instr-generate=&lt;filename&gt;</code>.  This <em>is</em> mentioned in the help output
from the <code>clang</code> command, but doesn’t seem to be anywhere in the clang
documentation itself.</p>

<p><strong>In both cases</strong>, you need to pass the coverage options to the <code>clang</code> or
<code>clang++</code> driver when you are <em>linking</em> as well as when you are compiling.
This will cause <code>clang</code> to link with any libraries required by the profiling
system.  You <em>do not</em> need to explicitly link with a profiling library when
using <code>clang</code>.</p>

<p>One final remark: on Mac OS X, <code>gcov</code> will likely be in your path, but
<code>llvm-profdata</code> and <code>llvm-cov</code> will not–instead, you can access them via
Xcode’s <code>xcrun</code> tool.</p>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[Why NOT Have a Code of Conduct?]]></title>
    <link href="https://alastairs-place.net/blog/2016/04/20/why-not-have-a-code-of-conduct/"/>
    <updated>2016-04-20T09:34:51+01:00</updated>
    <id>https://alastairs-place.net/blog/2016/04/20/why-not-have-a-code-of-conduct</id>
    <content type="html"><![CDATA[<p>Having just seen a demand that WWDC adopt a formal Code of Conduct for its
attendees this year (rdar://25791520 if you want to dupe it, though please
give this post a read first), I thought I’d write a little to express my
thoughts about the Code of Conduct phenomenon (in more than 140 characters,
since that seems somehow inadequate).</p>

<p>Let me start by saying that it has always been the case that most conferences
reserved the right to eject you if you were in some way disruptive.  As private
events, they’re within their rights to do so (at least in Common Law countries),
and if they have the appropriate wording in their Terms &amp; Conditions they may
not even have to refund your money.</p>

<p>Let me also say that I am <em>not</em> in favour of allowing harrassment or other bad
behaviour by conference attendees, and I realise that there will be situations
(e.g. where there are children present) where the organisers might want to
draw attention to the fact that attendees should keep to their best behaviour.</p>

<p>So what is this Code of Conduct thing about?  Well, a fair overview is
<a href="http://www.ashedryden.com/blog/codes-of-conduct-101-faq">this FAQ by Ashe Dryden</a>,
and there’s an example of the kind of thing we’re talking about
<a href="http://confcodeofconduct.com">on this website</a>.  To save time, I’d recommend
that you go and read those now, then come back if you’re still interested in
what I have to say.</p>

<p>OK, you’re back.  So <em>why</em> would anyone object to these things?</p>

<h2 id="necessity">Necessity.</h2>

<p>We’re grown-ups, right?  We should all, by now, know how to behave around
other people, and for those who don’t, we already have a set of rules that we’ve
collectively agreed upon that cover the worst kinds of harassment and bad
behaviour, namely <em>the law</em>, plus — as I already mentioned — most conferences
already reserve the right to remove you if you’re being disruptive.</p>

<p>I accept, for what it’s worth, that some people might find an explicit set of
rules reassuring.  Others, me included, do not.  Quite the opposite, in fact,
for reasons I’ll elucidate below.</p>

<h2 id="natural-justice">Natural Justice</h2>

<p>It’s commonly asserted that a problem with leaving this up to the law is that
the police “don’t have a great history of responding positively”, that
complainants may not wish to involve the police and that as a result it might
be better for conference organisers to deal with things themselves.</p>

<p>Except… conference organisers are not trained to deal with these types of
situations.  A lot of this is going to boil down to one person’s word against
another, and it’s very easy to allow your own personal biases to determine
your response.  Police officers are trained not to do that (not always
successfully, for sure, but they <em>are</em> at least trained); of course, that
does sometimes make people unhappy when they complain to the police, because
the police don’t seem to believe them — but that’s a misunderstanding.  The
function of the police is <em>not</em> to believe or to disbelieve, but to
<em>investigate</em>, and where there is evidence, to bring it before a court for
prosecution.</p>

<p>That courts of law require high standards of evidence — at least in Common Law
countries — is undeniable, and that’s because we’ve collectively agreed that
the principle should be that people are innocent until <em>proven</em> guilty.</p>

<p>This is <em>particularly</em> important in some of the areas we’re concerned with
here, because of the reputational impact on people subject to allegations of
sexism, racism or (worse) sexual assault, and the notion that the response to
allegations of that nature might be decided by conference organisers on the
basis of a low standard of evidence, without any right of appeal, <em>really
worries me</em>.</p>

<p>I know it’s also asserted that “false accusations are… incredibly rare”.  I’m
happy to believe that.  But there is a whole grey area of allegations that
might seem true from a certain point of view that isn’t necessarily shared by
all parties, and there are even situations where the accused and accusing
parties simply don’t know themselves what happened.</p>

<h2 id="out-of-venue-activities">Out-of-Venue Activities</h2>

<p>Ashe Dryden asserts that a “code of conduct should apply to any event where
your attendees may congregate”.  This seems generally problematic.</p>

<p>Certainly there are situations where conference organisers might need to get
involved; I accept that.  But it seems hard to justify extending the Code of
Conduct to all activities outside of those organised by the conference
organisers.</p>

<p>So, for instance, if someone misbehaves in a bar right outside the conference
venue, where there are a lot of conference attendees present, it is <em>totally</em>
appropriate for conference organisers to have words with that person.  Or,
actually, for anyone present to have words with that person.  But unless they
have broken the law, or upset the bar owner, you won’t be able to ban them
from hanging around in that bar, even if you kick them out of your conference.
And, furthermore, to the extent that you feel the Code of Conduct may
constrain their behaviour, it certainly won’t if you have invoked it to bar
them from the rest of your conference.</p>

<p>Equally, it seems preposterous to argue that the Code of Conduct should extend
to a shopping trip to a supermarket half way across town.  Or to e.g. a group
of attendees who decide to visit a strip club (not my cup of tea, but <em>some</em>
people clearly enjoy that kind of thing, and it’s very likely <em>effectively</em>
banned in the code of conduct you were thinking of using).</p>

<p>And then there are all kinds of questions about whether the Code of Conduct
protects people who are not conference attendees at all, or indeed <em>how</em> it
protects people who <em>are</em> conference attendees against those who are not
(hint: it doesn’t).</p>

<h2 id="scope">Scope</h2>

<p><em>What</em> should be banned?
 <a href="http://confcodeofconduct.com">http://confcodeofconduct.com</a> suggests that
 “harassment includes offensive verbal comments related to… technology
 choices”!  So you could, in theory, be evicted from a conference for making
 rude remarks about PHP (or, I suppose, for calling someone an idiot for using
 it).  That seems a step too far, for sure.</p>

<p>In fact, while we’re about it, <em>what</em> constitutes an “offensive verbal
comment”?  Does it have to meet a reasonable person test?  Would it be
inappropriate to reproduce the cartoons of the Prophet Mohammed?  In all
circumstances?  Are you <em>sure</em>?  Does the whole of the community agree?  Or if
not, does everyone agree to compromise somehow?</p>

<p>And what exactly is “harassing photography”?  Some people are <em>very</em> sensitive
about having their photograph taken (even accidentally), and others <em>much</em>
less so.  Who decides?  Is there a right of appeal?  How many photographs does
one have to take before it becomes harassment?</p>

<p>It’s also worth reflecting that quite a bit of that code of conduct would ban
many well-respected and enjoyable comedy acts outright.</p>

<p>Again, please don’t misunderstand — I am all for conference staff taking
someone aside and explaining that they’re upsetting someone, asking them to
please be sensitive to that person’s concerns, and even if necessary warning
them that they will be ejected if they continue with their behaviour.  What
I’m trying to tease out here is that there is <em>a lot</em> of subjective judgement
involved, and attempting to codify this in a Code of Conduct is fraught with
danger.</p>

<h2 id="legal-certainty">Legal Certainty</h2>

<p>You might <em>think</em> that having a Code of Conduct would create some legal
certainty for organisers when they do decide to act, but if they use the one
at <a href="http://confcodeofconduct.com">http://confcodeofconduct.com</a> they could be
in for a nasty shock.  For instance, as it’s currently worded, it bans
“offensive verbal comments related to… sexual images in public spaces”, rather
than banning sexual images in public spaces as I’m sure its author intended.
Granted, it says “harassment includes…”, so we can be certain that the
definition is not exhaustive, but in cases where contracts are unclear, <em>Common
Law takes the view that they should not be interpreted in a way that
favours the party that drafted them</em>.  My <em>guess</em> is that in court you’d
find that they chose to use the legal definition of “harassment” (whatever
that may be) and then added in anything in the “includes” list, in which case
if you evicted someone for “following” and that person sued to recover their
conference fees (and potentially travel and legal expenses), you might well
find yourself out of luck and out of pocket.</p>

<p>Maybe that’s an argument for getting a lawyer to look over them, but IMO, it
would have been much better to just put into the Terms and Conditions
that the organisers reserve the right to eject attendees for behaviour that
the organisers determine to be detrimental to other attendees or to the
conference as a whole.  I think you’d also want to make it clear what the
procedure for doing that should be — who had the right to make the
decision(s), whether there was an appeal process, under what circumstances
attendees’ money might be refunded and so on.  And on that subject, trying to
keep hold of the entire conference fee whatever is probably a bad idea; the
attendee’s credit card issuer is <em>very</em> likely to side with them, so if you’re
going to try to keep hold of it you’ll only want to do so in cases where you
have solid evidence of their misbehaviour.</p>

<h2 id="protecting-unpopular-people">Protecting “Unpopular” People</h2>

<p>Some people may be unpopular, or may hold views that are unpopular.  You can
certainly discuss this with them in advance if you think it will be a problem,
and ask them not to raise their unpopular views at your conference.  If they
aren’t relevant to the conference itself, they might even agree to that.</p>

<p>Anyway, there are two problems here; the first is that some people appear to
claim that the mere expression of a view with which they strongly disagree is
some form of harassment, in and of itself.  Indeed, there have even been
demands to ban certain people from certain conferences on the grounds that
people are aware that (or think that) they <em>hold</em> certain views, <em>even if</em>
they have promised not to express them at the conference.</p>

<p>The second problem is that unpopular people (or those with unpopular views)
are <em>far</em> more likely to be the targets of false — or at least <em>questionable</em>
— allegations.  I don’t want to pick individual people as examples, so I’ll
stick to generalising here: if a well-known feminist makes a joke about men,
it’s quite unlikely that anyone will complain, and even if they do, quite
unlikely that anyone will do anything about it.  If, however, a similar joke
about women was made by a man, I would <em>expect</em> there to be complaints, and I
would <em>expect</em> that Something Would Be Done.  (I’m not trying to be
anti-feminist here; I’m just observing that, right now, at least in
tech. circles, a fairly muscular form of feminism is popular, and making any
remark that conflicts with or disagrees with that is not.)</p>

<p>It’s also worth reflecting that the first problem includes things like the
views Roman Catholics or Muslims hold about homosexuality, which certainly for
some people meet the definition of “offensive verbal comments related to
sexual orientation”.  While one might argue that people who hold those views
should keep them to themselves for politeness’ sake (and indeed most do), if
someone <em>knows</em> that they hold those kinds of views, they might be tempted to
try to goad them into expressing them in order to trigger the Code of Conduct
and get rid of those people from the conference.</p>

<p>The irony here is that the intention of advocates of Codes of Conduct is
generally to protect minorities, but that in practice they may in some cases
achieve the opposite.</p>

<h2 id="protecting-the-expression-of-unpopular-views-in-some-cases">Protecting the Expression of Unpopular Views (in some cases)</h2>

<p><em>Sometimes</em> it might actually be appropriate to prioritise freedom of speech
over someone else’s right to not be offended.  <em>Sometimes</em> it’s better to let
people debate points of view that they may find challenging or even downright
offensive.</p>

<p>I grant you that at most technology related conferences, this won’t be
relevant, but I find Ashe Dryden’s assertion that this point can be
addressed by stating that “free speech laws do not apply to harassment”
overly simplistic, even leaving aside the obvious point that the United States
Constitution, wonderful as it is, doesn’t actually apply over most of the
surface of the Earth.  It occasionally does all of us good to hear views we
don’t like or agree with, even views we find offensive, if only because it
makes us think.</p>

<p>(FWIW, I can imagine that this might become a problem if you wanted to have a
conference talk or panel about gender politics in technology, which is
something of a live issue at the moment; it’s very likely going to involve,
one way or another, things that someone or other feels are “offensive verbal
comments related to gender”.  If you think not, imagine inviting e.g. Milo
Yiannopoulos to debate with Brianna Wu, assuming you could get them to sit on
the same stage.)</p>

<h2 id="tldr">TL;DR</h2>

<p>All of this is only my opinion, and hopefully I’ve explained above why I think
this way.</p>

<ul>
  <li>
    <p>Organisers <em>certainly</em> should have procedures to deal with poor behaviour by
attendees, or with situations where one attendee is upsetting another
somehow.</p>
  </li>
  <li>
    <p>It would be wise to put these procedures in the Terms and Conditions.</p>
  </li>
  <li>
    <p>It would be wise to train conference staff to follow these procedures
(e.g. insisting that they report complaints up the chain to organisers
until they reach someone you trust to deal with them sensibly).</p>
  </li>
  <li>
    <p>Trying to codify what constitutes good or bad behaviour creates problems,
and it’s probably better to use very general language in your Ts &amp; Cs,
instead of trying to write an explicit Code of Conduct.</p>
  </li>
  <li>
    <p>If someone breaks the law, or is alleged to have done so, you really should
consider letting the police deal with it, whatever your opinion of their
effectiveness might be.</p>
  </li>
  <li>
    <p>Attendees outside of your venue will be exposed to people who are not at
your conference and are not subject to your Code of Conduct at all anyway
(this potentially includes anyone you kick out for violating your Code of
Conduct).  As such, Codes of Conduct <em>do not</em> “protect” attendees.  At best,
if carefully drafted, they may protect conference organisers from future
lawsuits.</p>
  </li>
  <li>
    <p>Whether you have a Code of Conduct or not, you should consult a lawyer to
avoid creating problems for yourself down the road.</p>
  </li>
  <li>
    <p>There is nothing wrong with telling your attendees you expect them to behave
themselves, drawing to their attention the fact that there are children
present, telling them that you expect them not to stream pornography over
the conference WiFi and so on.  This is <em>not</em> the same as having a formal
Code of Conduct.</p>
  </li>
</ul>

<h2 id="still-tldr">Still TL;DR</h2>

<p>Codes of Conduct mainly protect the conference <em>organiser</em> (and only if they
are carefully worded); they don’t protect attendees. Defining what is and is
not acceptable is <em>hard</em>, and boils down to subjective judgement
anyway. Better to put procedures in place, stick them in your Ts &amp; Cs, and
train conference staff appropriately.</p>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[Symbolicating OS X Crash Logs]]></title>
    <link href="https://alastairs-place.net/blog/2015/12/30/symbolicating-os-x-crash-logs/"/>
    <updated>2015-12-30T12:26:00+00:00</updated>
    <id>https://alastairs-place.net/blog/2015/12/30/symbolicating-os-x-crash-logs</id>
    <content type="html"><![CDATA[<p>iOS developers have it easy; to symbolicate an iOS crash log, they can drop
the log onto the Organiser window in Xcode, and — in theory at least — it will
be symbolicated for them.</p>

<p>But on OS X, this doesn’t work.  Moreover, the <code>symbolicatecrash</code> Perl script
that iOS developers could use as an alternative doesn’t understand OS X crash
logs and so will refuse to process them.</p>

<p>You <em>could</em> try using Peter Hosey’s
<a href="https://bitbucket.org/boredzo/symbolicator">Symbolicator</a> package, but it’s a
bit buggy — looking at the code, Peter has misunderstood the “slide”, and it
also can’t cope with Xcode archives containing multiple dSYMs.  I did
contemplate fixing it and submitting a patch, but while I don’t want to be
unkind to Peter, I think I’d end up rewriting rather too much of it in the
process.</p>

<p>You could also try
<a href="http://lldb.llvm.org/symbolication.html">LLDB’s symbolicator</a>, which you use
like this:</p>

<pre><code>$ lldb
(lldb) command script import lldb.macosx.crashlog
"crashlog" and "save_crashlog" command installed, use the "--help" option for detailed help
"malloc_info", "ptr_refs", "cstr_refs", and "objc_refs" commands have been installed, use the "--help" options on these commands for detailed help.
(lldb) crashlog /path/to/crash.log
</code></pre>

<p>This is actually really rather neat, or it would be if it worked.  Unlike
other symbolicators, it annotates the backtrace with your actual source code
(and/or in some cases disassembly) so that you can see where the crash took
place.  Additionally, if you run it as above, within lldb itself, it will set
up the memory map as if your program was loaded.  Very cool.</p>

<p>You will note that I said <em>if it worked</em>.  Because, out of the box, it does
not.  The first problem is that the version shipped by Apple relies on a
script, <code>dsymForUUID</code>, that is not provided and whose behaviour is not
documented anywhere.  I wrote something that should be suitable and put it up
<a href="https://pypi.python.org/pypi/dsymForUUID/0.1.0">on PyPI</a> so you can install
it with e.g.</p>

<pre><code>$ sudo -H pip install dsymForUUID
</code></pre>

<p>(But wait… you might not need to.)</p>

<p>The second problem is that it’s also a bit broken.  It chokes on some crash
logs because they contain tab characters rather than spaces, and it also only
loads the <code>__TEXT</code> segment in the correct place, which makes for a bit of fun
if you need to poke around in one of the other segments.</p>

<p>Anyway, I filed a bug report today about all of this, with a patch attached to
it that fixes these problems.  I’ve also put a copy of
<a href="https://bitbucket.org/al45tair/crashlog/raw/tip/crashlog.py">the fixed <code>crashlog.py</code> file here</a>
so you can download and use it.</p>

<p>In addition to the usage shown on the lldb website, you can, in fact, invoke
it directly from the Terminal prompt, e.g.</p>

<pre><code>$ crashlog.py /path/to/crash.log
</code></pre>

<p>which is a very convenient way to use it in many cases.  Likewise, if you want
to use this version rather than the built-in one, you just need to make sure
it’s in your <code>PYTHONPATH</code>, then you can do</p>

<pre><code>$ lldb
(lldb) command script import crashlog
</code></pre>

<p>to use it in lldb.</p>

<p>The fixed version does <em>not</em> require <code>dsymForUUID</code>, and indeed it’s rather
faster without it, but it <em>can</em> use a <code>dsymForUUID</code> script if you happen to
have one (e.g. because you work at Apple).  To use it with your custom
<code>dsymForUUID</code>, you need to set the <code>DSYMFORUUID</code> environment variable to the
full path of your script.</p>

<h2 id="update-2016-05-04">Update 2016-05-04</h2>

<p>I found an interesting bug in the symbolicator; I’ve uploaded a new
crashlog.py script that fixes it.</p>

<h2 id="update-2016-05-31">Update 2016-05-31</h2>

<p>I’ve moved crashlog.py to
<a href="https://bitbucket.org/al45tair/crashlog">Bitbucket</a>, and added support for
symbolicating the output of the <code>sample</code> command.</p>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[Apple Help in 2015]]></title>
    <link href="https://alastairs-place.net/blog/2015/01/14/apple-help-in-2015/"/>
    <updated>2015-01-14T13:26:03+00:00</updated>
    <id>https://alastairs-place.net/blog/2015/01/14/apple-help-in-2015</id>
    <content type="html"><![CDATA[<p>The last time I had to build a brand new help file was some time ago — maybe
even ten years ago — and in the world of software, that’s an age.</p>

<p>For the past few months I’ve been working hard on a new release of iDefrag,
version 5, and as part of this I’m rewriting the documentation.  Rather than
using hand-written HTML like I did before, I’ve chosen
this time around to use a documentation generator,
<a href="http://sphinx-doc.org">Sphinx</a>.  The advantages of this approach include:</p>

<ul>
  <li>
    <p>Built-in support for indexing and cross-referencing.</p>
  </li>
  <li>
    <p>The ability to write the documententation in plain text.</p>
  </li>
  <li>
    <p>Keeps the presentation details separate from the content (via theming and
templates).</p>
  </li>
  <li>
    <p>Supports multiple output formats, not just HTML.</p>
  </li>
</ul>

<p>The current version of Sphinx doesn’t directly support building Apple Help
Books, but I’ve
<a href="https://github.com/sphinx-doc/sphinx/pull/1675">submitted a pull request to fix that</a>
so hopefully by the time you read this you’ll be able to do</p>

<pre><code>$ sphinx-quickstart
</code></pre>

<p>fill in some fields and then do</p>

<pre><code>$ make applehelp
</code></pre>

<p>to generate a help book.</p>

<p>(If you <em>do</em> do that, you’ll want to edit your <code>conf.py</code> file quite a bit, and
you probably don’t want to use the default theme either.)</p>

<p>Anyway, all of the <a href="http://sphinx-doc.org">Sphinx</a> related stuff was fine, and
worked as documented.  <em>Unlike Apple Help</em>, which doesn’t.  I spent <em>an entire
day</em> struggling to make a help book that actually worked, and most of that is
because of problems with the documentation.</p>

<p>Let’s start with the Info.plist.  Apple gives this not particularly helpful table:</p>

<table style="font-size: small">
  <thead>
    <tr>
      <th>Key</th>
      <th>Exact or sample value</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>CFBundleDevelopmentRegion</td>
      <td>en_us</td>
    </tr>
    <tr>
      <td>CFBundleIdentifier</td>
      <td>com.mycompany.surfwriter.help</td>
    </tr>
    <tr>
      <td>CFBundleInfoDictionaryVersion</td>
      <td>6.0</td>
    </tr>
    <tr>
      <td>CFBundleName</td>
      <td>SurfWriter</td>
    </tr>
    <tr>
      <td>CFBundlePackageType</td>
      <td>BNDL</td>
    </tr>
    <tr>
      <td>CFBundleShortVersionString</td>
      <td>1</td>
    </tr>
    <tr>
      <td>CFBundleSignature</td>
      <td>hbwr</td>
    </tr>
    <tr>
      <td>CFBundleVersion</td>
      <td>1</td>
    </tr>
    <tr>
      <td>HPDBookAccessPath</td>
      <td>SurfWriter.html</td>
    </tr>
    <tr>
      <td>HPDBookIconPath</td>
      <td>shrd/SurfIcn.png</td>
    </tr>
    <tr>
      <td>HPDBookIndexPath</td>
      <td>SurfWriter.helpindex</td>
    </tr>
    <tr>
      <td>HPDBookKBProduct</td>
      <td>surfwriter1</td>
    </tr>
    <tr>
      <td>HPDBookKBURL</td>
      <td>https://mycompany.com/kbsearch.py?p='product'&amp;q='query'&amp;l='lang'</td>
    </tr>
    <tr>
      <td>HPDBookRemoteURL</td>
      <td>https://help.mycompany.com/snowleopard/com.mycompany.surfwriter.help/r1</td>
    </tr>
    <tr>
      <td>HPDBookTitle</td>
      <td>SurfWriter Help</td>
    </tr>
    <tr>
      <td>HPDBookType</td>
      <td>3</td>
    </tr>
    <tr>
      <td>HPDBookTopicListCSSPath</td>
      <td>sty/topiclist.css</td>
    </tr>
    <tr>
      <td>HPDBookTopicListTemplatePath</td>
      <td>sty/topiclist.xquery</td>
    </tr>
  </tbody>
</table>

<p>There are two serious problems with the table above.  The first is that some
of it is wrong(!), and the second is that it doesn’t indicate which values are
sample values and which are required.</p>

<p>Here’s what you actually need:</p>

<table style="font-size: small">
  <thead>
    <tr>
      <th>Key</th>
      <th>Value</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>CFBundleDevelopmentRegion</td>
      <td><strong>en-us</strong></td>
    </tr>
    <tr>
      <td>CFBundleIdentifier</td>
      <td><em>your help bundle identifier</em></td>
    </tr>
    <tr>
      <td>CFBundleInfoDictionaryVersion</td>
      <td>6.0</td>
    </tr>
    <tr>
      <td>CFBundlePackageType</td>
      <td>BNDL</td>
    </tr>
    <tr>
      <td>CFBundleShortVersionString</td>
      <td><em>your short version string - e.g. 1.2.3 (108)</em></td>
    </tr>
    <tr>
      <td>CFBundleSignature</td>
      <td>hbwr</td>
    </tr>
    <tr>
      <td>CFBundleVersion</td>
      <td><em>your version - e.g. 108</em></td>
    </tr>
    <tr>
      <td>HPDBookAccessPath</td>
      <td><strong>_access.html</strong> <em>(see below)</em></td>
    </tr>
    <tr>
      <td>HPDBookIndexPath</td>
      <td><em>the name of your help index file</em></td>
    </tr>
    <tr>
      <td>HPDBookTitle</td>
      <td><em>the title of your help file</em></td>
    </tr>
    <tr>
      <td>HPDBookType</td>
      <td>3</td>
    </tr>
  </tbody>
</table>

<p>The first thing to note is that <code>CFBundleDevelopmentRegion</code> should have a
hyphen, <em>not</em> an underscore.  Apple’s utilities generate this properly, but
the documentation is wrong.</p>

<p>The second thing to note is that in spite of the documentation implying that
you can use your help bundle identifier to refer to your help bundle (which
would, admittedly, make sense), <em>you can’t</em>.  You need to use the <code>HPDBookTitle</code>
value.  Oh, and ignore any references to <code>AppleTitle</code> meta tags.  You don’t
need those.</p>

<p>The third thing relates to <code>HPDBookAccessPath</code>.  The file referred to there
<em>must</em> be a valid XHTML file.  In particular, it <em>cannot</em> be an HTML5 document
— that will simply not work, and the error messages you get on the system
console are completely uninformative.</p>

<p>The best solution I’ve come up with for this particular problem, as I want to
generate modern HTML output, is to make a file called <code>_access.html</code> and put
the following in it:</p>

<pre><code>&lt;!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN" "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd"&gt;
&lt;html xmlns="http://www.w3.org/1999/xhtml"&gt;
  &lt;head&gt;
    &lt;title&gt;Title Goes Here&lt;/title&gt;
    &lt;meta http-equiv="Content-Type" content="text/html; charset=utf-8" /&gt;
    &lt;meta name="robots" content="noindex" /&gt;
    &lt;meta http-equiv="refresh" content="0;url=index.html" /&gt;
  &lt;/head&gt;
  &lt;body&gt;
  &lt;/body&gt;
&lt;/html&gt;
</code></pre>

<p>This means that both <code>helpd</code> and the help indexer (<code>hiutil</code>) are happy, and I
can write my index page using modern HTML.  Incidentally, Apple appears to be
using a similar trick in the help for the current version of Mail.  Obviously
you can change the <code>index.html</code> in the above to whatever you need.</p>

<p>In your application bundle, you need to fill in the following keys</p>

<table style="font-size: small">
  <thead>
    <tr>
      <th>Key</th>
      <th>Value</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>CFBundleHelpBookFolder</td>
      <td><em>The path of your help book relative to <tt>Resources</tt> -
  e.g. SurfWriter.help</em></td>
    </tr>
    <tr>
      <td>CFBundleHelpBookName</td>
      <td><em>The value from <tt>HPDBookTitle</tt>, above</em></td>
    </tr>
  </tbody>
</table>

<p>Note that while the <code>HPDBookTitle</code> <em>is</em> displayed to the user, it can be
localised using <code>InfoPlist.strings</code>.  Note also that you <em>absolutely cannot</em>,
contrary to what the documentation implies, give a bundle ID here.  It just
doesn’t work.  You <em>could</em> however, if you wanted, write an
<code>InfoPlist.strings</code> file like this:</p>

<pre><code>HPDBookTitle = "SurfWriter Help"
</code></pre>

<p>then put the bundle ID in as the <code>HPDBookTitle</code> in the <code>Info.plist</code>.</p>

<p>Oh, and if you think you’re going to be able to double-click a help book to
preview it, think again.  That won’t work.  Instead, you need either to use it
from within your application, or you can put it in
<code>~/Library/Documentation/Help</code> (you might have to make that folder) and
double-click it in there.  Why?  Because help files are indexed and you can
only open them if they’re registered in the index.</p>

<p>One other thing that isn’t really documented at all is what exactly the
<code>HPDBookRemoteURL</code> will do for you.  There’s some handwaving about being able
to offer remote content updates, but how the URL is used is skirted over.
Well, if you <em>do</em> set <code>HPDBookRemoteURL</code>, Help Viewer will essentially expect
it to point at a copy of the <code>Resources</code> folder of your bundle; so if you have
<code>HPDBookRemoteURL</code> set to <code>http://example.com/foo/bar/</code>, then you’re going to
get requests like <code>http://example.com/foo/bar/en.lproj/index.html</code> (and so
on).</p>

<p><strong>Useful update (Feb 29th 2016)</strong></p>

<p>You may have noticed that Help Viewer has a button to toggle the table of
contents in your help file.  Matt Shepherd did a bit of work looking into this
and it turns out that it’s controlled by a Javascript API — 
<a href="https://gist.github.com/mattshepherd/54c66d38be90c2b1acf0">see Matt’s gist for more information</a>.</p>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[January VAT Changes and the VAT Threshold]]></title>
    <link href="https://alastairs-place.net/blog/2014/11/25/january-vat-changes-and-the-vat-exemption/"/>
    <updated>2014-11-25T12:27:47+00:00</updated>
    <id>https://alastairs-place.net/blog/2014/11/25/january-vat-changes-and-the-vat-exemption</id>
    <content type="html"><![CDATA[<p>I’ve just spotted <a href="https://www.change.org/p/vince-cable-mp-uphold-the-vat-exemption-threshold-for-businesses-supplying-digital-products">this petition</a>,
via a retweet from <a href="https://twitter.com/dancounsell">Dan Counsell</a>, and as a
member of HMRC’s Joint SME MOSS Working Group as well as the owner of a
microbusiness I thought I’d make a couple of comments.</p>

<p>It isn’t particularly clear from the petition, but the problem being raised is
that in order to register for the Mini One Stop Shop in the UK, you currently
need to be registered for UK VAT.  <strong>This <em>is</em> something that we have been talking
to HMRC about</strong>, and I have the impression that HMRC is amenable, in
principle, to allowing non-VAT-registered entities to use the Mini One Stop
Shop system, though the details of that have not been worked out.</p>

<p>Note also that your sales here in the UK will continue to be subject to
ordinary UK VAT, and will <em>not</em> be reported through MOSS, and even if your
UK-only sales are below the UK VAT threshold, it’s likely that you have
expenditure in the UK that involves an element of VAT, so you might want to
consider a voluntary registration in any event, in order to reclaim your input
tax.</p>

<p>(There is a related issue within the Mini One Stop Shop itself, in that there
are no thresholds for amounts reported via MOSS.  HMRC did try to negotiate a
threshold, but other member states didn’t support the idea and it was dropped.)</p>

<p>It is also worth pointing out that the Mini One Stop Shop is <strong>optional</strong>.
You don’t have to use it.  The alternatives are:</p>

<ul>
  <li>
    <p>Use a digital “marketplace” (e.g. Apple’s <a href="https://itunes.apple.com/gb/genre/ios/id36?mt=8">App Store</a>,
<a href="https://play.google.com/store?hl=en_GB">Google Play</a>,
<a href="https://www.paddle.com">Paddle</a>).  Marketplace operators, as of the 1st
of January 2015, are <em>required by law</em> to deal with EU VAT for you.  You
will only need to deal with B2B transactions between you and the store
operator.</p>
  </li>
  <li>
    <p>Register for VAT in EU member states into which you are selling.  This
will mean filing multiple VAT returns and complying fully with (up to) 28
different sets of VAT legislation.</p>
  </li>
  <li>
    <p>Use a distributor in EU member states you wish to sell into.  The
distributor is a business, so you only need worry about a B2B sale; B2C
sales will be made by the distributor within the member state(s) in which
it operates.</p>
  </li>
  <li>
    <p>Stop selling to other EU member states.</p>
  </li>
</ul>

<p>For a lot of digital micro-businesses, <strong>the best approach is likely to be to
use a digital marketplace</strong>.  MOSS gets you a single return and a single
payment; unlike using a marketplace or a distributor, <em>it does not free you
from the need to comply with up to 28 different sets of VAT rules</em>, though it
makes doing so considerably simpler in a number of ways.</p>

<p>As regards determining whether your sale is in the EU or not, with very few
exceptions (mostly having to do with e.g. mobile network operators, where
there is an obvious way to tell where the customer is) you need
to keep <em>two</em> non-contradictory pieces of information that identify your
customer’s location.  These might include, for instance</p>

<ul>
  <li>Your customer’s billing address</li>
  <li>The result of IP geolocation</li>
  <li>Your customer’s telephone number</li>
</ul>

<p>If those two pieces of information say your customer is outside the EU, then
it doesn’t matter (from your perspective) if the customer was really stood in
the middle of Brussels at the time; the rules say that you have done what is
expected of you.</p>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[The Bash Bug]]></title>
    <link href="https://alastairs-place.net/blog/2014/09/25/the-bash-bug/"/>
    <updated>2014-09-25T09:38:47+01:00</updated>
    <id>https://alastairs-place.net/blog/2014/09/25/the-bash-bug</id>
    <content type="html"><![CDATA[<p>There are <a href="http://www.theregister.co.uk/2014/09/24/bash_shell_vuln/">lots</a>
of <a href="http://arstechnica.com/security/2014/09/bug-in-bash-shell-creates-big-security-hole-on-anything-with-nix-in-it/">scary</a>
<a href="http://www.cnet.com/news/bigger-than-heartbleed-bash-bug-could-leave-it-systems-shellshocked/">headlines</a> on the Internet today about a bug in <a href="http://www.gnu.org">the GNU Project</a>’s <a href="http://www.gnu.org/software/bash/">Bourne Again Shell</a>
(aka Bash).</p>

<p>Apparently, Bash allows subshells to inherit exported function definitions,
which it implements by passing environment variables with those functions’
names through to subshells, with the value of the variable containing the
function definition.  For instance</p>

<pre><code>outer$ function hello {
&gt; echo "Hello World"
&gt; }
outer$ export -f hello
outer$ PS1="inner$ " /bin/bash
inner$ hello
Hello World
inner$ exit
outer$ export -nf hello
</code></pre>

<p>In this case, the outer shell has exported the function <code>hello</code> to the inner
shell, by setting an environment variable <code>hello</code> to the string <code>() { echo
"Hello World"; }</code>.  We can test this:</p>

<pre><code>outer$ export hello='() { echo "Hello World"; }'
outer$ PS1="inner$ " /bin/bash
inner$ hello
Hello World
inner$ exit
outer$ export -n hello
</code></pre>

<p>On its own, this feature is only harmful if a user can specify the name <em>and</em>
content of an environment variable, and only then if some program is foolishly
trying to run commands without specifying their full path.  For example:</p>

<pre><code>outer$ ls='() { echo "No way, Jose"; }' PS1="inner$ " /bin/bash
inner$ ls
No way, Jose
inner$ /bin/ls
foo.txt    bar.txt
inner$ exit
</code></pre>

<p>However, current versions of Bash contain a bug that causes Bash to execute
trailing statements on environment variables of this form, so for example</p>

<pre><code>outer$ naughty='() { :;}; echo "Oh dear, oh dear"' PS1="inner$ " /bin/bash
Oh dear, oh dear
inner$ exit
</code></pre>

<p>In the above example, the inner shell runs the <code>echo</code> command.  It shouldn’t.</p>

<p>Now, this <em>is</em> potentially a major security hole, but <em>only</em> in certain
circumstances, namely:</p>

<ol>
  <li>
    <p>If a user can set the value of an environment variable, <em>and</em></p>
  </li>
  <li>
    <p>Where a program passes control to a Bash shell and passes that value
through.</p>
  </li>
</ol>

<p>The two most common cases that you <em>might</em> find that allow remote exploitation
of this bug are CGI scripts (the old fashioned kind, not FastCGI, and not
anything run via Apache’s mod_php, mod_perl or mod_python) and OpenSSH if you
were relying on the <code>ForceCommand</code> feature to provide restricted SSH
access. <code>sudo</code>, fortunately, already strips out Bash exported functions (and
has done since 2004), so is not affected.</p>

<p>Put another way, unless you have very old code running on your web servers,
and unless you are doing something like running a public SSH server that
allows restricted log-ins (e.g. to run Git or Subversion via SSH, but nothing
else), the chances are that you <em>aren’t</em> vulnerable to remote exploits based
on this.  You <em>should check</em>, but you <em>should not</em> panic.</p>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[Twitter Is Not Private Chat]]></title>
    <link href="https://alastairs-place.net/blog/2014/09/04/twitter-is-not-private-chat/"/>
    <updated>2014-09-04T11:55:35+01:00</updated>
    <id>https://alastairs-place.net/blog/2014/09/04/twitter-is-not-private-chat</id>
    <content type="html"><![CDATA[<p>Let me say that again: <strong>Twitter is not private chat.</strong></p>

<p>Why do I say this?  Well, because it seems there are people out there who confuse Twitter with services like Glassboard, and think that people they don’t know shouldn’t respond to their tweets.  Or maybe it’s just people who disagree with them; it’s unclear.</p>

<p>There are a few important facts that such people need to be made aware of:</p>

<ol>
  <li>
    <p>People who follow them may retweet their tweets.  As a result they may very well be seen by people who do not follow them, who they do not know and who might disagree with whatever opinion they’ve expressed.</p>
  </li>
  <li>
    <p>By default, your tweets are public.  That being the case, tweeting is like standing on a soap box at Hyde Park Corner, talking loudly to all who will listen.  You don’t get to pick your audience.</p>
  </li>
  <li>
    <p>If you say something on Twitter (or indeed from a soap box at Hyde Park Corner), and someone who sees your tweet (or is listening to you) finds it interesting or controversial, they have every right to reply.  Your “conversation” is not private in any way, shape or form; indeed, it is not actually a conversation.</p>
  </li>
</ol>

<p>If you don’t like the above facts, <em>Twitter has a mode for you</em>; <a href="https://support.twitter.com/articles/14016-about-public-and-protected-tweets">set your account to “protected” tweet mode</a>.  At that point, you <em>do</em> get to screen your followers, who can’t retweet you.</p>

<p>Yes, there are downsides to protected tweet mode.  If you don’t like the way Twitter works, and you don’t want to protect your tweets, post to a blog instead and turn comments off.  Or use a private group chat system like Glassboard.  Alternatively, you will simply have to live with it.</p>

<p>Finally, if you ask on Twitter why people are replying to you when you don’t want them to, and someone points out all of the above, there is absolutely no excuse for threatening or abusing them.</p>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[Code-points Are a Red Herring]]></title>
    <link href="https://alastairs-place.net/blog/2014/06/17/code-points-are-a-red-herring/"/>
    <updated>2014-06-17T10:58:30+01:00</updated>
    <id>https://alastairs-place.net/blog/2014/06/17/code-points-are-a-red-herring</id>
    <content type="html"><![CDATA[<p>Having just read <a href="http://www.raywenderlich.com/73997/swift-language-highlights">Matt Galloway’s article about Swift from an Objective-C developer’s perspective</a>, I have a few things to say, but the most important of them is really nothing to do with Swift, but rather has to do with a common misunderstanding.</p>

<p>Let me summarise my conclusion first, and then explain why I came to it a long time ago, and why it’s relevant to Swift.</p>

<p><strong>If you are using Unicode strings, they should (look like) they are encoded in UTF-16.</strong></p>

<p>“But code-points!”, I hear you cry.</p>

<p>Sure.  If you use UTF-16, you can’t straightforwardly index into the string on a code-point basis.  But why would you want to do that?  The only justification I’ve ever heard is based around the notion that code-points somehow correspond to characters in a useful way.  Which they don’t.</p>

<p>Now, someone is going to object that UTF-16 means that all their English language strings are twice as large as they need to be.  But if you do what Apple did in Core Foundation and allow strings to be represented in ASCII (or more particularly in ISO Latin-1 or any subset thereof), converting to UTF-16 <em>on the fly</em> at the API level is trivial.</p>

<p>What about UTF-8?  Why not use that?  Well, if you stick to ASCII, UTF-8 is compact.  If you include ISO Latin-1, UTF-8 is never larger than UTF-16.  The problem comes with code-points that are inside the BMP, but have code-point values of <code>0x800</code> and above.  Those code-points take <em>three bytes</em> to encode in UTF-8, but only two in UTF-16.  For the most part this affects Oriental and Indic languages, though Eastern European languages and Greek are affected to some degree, as is mathematics and various shape and dingbat characters.</p>

<p>So, first off, UTF-8 is not necessarily any smaller than UTF-16.</p>

<p>Second, and this is an important one too, UTF-8 permits a variety of invalid encodings that can create security holes or cause other problems if not dealt with.  For instance, you can encode NUL (code-point 0) in any of the following ways:</p>

<pre><code>00
c0 80
e0 80 80
f0 80 80 80
</code></pre>

<p>Some older decoders may also accept</p>

<pre><code>f8 80 80 80 80
fc 80 80 80 80 80
</code></pre>

<p>Officially, only the first encoding (<code>00</code>) is valid, but <em>you</em> as a developer need to check for and reject the other encodings.  Additionally, any encoding of the code-points <code>d800</code> through <code>dfff</code> is invalid and should be rejected — a <em>lot</em> of software fails to spot these and lets them through.</p>

<p>Finally, if you start in the middle of a UTF-8 string, you may need to move a variable number of bytes to find the character you’re in, and you can’t tell in advance how many that will be.</p>

<p>For UTF-16, the story is much simpler.  Once you’ve settled on the byte order, you really only need to watch out for broken surrogate pairs (i.e. use of <code>d800</code> through <code>dfff</code> that doesn’t comply with the rules).  Otherwise, you’re in pretty much the same boat as you would be if you’d picked UCS-4, except that in the majority of cases you’re using 2 bytes per code-point, and <em>at most</em> you’re using 4, so you <em>never use more than UCS-4 would</em> to encode the same string.</p>

<p>If you have a pointer into a UTF-16 string, you may <em>at most</em> need to move one code unit back, and that only happens if the code unit you’re looking at is between <code>dc00</code> and <code>dfff</code>.  That’s a much simpler rule than the one for UTF-8.</p>

<p>I can hear someone at the back still going “but code-points…”.  So let’s compare code-points with what the end user things of as characters and see how we get on, shall we?</p>

<p>Let’s start with some easy cases:</p>

<pre><code>0 - U+0030
A - U+0041
e - U+0065
</code></pre>

<p>OK, they’re straightforward.  How about</p>

<pre><code>é - U+00E9
</code></pre>

<p>Seems OK, doesn’t it?  But it could also be encoded</p>

<pre><code>é - U+0065 U+0301
</code></pre>

<p>Someone is now muttering about how “you could deal with that with normalisation”. And they’re right.  But you can’t deal with <em>this</em> with normalisation:</p>

<pre><code>ē̦ - U+0065 U+0304 U+0326
</code></pre>

<p>because there isn’t a precomposed variant of that character.</p>

<p>“Yeah”, you say, “but nobody would ever need that”.  Really?  It’s a valid encoding, and someone somewhere probably would like to be able to use it.  Nevertheless, to deal with that objection, consider this:</p>

<pre><code>בְּ - U+05D1 U+05B0 U+05BC
</code></pre>

<p>That character <em>is</em> in use in Hebrew.  And there are other examples, too:</p>

<pre><code>कू - U+0915 U+0942
कष - U+0915 U+0937
</code></pre>

<p>The latter case is especially interesting, because whether you see a single glyph or two depends on the font and on the text renderer that your browser is using(!)</p>

<p>The fact is that code-points don’t buy you much.  The end user is going to expect all of these examples to count as a single “character” (except, <em>possibly</em> for the last one, depending on how it’s displayed to them on screen).  They are <em>not interested</em> in the underlying representation you have to deal with, and they will not accept that you have any right to define the meaning of the word “character” to mean “Unicode code-point”.  The latter simply does not mean anything to a normal person.</p>

<p>Now, sadly, the word “character” has been misused so widely that the Unicode consortium came up with a new name for the-thing-that-end-users-might-regard-as-a-unit-of-text.  They call these things <em>grapheme clusters</em>, and in general they consist of a sequence of code-points of essentially arbitrary length.</p>

<p>Note that the reason people think using code-points will help them is that they are under the impression that a code-point maps one-to-one with some kind of “character”.  It does not.  As a result, you <em>already</em> have to deal with the fact that one “character” does not take up one code unit, even if you chose to use the Unicode code-point itself as your code unit.  <strong>So you <em>might as well use UTF-16</em>: it’s no more complicated for you to implement, and it’s <em>never</em> larger than UCS-4.</strong></p>

<p>It’s worth pointing out at this point that this is the exact choice that the developers of ICU (the Unicode reference implementation) and Java (whose string implementation derives from the same place) made.  It’s also the choice that was made in Objective-C and Core Foundation.  And it’s the <em>right</em> choice.  UTF-8 is more complicated to process and is not, actually, smaller for many languages.  If you want compatibility with ASCII, you can always allow some strings to be Latin-1 underneath and expand them to UTF-16 on the fly.  UCS-4 is always larger and actually no easier to process because of combining character sequences and other non-spacing code-points.</p>

<p>Why is this relevant to Swift?  Because in <a href="http://www.raywenderlich.com/73997/swift-language-highlights">Matt Galloway’s article</a>, it says:</p>

<blockquote>
  <p>Another nugget of good news is there is now a builtin way to calculate the true length of a string.</p>
</blockquote>

<p>Only what Matt Galloway means by this is that it can calculate <em>the number of code-points</em>, which is a figure that is almost completely useless for any practical purpose I can think of.  The <em>only</em> time you might care about that is if you were converting to UCS-4 and wanted to allocate a buffer of the correct size.</p>

]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[Async in Swift]]></title>
    <link href="https://alastairs-place.net/blog/2014/06/12/async-in-swift/"/>
    <updated>2014-06-12T16:43:39+01:00</updated>
    <id>https://alastairs-place.net/blog/2014/06/12/async-in-swift</id>
    <content type="html"><![CDATA[<p>You may have seen <a href="https://alastairs-place.net/blog/2014/06/09/c-number-like-async-in-swift/">this piece I wrote about implementing something like C#’s async/await in Swift</a>.  While that code did work, it suffers from a couple of problems relative to what’s available in C#.  The first problem is that it only supports a single return type, <code>Int</code>, because of a problem with the current version of the Swift compiler.</p>

<p>The second problem is that you can’t use it from the main thread in a Cocoa or Cocoa Touch program, because <code>await</code> blocks.</p>

<p>As <a href="https://twitter.com/al45tair/status/476031352434077696">I mentioned previously</a> <a href="https://twitter.com/al45tair/status/476116595295924224">on Twitter</a>, to make it work really well involves some shennanigans with the stack.  Anyway, I’m pleased to announce that I’ve been merrily hacking away and as a result you can <a href="https://bitbucket.org/al45tair/async">download a small framework project that implements async/await from BitBucket</a>.</p>

<p>I’m quite pleased with the syntax I’ve managed to construct for this as well; it looks almost as if it’s a native language feature:</p>

<figure class="code"><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class="line-number">1</span>
<span class="line-number">2</span>
<span class="line-number">3</span>
<span class="line-number">4</span>
<span class="line-number">5</span>
<span class="line-number">6</span>
<span class="line-number">7</span>
<span class="line-number">8</span>
<span class="line-number">9</span>
<span class="line-number">10</span>
<span class="line-number">11</span>
<span class="line-number">12</span>
<span class="line-number">13</span>
<span class="line-number">14</span>
<span class="line-number">15</span>
<span class="line-number">16</span>
<span class="line-number">17</span>
<span class="line-number">18</span>
<span class="line-number">19</span>
<span class="line-number">20</span>
<span class="line-number">21</span>
</pre></td><td class="code"><pre><code class=""><span class="line">let task = async { () -&gt; () in
</span><span class="line">  let fetch = async { (t: Task&lt;NSData&gt;) -&gt; NSData in
</span><span class="line">    let req = NSURLRequest(URL: NSURL.URLWithString("http://www.google.com"))
</span><span class="line">    let queue = NSOperationQueue.mainQueue()
</span><span class="line">    var data = NSData!
</span><span class="line">    NSURLConnection.sendAsynchronousRequest(req,
</span><span class="line">                                            queue:queue,
</span><span class="line">      completionHandler:{ (r: NSURLResponse!, d: NSData!, error: NSError!) -&gt; Void in
</span><span class="line">        data = d
</span><span class="line">        Async.wake(t)
</span><span class="line">      })
</span><span class="line">    Async.suspend()
</span><span class="line">    return data!
</span><span class="line">  }
</span><span class="line">
</span><span class="line">  let data = await(fetch)
</span><span class="line">  let str = NSString(bytes: data.bytes, length: data.length,
</span><span class="line">                     encoding: NSUTF8StringEncoding)
</span><span class="line">
</span><span class="line">  println(str)
</span><span class="line">}</span></code></pre></td></tr></table></div></figure>

<p>Now, to date I haven’t actually tried it on iOS; I think it should work, but it’s possible that it will crash <em>horribly</em>.  It is certainly working on OS X, though.</p>

<p>How does it work?  Well, behind the scenes, when you use the <code>async</code> function, a new (<em>very small</em>) stack is created for your code to run in.  The C code then uses <code>_setjmp()</code> and <code>_longjmp()</code> to switch between different contexts when necessary.  If you want to cringe slightly now, be my guest :-)</p>

<p>Possible improvements when I get the time:</p>

<ul>
  <li>Reduce the cost of async invocation by caching async context stacks</li>
  <li>Once Swift is fixed, remove the <code>T[]</code> hack that we’re using instead of declaring the result type in the <code>Task&lt;T&gt;</code> object as <code>T?</code>.  The latter presently doesn’t work because of a compiler limitation.</li>
</ul>

]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[C#-like Async in Swift]]></title>
    <link href="https://alastairs-place.net/blog/2014/06/09/c-number-like-async-in-swift/"/>
    <updated>2014-06-09T18:16:50+01:00</updated>
    <id>https://alastairs-place.net/blog/2014/06/09/c-number-like-async-in-swift</id>
    <content type="html"><![CDATA[<p>Justin Williams was <a href="http://carpeaqua.com/2014/06/08/the-next-five-years/">wishing for C#-like async support</a> in Swift.  I think it’s possible to come up with a fairly straightforward implementation in Swift, without any changes to the compiler, and actually without any hacking either. (If it weren’t for compiler bugs, the code below would be more than just a toy implementation too…)</p>

<p>Anyway, here goes:</p>

<figure class="code"><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class="line-number">1</span>
<span class="line-number">2</span>
<span class="line-number">3</span>
<span class="line-number">4</span>
<span class="line-number">5</span>
<span class="line-number">6</span>
<span class="line-number">7</span>
<span class="line-number">8</span>
<span class="line-number">9</span>
<span class="line-number">10</span>
<span class="line-number">11</span>
<span class="line-number">12</span>
<span class="line-number">13</span>
<span class="line-number">14</span>
<span class="line-number">15</span>
<span class="line-number">16</span>
<span class="line-number">17</span>
<span class="line-number">18</span>
<span class="line-number">19</span>
<span class="line-number">20</span>
<span class="line-number">21</span>
<span class="line-number">22</span>
<span class="line-number">23</span>
<span class="line-number">24</span>
<span class="line-number">25</span>
<span class="line-number">26</span>
<span class="line-number">27</span>
<span class="line-number">28</span>
<span class="line-number">29</span>
<span class="line-number">30</span>
<span class="line-number">31</span>
<span class="line-number">32</span>
<span class="line-number">33</span>
<span class="line-number">34</span>
<span class="line-number">35</span>
<span class="line-number">36</span>
<span class="line-number">37</span>
<span class="line-number">38</span>
<span class="line-number">39</span>
<span class="line-number">40</span>
<span class="line-number">41</span>
<span class="line-number">42</span>
<span class="line-number">43</span>
<span class="line-number">44</span>
<span class="line-number">45</span>
<span class="line-number">46</span>
<span class="line-number">47</span>
<span class="line-number">48</span>
<span class="line-number">49</span>
<span class="line-number">50</span>
<span class="line-number">51</span>
<span class="line-number">52</span>
<span class="line-number">53</span>
<span class="line-number">54</span>
<span class="line-number">55</span>
<span class="line-number">56</span>
<span class="line-number">57</span>
</pre></td><td class="code"><pre><code class=""><span class="line">import Dispatch
</span><span class="line">
</span><span class="line">var async_q : dispatch_queue_t = dispatch_queue_create("Async queue",
</span><span class="line">  DISPATCH_QUEUE_CONCURRENT)
</span><span class="line">
</span><span class="line">/* If generics worked, we'd use Task&lt;T&gt; here and result would be of type T? */
</span><span class="line">class Task {
</span><span class="line">  var result : Int?
</span><span class="line">  var sem : dispatch_semaphore_t = dispatch_semaphore_create(0)
</span><span class="line">    
</span><span class="line">  func await() -&gt; Int {
</span><span class="line">    dispatch_semaphore_wait(sem, DISPATCH_TIME_FOREVER)
</span><span class="line">    return result!
</span><span class="line">  }
</span><span class="line">}
</span><span class="line">
</span><span class="line">func await(task: Task) -&gt; Int {
</span><span class="line">  return task.await()
</span><span class="line">}
</span><span class="line">
</span><span class="line">func async(b: () -&gt; Int) -&gt; Task {
</span><span class="line">  var r = Task()
</span><span class="line">  
</span><span class="line">  dispatch_async(async_q, {
</span><span class="line">    r.result = b()
</span><span class="line">    dispatch_semaphore_signal(r.sem)
</span><span class="line">    })
</span><span class="line">  
</span><span class="line">  return r
</span><span class="line">}
</span><span class="line">
</span><span class="line">/* Now use it */
</span><span class="line">func Test2(var a : Int) -&gt; Task { return async {
</span><span class="line">  sleep(1)
</span><span class="line">  return a * 7
</span><span class="line">  }
</span><span class="line">}
</span><span class="line">
</span><span class="line">func Test(var a : Int) -&gt; Task { return async {
</span><span class="line">  var t2 = Test2(a)
</span><span class="line">  var b = await(t2)
</span><span class="line">  
</span><span class="line">  return a + b
</span><span class="line">  }
</span><span class="line">}
</span><span class="line">
</span><span class="line">var t = Test(100)
</span><span class="line">
</span><span class="line">println("Waiting for result")
</span><span class="line">
</span><span class="line">for n in 1..10 {
</span><span class="line">  println("I can do work here while the function works.")
</span><span class="line">}
</span><span class="line">
</span><span class="line">var result = await(t)
</span><span class="line">
</span><span class="line">println("Result is available")</span></code></pre></td></tr></table></div></figure>

<p>Now, obviously if Swift supported continuations, this might be done more
efficiently (i.e. without any background threads or semaphores), but that’s an
implementation detail.</p>

<p>There are also some syntax changes that would make it cleaner, notably if it
was permissible to remove the <code>{ return</code> and <code>}</code> from the async function
declarations.  I did briefly try to see whether I was allowed to assign to a
function, ala</p>

<pre><code>func Test(var a : Int) -&gt; Task = async { }
</code></pre>

<p>but that syntax isn’t allowed (if it was, <code>async</code> would obviously need to return
a block).</p>

]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[1st January 2015 VAT Changes]]></title>
    <link href="https://alastairs-place.net/blog/2014/04/05/1st-january-2015-vat-changes/"/>
    <updated>2014-04-05T09:22:33+01:00</updated>
    <id>https://alastairs-place.net/blog/2014/04/05/1st-january-2015-vat-changes</id>
    <content type="html"><![CDATA[<p>On the 1st of January 2015, some changes to European Union law come into force
that significantly affect the way that VAT works for “electronic services”
delivered to consumers.  The laws in question were actually changed back in 2008,
but because of obstruction from some member states that benefit from the status
quo, the date at which they came into effect was pushed back by six years.</p>

<p><strong>If you are a software developer selling software in the European Union,
these changes matter to you.</strong> There has been very little publicity thus
far about these changes (that will change as we get closer to the end of the
year), but given that you may need to make changes to your website, it seems
like a good idea to tell you about them now.</p>

<p>So, what’s changing?  Currently, if you are established in the European Union
and you sell downloadable software to a customer who is also in the European
Union, you always charge VAT in <em>your</em> country, following the rules in your
country, and you pay it to the tax authority in your country.  This is simple,
because there is only one set of rules to follow, and it’s the one for your
country.</p>

<p>As of the 1st of January, the VAT will instead be due in the <em>customer’s</em>
country.  If there were no other changes to the rules, you would therefore be
obliged to register for VAT in other member states, according to their rules,
and submit multiple returns every quarter (or at whatever period they
specify).  That means you might have to register with up to 28 member states,
apply 28 different rates, 28 different sets of rules, make 28 times as many
VAT returns and 28 separate payments in difference currencies (with currency
conversions and rounding following different rules in different
jurisdictions).  For a small software company or an independent developer,
this is clearly not going to work.</p>

<p>There are two other changes that are also coming in at the same time that
mitigate this problem.  The first is that app stores will be responsible for
charging and remitting consumer VAT.  Apple already does this, but some other
app stores may not.  Under the new rules, they will have to, so you will only
have to deal with VAT as it applies to transactions between you and the app
store provider.</p>

<p>If you sell direct to consumers, that doesn’t really help, though.  What
<em>will</em> help is that EU member states are going to operate a system known as
the Mini One Stop Shop (or MOSS for short).  This is similar to the scheme
that has been operating for businesses outside of the EU selling to EU
customers, whereby you can register with a single tax authority, submit a
single return to that tax authority, and pay all of the tax due to that one
place.  You are still required to charge VAT at the rate applicable in the
customer’s country, and in various respects the rules in that country will
still apply — with some simplifications.  Registration for this new scheme
starts in October, and, unless you plan on only selling via an app store, you
will probably want to register for it.</p>

<p>The other slight complication is that after 1st of January, you will need to
keep two non-conflicting pieces of evidence to identify the location of your
customer.  HMRC has indicated, at least in the case of the U.K., that they
will be fairly relaxed about this evidence — so, for instance, they realise
that IP geolocation may not be 100% accurate, and that some customers may lie
and give you false details.  It also does not matter if you have more data
that conflicts with your two non-conflicting pieces of evidence; all you need
is those two.  However, this affects <em>all</em> of your sales, not just those to
customers in the EU, since it applies equally to your decision not to charge
VAT to customers because they are not in any EU member state.</p>

<p>Why am I telling you about this?  Because I’m a member of H.M. Revenue and
Customs’ MOSS Joint SME Business/HMRC Working Group.  Those of you who are in
the UK, if you have queries about the scheme, or issues you would like to
raise with HMRC, please do get in touch and I’ll try to help out.  (If you are
a member of <a href="http://www.tiga.org">TIGA</a>, they have a couple of
representatives on the working group also, so you can talk to them too.)</p>

<p>Finally, I will add that the law changes are already made — back in 2008 — so
the scope for changing the rules at this stage is very limited.  What we <em>can</em>
influence to some extent is how they’re enforced and whether HMRC is aware of
problems the new rules may cause us.</p>

<p>I’ll be posting some more on this topic over the coming weeks and months.</p>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[Dmgbuild - Build '.dmg' Files From the Command Line]]></title>
    <link href="https://alastairs-place.net/blog/2014/02/17/dmgbuild/"/>
    <updated>2014-02-17T10:54:30+00:00</updated>
    <id>https://alastairs-place.net/blog/2014/02/17/dmgbuild</id>
    <content type="html"><![CDATA[<p>I’ve just released a new command line tool, <code>dmgbuild</code>, that automates the
creation of (nice looking) disk images from the command line.  There are no
GUI tools necessary; there is no AppleScript, and it doesn’t rely on Finder,
or on any deprecated APIs.</p>

<p>Why use this approach?  Well, because everything about your disk image is
defined in a plain text file, you’ll get the <em>same</em> results every time; not
only that, but the resulting image will be the same no matter what version of
Mac OS X you build it on.</p>

<p>If you’re interested, the
<a href="https://pypi.python.org/pypi/dmgbuild">Python package is up on PyPI</a>, so you
can just do</p>

<pre><code>pip install dmgbuild
</code></pre>

<p>to get the program (if you don’t have pip, do <code>easy_install pip</code> first; or
download it from PyPI, extract it, then run <code>python setup.py install</code>).
You can also
<a href="http://dmgbuild.rtfd.org">read the documentation</a>, or
<a href="http://bitbucket.org/al45tair/dmgbuild">see the code</a>.</p>

<p>It’s <em>really</em> easy to use; all you need do is make a settings file
(<a href="http://dmgbuild.readthedocs.org/en/latest/example.html">see the documentation for an example</a>)
then from the command line enter something like</p>

<pre><code>dmgbuild -s my-settings.py "My Disk Image" output.dmg
</code></pre>

<p>The code for editing <code>.DS_Store</code> files and for generating Mac aliases has been
split out into two other modules, <code>ds_store</code> and <code>mac_alias</code>, for those who
are interested in such things.  The <code>ds_store</code> module should be fully portable
to other platforms; the <code>mac_alias</code> module relies on some OS X specific
functions to fill out a proper alias record, and on other systems those would
need to be replaced somehow.  The <code>dmgbuild</code> tool itself relies on <code>hdiutil</code>
and <code>SetFile</code>, so will only work on Mac OS X.</p>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[Bit-rot and RAID]]></title>
    <link href="https://alastairs-place.net/blog/2014/01/16/bit-rot-and-raid/"/>
    <updated>2014-01-16T10:28:00+00:00</updated>
    <id>https://alastairs-place.net/blog/2014/01/16/bit-rot-and-raid</id>
    <content type="html"><![CDATA[<p>There’s <a href="http://arstechnica.com/information-technology/2014/01/bitrot-and-atomic-cows-inside-next-gen-filesystems/">an interesting article on Ars Technica about next-generation filesystems</a>, which mentions something it calls “bit rot” — allegedly the “silent corruption of data on disk or tape”.</p>

<p>Is this a thing?  Really?  Well, no, not really.</p>

<p>Very early on, disks and tapes were relatively unreliable and so there have
basically always been checksums of some description to let you know if data
you read is corrupted.  Historically, we’re talking about some kind of
per-block cyclic redundancy check, which is why one of the error codes you can
receive at a disk hardware interface is “CRC error”.</p>

<p>Modern disks actually use error correcting codes such as Reed-Solomon Encoding
or Low-Density Parity Check codes.  A single random bit error under such
schemes can be corrected, end of story.  They may be able to correct multiple
bit errors too, and these codes can detect more errors than they are able to
correct.</p>

<p>The upshot is that a single bit flip on a disk surface won’t cause a read
error; in fact, the software in your computer won’t even notice it because the
hard disk will correct it and rewrite the data on its own.</p>

<p>It takes multiple flipped bits to cause a problem, an in most cases this will
result in the drive reporting a failure to the operating system when trying to
read the block in question.  The probability of a multi-bit failure that can
get past Reed-Solomon or LDPC codes is <em>tiny</em>.</p>

<p>The author then goes on to make a ludicrous claim that RAID won’t be able to
deal with this kind of event, and “demonstrates” by flipping “a single bit” on
one of his disks to make his point.  Unfortunately, this is a completely bogus
test.  He has, in fact, flipped at <em>many</em> more bits than just the one, and
he’s done so by writing to the disk, which will encode his data using its
error correcting code, resulting in a block that reads correctly because he’s
actually stored the wrong data there deliberately.</p>

<p>The fact is that, in practice, when an unrecoverable data corruption occurs on
a disk surface, the disk returns an error when something tries to read that
block.  If a RAID controller gets such an error, it will attempt to rebuild
the data using parity (or whatever other redundancy mechanism it’s using).</p>

<p>So RAID <em>really does</em> protect you from changes that occur on the disk itself.</p>

<p>Where RAID does <em>not</em> protect you is on the computer side of the equation.  It
doesn’t prevent random bit flips in RAM, or in the logic inside your machine.
Some components in some computers have their own built-in protection against
these events — for instance, ECC memory uses error correcting codes to prevent
random bit errors from corrupting data, while some data busses themselves use
error correction.  If you <em>are</em> seeing random bit flips in files that
otherwise read OK, it’s much more likely they were introduced in the
electronics or even via software bugs and written in their corrupted form to
your storage device.</p>

<p>An aside: programmers generally use the term “bit rot” to refer to the fact
that unmaintained code will often at some point stop working because of
apparently unrelated changes in other parts of a large program.  Such modules
are said to be suffering from “bit rot”.  I’ve never heard it used in the
context of data storage before.</p>
]]></content>
  </entry>
  
</feed>
