Facebook’s Fall from Grace

Following the attack at a mosque in Christchurch in which 50 people were murdered, New Zealand’s Prime Minister Jacinda Ardern called on Facebook to do better;

“They are the publisher, not just the postman. It cannot be a case of all profit, no responsibility.”

She has a point, during the shooting in Christchurch the shooter live streamed his rampage through two mosques. I have seen a couple of screen grabs from the video and the images look like a very graphic shooter game. We now know that the first man to see him at the first mosque greeted him with the words “Welcome, Brother” and presumably this greeting was recorded on the live stream. It’s now illegal to publish the video stream in New Zealand, and the article where I saw these images has been taken down. To give Facebook credit once the New Zealand police alerted them I understand their Global Escalations Teams worked to remove instances of the live stream from their platform. But technically, under US law, they cannot be held responsible in court.

The video may still be out there, I’m not interested in seeing it but when researching for this article I found an interesting autocomplete in a google search, and it seems the effort to remove the video was not perfect.

In the Easter shootings across Sri Lanka which had a significantly higher death toll, their government worked quickly to block social media, and continue to circumscribe citizens’ use of social media. It’s not the first time the Sri Lankan government have blocked social media due to concerns about the spread of extremism via social media sadly.

How is this possible?

Social media platforms have benefited from a piece of US law, section 230 of the US Communications Decency Act which says;

“No provider or user of an interactive computer service shall be treated as the publisher or speaker of any information provided by another information content provider”

It’s an important part of maintaining free speech on the internet and it means I’m not liable for comments someone leaves on this blog, and nor is WordPress. The EFF explains in more detail.

More scandal

This isn’t the only issue Facebook has been faced with, last year they admitted to a security breach that may have affected 90 million accounts.

There are also growing concerns about health impacts as research piles up about the harmful impact of social media, particularly on children. There’s also evidence that anti-vaccination activists are targeting ads to people likely to be wavering on the vaccination question, and the number of Measles outbreaks keeps growing.

More famously their algorithms have undermined democracy in at least two countries. This is via the link to Cambridge Analytica, here’s how that worked as explained by journalist Carole Cadwalladr;

With all this scandal, how is the company doing?

Well. Facebook is doing well.

Revenue continues to grow, user numbers continue to grow. User numbers have apparently levelled off slightly in the US and in Europe, but it’s not clear that this is due to scandals.

Facebook currently makes more than 1.6 million USD per employee, 98% of their revenue is from advertising (2018 annual figures).  Which begs the question of just who the customer is. Remember that they don’t pay for any of the content placed on Facebook – in contrast to, say, a glossy magazine like Vogue which at least provides some content to dilute the advertisements. So we, the users are the content providers and our attention is the commodity sold to advertisers.

Regulation Required

It seems this isn’t a problem that the free market can solve. We’re now living with a platform that is with us 24/7, pulls together a global community of almost half the world’s population, and holds data on our every move – and tends to seek more data rather than less. One way that Facebook has grown is by acquiring Instagram and WhatsApp, and the company is now so rich that it can buy any competitor thus stifling innovation. Governments have seen the impact on their country – in Sri Lanka, in New Zealand with devastating effects – and in their elections. During the campaigning to appeal the 8th amendment in Ireland Facebook banned all ads that were funded from outside Ireland, showing that it is possible to contain the damage of foreign influence. The EU put the GDPR legislation in place, in an attempt to protect citizens against the power that Facebook and other social media companies have accrued, in response Facebook moved millions of accounts from Irish servers to US servers – out of the reach of EU legislation.

The US is also stepping up, with the FTC investigating Facebook’s use of personal data and a hefty 5 billion USD fine looming over the company. Even that might not be enough, there’s a bipartisan call for tougher protections on consumer privacy.

I started writing this post in December, it’s been re-written more than any other post I’ve ever made, but every time I thought I was ready to hit publish something else happened. I nearly delayed again to analyse the information coming out of F8 and more analysis on the appearance of a change in Facebook’s policy on privacy, there’s a pretty good analysis on the Vergecast – they’re not convinced and nor am I.

Image via pixabay

Data for dummies: 6 data-analysis tools anyone can use

I’m a big fan of measurement, metrics and visualisations. I’m not a big fan of the amount of work it can take to set up those metrics. So I was excited back in 2013 to find an article listing six tools to help visualise data, my original post is below, and the original article I referenced is still online.

Five years on and the tools are mostly still around.

  1. BigML
    The company still exists and provides comprehensive data analysis, visualisation and predictions. There’s strong integrations with other tools and programming languages. They’re aiming at the corporate market with enterprise pricing options. There’s still a free option and a subscription option for individuals but their focus seems to be more at the company scale.
  2. Google Fusion Tables
    Disappeared? I suspect some of the data features are still available, but the tool itself has been retired.
  3. Infogram
    Makes infographics in a really simple way, but need to upgrade for any downloads. I so want this tool at work though, I made the infographic scorecard on the right in a matter of minutes.
  4. Many Eyes
    Doesn’t seem to be a stand alone product any more, but has been absorbed into IBM’s data tools, that’s fine, but IBM focuses on the corporate market – and is often at the pricier end of the market in my experience.
  5. Statwing
    Statwing was bought by Qualtrics in 2016, it remains a stand alone tool with the same focus on statistics, you’ll get insights into data, be able to apply statistical tests easily but the output isn’t as designed as some of the other tools.
  6. Tableau
    As luck would have it I now have access to the subscription version of this at work but have yet to play with it. It’s a powerful tool for visualising data and you can wander through the public gallery to see what the possibilities are – including a visualisation of the shape of national happiness. The great thing is that you can always drill down into the data.We will be doing some stakeholder analysis early next year and I think this tool will be a great way to visualise the results.

Five years since my original post and I’m still geekily enthusiastic about data visualisation tools.

 


I’ve spent about an hour playing with these tools, I’m loving Statwing, and will use it to analyse some of the data we’ve got on adoption of new technology. The Infogram tool also has potential to help present data in a more appealing way.

 

Image: data

Take the Survey

 

CM2017_02_survey.png

Creating a good survey, one that gives you robust results, takes skill. In a former life I worked for a data analytics company where a team worked on creating surveys for consumers where I gained an appreciation of the skill. I have sincere worked with online surveys. Here are some aspects of survey design to consider.

Sample Size

Imagine you want to know whether Dutch people prefer dark or milk chocolate. The population of the Netherlands is 16.8 million. How many of them do you need to ask?

It turns out, not that many. If I collected data from 1067 people I could be 95% sure that my answer as correct with a margin of error of 3%. That means that if 70% choose milk chocolate the answer in the general population will lie between 67 and 73%. So if you’re a chocolate manufacturer you now know to make most of your flavours based on milk chocolate.

You can be more sure of the answer the further the outcome is from 50%. For the chocolate maker an answer of 47-53% would still be useful, but it’s problematic if you’re predicting political outcomes.

Once upon a time I knew the maths behind these calculations, now I just use an online calculator

Sample Selection

Your sample should reflect your target population as much as possible. This may involve excluding some people from  participating – if you are researching hair care products you don’t need bald men in your sample. For wider. issues it is more likely that you will try to construct a sample that mirrors the total population in terms of gender, race, age, income, family status, religion, location, gender identity and sexuality. That’s not easy. The further you are from your target group the less reliable the outcome of your survey.

Method Bias

Your method of collecting data may introduce bias, if you are collecting data by calling domestic numbers during working hours you exclude working people. If you collect data online you exclude those not on the Internet, and limit respondents to the small group that find your website.

If you are collecting data online you need to control for bots, and you may want to limit the number of times a respondent can answer.

Question construction

To get useful data from your survey you need to construct your questions to be neutral, unambiguous, not leading and specific.

Neutral

“Do you smoke cigarettes?” Is neutral

“Are you a filthy smoker?” Is not.

Unambiguous

It should be clear what information you ar seeking in your question; there are two traps to avoid here.

  • Asking two things in one question

“how friendly and helpful was your customer agent today?” Asks two things, and it’s impossible to decide how to answer if your customer agent solved the problem but was grumpy on the phone with you. You need to split this into two questions.

  • Using negatives

“Do you disagree that raising taxes won’t create jobs?” Is confusing. Rewrite this to ask “Do you agree that…  ?” to simplify it

Avoid Leading Questions

Leading questions contain details that indicate the expected answer.

“When will you start offering free upgrades?” assumes that you will offer free upgrades.

Specific

You will get more accurate and useful data if you ask specifics.

“Do you eat chocolate regularly?” doesn’t tell you much since ‘regularly’ means different things. Much better to ask “how often do you eat chocolate?” and give people a series of ranges to choose from.

What led to this post? A friend posted a strange survey from the President of the United States that breaks every single one of these rules, and a few others.

Here’s the title page of the survey, given that it was sent out after the press conference where the press was repeatedly called “Fake news” the title is clearly priming you to doubt the accountability of the media.

screen-shot-2017-02-20-at-16-45-37

The survey was sent to known Republican supporters, yet the President represents all Americans. The questions are certainly not neutral, and some are just confusing. Here’s the most confusing;
screen-shot-2017-02-21-at-16-05-14

And here’s the most ironic, given that we have already seen that the President uses “alternative facts“, misleading statements and untruths.

screen-shot-2017-02-20-at-16-41-47
 All of which is to say that when the Presidential PR machine talks about having data showing how people don’t trust mainstream media remember his data collection is flawed and the results cannot be trusted.

Images; Question mark |  qimono via pixabay  |   CC0 1.0 

Believe Data

We were sailing back to our home port and a dense fog descended. Suddenly we couldn’t see more than a boat length ahead. My father, a mariner by profession, plotted a course and steered by it, sending my brother and me forward as lookouts.

My mother was convinced we were sailing in the wrong direction, that we’d steered off course (and this was before the reassurance of GPS). “No,” said my father “you must trust your instruments”.

We made it safely home; it was an early lesson in believing data.

The amount of data produced and collected every day continues to grow. “Big Data” is a well-known, although poorly understood term. In many companies we’ve moved on to “data-driven decisions”. But we’re not always good at believing the data.

I was in a meeting recently where the most senior person in the room looked at a graph of twitter follower growth and said “I just don’t believe this data”. The data showed that goals for follower numbers would not be met. Leaving aside the argument on whether follower numbers is a good goal, the data don’t lie. If there’s a straight line of progress that won’t reach the goal then you need to change something or accept missing the goal.

It made me think about when we believe data and when we should be sceptical.

We tend to measure progress against an expected path, and in a large organisation invariably report that progress upwards in the organisation. In our plans and projections that progress follows a nice upward curve. But the reality is different, every project encounters setbacks, and the graph is more jagged than smooth.

In fact a smooth graph, where targets are always met should raise questions.

Years ago I was chatting to a guy who left his previous company after about four months. He left because the targets for the quarter were increased by 25%, and everyone met them. As an experienced business person he knew that a situation where every business unit met the stretch goal in the first quarter it was applied was very very unlikely. His suspicions were raised and he left as quickly has he could. A year later the company collapsed under its own lies. The company? Enron.

In his articles (and books) Ben Goldacre campaigns for greater journalistic care in reporting data, and better education on scientific method. He points to the dangerous habit of pharmaceutical companies in cherry-picking their data, choosing studies that support their product and ignoring those that don’t.

I said earlier that we should trust the data, but we also need to know how the data was collected, what errors might be inherent in the data collection methodology, and what limits there might be to interpreting the data. This should be part of everyone’s mental toolkit. It would help us evaluate all those advertising claims, refute 90% of the nonsense on the internet, be honest about progress to goals, and finally make data-driven decisions.

 

Image; Data via pixabay

 

 

I Think You’ll Find It’s a Bit More Complicated Than That

I Think You’ll Find It’s a Bit More Complicated Than That
Ben Goldacre

This is a romp through Dr. Goldacre’s analysis of weak claims and poorly reported science. He argues that journalists should cite, and link to, the sources of the research behind the headlines. He also argues that we, the unsuspecting public should know how to read scientific studies for ourselves, and we should question the reports rather than swallow the conclusions whole.

So if you’ve ever read a science-y headline and thought to yourself “that doesn’t sound right” this book is for you. It takes a look at scientific method and points out some of the pitfalls in constructing a good experiment and in the process gives some pointers about what to look for when evaluating a scientific story;

  • Who funded the study?
  • How well was the experiment designed?
    • sample size
    • scientific method; was there a simple
    • testing a single hypotheses
  • Cherry Picking the data; does the report use a small group of reports to prove a point rather than all research?

In the past three weeks three cases have popped up in social media that prove the need to both hold journalists to a higher standard and to educate us all.

(1) Proving nothing; A Swedish family ate organically for two weeks, and tests showed a drop in the concentration of pesticides in their urine.

So the family had their urine tested for various pesticides on their usual diet, then ate organic food for two weeks, then tested the urine again. Their urine was tested daily over the two weeks and by the end there was almost no pesticide in the urine.

Note that “organic” doesn’t mean pesticide-free, so the family could still have consumed some pesticide with their organic meals. The article doesn’t report on whether that was tested for.

Which the article calls a ” staggering result”. No, not staggering, school level biology. You could do the exact same test with vitamin C. Give people a high vitamin C diet for a month, then remove vitamin C from their diet. Hey presto! No vitamin C in the urine.

This report hits the trifecta; small sample size, poor design, funded by a supermarket with a range of organic foods. Essentially this “experiment” simply proved that the Swedish family have well-functioning kidneys.

(2) Faked Data; There was a really interesting study done on the attitudes to same-sex marriage. It concluded that conversation with a gay surveyor/canvasser could induce long-term attitude change. The study seemed to be well constructed, with a good data set supporting the conclusion. The optimistic news was widely reported late last year when the study was released.

But when scientists started digging into the data, and trying to replicate the results something didn’t stack up. The study has now been retracted by one of the authors, it seems there will be a further investigation.

It’s not always the journalists at fault.

(3) We’re easily fooled; Daily dose of chocolate helps you lose weight.

Before you rush out to buy a week’s supply of your favourite chocolate bars, it’s not true.

But it turns out that it’s rather easy to generate the research and result to prove this, and extremely easy to get mainstream media to report on it. As John Bohannon proved in setting up this experiment and the associated PR.

So there can be flaws or outright fraud in science. Journalists can, on occasion, twist the story to deliver the headline. And we, the public are ready to believe reports that re-inforce our own opinions, and we’re too ready to believe good news about chocolate.

Turns out if it sounds too good to be true we should ask more questions.

Many of the articles in this book are already published in the Guardian, and if you want to read more on bad science Dr. Goldacre has his own site with the helpfully short title; Bad Science. He campaigns for greater journalistic responsibility on reporting science, for using the scientific method to test policy decisions, and for better education on scientific method.

He’s right, on all three.

“Let the Data set change your Mind Set”

Another fantastic TED talk on data visualisation, and how it might help us understand the enormous amounts of complex information we’re facing.

I love working on visualisation of information, and have had mild success in simplifying problem statements or project goals into single images. I admit I get a kick out of the moment when the data/information “clicks” into place and the diagram becomes clear and simple. I get another click when someone else’s response is along the lines of “ah, now I get it”.