Digging Into My Facebook Data

Piles of Books

Facebook has been in the spotlight lately, over a variety of issues related to how they collect and use the data of their “members”. Which means they’re doing a lot of apologizing and tinkering with their system, hoping to avoid more negative publicity and political interference.

But even without the recent problems, Facebook would be making alterations to their data policies, because of new laws in the European Union that go into effect next month. Among other features, the General Data Protection Regulations (GDPR) will give citizens of the EU the right to see the data companies have collected on them.

Which is probably one reason why Facebook is now offering a way to download a copy of the information they have on you. You’ll find a link to make the request under General Account Settings.

If you’re an active Facebook user, be prepared for a large file. They will be sending your entire timeline, all the messages you’ve sent and received, every photo and video you’ve uploaded, and more.

My file, however, was not large at all, a zipped file of 74kb.

Although I registered for a Facebook account ten years ago, I’ve never posted anything in that time1 and very rarely comment on the posts of others. The only reasons I open the app a few times a month are to see the latest photos from friends and relatives, and to read new comics from Bloom County. I’m just not very social I guess.

In fact, the only even slightly interesting part of my Facebook data is in the Ads section, where we find a list of advertisers with my contact info. First advertiser: Cyndi Lauper. Farther down is Rod Stewart. Very odd.

The rest of the list includes a few companies I use regularly or from whom I’ve requested information. And many sites dealing with crowdfunding I’ve never heard of. I’m very sure I did not click on any ads for these firms in Facebook or on articles related to them.

All of which leads to a basic question: why did Facebook send my information to those advertisers? What did their algorithms find in my bland profile and very sparse timeline that lead to those matches? I suspect some of this data came from the harvesting Facebook does on other websites.

Anyway, check out the data Facebook has stored in your account. You may find something even more interesting.


The image is piles of old fashioned data taken by Michael Coghlan, posted to his Flickr account, and used under a Creative Commons license.

1. Ok, maybe not never. I found one post I made in April 2010: “Still on my ongoing effort to figure out the appeal of Facebook and why I would want to spend time on it. At least the iPad makes it easier than than the iPhone app. :-)”. I’m still working on that.

The Problem Is Greater Than Facebook

Following up on the previous post, a few more random thoughts related to the current Facebook data security mess.

First, the problem with the collection and use of personal data extends far beyond Facebook. Google, Twitter, Instagram, WhatsApp1, SnapChat, and many other social media companies all offer services you don’t pay for.

All make money through selling you, their “members”, to advertisers. All have long, legally detailed terms of service, which you agreed to (even if you didn’t read it), that allow them to use your contributions and data in pretty much any way they want. Which brings up copyright issues that are a whole ‘nother rant.

But it’s not just social media collecting your data. Plenty of companies that charge for products and services – Apple, Samsung, Amazon, your phone and cable companies, your supermarket, gas station, and big box stores (remember your loyalty card?) – collect valuable data on your buying habits. And pretty much anything else they can find. Information they can use to make even more profits.

It will be interesting to see whether Europe’s new data security laws, which take affect in May, will impact the behavior of Facebook and the others. One major goal of the legislation is to give users more control over their data, including the ability to have some of it deleted. Facebook and other data-driven companies, on the other hand, are dependent on users willingly giving over their information and not caring what happens next. 

Over here in the US, despite calls for investigation and pending lawsuits, our current laws probably don’t cover this situation. It’s also very unclear what new regulations on Facebook and other social media companies would look like, considering the long tradition of free speech rights in this country. Plus, if actual data breaches of the past are any indication, there isn’t a lot of political will to do anything related to consumer protection.

I’ve seen many calls on Twitter and elsewhere to delete your Facebook accounts. That’ll show them. Except it probably won’t since the people who actually follow through is a very, very small fraction of their overall membership. Plus, Facebook will still have your data and has the infrastructure in place to continue following you around the web.

On top of everything else, Facebook makes it very difficult to actually delete an account. Bill Fitzgerald, my go-to guy for understanding data security and privacy issues, has some recommendations for people who want to try. If you’d rather continue using Facebook, check out Wired’s guide to the complicated world of their privacy and security settings.

Finally, when Mark Zuckerberg’s name comes up in the news, does anyone else picture Jesse Eisenberg in The Social Network? Considering Zuck’s shall we say “relaxed” attitude towards the privacy of his customers, I’m beginning to think the portrayal of him in that film wasn’t all that far from real life. Maybe he needs to hire Eisenberg to front him and get Aaron Sorkin to write the script. Certainly would be more entertaining.


Cartoon is by the wonderful Randall Munroe, posted at his site xkcd and used under a Creative Commons license. Check out his book What If? in which he answers absurd hypothetical questions with real science.

1. Instagram and WhatsApp are both owned by Facebook.

Selling Your Personal Data Is Their Business

Grid

You probably noticed that Facebook was in the headlines again this week.

Social media, TV pundits, and politicians were outraged over high profile investigative reports in the New York Times and the Guardian claiming that personal information on 50 million Facebook users had been harvested by a researcher in 2014 and used to create targeted political ads for the trump campaign.

The details, of course, are far more complicated.1

For one thing, too many reports are calling what happened a “data breach”, often comparing it in some way to the Experian story from last year. But the term breach implies that someone outside of Facebook, in this case a researcher for the UK-based data analysis company Cambridge Analytica, broke in and stole the information.

In fact, the researcher followed Facebook’s rules and only collected information from something like 270,000 users, all of whom consented to the process. Then, thanks to the Facebook terms of service and API2 that applied in 2014, he was also able to harvest data from all of their friends, which brings us to the 50 million number most often quoted.

So, rather than having personal data stolen, Facebook gave it away. Or more likely, sold it.

Because that is their business model. It’s why the company has a market cap of around half a trillion dollars and CEO Zuckerberg has a net worth north of $60 billion.3

Facebook is very successful at collecting data from it’s more than two billion active members and then selling it to advertisers. Cambridge Analytica was one more advertiser and it didn’t matter that their ads were misleading and dishonest (at best). As long as the funds transfer went through.

Whatever you call this particular abuse of member data, it’s only the latest in a long string of arrogant and clueless decision the company has made over it’s short history. And, even with new privacy laws in Europe and Congress critters fighting over the opportunity to hold hearings, it probably won’t be the last.

And this is as good a time as any to again point out two facts about Facebook that anyone with an account should remember (but probably doesn’t):

1. Facebook is a multinational corporation not a community. Communities are built by people and, while it’s possible to create one using an online platform, the company itself is not going to make it happen.

2. Facebook membership is free. Which means you are not the company’s customer; you are the product they sell to advertisers. Monetizing your content and data is their first, maybe their only, concern.


I’m not sure the image has anything to do with this story.

1. In addition to the two articles linked above (the Times piece is probably a little better), Wired has done some of the best analysis of this story. This piece is a good place to begin.

2. API is application programming interface, the rules established by tech companies that allow outside code to communicate with their systems. In most cases, companies like Facebook provide very specific instructions as to what can be done with APIs.

3. Both took a big hit on Monday when Facebook’s stock dropped hard after investors spent the weekend digesting the Times and Guardian reports from Friday.

The European Approach to Protecting Your Data

Almost four years ago, the highest court in the European Union (EU)1 ruled that citizens of member countries had a “right to be forgotten”. Of course, that ruling left some holes and more than a few questions. But it did trigger some increasingly public conversations around the general topic of privacy and personal data.

That discussion, paired with some massive data breeches at high profile companies, led the EU Parliament to create a new set of laws2 dealing with data security and privacy. Those rules, the General Data Protection Regulations (GDPR), will become effective in the EU beginning in May.

In general, the GDPR sets strict guidelines for the kind of data that can be collected from individuals by companies and organizations, and how that data can be used. That data includes anything that can be used to specifically identify a person (including social media posts, location info, photographs, etc.), as well as not so obviously personal information like race, religion, and politics.

GDPR also requires companies to obtain more specific consent from the user as well as explaining more clearly how their data will be used. Specifically excluded is vague language like “Improving users’ experience”, “marketing purposes”, or “future research”. Companies must also make it easy for users to withdraw their consent and are then required to delete the material they’ve collected. 

So what has any of this got to do with those of us not living in Europe? Plenty.

While the regulations are specific to the member countries of the EU, most of what I’ve read about them suggest that all of us in the US, and elsewhere in the world, will likely be affected by them.

The law applies to any company or organization that does business in the EU member countries and collects personal data from their citizens. That includes many based in the US, familiar names like Facebook, Google, Microsoft, Apple, and more. Since most multinational corporations shuffle information around the world, it’s very likely that they will need to adapt their data handling practices everywhere, not just in Europe.

Plus the law also also provides for some pretty hefty penalties for misusing or failure to secure the data, including fines of up to €20 million or 4% of “global turnover”, whichever is larger. To put that in some perspective €20m (about $24m US at the moment) is pocket change for Facebook. 4% of their total income is not.

I know, all of this is pretty geeky stuff.

However, it’s also important if you’re concerned about the data most companies are already collecting about you and others. If you’re interested in more details of the GDPR in basic, non-legal language, check out this rough guide to GDPR and/or this short summary directed at US corporations.

Of course, the EU laws are not perfect. There will likely be much confusion when they take effect, and when the first law suits follow not long after. It will be interesting to see whether the big data collectors will be forced to change their behavior. Or will they just find new ways to continue their current practices? After all, our information is the foundation of their massive profits.

Beyond that, there’s also the larger question of whether the US should implement similar laws? It’s not likely to happen in this political climate, with political “leaders” who claim that the “free market” will protect us all. But maybe some outside pressure on US-based companies may effect some need change.


The map is from the BBC, showing the current configuration of the European Union. Of course, their home country, the United Kingdom, is in the process of a very contentious “Brexit” from the EU, so that map could change in 2019. In more than one way if the people of Scotland and Northern Ireland make some hard decisions.

1. Very tangential side note: I love that the official anthem of the EU is based on Beethoven’s “Ode to Joy”. Certainly more uplifting music than the militaristic tones of most national anthems.

2. In some of what I’ve read, experts says that GDPR isn’t so much “new” law as it is a clarification of many different data and privacy regulations that are already on the books, combined with court rulings. Either way, GDPR is likely going to change the way companies do business in the EU, and possibly elsewhere.

The Surveillance Classroom

During the 2016 holiday season, Amazon’s Alexa devices were huge sellers. Google was second in the category with Home. Apple just started shipping their Siri-enabled Homepod and they will probably sell a bunch of them.

So now tens of millions of homes have always-listening internet-connected microphones listening to every sound, and more are coming. This despite the many cautions from privacy experts about allowing large corporations to have access to a new continuous stream of auditory data. 

But who cares if the artificially intelligent software powering these devices is buggy? Does it matter that Amazon, Google, and Apple are vague about how they are using that information and who has access to it? Let’s bring these boxes into the classroom!

Michael Horn, co-author of Disrupting Class, the hot education-change book from a decade ago, says Alexa and her friends is “the next technology that could disrupt the classroom”.

It’s not entirely clear why Horn believes a “voice-activated” classroom would improve student learning. Other than that the superintendent he has interviewed is concerned that kids “will come in and will be used to voice-activated environments and technology-based learning programs”.

That’s nothing new. For a few decades (at least) we have been throwing technology into the classroom based on the premise that kids have the stuff at home. That approach hasn’t been especially successful, and Alexa is not likely to change that.

But these days, a major reason for using many, if not most, new classroom technologies is collecting and analyzing data.

These devices could also send teachers real-time data to help them know where and how they should intervene with individual students. Eastwood imagines that over time these technologies would also know the different students based on their reading levels, numeracy, background knowledge, and other areas, such that it could provide access to the appropriate OER content to support that specific child in continuing her learning.

Maybe I’m wrong but I think it’s better to have a teacher or other adult listening to kids.

Anyway, Horn presents a lot of questions about the use of Alexa and her peers in the classroom but his last one is probably the most salient: “What is the best use of big data and artificial intelligence in education?” Before ending, he also very briefly touches on the security of that data – “And there are bound to be privacy concerns.”. As I said, briefly.

But the bottom line to all this is whether we want Amazon, Google, or Apple surveillance devices collecting data on everything that happens in the classroom. Horn seems to think the technology could be disruptive. It sounds creepy and rather invasive to me.


The image is from an article about a contest Amazon is running for developers, with cash prizes for the best Alexa apps that are “educational, fun, engaging or all of the above for kids under the age of 13”.