Google, YouTube, Data

« previous entry | next entry »
Jul. 3rd, 2008 | 06:53 am

While reading my RSS feeds this morning I picked up this:
YouTube Must Give All User Histories To Viacom

After Scientology's DMCA request on Slashdot we made an active choice to squash data on users to limit the possibility of this sort of request. We randomized incoming trackable data on users and tossed everything but aggregate data for long term storage.

Why?

For one, we simply did not need to keep terabytes of log data sitting around collecting dust. Secondly, while the data might be useful for determining trends we risked our user's privacy. We believed this was unacceptable. One court request and we could be handing over who knows what to any company that could find an uneducated judge to sign away the privacy of millions. The data was just not that valuable.

What would I like to see?

Sites handing control of data retention over to users.

It would be good to see more sites give users the opportunity to have their tracking information removed after a period of time from companies databases.

We could start a trend by having websites publish data retention policies .

So what would it take to make this happen? Would peer pressure work? I am not in favor of creating more laws.

What if we petition sites to make steps in this direction. A few at a time, with a goal of long term of putting peer pressure on sites that do not follow the lead of user privacy oriented sites.

Is this too much too ask for?

Link | Leave a comment | Add to Memories | Share

Comments {7}

(no subject)

from: jamesd
date: Jul. 4th, 2008 12:04 am (UTC)
Link

It's already the law here. A business is required to disclose why it wants information, not use it for purposes other than that and not to retain it any longer than required for those business purposes.

Now, time for me to consider using another law: compelling Google to provide to me all personally identifiable data it holds about me for a fee of about $20. Using my IP address as the personal identifier.

Viacom clearly has the technical capability to go from an IP address to a user based on the IP address and the ISP so the assertion that an IP address is not personally identifying is pretty ridiculous.

Someone also seems to have screwed up: IP addresses that have been modified so that one IP maps to a unique identifier that is not itself an IP address would presumably satisfy Viacom's information need without disclosing the actual IP addresses. I don't see why Google could not have offered that option and it seems that it most certainly should do so now since it's just suffered a significant loss of MY trust by possibly leaking some IP-related information it holds about me.

Peer pressure won't work. LiveJournal has a feature that marks entries as not for archiving by search engines. Google circumvents that by grabbing the RSS feed and ignoring the HTML setting.

Reply | Thread

Brian "Krow" Aker

(no subject)

from: krow
date: Jul. 4th, 2008 01:32 am (UTC)
Link

Who knows how Google stores IP data, they may already do that.

But does everyone? No way.

Reply | Parent | Thread

Public Libraries

from: cedarmulberry
date: Jul. 4th, 2008 12:44 am (UTC)
Link

Public Libraries started doing this right after the Patriot Act came out-- most no longer maintain records of what patrons have checked out in the past, because they can't hand over what they don't have.

There was actually quite a shredding-frenzy at public libraries right after the act came out, quickly followed by policies to prevent data from being retained in the first place.

Reply | Thread

Problem is poor judgement, although data retention practices would help

from: fungau
date: Jul. 4th, 2008 04:31 am (UTC)
Link

This is IH from isohunt.com. Brian, you should have talked with robbat2 before. I've went through this same process with our lawsuit brought by the MPAA so I thought I'll share some of my thoughts on the issue.

As unreasonable as I think these lawsuits by Viacom/MPAA may be, in order for the legal process to work, plaintiffs are entitled to evidence in order to prove their case. However, user privacy should be a large concern in disclosing of data (logs) as evidence, and in neither Youtube's or our case, there's no reason for turning over data that would expose your personal identify (such as your IP address). From glancing the order against Youtube, the reason they were ordered to turn over user histories is to prove user infringements, and inclusion of IP's in such logs is to uniquely identify users who may have signed up multiple usernames/accounts. I call bullshit on that. If someone uses multiple usernames, he can as easily login with multiple IP addresses, disclosing IP's would not help the plaintiffs in proving copyright infringements. I expect Google/Youtube to appeal the order (at least I sure hope so).

As for us, we successfully argued in our MPAA case that we don't need to turn over your IP addresses as it is a violation of user privacy with no evidentiary value, and only turned over .torrent access logs in anonymized form. You may not like to hear that .torrent logs are being turned over, but the truth is we were ordered to do so and that the MPAA does need anonymous logs to prove their frivolous lawsuit.

More at http://isohunt.com/forum/viewtopic.php?t=134054

Reply | Thread

Data Information

from: hallesey
date: Jul. 5th, 2008 01:09 pm (UTC)
Link

I am from the U.K., and here we have had problems with discs filled with confidential information going missing (whether by accident or not), and I am very wary of giving out information to authorities except that which is required by law, i.e. taxes and insurance. My own opinion is that this U.S. law should be attacked tooth and nail, as goodness where the information will land up.

Reply | Thread

backing off?

from: axehind
date: Jul. 15th, 2008 03:39 pm (UTC)
Link


Looks like viacom may be backing off some
http://news.bbc.co.uk/2/hi/technology/7506948.stm

"Google, owners of YouTube, will now hand over the database but without data that could identify users."

"We are pleased to report that Viacom, MTV and other litigants have backed off their original demand for all users' viewing histories and we will not be providing that information," said a statement on the YouTube blog.

Reply | Thread

The problems with stopping tracking of any kind

from: cheuer
date: Aug. 17th, 2008 07:43 am (UTC)
Link

Realistically, the problem is not that the judges may or may not be uneducated. The problem is the Patriot act. Due to the fact that many of the provisions in the patriot act are being shot down, they government may or may not be resorting to an old tactic of their's to get people's private information in a legal manner, namely indirect purchasing of data in which they had no part in either the prompting nor the actual collection of said data.

For example, if the government recorded down everywhere that you went and used, say your cell phone, then they are invading your privacy. However, your phone company already does this. So, what the government does from time to time is purchase said data for a specific reason. This they can do, since they had no role in the prompting of the collection, nor the collection. They use this method to track many of the suspects for treason, espionage, etc.

Therefore, seeing as how more and more people are trying to get all of the personal information out of the sites in which most people tend to spend their free time, youtube, which is now a subset of google, one of the most used sites in the world, as an example, and then they have all of this data sitting around which, unless the government has no desire for it, will undoubtedly make its way to the government in one form or the other.

Thus, the only cure to situations like this is not by petitions to the sites who record said information, but the the Senators and representatives that allow such invasion of privacy to continue for private business. The law as it is written only protects the people from the government, and from people like themselves. This means that while it does protect from the government from collecting your information, and you don't have that pervert of a neighbor stalking your every move, you do have tiny tidbits of relative data lying about that can be pieced together to make a larger picture. Until this vast gap in the legal system is fully addressed to where not only the state and the peeping tom next door can track your movements, but the corporations also may not track your actions without explicit consent, to both the tracking and the later usage for said information, then we will constantly be offered issues such as these, where one corporation sues another, and in the end screws over the customers/patrons of the second corporation all to get information that it will later discard due to lack of a use, but desire to turn as much of profit as possible.

Now I will take a coder's perspective.
The only way to reference someone in the means of a code is to create a unique path if you will to that person. That is why many people can browse a site at the same time. In order to get this information however, the code of the website has to invade our privacy and gather some near perfectly unique id tag. And what it grabs is your ip. Thus no matter where we go, at some point or another our data does get written to a file, even if it is for a transient nano-second as the file re-writes itself constantly to hide said information, the fact remains that at one time or another, it was there, and can be either stolen, or given. The only way to eliminate this recording is to make it so that the code never references a single, unique reference tag, but instead simply does.

To put this in a model for those that do not understand code or any of what I just said, imagine a classroom. In that classroom there are many students currently accessing the resource of the room, the grand taskmaster called the teacher. This would be the "code" of the website. So if are the teacher, how do you address the student? By their name. How do you know what language? Where they are from/what they look like. So basically, an ip is the name and language info in short of the computer your currently on.

But it is impossible to change the fact that someone somewhere is recording what you should do, because A) its what lets you sleep in peace at night, and B) its the only reason why you can stay awake at 4 in the morning and not having going, Knock Knock, answer the door mister terrorist, we know your plotting something, as you play Hello Kitty Island Paradise or w/e you play at 4 in the mor

Reply | Thread