cloudseer + politics   6

Jeff Bezos’s Patent Reform Ideas
Jeff Bezos has a few excellent ideas for how to reform our patent system:

Much (much, much, much) remains to be worked out, but here’s an outline of what I have in mind:

1. That the patent laws should recognize that business method and software patents are fundamentally different than other kinds of patents.

2. That business method and software patents should have a much shorter lifespan than the current 17 years — I would propose 3 to 5 years. This isn’t like drug companies, which need long patent windows because of clinical testing, or like complicated physical processes, where you might have to tool up and build factories. Especially in the age of the Internet, a good software innovation can catch a lot of wind in 3 or 5 years.

3. That when the law changes, this new lifespan should take effect retroactively so that we don’t have to wait 17 years for the current patents to enter the public domain.

4. That for business method and software patents there be a short (maybe 1 month?) public comment period before the patent number is issued. This would give the Internet community the opportunity to provide prior art references to the patent examiners at a time when it could really help. (Thanks to my friend Brewster Kahle for this suggestion.)

Two and four are brilliant. Reducing patent lifespans to 3-5 years would instantly make our current patent problems much smaller, because not only would patents be invalidated rather quickly, but because their lifespan is so short, people would have much less reason to file them in the first place.

By the way, note the date on this.
links  Politics  Web  shared  from google
august 2011 by cloudseer
What Happened to the Imperial Presidency?
The War Powers Resolution forbids American forces to stay in a war zone for more than 60 days without authorization from Congress. May 20th was the 60th day of U.S. involvement in the war in Libya.

The Office of Legal Counsel and the Pentagon’s general counsel both advised President Obama that the U.S.’s involvement in NATO operations against Qaddafi amounted to hostilities, and thus the administration would need to end American involvement or receive approval from Congress. Obama overruled them, and insisted our involvement doesn’t amount to hostilities:

But Mr. Obama decided instead to adopt the legal analysis of several other senior members of his legal team — including the White House counsel, Robert Bauer, and the State Department legal adviser, Harold H. Koh — who argued that the United States military’s activities fell short of “hostilities.” Under that view, Mr. Obama needed no permission from Congress to continue the mission unchanged.

We are providing surveillance and refueling for NATO planes, two things that are absolutely required to continue operations, and we continue to use unmanned drones to fire missiles at targets on the ground. It is the President’s contention that those three things do not amount to “hostilities”?

The executive branch routinely has disagreements with Congress, but what’s ridiculous about this is just how quiet the left has been about it. This sounds remarkably similar to what Democrats rightly criticized the Bush administration for doing, but we’ve heard little more than silence. Apparently, an “imperial presidency” is only a concern if it isn’t their guy in the White House.
links  Politics  shared  from google
june 2011 by cloudseer
“My Student, the Terrorist”
After being accused of quartering a supporter of al Qaeda, American citizen Syed Fahad Hashmi was extradited from Britain and placed in solitary confinement:

The federal government established SAMs in 1996 for gang leaders and other crime bosses with demonstrated reach in cases of “substantial risk that an inmate’s communication or contacts with persons could result in death or serious bodily injury to persons.” After September 11, the Justice Department began using SAMs pretrial, with wide latitude to wall off terrorism suspects before they had been convicted of anything.

Fahad was allowed no contact with anyone outside his lawyer and, in very limited fashion, his parents—no calls, letters, or talking through the walls, because his cell was electronically monitored. He had to shower and relieve himself within view of the camera. He was allowed to write only one letter a week to a single member of his family, using no more than three pieces of paper. One parent was allowed to visit every two weeks, but often would be turned away at the door for bureaucratic reasons. Fahad was forbidden any contact—directly or through his lawyers—with the news media. He could read only portions of newspapers approved by his jailers—and not until 30 days after publication. Allowed only one hour out of his cell a day, he had no access to fresh air but was forced to exercise in a solitary cage.

The government cited Hashmi’s “proclivity for violence” as the reason for such harsh measures—even though he had no criminal record and was not charged with committing an actual act of violence or having any demonstrated reach outside of prison. Given the number of people convicted of a violent crime behind bars in the United States, “proclivity for violence” seemed an implausible justification for the harsh measures.

After nearly three years of solitary confinement, he plead guilty to providing material support to terrorism in April 2010. He was never allowed to review evidence held against him, because it was deemed classified.

The federal government alleged that he provided support to al Qaeda by allowing an acquaintance to stay at his apartment in London with luggage filled with “military gear”—raincoats, ponchos, and water-proof socks, apparently—that later delivered it to al Qaeda in Pakistan. This acquaintance, by the way, testified against him in court.

He was sentenced to 15 years in prison. He is now being held at Colorado’s Supermax prison, in solitary confinement.

Perhaps, as the government argued, Hashmi was radicalized and was attempting to support terrorism. It’s certainly possible; while pursuing his degree in political science in New York, he advocated Muslim religious law as a “utopian” society and called the U.S. the world’s largest terrorist. I don’t deny that he may very well have been a threat, and imprisoning him based on these rather flimsy charges may have prevented greater crimes.

But I don’t know. It may also be that an innocent man—a man who advocated a political system I strenuously disagree with and that runs counter to our system, yes, but an innocent man—is now being held in solitary confinement, serving out a 15-year sentence. The only people who know are Hashmi and the federal government officials which have access to the evidence held against him, evidence we have never seen.

The government is asking us to trust them; I am generally inclined to do so on issues related to terrorism, but this isn’t acceptable. Not only is this a slippery slope, but we may have already slid down it: there may very well be an innocent man rotting away in prison. That’s too much power for the federal government to hold. We are not very far away from the situation in China, where the government uses state secrets laws to throw dissenters in jail. The only thing which prevents the federal government from abusing “Special Administrative Measures” (SAMs) and classified evidence rules is their own moral rectitude, and if that is all we have, we are in trouble indeed.

We are at risk of a government with arbitrary power, where the law does not define their power and constrain it, but rather enables them to do as they please.
links  Politics  World  shared  from google
april 2011 by cloudseer
Crowdsourced document analysis and MP expenses
As you may have heard, the UK government released a fresh batch of MP expenses documents a week ago on Thursday. I spent that week working with a small team at Guardian HQ to prepare for the release. Here’s what we built:

http://mps-expenses2.guardian.co.uk/

It’s a crowdsourcing application that asks the public to help us dig through and categorise the enormous stack of documents—around 30,000 pages of claim forms, scanned receipts and hand-written letters, all scanned and published as PDFs.

This is the second time we’ve tried this—the first was back in June, and can be seen at mps-expenses.guardian.co.uk. Last week’s attempt was an opportunity to apply the lessons we learnt the first time round.

Writing crowdsourcing applications in a newspaper environment is a fascinating challenge. Projects have very little notice—I heard about the new document release the Thursday before giving less than a week to put everything together. In addition to the fast turnaround for the application itself, the 48 hours following the release are crucial. The news cycle moves fast, so if the application launches but we don’t manage to get useful data out of it quickly the story will move on before we can impact it.

ScaleCamp on the Friday meant that development work didn’t properly kick off until Monday morning. The bulk of the work was performed by two server-side developers, one client-side developer, one designer and one QA on Monday, Tuesday and Wednesday. The Guardian operations team deftly handled our EC2 configuration and deployment, and we had some extra help on the day from other members of the technology department. After launch we also had a number of journalists helping highlight discoveries and dig through submissions.

The system was written using Django, MySQL (InnoDB), Redis and memcached.

Asking the right question

The biggest mistake we made the first time round was that we asked the wrong question. We tried to get our audience to categorise documents as either “claims” or “receipts” and to rank them as “not interesting”, “a bit interesting”, “interesting but already known” and “someone should investigate this”. We also asked users to optionally enter any numbers they saw on the page as categorised “line items”, with the intention of adding these up later.

The line items, with hindsight, were a mistake. 400,000 documents makes for a huge amount of data entry and for the figures to be useful we would need to confirm their accuracy. This would mean yet more rounds of crowdsourcing, and the job was so large that the chance of getting even one person to enter line items for each page rapidly diminished as the news story grew less prominent.

The categorisations worked reasonably well but weren’t particularly interesting—knowing if a document is a claim or receipt is useful only if you’re going to collect line items. The “investigate this” button worked very well though.

We completely changed our approach for the new system. We dropped the line item task and instead asked our users to categories each page by applying one or more tags, from a small set that our editors could control. This gave us a lot more flexibility—we changed the tags shortly before launch based on the characteristics of the documents—and had the potential to be a lot more fun as well. I’m particularly fond of the “hand-written” tag, which has highlighted some lovely examples of correspondence between MPs and the expenses office.

Sticking to an editorially assigned set of tags provided a powerful tool for directing people’s investigations, and also ensured our users didn’t start creating potentially libellous tags of their own.

Breaking it up in to assignments

For the first project, everyone worked together on the same task to review all of the documents. This worked fine while the document set was small, but once we had loaded in 400,000+ pages the progress bar become quite depressing.

This time round, we added a new concept of "assignments". Each assignment consisted of the set of pages belonging to a specified list of MPs, documents or political parties. Assignments had a threshold, so we could specify that a page must be reviewed by at least X people before it was considered reviewed. An editorial tool let us feature one "main" assignment and several alternative assignments right on the homepage.

Clicking “start reviewing” on an assignment sets a cookie for that assignment, and adds the assignment’s progress bar to the top of the review interface. New pages are selected at random from the set of unreviewed pages in that assignment.

The assignments system proved extremely effective. We could use it to direct people to the highest value documents (our top hit list of interesting MPs, or members of the shadow cabinet) while still allowing people with specific interests to pick an alternative task.

Get the button right!

Having run two crowdsourcing projects I can tell you this: the single most important piece of code you will write is the code that gives someone something new to review. Both of our projects had big “start reviewing” buttons. Both were broken in different ways.

The first time round, the mistakes were around scalability. I used a SQL “ORDER BY RAND()” statement to return the next page to review. I knew this was an inefficient operation, but I assumed that it wouldn’t matter since the button would only be clicked occasionally.

Something like 90% of our database load turned out to be caused by that one SQL statement, and it only got worse as we loaded more pages in to the system. This caused multiple site slow downs and crashes until we threw together a cron job that pushed 1,000 unreviewed page IDs in to memcached and made the button pick one of those at random.

This solved the performance problem, but meant that our user activity wasn’t nearly as well targeted. For optimum efficiency you really want everyone to be looking at a different page—and a random distribution is almost certainly the easiest way to achieve that.

The second time round I turned to my new favourite in-memory data structure server, redis, and its SRANDMEMBER command (a feature I requested a while ago with this exact kind of project in mind). The system maintains a redis set of all IDs that needed to be reviewed for an assignment to be complete, and a separate set of IDs of all pages had been reviewed. It then uses redis set intersection (the SDIFFSTORE command) to create a set of unreviewed pages for the current assignment and then SRANDMEMBER to pick one of those pages.

This is where the bug crept in. Redis was just being used as an optimisation—the single point of truth for whether a page had been reviewed or not stayed as MySQL. I wrote a couple of Django management commands to repopulate the denormalised Redis sets should we need to manually modify the database. Unfortunately I missed some—the sets that tracked what pages were available in each document. The assignment generation code used an intersection of these sets to create the overall set of documents for that assignment. When we deleted some pages that had accidentally been imported twice I failed to update those sets.

This meant the “next page” button would occasionally turn up a page that didn’t exist. I had some very poorly considered fallback logic for that—if the random page didn’t exist, the system would return the first page in that assignment instead. Unfortunately, this meant that when the assignment was down to the last four non-existent pages every single user was directed to the same page—which subsequently attracted well over a thousand individual reviews.

Next time, I’m going to try and make the “next” button completely bullet proof! I’m also going to maintain a “denormalisation dictionary” documenting every denormalisation in the system in detail—such a thing would have saved me several hours of confused debugging.

Exposing the results

The biggest mistake I made last time was not getting the data back out again fast enough for our reporters to effectively use it. It took 24 hours from the launch of the application to the moment the first reporting feature was added—mainly because we spent much of the intervening time figuring out the scaling issues.

This time we handled this a lot better. We provided private pages exposing all recent activity on the site. We also provided public pages for each of the tags, as well as combination pages for party + tag, MP + tag, document + tag, assignment + tag and user + tag. Most of these pages were ordered by most-tagged, with the hope that the most interesting pages would quickly bubble to the top.

This worked pretty well, but we made one key mistake. The way we were ordering pages meant that it was almost impossible to paginate through them and be sure that you had seen everything under a specific tag. If you’re trying to keep track of everything going on in the site, reliable pagination is essential. The only way to get reliable pagination on a fast moving site is to order by the date something was first added to a set in ascending order. That way you can work through all of the pages, wait a bit, hit “refresh” and be able to continue paginating where you left off. Any other order results in the content of each page changing as new content comes in.

We eventually added an undocumented /in-order/ URL prefix to address this issue. Next time I’ll pay a lot more attention to getting the pagination options right from the start.

Rewarding our contributors

The reviewing experience the first time round was actually quite lonely. We deliberately avoided showing people how others had marked each page because we didn’t want to bias the results. Unfortunately this meant the site felt like a bit of a ghost town, even when hundreds of other people were actively reviewing thing[…]
crowdsourcing  django  guardian  innodb  memcached  mpsexpenses  mysql  nosql  politics  projects  python  redis  shared  from google
december 2009 by cloudseer
Four short links: 14 October 2009
10Gui Video -- demo of a new take on multitouch, a tablet and new GUI conventions. (via titine on Twitter)
Behind the Scenes at WhatDoTheyKnow -- numbers and stories from the MySociety project, which provides a public place for Official Information Act requests and responses. The fact information is subject to copyright and restrictions on re-use does not exempt it from disclosure under the Freedom of Information Act (though there is a closely related exemption relating to “commercial interest”). Occasionally public bodies will offer to reply to a request, but in order to deter wider dissemination of the material they will refuse to reply via WhatDoTheyKnow.com. Southampton University have released information in protected PDF documents and the House of Commons has refused to release information via WhatDoTheyKnow.com which it has said it would be prepared to send to an individual directly.
The View from HadoopWorld (RedMonk) -- fascinating glimpse into the Hadoop user and developer world. Hadoop can be used with a variety of languages, from Perl to Python to Ruby, but as Doug Cutting admitted today, they’re all second class citizens relative to Java. The plan, however, is for that to change. Which can’t happen soon enough, in my view. It’s not that there’s anything intrinsically wrong with Java, or its audience. The point, rather, is that there are lots and lots of dynamic language developers out there that would be far more productive working in their native tongue versus translating into Java.
Be Lucky, It's an Easy Skill to Learn (Telegraph) -- this one resonated with me, as it ties into some life hacking I've been doing lately. And so it is with luck - unlucky people miss chance opportunities because they are too focused on looking for something else. They go to parties intent on finding their perfect partner and so miss opportunities to make good friends. They look through newspapers determined to find certain types of job advertisements and as a result miss other types of jobs. Lucky people are more relaxed and open, and therefore see what is there rather than just what they are looking for. (via Hacker News)
gov20  hadoop  lifehacks  multicore  multitouch  mysociety  politics  ui  web  shared  from google
october 2009 by cloudseer

Copy this bookmark:



description:


tags: