960
Latent semantic analysis - Wikipedia, the free encyclopedia
Analyzing relationships between a set of documents and the terms they contain by producing a set of concepts related to the documents and terms.
nlp  search  svd 
12 days ago
Facebook's 900 Million? But What about Engagement?
Average Minutes per Visit
fb=10.9
pinboard=9.6

Average Monthly Visits per Visitor
f=36
twitter=7.5
p=5.9
L=4.9

Total Minutes
f=62,160
t=844
l=705
socialnetworks  statistics  usage  comScore 
12 days ago
Secretary problem - Wikipedia, the free encyclopedia
The optimal stopping rule prescribes to reject about n/e applicants after the interview (where e is the base of the natural logarithm) without choice then stop at the first applicant who is better than every applicant interviewed so far (or proceed to the last applicant if this never occurs).
mathematics  statistics  probability 
19 days ago
Anscombe's quartet - Wikipedia, the free encyclopedia
Anscombe's quartet comprises four datasets that have identical simple statistical properties, yet appear very different when graphed.
data  statistics  outliers 
19 days ago
Carl Meyer: Books
Author of 'Google's PageRank and beyond : the science of search engine rankings'
books  authors  search  google 
22 days ago
Amy N. Langville's homepage
Author of 'Google's PageRank and beyond :
the science of search engine rankings'
authors  books  search  google 
22 days ago
About John | John Battelle's Search Blog
Author of 'The search : how Google and its rivals rewrote the rules of business and transformed our culture' / John Battelle.
books  search  authors  google 
22 days ago
Spicy Grilled Salmon | Chowguys
2 boneless salmon fillets or steaks
1 Cup of Fresh cilantro
3 Large spoons of Extra virgin olive oil
1/2 Lemon juice
2 Cloves of Garlic, minced
1 Generous pinch of ground coriander
1 Generous pinch of Fresh ginger, peeled and minced
1 Generous pinch of ground cinnamon
1 Generous pinch of ground cardamom
Generous pinch of Kosher salt
Generous pinch of Black Pepper
fish  salmon  recipes  food 
22 days ago
Grilled Wild Salmon with Preserved Lemon Relish | Simply Recipes
Ingredients

One whole (4 pounds) or half (2 pounds) of a wild-caught salmon, gutted, cleaned, skin on, scales removed
Olive oil
Fresh squeezed lemon juice

Relish Ingredients

2 whole preserved lemons, rinsed of excess salt, seeds removed, chopped
1/4 cup chopped fresh parsley
1/4 cup chopped fresh dill
1/4 cup chopped shallots
1 teaspoon olive oil
1/2 teaspoon lemon juice
Ground black pepper
grilling  salmon  recipes  fish 
22 days ago
Clayton Christensen's "How Will You Measure Your Life?" — HBS Working Knowledge
Blockbuster's mistake? To follow a principle that is taught in every fundamental course in finance and economics. That is, in evaluating alternative investments, we should ignore sunk and fixed costs, and instead base decisions on the marginal costs and revenues that each alternative entails. But it's a dangerous way of thinking. Almost always, such analysis shows that the marginal costs are lower, and marginal profits are higher, than the full cost.

This doctrine biases companies to leverage what they have put in place to succeed in the past, instead of guiding them to create the capabilities they'll need in the future. If we knew the future would be exactly the same as the past,that approach would be fine. But if the future's different—and it almost always is—then it's the wrong thing to do. As Blockbuster learned the hard way, we end up paying for the full cost of our decisions, not the marginal costs, whether we like it or not.
innovation  business  strategy 
23 days ago
Getting Started Using HTML5 Boilerplate - Dan Wahlin's WebLog
eatures offered in HTML5 Boilerplate include cross-browser compatibility (they deal with IE6, IE7 and IE8 in a clever way), inclusion of caching and compression rules, utility classes such as .no-js and .clearfix, .png support in IE6, Modernizr support, Google analytics support, mobile browser optimizations, IE specific classes for maximum cross-browser control, JavaScript profiling and testing support, CDN hosted jQuery with a local fallback script, plus quite a bit more.
css  html5  html 
24 days ago
HTML5 Boilerplate - A rock-solid default template for HTML5 awesome.
HTML5 Boilerplate is the professional frontend developers's base HTML/CSS/JS template for a fast, robust and future-safe site.
css  html  html5  javascript 
24 days ago
NLTK Data
NLTK has built-in support for dozens of corpora and trained models, as listed below. To use these within NLTK we recommend that you use the NLTK corpus downloader, >>> nltk.download()
nlp  nltk  nltk_data  corpus 
25 days ago
Ch. 5: Categorizing and Tagging Words
n.

The process of classifying words into their parts of speech and labeling them accordingly is known as part-of-speech tagging, POS-tagging, or simply tagging. Parts of speech are also known as word classes or lexical categories. The collection of tags used for a particular task is known as a tagset. Our emphasis in this chapter is on exploiting tags, and tagging text automatically.
nltk  nlp  pos_tagging 
25 days ago
Installing NLTK — NLTK 2.0 documentation
Open Finder>Applications>Utilities>Terminal and type python -V to find out what version of Python is installed
Install Setuptools: Download the corresponding version of Setuptools from http://pypi.python.org/pypi/setuptools (scroll to the bottom, and pick the filename that contains the right version number and which has the extension .egg). Install it by typing sudo sh Downloads/setuptools-...egg, giving the location of the downloaded file.
Install Pip: run sudo easy_install pip
Install Numpy: run sudo pip install numpy --upgrade
Install NLTK: run sudo pip install nltk --upgrade
Test installation: run python then type import nltk
nltk  python  nlp 
25 days ago
Penn Treebank P.O.S. Tags
Alphabetical list of part-of-speech tags used in the Penn Treebank Project
nlp  pos_tagging 
26 days ago
Git - Community Ubuntu Documentation
Git is an open source, distributed version control system designed to handle everything from small to very large projects with speed and efficiency.

Every Git clone is a full-fledged repository with complete history and full revision tracking capabilities, not dependent on network access or a central server. Branching and merging are fast and easy to do.
howto  ubuntu  git  sourcecontrol 
26 days ago
A Bayesian Approach to A/B Testing | Custora Blog
Significance testing is useful when the goal is inference. If we want to make falsifiable statements, and draw conclusions through experimentation, then we use statistical significance to measure certainty. However in the business setting, we want to make a decision with some other goal in mind: increasing conversions, improving ease of use, maximizing profit, or some other objective. In these cases there are better criteria to determine an experiment stopping point.
statistics  tools  web  a/btesting  webanalytics 
29 days ago
Stats: Facebook Made $9.51 in Ad Revenue Per User Last Year In The U.S. and Canada | TechCrunch
Facebook made about $9.51 in advertising revenue per user in the U.S. and Canada. Europe was about half that much with $4.86 in ad revenue per user. Asia and the rest of the world follow that at $1.79 and $1.42 per user.
facebook  marketing  advertising  marketshare 
29 days ago
Inherent Uncertainty: SEO is very sophisticated (and.... we're back!)
The main idea is that web traffic is driven by search queries, and for a large number of search terms it's relatively easy to game the ranking algorithm of Google, Yahoo, etc. -- as long as one can manage to get several links to the desired site embedded in enough relevant pages scattered around the web. So let us observe: the blog you are currently reading contains the word "computer", "online" and "learning" many many times -- hence it's a prime target boosting one's ranking for queries such as "online university".

A company hoping to boost the ranking of site X pays an SEO firm to spread links to X around the web via content farms, unmoderated blog comments, etc. -- but all of these practices can be done via scripting and web crawling, i.e. it's all automated. I've actually done some work [http://www.youtube.com/watch?v=UAO73M7NlHk] on detecting such "web spam", i.e. hosts and pages generated solely for the purpose of optimizing search rankings. When this research was published nearly 4 years ago I and my coauthors emphasized a key observation: "good" sites rarely link to "spammy" sites. This makes detecting spammy sites relatively easy, since they don't get a lot in-links from trusted sites.
seo  search 
29 days ago
Learn more - MetaOptimize Q+A
You and other data geeks can ask and answer questions on machine learning, natural language processing, artificial intelligence, text analysis, information retrieval, search, data mining, statistical modeling, and data visualization.
machinelearning  nlp  datamining 
4 weeks ago
Business development: the Goldilocks principle - Chris Dixon
At Hunch, we switched our focus (“pivoted“) about 14 months ago from B2C to B2B. Over that time, we pitched over 500 potential partners, trying to get them to use and eventually pay for our recommendation services. This process had its ups and downs, but eventually ended well when – after 8 months of grueling diligence – eBay decided to acquire Hunch in what I expect will be a successful outcome for both companies. During this time, I got a crash course in B2B sales/business development. Here is the first in a series of blog posts based on what I learned.

Somewhat counterintuitively, the biggest problem we encountered when pitching Hunch technology to potential partners wasn’t that it wasn’t interesting or useful to them, but that it was so interesting and useful that they considered it “strategic” or “core” and thus felt they needed to own and not rent it.
startup  strategy  entrepreneurship 
4 weeks ago
breaking culture • The Educational Lottery
Felicity Allen, ed.
Education
Whitechapel/MIT Press (Documents of Contemporary Art), August 2011. 240 pp.

Philip W. Jackson
What Is Education?
University of Chicago Press, December 2011. 136 pp.

John Marsh
Class Dismissed: Why We Cannot Teach or Learn Our Way Out of Inequality
Monthly Review Press, July 2011. 328 pp.

Professor X
In the Basement of the Ivory Tower: Confessions of an Accidental Academic
Viking Adult, March 2011. 288 pp.
books  education 
4 weeks ago
« earlier      
2011 3d a/btesting ads adventures advertising adwords africa agents ai airfare ajax algebra algorithmictrading algorithms amazon analytics antartica antenna apache apartments apparel apple apply architecture assembler assembly astrology authors azure baby backup bargains basketball beaches beer bigdata biztalk blog blogging blogs blu-ray boats bonds book books brain bschool business businessintelligence calculator calendar cameras canon career cars casestudies cd charts chess chicken china choropleth cio cisco citation cities civilwar classes clothing cloud cms code coding company competition complexsystems cooking coupons cpu credit cryptography css culture data database datamining datawarehousing deals design development dhtml diet digg digital discount dng dns domains download downloads drinks drm dvd earth ebooks ec2 economics editor education electronics encoder entrepreneurship etfs ethics evolution excel extensions extraction eyeglasses facebook faq fashion fertilization fiction filetype:pdf finance financials firefox fish florida food football forums fossil frauddetection from frost gambling games gaming garden gardening gaussiancurve glasses google governance graphics greasemonkey grid grilling guitar hacking hadoop hardware hawaii hdr hdtv health heat heatmap hinduism history hive home homes hosting housing howto html html5 humor ideas images income india indian informationtheory innovation inspirational install internet internetexplorer interview investing investment ipod ishmael it javascript jobs kannada kauai kernel keyboard kingcounty lamb lamp language languages lawn lawncare lean learn legal lessons library links linux lists literature logic lost mac machinelearning management mapreduce maps marketing marketshare math mathematics maui mavericks mba media:document microsoft mindmap mini-movie mkv mlb mobile money moneymanager monitor mortgage motherboard movies movingaverage multimedia munibonds music mutualfunds my mythtv names nba nba2011 netflix networking neuralnetwork news newyork nfl nlp nltk normal northwest nosql nutrition nwmls nyc octave onions opensource options osx paintshop parade parks parsing passport pasta pc pdu pentax performance personality pests philosophy phone photo photography photos photoshop php physics pickle places plants pmi pmo pmp PnF politics pos_tagging powerpoint PowerTabs predictivemodeling probability products programming projectmanagement prototyping proxy psychology pvr python quantumcomputing questions quotes R rankings raw realestate recipes recommendations records recruiting reference religion repair research resources restaurants resume reviews risk rock root s3 SaaS salmon samsung sas savings scalability scaling science sciencefiction scifi ScreenScrape search seattle security seeds Sent seo sharepoint shopping shortcuts snorkeling socialnetworks society software sports sql ssas stanford startup statistics stockoptions stocks strategy streaming subway sunglasses tabs tagging taxes technology textextraction themes timeline tips tmq tomatoes tools trading transitvisa travel tuna tutorial tutorials tv twitter ubuntu ui unix usa usability utilities ux vacation vi video videos virtualization visa visit vista visualization walmart washington watering web webanalytics webdesign whois windows wordpress wp7 writing wss x64 yahoo

Copy this bookmark:



description:


tags: