Wednesday, August 13, 2008

URLs Everywhere


Google analyzes backlinks posted on web pages to calculate PageRank, which in turn influences the search results ranking of pages. But web pages aren’t the only places to find URL pointers. Whether or not Google evaluates any of these, I don’t know:


  • There are URLs on billboards (some of which Google already stores in Google Street View)
  • 广告牌上有网址(其中有些已经被Google存储在Google街景地图中了)
  • There are URLs scribbled on walls using black marker
  • 在墙上用记号笔胡乱写的网址
  • There are URLs mentioned in books (some of which Google already scanned as part of Google Book Search)
  • 书上提到的网址(有些Google已经扫描下来放在Google图书搜索里了)
  • Besides books, there’s URLs in other print media like magazines, newspapers, comic books
  • 除了书籍,在一些别的印刷业比如杂志报纸漫画书上也会有网址
  • URLs are sent around in emails (some of which Google already stores in Gmail)
  • 通过邮件发送的网址(有些已经被Google收在Gmail里了)
  • URLs are sent around in chat programs (and when you’re using Google Talk, Google could analyze this)
  • 通过聊天工具发送的网址(当你使用Gtalk的时候,说不定Google就在分析)
  • URLs are sometimes pasted in the chat rooms of IRC (or in the 3D worlds of Second Life, Lively and others)
  • 互联网中继聊天室中贴出来的网址(或者是在第二人生,Lively等三维世界里)
  • URLs are sometimes appearing in text documents, spreadsheets, presentations (Google knows those stored in Google Docs)
  • 有时网址还会出现在文本文档,表格和幻灯片里(Google知道这些都在Google Docs里面保存着)
  • Sometimes, URLs are mentioned on web pages but not linked, like in certain news reports; also, URLs may appear in Flash files, JavaScript files and so on, some of which Google say they already crawl
  • 有时网址会在网页里提到但是没有链接,就像在一些新闻报道里,动画文件,JavaScript中也可能出现网址,有些Google说他们已经开始进行检索了
  • URLs appear on TV in spoken or printed form; they are also named on radio
  • 电视上可以会说出或者是显示一些网址;有些也会在收音机里提到
  • URLs are sometimes sent as SMS
  • 有时候网址也会通过短信发送
  • URLs are sometimes talked about
  • 网址有时会在谈话中提到
  • URLs are sometimes thought about
  • 网址有时会在人的脑海里出现

Whether accessing any of these makes sense is another issue. Billboards would be ads, so they would probably be excluded. Links sent in emails are often spam (though Google has ways to find out about certain spam and could exlude those mails, or conversely, add penalty pointers to the linked pages); links sent in emails are also sometimes private; so are some of the URLs talked about, or thought about (and the latter is hard to scan with today’s technology in any case). In other instances, the sample would be skewed as it would only be the “Google property” sample (e.g. when analyzing URLs in Google Spreadsheets but not in MS Excel files). Links in books could be more authoritative, though they might also be outdated. But sometimes, analyzing a URL outside of a web page may also give a better, because more diverse and fine-tuned, ranking.


