This link has been bookmarked by 258 people . It was first bookmarked on 19 Mar 2007, by jengalbells.
-
01 Feb 12
-
01 Nov 11
al-Amjad Tawfiq IsstaifToday's Web has terabytes of information available to humans, but hidden from computers. It is a paradox that information is stuck inside HTML pages, formatted in esoteric ways that are difficult for machines to process. The so called Web 3.0, which is likely to be a pre-cursor of the real semantic web, is going to change this. What we mean by 'Web 3.0' is that major web sites are going to be transformed into web services - and will effectively expose their information to the world.
-
28 Sep 11
-
23 Sep 11
-
09 Sep 11
-
26 Jun 11
-
09 Jun 11
-
07 Apr 11
-
25 Mar 11
-
24 Feb 11
-
23 Dec 10
-
31 Aug 10
-
13 Aug 10
-
20 Jun 10
-
19 Jun 10
-
30 May 10
dolors reigWeb 3.0: When Web Sites Become Web Services Today's Web has terabytes of information available to humans, but hidden from computers. It is a paradox that information is stuck inside HTML pages, formatted in esoteric ways that are difficult for machines
-
17 May 10
-
09 Apr 10
-
19 Feb 10
-
18 Nov 09
-
12 Oct 09
Monica SansToday's Web has terabytes of information available to humans, but hidden from computers. It is a paradox that information is stuck inside HTML pages, formatted in esoteric ways ...
-
04 Sep 09
-
25 Aug 09
Peter van der ReijdenToday's Web has terabytes of information available to humans, but hidden from computers. It is a paradox that information is stuck inside HTML pages, formatted in esoteric ways that are difficult for machines to process. The so called Web 3.0, which is li
-
22 Aug 09
-
18 Aug 09
-
01 Jul 09
-
26 May 09
-
14 May 09
-
09 Apr 09
-
18 Mar 09
-
15 Mar 09
-
20 Feb 09
-
12 Feb 09
-
04 Feb 09
-
23 Jan 09
-
09 Jan 09
-
30 Dec 08
-
29 Dec 08
-
23 Dec 08
-
15 Nov 08
-
What we mean by 'Web 3.0' is that major web sites are going to be transformed into web services - and will effectively expose their information to the world
-
-
03 Nov 08
-
08 Oct 08
-
03 Oct 08
-
04 Sep 08
-
31 Aug 08
-
09 Aug 08
-
01 Aug 08
-
24 Jul 08
-
21 Jul 08
Simone Economo"As more and more of the Web is becoming remixable, the entire system is turning into both a platform and the database. Yet, such transformations are never smooth. For one, scalability is a big issue. And of course legal aspects are never simple."
semantic web semanticweb semantics web2.0 web3.0 webservices api mashup business structured data structure webdev webstandards for:andreagandino for:craiv
-
17 Jun 08
-
Today's Web has terabytes of information available to humans, but hidden from computers. It is a paradox that information is stuck inside HTML pages, formatted in esoteric ways that are difficult for machines to process. The so called Web 3.0, which is likely to be a pre-cursor of the real semantic web, is going to change this. What we mean by 'Web 3.0' is that major web sites are going to be transformed into web services - and will effectively expose their information to the world.
-
The transformation will happen in one of two ways. Some web sites will follow the example of Amazon, del.icio.us and Flickr and will offer their information via a REST API. Others will try to keep their information proprietary, but it will be opened via mashups created using services like Dapper, Teqlo and Yahoo! Pipes
-
The net effect will be that unstructured information will give way to structured information - paving the road to more intelligent computing
-
One of the first web services opened up by Amazon was the E-Commerce service. This service opens access to the majority of items in Amazon's product catalog
-
Why has Amazon offered this service completely free? Because most applications built on top of this service drive traffic back to Amazon (each item returned by the service contains the Amazon URL). In other words, with the E-Commerce service Amazon enabled others to build ways to access Amazon's inventory
-
It focuses on letting people create mashups and widgets from web services and rss.
-
So bringing together Open APIs (like the Amazon E-Commerce service) and scraping/mashup technologies, gives us a way to treat any web site as a web service that exposes its information. The information, or to be more exact the data, becomes open. In turn, this enables software to take advantage of this information collectively. With that, the Web truly becomes a database that can be queried and remixed.
-
There are several good reasons why Web Sites (online retailers in particular), should think about offering an API. The most important reason is control. Having an API will make scrapers unnecessary, but it will also allow tracking of who is using the data - as well as how and why. Like Amazon, sites can do this in a way that fosters affiliates and drives the traffic back to their sites.
-
The old perception is that closed data is a competitive advantage. The new reality is that open data is a competitive advantage. The likely solution then is to stop worrying about protecting information and instead start charging for it, by offering an API. Having a small fee per API call (think Amazon Web Services) is likely to be acceptable, since the cost for any given subscriber of the service is not going to be high. But there is a big opportunity to make money on volume. This is what Amazon is betting on with their Web Services strategy and it is probably a good bet
-

-
-
21 May 08
-
20 May 08
-
09 May 08
-
25 Apr 08
-
11 Apr 08
-
05 Apr 08
-
04 Apr 08
-
31 Mar 08
-
Web 3.0, which is likely to be a pre-cursor of the real semantic web
-
What we mean by 'Web 3.0' is that major web sites are going to be transformed into web services - and will effectively expose their information to the world.
-
The old perception is that closed data is a competitive advantage. The new reality is that open data is a competitive advantage.
-
The likely solution then is to stop worrying about protecting information and instead start charging for it, by offering an API. Having a small fee per API call (think Amazon Web Services) is likely to be acceptable, since the cost for any given subscriber of the service is not going to be high.
-
-
27 Mar 08
-
26 Mar 08
-
25 Mar 08
-
23 Mar 08
-
21 Mar 08
-
07 Mar 08
-
18 Feb 08
-
13 Feb 08
Steve Hecom API's explained
-
15 Jan 08
-
12 Jan 08
-
05 Jan 08
-
03 Jan 08
-
31 Dec 07
-
Today's Web has terabytes of information available to humans, but hidden from computers. It is a paradox that information is stuck inside HTML pages, formatted in esoteric ways that are difficult for machines to process. The so called Web 3.0, which is likely to be a pre-cursor of the real semantic web, is going to change this. What we mean by 'Web 3.0' is that major web sites are going to be transformed into web services - and will effectively expose their information to the world.
-
-
12 Dec 07
-
The transformation will happen in one of two ways. Some web sites will follow the example of Amazon, del.icio.us and Flickr and will offer their information via a REST API. Others will try to keep their information proprietary, but it will be opened via mashups created using services like Dapper, Teqlo and Yahoo! Pipes. The net effect will be that unstructured information will give way to structured information - paving the road to more intelligent computing. In this post we will look at how this important transformation is taking place already and how it is likely to evolve.
-
One of the first web services opened up by Amazon was the E-Commerce service
. This service opens access to the majority of items in Amazon's product catalog. The API is quite rich, allowing manipulation of users, wish lists and shopping carts. However its essence is the ability to lookup Amazon's products. -
Why has Amazon offered this service completely free? Because most applications built on top of this service drive traffic back to Amazon (each item returned by the service contains the Amazon URL). In other words, with the E-Commerce service Amazon enabled others to build ways to access Amazon's inventory. As a result many companies have come up with creative ways of leveraging Amazon's information - you can read about these successes in one of our previous posts.
-
What it does do is allow authorized mashups to manipulate the user information stored in del.icio.us. For example, an application may add a post, or update a tag, programmatically. However, there is no way to ask del.icio.us, via API, what URLs have been posted to it or what has been tagged with the tag web 2.0 across the entire del.icio.us database. These questions are easy to answer via the web site, but not via current API.
-
We recently covered Yahoo! Pipes, a new app from Yahoo! focused on remixing RSS feeds. Another similar technology, Teqlo
, has recently launched. It focuses on letting people create mashups and widgets from web services and rss. Before both of these, Dapper
launched a generic scraping service for any web site. Dapper is an interesting technology that facilitates the scraping of the web pages, using a visual interface. -
Scraping technologies are actually fairly questionable. In a way, they can be perceived as stealing the information owned by a web site. The whole issue is complicated because it is unclear where copy/paste ends and scraping begins. It is okay for people to copy and save the information from web pages, but it might not be legal to have software do this automatically. But scraping of the page and then offering a service that leverages the information without crediting the original source, is unlikely to be legal.
-
Having an API will make scrapers unnecessary, but it will also allow tracking of who is using the data - as well as how and why. Like Amazon, sites can do this in a way that fosters affiliates and drives the traffic back to their sites.
-
open data is a competitive advantage. The likely solution then is to stop worrying about protecting information and instead start charging for it, by offering an API. Having a small fee per API call (think Amazon Web Services) is likely to be acceptable, since the cost for any given subscriber of the service is not going to be high. But there is a big opportunity to make money on volume. This is what Amazon is betting on with their Web Services strategy and it is probably a good bet.
-
Web is becoming remixable
-
it is not a question of if web sites become web services, but when and how. APIs are a more controlled, cleaner and altogether preferred way of becoming a web service. However, when APIs are not avaliable or sufficient, scraping is bound to continue and expand.
-
Scraping is terribly unreliable and inefficient.
-
Don't forget microformats and similar things that will allow regular sites to offer up properly formatted data without significantly changing workflow. Services in Yahoo Pipes' ilk will be able to use data provided by microformats very accurately in future.
-
these guys are trying to give birth to a host of companies that build off their structured, managed approach to building schema for the "semantic web.
-
-
06 Dec 07
-
03 Dec 07
-
What we mean by 'Web 3.0' is that major web sites are going to be transformed into web services - and will effectively expose their information to the world.
-
The net effect will be that unstructured information will give way to structured information - paving the road to more intelligent computing.
-
However, only a fraction of those APIs are opening up information - most focus on manipulating the service itself. This is an important distinction to understand in the context of this article.
-
So how do these services get around the fact that there is no API? The answer is that they leverage standardized URLs and a technique called Web scraping.
-
How Web Scraping Works
Web Scraping is essentially reverse engineering of HTML pages. It can also be thought of as parsing out chunks of information from a page.
-
So bringing together Open APIs (like the Amazon E-Commerce service) and scraping/mashup technologies, gives us a way to treat any web site as a web service that exposes its information. The information, or to be more exact the data, becomes open. In turn, this enables software to take advantage of this information collectively. With that, the Web truly becomes a database that can be queried and remixed.
-
Information that seems to be free is perceived as being free.
-
Having an API will make scrapers unnecessary, but it will also allow tracking of who is using the data - as well as how and why. Like Amazon, sites can do this in a way that fosters affiliates and drives the traffic back to their sites.
-
old perception is that closed data is a competitive advantage. The new reality is that open data is a competitive advantage
-
The likely solution then is to stop worrying about protecting information and instead start charging for it, by offering an API.
-
As more and more of the Web is becoming remixable, the entire system is turning into both a platform and the database.
-
But it is not a question of if web sites become web services, but when and how. APIs are a more controlled, cleaner and altogether preferred way of becoming a web service.
-
-
18 Nov 07
joseafigueiraOnline photo editing made fun With Picnik you can quickly edit all your online photos from one place. It's the easiest way on the Web to fix underexposed photos, remove red-eye, or apply effects to your photos. It's fast, easy, and fun.
-
15 Nov 07
-
31 Oct 07
-
30 Oct 07
-
28 Oct 07
-
27 Oct 07
Patricia MossToday's Web has terabytes of information available to humans, but hidden from computers. It is a paradox that information is stuck inside HTML pages, formatted in esoteric ways ...
-
26 Oct 07
-
24 Oct 07
-
23 Oct 07
-
21 Oct 07
Henk-Jan van der Klismove from HTML pages to web services, SAAS
apps delicious internet socialnetworking web3.0 webservices mashup trends web api web2.0 webdesign
-
20 Oct 07
-
13 Oct 07
-
09 Oct 07
-
07 Oct 07
-
26 Sep 07
-
17 Sep 07
-
14 Sep 07
-
12 Sep 07
-
Today's Web has terabytes of information available to humans, but hidden from computers. It is a paradox that information is stuck inside HTML pages, formatted in esoteric ways that are difficult for machines to process. The so called Web 3.0, which is likely to be a pre-cursor of the real semantic web, is going to change this. What we mean by 'Web 3.0' is that major web sites are going to be transformed into web services - and will effectively expose their information to the world.
-
So how do these services get around the fact that there is no API? The answer is that they leverage standardized URLs and a technique called Web scraping. Let's understand how this works. In del.icio.us, for example, all URLs that have the tag book can be found under the URL http://del.icio.us/tag/book; all URLs tagged with the tag movie are at http://del.icio.us/tag/movie; and so on. The structure of this URL is always the same: http://del.icio.us/tag[TAG]. So given any tag, a computer program can fetch the page that contains the list of sites tagged with it. Once the page is fetched, the program can now perform the scraping - the extraction of the necessary information from the page.
How Web Scraping Works
Web Scraping is essentially reverse engineering of HTML pages. It can also be thought of as parsing out chunks of information from a page. Web pages are coded in HTML, which uses a tree-like structure to represent the information. The actual data is mingled with layout and rendering information and is not readily available to a computer. Scrapers are the programs that "know" how to get the data back from a given HTML page. They work by learning the details of the particular markup and figuring out where the actual data is.
-
-
05 Sep 07
Would you like to comment?
Join Diigo for a free account, or sign in if you are already a member.