Gridstone and the Top-Down Approach to the Semantic Web
What does Gridstone Research do? If an Equity Analyst asks this question, the answer we give is what our home page says,
Using cutting-edge technology, Gridstone assembles, analyzes and structures unstructured company information into financial data, guidance, operational data and structured text. Information that could take hours to assemble is available at your fingertips, at our website or directly in Excel.
This describes the end-user benefit. But for those who are interested in such matters, it still doesn’t answer the question of what we actually do. To explain this, I will heavily lean upon an excellent post on ReadWriteWeb, by Alex Iskold. The post is called Top-Down: A New Approach to the Semantic Web.
Wikipedia describes the Semantic Web thus
The Semantic Web is an evolving extension of the World Wide Web in which web content can be expressed not only in natural language, but also in a format that can be read and used by software agents, thus permitting them to find, share and integrate information more easily.
The Semantic Web and associated standards like RDF and OWL are rapidly gaining visibility. But is it anywhere near where it might produce something of business value? Many commentators believe that it is going to be a long haul. Iskold outlines several challenges with what he calls the bottom-up approach to the Semantic Web in another great piece.
The biggest challenge that the Semantic Web is going to face is about what to do with all the existing content. How do the website owners justify the expense related to annotating their content with semantics? And until the content is converted, no useful applications can be built on top of it. There’s a bit of a chicken and egg problem here.
Might there be another approach then? An approach where someone or some company actually builds the technology to annotate web content with semantics. Iskold calls this the top-down approach
The essence of a top-down semantic web service is simple - leverage existing web information, apply specific, vertical semantic knowledge and then redeliver the results via a consumer-centric application.
Iskold believes that this is not only more likely to be successful in the short-term, it is already happening. He talks about Spock, a vertical search company focused on people.
Consider the vertical search engine Spock, which scans the web for information about people. It knows how to recognize names in HTML pages and it also looks for common information about people that all people have - birthdays, locations, marital status, etc. In addition, Spock “understands” that people relate to each other.
This is very similar to what Gridstone Research does, albeit in an entirely different domain – financial information.
We
Crawl the web. (the SEC website)
Recognize significant numbers (page numbers are not significant)
Understand relationships with other numbers through a taxonomy. (S&M and G&A add up to SG&A)
Understand the attributes of each number ($, millions, US GAAP, Consolidated)
Additionally, we
Recognize named entities
Understand relationships of brands, products, management to companies as well as among companies themselves (competitors, suppliers, customers)
Recognize forward-looking statements
Enable semantic search
In the last two years, we have been busy building the enabling technologies. This isn’t an easy problem to solve and there are many building blocks. But finally, all the pieces are in place. Later this month we will unveil Search on the Gridstone platform. It will be unlike anything you have seen in the Financial domain.
Watch this space.
