Search This Blog

22 Jun 2012

From Big Data to Big Semantics

What could be a strategy for the Language Technology industry?

Is there a need to fragment the Language Technology industry?

On the 19th of June 2012, most of the European Innovators on Language Technology (LT) met in Brussels at the LT-Innovate summit.
The main idea of this summit was to provide a platform for Language Technology companies to showcase their technologies and solutions, to demonstrate their capabilities in the market and to illustrate which customer problems they can solve.
We took the opportunity to discuss and identify the inhibitors of the market (such as lack of language resources, i.e. manually annotated content) as well as its potential (business intelligence on unstructured data). I say we because my official role at the summit was to evaluate LT companies on their business potential so that the organisers could distribute awards to a few winners based on innovation, business potential and market approach of their newest offering.
And something stroke me on how these brilliant guys and girls presented their passion and their hope for success.
But first let’s have a helicopter look at what technologies/solutions were showcased. On the total 32 companies presenting their ideas for new business, we had:
-          14 companies with Information Extraction solutions/technologies (43,7%)
-          10 companies dealing with Machine Translation (31,2%)
-          2 E-learning companies (6,2%)
-          2 companies providing Virtual Assistant solutions (6,2%)
-          1 each on Terminology, Sentimental Analysis, Search and Service
Information Extraction seems now to overpass the until now hot topic of Machine Translation. And this is good. Indeed, both Language Technology and cloud based CPU availability have reached a level that allows searching more deeply into the unstructured web, to structure it and ... to be able to analyse this data and to give some meaning to it.
And this is the point that stroke me: Exactly at these 3 dots of the last sentence, between only structuring on one side and analysing on the other side, there seems to be a barrier between LT companies which stopped at the technology level and those which decided to propose solutions based on their technologies. Typical of any new industry you would say.
Can the LT industry afford staying at the technological level? The answer of the jury was no. All LT companies that got primed at LT-Innovate were those that crossed this barrier and had a solution for a specific market.
And I would like to illustrate my saying here with the example of Textkernel, an 11 years old Amsterdam based company, which received one of the LT-Innovate awards. The kernel value of Textkernel (if I can say so ;-)), as always with technology companies is its technology. They were proud of their technology and tried to sell it. They got project work and they made it through the years. So they were good at finding projects you will say. Sure, but their turnover was stagnating. It was hard work. Until 2 years ago when they decided to change their strategy, moving from a general technology provider that “only” allows to structure unstructured data, to a solution company providing specific solutions for the Human Resource market. And since then, they are growing!
Discussing with Jakub Zavrel, Textkernel’s Managing Director, about addressing another vertical market, the answer was a clear “NO”. There is plenty to do to optimise the technology to the Human Resource market. And there is much more to do to bring disruption and change the rules of the HR-market what would really showcase the value of LT.
And I liked his reluctance not even to think about going for a second growth dimension. At this stage of the LT industry, we essentially need success stories that show the magic of LT, which demonstrates its intelligence, value and disruptive power. And we are the only ones to be able to build these success stories: do not expect a person having never touched semantics to be able to extract valuable meaning out of millions of sentences. Nobody else than LT people can build LT solutions. We have to do the work.
So, yes, we do need to fragment the LT industry, specialising in niche markets with complete solutions. Consolidation is still far away.
And this is my recommendation to all LT providers:
-     Differentiate yourself, position your company (e.g. in a niche market), there is enough room for everybody.
-     Accept the fact you are the only one to understand the real value of the data you generate.
-     Cross the solution barrier, forget for some time that your technology can do more, just do one thing, but do it well.
-     Understand the business processes of your target market and find the fit.

Summary:

The LT industry transforms Big Data into Big Semantics. The semantically structured data generated by the LT industry is understandable only for the LT industry, not for the customers of the LT industry. This data is too new for the LT customers. We, from the LT industry, are the only ones that can demonstrate its value.  Offering project work based on technology will not help us. We need to propose solutions.