Computers are being taught to understand the meaning behind words and images on the internet, and it’s bringing online a new generation of intelligent software that can perform tasks that only humans were able to do up to now.
It’s thanks to machine learning, artificial intelligence and an emerging branch of computer science called the semantic web.
‘Semantic technologies can play a very important role to find a smarter way to get high-level information that is currently retrieved by human operators,’ said Andrea Ciapetti, open source specialist at Italian technology firm Innovation Engineering.
He is working with the Madrid police to create a search engine that can analyse video footage and discover criminal acts such as someone being pickpocketed.
Software identifies elements of the video which might be of interest, then semantic web technology looks at these events and picks out the ones that might indicate a crime is taking place.
His company is also working on the EU-funded DISCOVER-IT project to create a semantic search tool for start-up companies that can scour the web for relevant patents and data from open access research papers in order to help them come up with innovative ideas.
Semantic technology works by annotating words and images with supplementary information so that software can understand their meaning.
‘This is what the semantic web is about, turning the web of documents as it is today, as it is for humans, into a web of data that is for software consumption,’ said Luca De Santis, from Net7, an Italy-based web technology firm.
He is the project manager of StoM, an EU-funded project which is working out how to commercialise two semantic search engines developed as part of an earlier project, SemLib.
One of the products, called EventPlace, is a search tool that brings together information relating to an event, while the other, PunditBrain, is used to create annotations on web documents that, thanks to semantic technologies, are easier to search and reuse.
‘This is what the semantic web is about, turning the web of documents as it is today, as it is for humans, into a web of data that is for software consumption.’
Luca De Santis, Net7, Italy
Semantic search is able to link similar ideas together in this way because it adds explanatory information to web pages, or links to external repositories which give meaning and context to words.
Wikipedia for computers
Data repositories such as DBpedia, a version of Wikipedia for computers, are at the heart of semantic technology. These can be used to annotate web pages, making them easier for semantic systems to understand.
It means that when the software comes across a word which could mean two things, such as ‘rock’, which could refer to music or geology, the software can check to find out the exact meaning.
‘I can provide a link to the DBpedia entity, and say, “Ok, this is about music”,’ said De Santis.
Semantic technologies are already being used to group news articles together when they are about the same thing, or to understand what a Facebook user is interested in by looking at the similarities between pages they have ‘liked’.
‘Facebook, through semantic web technology, can understand what you really like,’ said De Santis. ‘Is the page about restaurants, or is it about rock music?’
One of the problems with using semantic techniques is that, in many areas, meaning can be difficult to define in a mathematically precise way, creating data that is ill-suited to the logical reasoning used by computers.
One example is wine, where words that are used to describe taste, such as sweet or fruity, can mean slightly different things to different people. Yet for a semantic technology search to be able to answer questions, such as which wine goes well with a specific dish, it needs to understand and use these different terms.
‘Building a logical theory for these real-world domains is quite tricky,’ explained Dr Steven Schockaert, principal investigator of the FLEXILOG project.
He is working on a way to spatially model the meaning of words, so that it can be used to answer logical search queries.
‘The goal would be to have a system that can just learn, on its own, information about many different domains as it’s reading information on the web,’ said Dr Schockaert, whose work has been funded by the EU’s European Research Council.
The plan is to feed the system information that it can use to learn.
‘Initially we’re going to work on Wikipedia, then we’re going to scale it up to a substantial fragment of the web.’
The big data explosion, which allows scientists to analyse factors such as people’s lifestyles, genes and medical records to develop personalised treatments for conditions, has so far mostly benefitted rare diseases with simple causes. But now, complex problems such as cardiovascular disease and dementia are getting the big data treatment.
An analysis of a newly cleaned-up dataset tracking Europe’s air pollution has revealed that nitrogen dioxide levels are on a steeper downward trend than previously thought, according to Dr Folkert Boersma from the Royal Netherlands Meteorological Institute, who says that ensuring the quality of Earth observation data can reveal new insights into climate change.
Earthworms and tiny water fleas could help deliver clean water to billions of people living in remote areas of the world by eating up sewage and other pollution.
A sister and brother who created shock-activated protective gear featuring a starch liquid for people who in-line skate, motorcycle and do other risky sports, won one of the three first prizes at this year’s European Union Contest for Young Scientists (EUCYS).
Biofilters offer in-situ low-maintenance ways of treating wastewater.
Winners from Germany and Canada take home top prizes.
Electric cars with liquid batteries could be charged in minutes, says Prof. Cronin.