Computer scientists are developing an AI system that automatically rewrites obsolete sentences in Wikipedia
Computer scientists develop an AI system that automatically rewrites obsolete sentences in Wikipedia articles while maintaining a human tone
- AI system compares a Wikipedia sentence with new data in a ‘claim’ sentence
- This pair is marked as either agree, disagree or neutral by the algorithm
- If it is not agreed, the false words or numbers are replaced
A computer system has been developed that scans a Wikipedia article and automatically detects, checks and corrects actual errors.
This AI-powered system can keep sentences up-to-date and save human editors the hassle, while maintaining a human tone in writing.
The technology is made at MIT and would enable efficient and accurate updates of the 52 million articles from Wikipedia.
Scroll down for video
This AI-powered system can keep sentences up-to-date and save human editors the hassle, while maintaining a human tone in writing. The technology is made at MIT
HOW DOES IT WORK MIT WIKIPEDIA AI TOOL?
For example, suppose this sentence needs to be updated: “Fund A considers 28 of their 42 minority interests in operationally active companies of particular interest to the group.”
The claim sentence with updated information can read: “Fund A considers 23 of 43 minority participations important”.
The system would find the relevant Wikipedia text for ‘Fund A’ based on the claim.
The obsolete numbers (28 and 42) are then automatically deleted and replaced with the new numbers (23 and 43), while the sentence remains exactly the same and grammatically correct.
The system compares a sentence in a Wikipedia article with an updated sentence with conflicting information – called a claim.
If these two sentences do not match, the AI uses a so-called ‘neutrality mask’.
An existing Wikipedia sentence and an updated piece of information form a linked piece of data.
One currently exists and one contains new information.
Each pair of sentences is automatically labeled as “agree,” “disagree,” or “neutral.”
The system matches mismatched pairs and then a modified “neutrality mask” identifies the exact words that make the information contradictory.
Researchers used the system on a dataset of specific Wikipedia sentences, not on all Wikipedia pages.
Researchers used the system on a dataset of specific Wikipedia sentences, not on all Wikipedia pages. The system can also be used to combat fake news and bias posted by human writers in Wikipedia articles
This system then changes the information so that it is no longer contradictory, but it still needs to be corrected.
A binary system labels words that should probably be removed with a 0 and attaches a 1 to essential words.
The researchers then created a third step, called a “new two-encoder decoder.”
This replaces words from the claim – the latest information – in the existing sentence on sites with a 0 that indicates a word that has been deleted.
This algorithm therefore removes the outdated information and replaces it with the correct statistics.
The information and words that are still accurate and current are retained.
The system can also be used to counter fake news and prejudices that have been inserted by human writers into Wikipedia articles.
“If you have a bias in your dataset, and you’re fooling your model by communicating just one sentence in a pair to make predictions, your model won’t survive the real world,” says Darsh Shah, a PhD student at the MIT Computer Science and artificial intelligence laboratory (CSAIL).
“We let models look at both sentences in all agree-disagree pairs.”