Write to us

Google-aided filter censors ‘harmful’ internet content; Researchers claim 91pc accuracy, a big jump over the 70pc efficiency of traditional machine censors

"A research team on the mainland claims to have developed a text censor that can filter “harmful information” on the internet with unprecedented accuracy using artificial intelligence."

A research team on the mainland claims to have developed a text censor that can filter “harmful information” on the internet with unprecedented accuracy using artificial intelligence.

Traditional machine censors rely mainly on keywords to do this and struggle to achieve 70 per cent accuracy, while AI technology – which needs to be trained by humans – has taken that to about 80 per cent in recent years.

The team from Shenyang Ligong University and the Chinese Academy of Sciences said their AI technology did not need to be trained by humans and “outperforms other approaches” to attain more than 91 per cent accuracy. It would be particularly useful to “identify and filter sensitive information from online news media”, lead researcher Li Shu and her colleagues wrote in the Journal of Chinese Computer Systems.

The mainland has more than 900 ­million internet users, more than any other country, and is building the world’s largest 5G networks to boost speed. But the internet is tightly controlled, with many sites blocked including GoogleFacebookTwitter and some foreign news outlets – and much of the content on the sites that are available is banned.

Banned topics include cults, ­pornography, drug abuse, firearm use, terrorism and attacks on the Communist Party and its leaders.

But identifying them is a challenge for computers. Chinese is one of the most complex ­languages in the world, with nearly 10,000 characters. And sensitive words – gun, for example – could get picked up in a non-sensitive context, triggering a false alarm, or illegal information could be posted online without the use of any sensitive words.

The Chinese government and internet companies have instead relied on a huge army of censors to manually vet online content, but it is too costly and inefficient to keep pace with the growth of information on China’s internet.

Li, an associate professor of computer science at Shenyang Ligong University, said the technology developed by her team could keep up with the fast-evolving language used online in China, with a powerful dictionary ­containing not only sensitive words but their changing forms.

She said it could also read between the lines when searching for illegal content that was hidden in a different context, increasing the ability to identify text that is written in a way to bypass machine censors.

Many internet users in China avoid using sensitive words and instead use homonyms or add hyphens to fool the censors.

Part of the team’s text censor technology came from Google, Li said. In 2017, Google developed an open-source language model known as bidirectional encoder representations from transformers, or BERT, to help its search engine better understand users’ search terms. BERT can read words in different contexts – such as “running a business” versus “running a marathon” – as a result of reading huge databases including the entire Wikipedia site.

But BERT is not a censor and cannot understand text longer than 512 words. To make it work, Li’s machine breaks a long text into segments, lets BERT read the shorter parts and uses another AI tool to assess the results using an up-to-date dictionary.


Source: South China Morning Post


Share on facebook
Share on twitter
Share on linkedin
Share on telegram
Share on whatsapp
Share on email

AI Blog – Latest news on Artificial Intelligence and its applications on the globe. 

Browse more

Related Posts

Board Directors Can Do More With AI

Corporate governance is an evolving area that changes with policy matters and economic reforms in a country. There is notably an increasing pressure on the board of directors and management to implement policies; while maintaining a healthy culture and good corporate governance responsibility across the organisation.

Read More »

New Way Forward to Formulate Strategic Plans

The Centre for AI Innovation (CEAI), the social innovation arm of MyFinB Group, has launched a series of capacity building programme to help diverse types of organisations with strategic planning, using AI. This was implemented in view of the volatility of the business environment that causes many organisations to adopt reactive strategies rather than proactive ones. However, reactive strategies or “fire-fighting” are typically only viable as a short-term solution, even though they may require spending significant resources and time to execute them.

Read More »

Professional Firms Get AI Boost for their Clients

InfoTrust, a Singapore-based, award-winning ICT company, is embarking on a journey to build internal competencies in AI to disrupt traditional professional services such as corporate reporting solutions. The primary goal is for InfoTrust to build a suite of AI solutions using predictive and prescriptive analytics, to boost its existing suite of solutions for the Singapore market as a start.

Read More »