Hints.AI
Misinformation (fake news) has pernicious impacts on our communities and our forms of government. Misinformation has been implicated in a wide variety of countries and has been shown to harm civic life. For example, vaccination rates have been impacted by fake new, leading to outbreaks of infectious diseases. While, malign actors have used information to deepen fissures in social society for many years, social media has created increased vulnerability.
Our solution uses network analysis to detect misinformation. Our system is robust, scalable, impervious to adversaries, and offers an explainable machine learning approach to misinformation detection.
In contrast to current approaches which use manual labeling or Natural Language processing, our solution uses network analysis. Who interacts with an article is just as important as what the article states.
Journalists have expressed keen interest and our partners in the media monitoring industry are keen to see us succeed.
Misinformation is global in nature, scale, and impact. Everything from elections in modern democracies, trust in civic institutions, vaccinations rates, willingness to adapt new technologies (e.g., 5G) are influenced by misinformation. Most importantly, the underlying social cohesion of modern societies are adversely impacted by adversaries attempting to manipulate societal fissures for their own use.
By stopping the spread of fake news in social networks, we can greatly reduce the impact of misinformation on our societies and ameliorate the impact of misinformation.
We are currently working with journalists both in newspapers of record as well as service providers to journalists. We are also working with companies who monitor social media and have started conversations with the social media companies themselves.
Current methods to detect misinformation mainly rely on source analysis which is easily spoofed or on NLP/semantic analysis which is not sufficiently advanced for the problem (e.g., "Pope endorses Trump" is perfectly valid semantically).
In contrast we rely on network analysis. Constructing a network of articles with weights showing the interactions between them, allows us to discover surprising relationships between disparate types of misinformation.
Formally, we define a graph of articles. Each article is linked to other articles with a weighted edge where the weight depends on the number/type/percentage of common users they share. The more two articles share a specialized audience, the more related they are. Some articles are labeled as True/False using data from the International Fact Checking Network (IFCN) as well as newspapers of record (e.g. NYTimes, Washpost etc) to bootstrap the protocol.
For example, starting off with labeled data which contains misinformation about #Khasoggi leads us to misinformation about Beto O'Rourke illegally funding migrant caravan. Starting off with misinformation about vaccinations leads us to misinformation about crack cocaine (created by the CIA), 911, Qanon etc.
Note that the labeling itself is highly robust. Even a relatively high percentage of mislabels will still result in correct sorting due to the usage of network algorithms to increase our redundancy and power.
The system is robust and extremely quick. It is adversary proof (even if an adversary knows how the algorithm works, they size of the manipulation is bounded by the amount of work they do) and explainable.
- Make government and other institutions more accountable, transparent, and responsive to citizen feedback
- Ensure all citizens can overcome barriers to civic participation and inclusion
- Prototype
- New technology
Its a new technology which is fast, robust, adversary proof. The tech has tunable precision and recall which can be arbitrary high at the cost of data collection and speed of data collection. As the number of users interacting with an article increase, the algorithm gets more accurate. Thus, by waiting till more users interact with an article, the result can be more accurate. Note that the number of users needed is still small and any article with insufficient users, does not have enough traction to cause societal damage.
It also solves a known high impact societal problem.
We utilize novel algorithms inspired by search algorithms. Our technology is patent pending.
One of the early methods of finding data on the web was Yahoo directory which was a manually annotated list (tree) of all the webpages. This is where we are today with misinformation where we try to manually annotate all of the fake news. This doesn't scale.
The next generation tried to understand the webpage using such techniques as NLP, semantic analysis, TFIDF etc. The technology never worked properly and "Pope endorses Trump" is perfectly valid semantically and requires deep domain knowledge to know that it is false.
The current generation uses networks analysis (e.g., Pagerank) and actualy solves the problem.
It is worth noting that our solution is more robust than pagerank. This is due to the fact that to fool pagerank, you want to increase the score, whereas to fool HINTS you want to decrease the score. This is much harder.
- Artificial Intelligence
- Machine Learning
- Big Data
- Social Networks
Since we can discover fake news within minutes of it being uploaded, we can provide information to the social media companies to limit it's spread and exposure. This limits the impact of misinformation and the attendant known social harms.
Our solution is already working in a protoype and we are scaling up the engineering to ingest the entire decahose (provided to us by a partner organization).
- United States
- United States
Currently we are in testing. In one year, we intend to be serving 500 journalists and in 5 years to be integrated into all the major social network platforms.
Scaling up the engineering to enable full coverage of all the social media data provided to us by our partners.
We have obtained partnerships which provide us with the crucial data. We are currently working on scaling up the engineering to a MVP for journalists who have expressed keen interest.
This MVP will provide a feed of trending misinformation for the journalists to debunk. This will also provide us marketing to enable us to integrate with the social media companies (who are risk averse) to prevent the spread of misinformation.
Our path has been shockingly smooth with everyone trying to help. We are utilizing cloud computing to enable scaleup and our team includes seasoned experts in distributed systems.
- For-Profit
1 Full time, 3 part time, 2 interns.
We have a technology which works :)
Meedan, Washington Post - journalist customers
Knowledge future groups @ MIT- research partnership.
Zignal labs, Full Intel, Public relay - data partnership and customers.
Subscription model for journalists. This is not expected to bring much revenue but is for social benefit.
Push notifications and subscription model for media monitoring companies.
API model for social media companies.
We expect to be profitable from media monitoring companies. We have already signed on 3 clients. Several VC's have suggested that we enter the media monitoring market ourselves but that would interfere with partnerships with social media companies and the larger impact issues.
Exposure. Social cachet especially with the large companies. Funding. Connections.
- Distribution
- Talent or board members
- Monitoring and evaluation
- Media and speaking opportunities
Facebook, Twitter, Google :)
Our solution is completely based on novel AI techniques for machine learning with the social network data. We are experts in ML, distributed systems and algorithms.
Funding will go towards hiring more data scientists to accelerate our growth.
Our solution is based on novel network analysis algorithms for social graph data. Funding will be used towards engineering and scale up.
Our current data is the Twitter decahose accessed through our partner Zignal labs. This data is publicly available (for a cost) and thus the data sourcing and maintenance issues are de minimus.
Our solutions has the potential to drastically transform the ecosystem for misinformation by preventing the spread of misinformation. This has wide application for a range of issues such as race relations, communal tensions, anti-vaccine issues etc.